Use Your Head: Improving Long-Tail Video Recognition

Toby J Perrett*, Saptarshi Sinha, Tilo Burghardt, Majid Mirmehdi, Dima Damen

*Corresponding author for this work

Research output: Contribution to conferenceConference Paperpeer-review

135 Downloads (Pure)

Abstract

This paper presents an investigation into long-tail
video recognition. We demonstrate that, unlike naturallycollected
video datasets and existing long-tail image benchmarks,
current video benchmarks fall short on multiple
long-tailed properties. Most critically, they lack few-shot
classes in their tails. In response, we propose new video
benchmarks that better assess long-tail recognition, by sampling
subsets from two datasets: SSv2 and VideoLT.
We then propose a method, Long-Tail Mixed Reconstruction
(LMR), which reduces overfitting to instances
from few-shot classes by reconstructing them as weighted
combinations of samples from head classes. LMR then
employs label mixing to learn robust decision boundaries.
It achieves state-of-the-art average class accuracy
on EPIC-KITCHENS and the proposed SSv2-LT and
VideoLT-LT. Benchmarks and code at: github.com/
tobyperrett/lmr
Original languageEnglish
Number of pages12
Publication statusPublished - 23 Jun 2023
EventIEEE/CVF Computer Vision and Pattern Recognition - Vancouver, Canada
Duration: 18 Jun 202323 Jun 2023

Conference

ConferenceIEEE/CVF Computer Vision and Pattern Recognition
Abbreviated titleCVPR
Country/TerritoryCanada
CityVancouver
Period18/06/2323/06/23

Fingerprint

Dive into the research topics of 'Use Your Head: Improving Long-Tail Video Recognition'. Together they form a unique fingerprint.

Cite this