Temporal-Relational CrossTransformers for Few-Shot Action Recognition

Toby J Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, Dima Damen

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

192 Citations (Scopus)
208 Downloads (Pure)

Abstract

We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. Distinct from previous few-shot works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video representations are formed from ordered tuples of varying numbers of frames, which allows sub-sequences of actions at different speeds and temporal offsets to be compared.1Our proposed Temporal-Relational CrossTransformers (TRX) achieve state-of-the-art results on few-shot splits of Kinetics, Something-Something V2 (SSv2), HMDB51 and UCF101. Importantly, our method outperforms prior work on SSv2 by a wide margin (12%) due to the its ability to model temporal relations. A detailed ablation showcases the importance of matching to multiple support set videos and learning higher-order relational CrossTransformers.
Original languageEnglish
Title of host publication2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages475-484
Number of pages10
ISBN (Electronic)978-1-6654-4509-2
ISBN (Print)978-1-6654-4510-8
DOIs
Publication statusPublished - 2 Nov 2021
EventComputer Vision and Pattern Recognition 2021 - Online
Duration: 19 Jun 202125 Jun 2021
http://cvpr2021.thecvf.com/

Publication series

NameConference on Computer Vision and Pattern Recognition (CVPR)
PublisherIEEE
ISSN (Print)1063-6919
ISSN (Electronic)2575-7075

Conference

ConferenceComputer Vision and Pattern Recognition 2021
Abbreviated titleCVPR
Period19/06/2125/06/21
Internet address

Research Groups and Themes

  • SPHERE
  • Digital Health

Fingerprint

Dive into the research topics of 'Temporal-Relational CrossTransformers for Few-Shot Action Recognition'. Together they form a unique fingerprint.

Cite this