Projects per year
Abstract
We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. Distinct from previous few-shot works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video representations are formed from ordered tuples of varying numbers of frames, which allows sub-sequences of actions at different speeds and temporal offsets to be compared.1Our proposed Temporal-Relational CrossTransformers (TRX) achieve state-of-the-art results on few-shot splits of Kinetics, Something-Something V2 (SSv2), HMDB51 and UCF101. Importantly, our method outperforms prior work on SSv2 by a wide margin (12%) due to the its ability to model temporal relations. A detailed ablation showcases the importance of matching to multiple support set videos and learning higher-order relational CrossTransformers.
| Original language | English |
|---|---|
| Title of host publication | 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
| Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
| Pages | 475-484 |
| Number of pages | 10 |
| ISBN (Electronic) | 978-1-6654-4509-2 |
| ISBN (Print) | 978-1-6654-4510-8 |
| DOIs | |
| Publication status | Published - 2 Nov 2021 |
| Event | Computer Vision and Pattern Recognition 2021 - Online Duration: 19 Jun 2021 → 25 Jun 2021 http://cvpr2021.thecvf.com/ |
Publication series
| Name | Conference on Computer Vision and Pattern Recognition (CVPR) |
|---|---|
| Publisher | IEEE |
| ISSN (Print) | 1063-6919 |
| ISSN (Electronic) | 2575-7075 |
Conference
| Conference | Computer Vision and Pattern Recognition 2021 |
|---|---|
| Abbreviated title | CVPR |
| Period | 19/06/21 → 25/06/21 |
| Internet address |
Research Groups and Themes
- SPHERE
- Digital Health
Fingerprint
Dive into the research topics of 'Temporal-Relational CrossTransformers for Few-Shot Action Recognition'. Together they form a unique fingerprint.Projects
- 1 Finished
-
UMPIRE: United Model for the Perception of Interactions for visual Recognition
Damen, D. (Principal Investigator)
1/02/20 → 31/01/25
Project: Research
Equipment
-
HPC (High Performance Computing) and HTC (High Throughput Computing) Facilities
Alam, S. R. (Manager), Williams, D. A. G. (Manager), Eccleston, P. E. (Manager) & Greene, D. (Manager)
Facility/equipment: Facility