Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

31 Downloads (Pure)

Abstract

Fine-grained action recognition datasets exhibit environmental bias, where even the largest datasets contain sequences from a limited number of environments due to the challenges of large-scale data collection. We show that multi-modal action recognition models suffer with changes in environment, due to the differing levels of robustness of each modality. Inspired by successes in adversarial training for unsupervised domain adaptation, we propose a multi-modal approach for adapting action recognition models to novel environments. We employ late fusion of the two modalities commonly used in action recognition (RGB and Flow), with multiple domain discriminators, so alignment of modalities is jointly optimised with recognition. We test our approach on EPIC Kitchens, proposing the first benchmark for domain adaptation of fine-grained actions. Our multi-modal method outperforms single-modality alignment as well as other alignment methods by up to 3%.
Original languageEnglish
Title of host publication2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Subtitle of host publicationCVPR 2020
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages119-129
Number of pages11
ISBN (Electronic)978-1-7281-7168-5
DOIs
Publication statusE-pub ahead of print - 5 Aug 2020
EventComputer Vision and Pattern Recognition -
Duration: 14 Jun 202019 Jun 2020

Publication series

NameConference on Computer Vision and Pattern Recognition (CVPR)
PublisherIEEE
ISSN (Electronic)2575-7075

Conference

ConferenceComputer Vision and Pattern Recognition
Period14/06/2019/06/20

Fingerprint Dive into the research topics of 'Multi-Modal Domain Adaptation for Fine-Grained Action Recognition'. Together they form a unique fingerprint.

Cite this