Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

10 Downloads (Pure)

Abstract

Fine-grained action recognition datasets exhibit environmental bias, where even the largest datasets contain sequences from a limited number of environments due to the challenges of large-scale data collection. We show that multi-modal action recognition models suffer with changes in environment, due to the differing levels of robustness of each modality. Inspired by successes in adversarial training for unsupervised domain adaptation, we propose a multi-modal approach for adapting action recognition models to novel environments. We employ late fusion of the two modalities commonly used in action recognition (RGB and Flow), with multiple domain discriminators, so alignment of modalities is jointly optimised with recognition. We test our approach on EPIC Kitchens, proposing the first benchmark for domain adaptation of fine-grained actions. Our multi-modal method outperforms single-modality alignment as well as other alignment methods by up to 3%.
Original languageEnglish
Title of host publicationComputer vision and Pattern Recognition
Subtitle of host publicationCVPR 2020
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Publication statusAccepted/In press - 27 Feb 2020
EventComputer Vision and Pattern Recognition -
Duration: 14 Jun 202019 Jun 2020

Conference

ConferenceComputer Vision and Pattern Recognition
Period14/06/2019/06/20

Fingerprint Dive into the research topics of 'Multi-Modal Domain Adaptation for Fine-Grained Action Recognition'. Together they form a unique fingerprint.

  • Projects

    Cite this

    Munro, J., & Damen, D. (Accepted/In press). Multi-Modal Domain Adaptation for Fine-Grained Action Recognition. In Computer vision and Pattern Recognition: CVPR 2020 Institute of Electrical and Electronics Engineers (IEEE).