Projects per year
Fine-grained action recognition datasets exhibit environmental bias, where even the largest datasets contain sequences from a limited number of environments due to the challenges of large-scale data collection. We show that multi-modal action recognition models suffer with changes in environment, due to the differing levels of robustness of each modality. Inspired by successes in adversarial training for unsupervised domain adaptation, we propose a multi-modal approach for adapting action recognition models to novel environments. We employ late fusion of the two modalities commonly used in action recognition (RGB and Flow), with multiple domain discriminators, so alignment of modalities is jointly optimised with recognition. We test our approach on EPIC Kitchens, proposing the first benchmark for domain adaptation of fine-grained actions. Our multi-modal method outperforms single-modality alignment as well as other alignment methods by up to 3%.
|Title of host publication||Computer vision and Pattern Recognition|
|Subtitle of host publication||CVPR 2020|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Publication status||Accepted/In press - 27 Feb 2020|
|Event||Computer Vision and Pattern Recognition - |
Duration: 14 Jun 2020 → 19 Jun 2020
|Conference||Computer Vision and Pattern Recognition|
|Period||14/06/20 → 19/06/20|
Munro, J., & Damen, D. (Accepted/In press). Multi-Modal Domain Adaptation for Fine-Grained Action Recognition. In Computer vision and Pattern Recognition: CVPR 2020 Institute of Electrical and Electronics Engineers (IEEE).