Projects per year
Abstract
We show the potential of our highly-detailed annotations through a challenging VQA benchmark of 26K questions assessing the capability to recognise recipes, ingredients, nutrition, fine-grained actions, 3D perception, object motion, and gaze direction. The powerful long-context Gemini Pro only achieves 37.6% on this benchmark, showcasing its difficulty and highlighting shortcomings in current VLMs. We additionally assess action recognition, sound recognition, and long-term video-object segmentation on HD-EPIC.
HD-EPIC is 41 hours of video in 9 kitchens with digital twins of 413 kitchen fixtures, capturing 69 recipes, 59K fine-grained actions, 51K audio events, 20K object movements and 37K object masks lifted to 3D. On average, we have 263 annotations per minute of our unscripted videos.
Original language | English |
---|---|
Title of host publication | 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Number of pages | 29 |
Publication status | Accepted/In press - 1 Mar 2025 |
Event | IEEE/CVF Computer Vision and Pattern Recognition: CVPR - Nashville, Nashville, United States Duration: 11 Jun 2025 → 15 Jun 2025 https://cvpr.thecvf.com |
Publication series
Name | Conference on Computer Vision and Pattern Recognition (CVPR) |
---|---|
Publisher | IEEE |
ISSN (Print) | 1063-6919 |
ISSN (Electronic) | 2575-7075 |
Conference
Conference | IEEE/CVF Computer Vision and Pattern Recognition |
---|---|
Country/Territory | United States |
City | Nashville |
Period | 11/06/25 → 15/06/25 |
Internet address |
Fingerprint
Dive into the research topics of 'HD-EPIC: A Highly-Detailed Egocentric Video Dataset'. Together they form a unique fingerprint.-
8030 EPSRC via Oxford EP/T028572/1 Visual AI
Damen, D. (Principal Investigator)
1/12/20 → 30/11/25
Project: Research, Parent
-
8459 EPSRC EP/T004991/1 UMPIRE - Dima Aldamen Fellowship
Damen, D. (Principal Investigator)
1/02/20 → 31/01/25
Project: Research
Datasets
-
HD-EPIC
Cramp, L. (Creator), Wray, M. (Creator), Perrett, T. (Creator), Chalk, J. (Creator), Flanagan, K. (Creator), Khalil, A. D. (Creator), Sinha, S. (Creator), Emara, O. (Creator), Zhu, Z. (Creator), Bansal, S. (Creator), Parida, K. (Creator), Gatti, P. (Creator), Guerrier, R. (Creator), Pollard, S. (Creator) & Abdelazim, F. (Creator), University of Bristol, 1 Jul 2014
DOI: 10.5523/bris.3cqb5b81wk2dc2379fx1mrxh47, http://data.bris.ac.uk/data/dataset/3cqb5b81wk2dc2379fx1mrxh47
Dataset