This paper presents an outdoor video dataset annotated with action labels, collected from 24 participants wearing two head-mounted cameras (GoPro and SMI eye tracker) while assembling a camping tent. In total, this is 5.4 hours of recordings. Tent assembly includes manual interactions with non-rigid objects such as spreading the tent, securing guylines, reading instructions, and opening a tent bag. An interesting aspect of the dataset is that it reflects participants' proficiency in completing or understanding the task. This leads to participant differences in action sequences and action durations. Our dataset, called EPIC-Tent, also has several new types of annotations for two synchronised egocentric videos. These include task errors, self-rated uncertainty and gaze position, in addition to the task action labels. We present baseline results on the EPIC-Tent dataset using a state-of-the-art method for offline and online action recognition and detection.
|Number of pages||9|
|Publication status||Published - 2 Nov 2019|
|Event||The Fifth International Workshop on Egocentric Perception, Interaction and Computing - Seoul, Korea, Republic of|
Duration: 2 Nov 2019 → 2 Nov 2019
Conference number: 5
|Conference||The Fifth International Workshop on Egocentric Perception, Interaction and Computing|
|Country||Korea, Republic of|
|Period||2/11/19 → 2/11/19|
- Visual Perception
Jang, Y., Sullivan, B. T., Ludwig, C. J. H., Gilchrist, I. D., Damen, D., & Mayol-Cuevas, W. W. (2019). EPIC-Tent: An Egocentric Video Dataset for Camping Tent Assembly. Paper presented at The Fifth International Workshop on Egocentric Perception, Interaction and Computing, Seoul, Korea, Republic of.