Abstract
This paper presents an unsupervised approach towards automatically extracting
video-based guidance on object usage, from egocentric video and wearable gaze
tracking, collected from multiple users while performing tasks. The approach
i) discovers task relevant objects, ii) builds a model for each, iii) distinguishes different ways in which each discovered object has been used and vi) discovers the
dependencies between object interactions. The work investigates using appearance, position, motion and attention, and presents results using each and a combination of relevant features. Moreover, an online scalable approach is presented and is compared to offline results. The paper proposes a method for selecting a suitable video guide to be displayed to a novice user indicating how to use an object, purely triggered by the user’s gaze. The potential assistive mode can also recommend an object to be used next based on the learnt sequence of object interactions. The approach was tested on a variety of daily tasks such as initialising a printer, preparing a coffee and setting up a gym machine.
video-based guidance on object usage, from egocentric video and wearable gaze
tracking, collected from multiple users while performing tasks. The approach
i) discovers task relevant objects, ii) builds a model for each, iii) distinguishes different ways in which each discovered object has been used and vi) discovers the
dependencies between object interactions. The work investigates using appearance, position, motion and attention, and presents results using each and a combination of relevant features. Moreover, an online scalable approach is presented and is compared to offline results. The paper proposes a method for selecting a suitable video guide to be displayed to a novice user indicating how to use an object, purely triggered by the user’s gaze. The potential assistive mode can also recommend an object to be used next based on the learnt sequence of object interactions. The approach was tested on a variety of daily tasks such as initialising a printer, preparing a coffee and setting up a gym machine.
Original language | English |
---|---|
Pages (from-to) | 98-112 |
Number of pages | 15 |
Journal | Computer Vision and Image Understanding |
Volume | 149 |
Early online date | 7 Jun 2016 |
DOIs | |
Publication status | Published - Aug 2016 |
Keywords
- Video Guidance
- Real-time Computer Vision
- Assistive Computing
- Object Discovery
- Object Usage
Fingerprint
Dive into the research topics of 'You-Do, I-Learn: Egocentric Unsupervised Discovery of Objects and their Modes of Interaction Towards Video-Based Guidance'. Together they form a unique fingerprint.Profiles
-
Professor Dima Damen
- School of Computer Science - Professor in Computer Vision
Person: Academic , Member
-
Professor Walterio W Mayol-Cuevas
- School of Computer Science - Professor in Robotics, Computer Vision and Mobile Systems
- Visual Information Laboratory
Person: Academic , Member