Skip to main navigation Skip to search Skip to main content

Struggle Determination from Video

  • Shijia Feng

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

Recognising human struggle is a crucial capability for intelligent assistive systems, such as VR applications or robots, enabling them to provide timely support during task execution. This capability is particularly valuable in scenarios where individuals are learning new skills and an assistive system is meant to support, or also crucial for systems to detect what to learn and what to ignore.
This thesis introduces the novel problem of struggle determination in video understanding, where the aim is to develop models that identify whether a person is struggling. The task presents unique challenges, including the absence of annotated struggle datasets, the inherent ambiguity in defining what constitutes struggle, and the need to design appropriate modelling tasks and deep learning architectures. While prior work in skill assessment and action detection provides useful reference points, struggle determination requires new data resources and methodological adaptations.

To address these gaps, this thesis designs and conducts controlled experiments to annotate and collect videos capturing human struggle across diverse scenarios. Building on this foundation, the research investigates struggle determination in several dimensions. First, struggle is annotated in trimmed video segments, and different modelling approaches and architectures are explored. The work then extends to untrimmed recordings, where annotations precisely mark the start and end times of struggle, transforming the problem from video classification to temporal struggle detection. The pipeline is further reformulated for online settings, where video streams are processed to detect ongoing struggle and even anticipate upcoming struggle within a short temporal horizon. This shift better aligns the task with practical applications that require timely and proactive support. Finally, the thesis examines whether struggle detection models can generalise across tasks and activities, a critical step for building scalable datasets and training robust, transferable models. Overall, this thesis establishes struggle determination as a new research direction in video understanding and provides the datasets, benchmarks, and analyses to help advance future developments in this field.
Date of Award20 Jan 2026
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorWalterio W Mayol-Cuevas (Supervisor) & Michael Wray (Supervisor)

Keywords

  • struggle determination

Cite this

'