Using computerised picture-based causal explanation speaking tasks to assess young language learners
: Validity evidence from multiple perspectives

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)


This PhD project developed computerised picture-based causal explanation speaking tasks (CESTs) for young language learners (YLLs) and collected multiple validity evidence to justify its use. This is done so to respond to the lack of assessment tasks that meet the newly promoted instructional focus (integration of language and thinking skills) in English-as-a-foreign-language (EFL) education and the limited validation studies on YLLs’ response processes. With the aim to investigate construct validity and cognitive appropriateness, this project drew on conceptual insights from the argument-based approach to validation, Weir’s socio-cognitive framework’s special consideration for YLLs, and requirements from the Standards for Educational and Psychological Testing on collecting multiple validity evidence. It also refers to Kormos’s bilingual speech production model and Weir’s socio-cognitive framework for speaking test validation to conceptualise the speech production processes. This PhD project involved 96 YLLs (48 from Grade 4 and Grade 6 each) in China to perform two CESTs in both L1 (Chinese) and L2 (English), with their eye movements and speaking performance recorded. Based on the replay of the recorded eye gaze and speaking performance, YLLs were interviewed retrospectively on their response processes. They also completed receptive and productive L2 vocabulary size tests. This project investigated different facets of L1 and L2 speaking performance, including performance scores, picture options, and the use of causal connectives (CC) and mental state words (MSW), and L1 and L2 response processes, in relation to L2 linguistic, age (grade levels) and cognitive development levels.
The findings show that CESTs are generally cognitively appropriate for YLLs because they accomplished CESTs in L1 with high scores, and YLLs across the two grade levels had similar L1 performance scores, L1 choice of causal antecedents, L1 use of CC, and similar L1 response processes. Nevertheless, it was also found that YLLs’ cognitive ability to interpret and verbalise MSW in L1 is still developing. When performing CESTs in L2, YLLs with fewer linguistic resources were found to view both content-relevant and content-irrelevant areas significantly more, especially when explaining the first two pictures in the task. It suggests that, under the time constraints and linguistic-resource-depleting conditions, certain YLLs might have difficulty in managing their test-taking time and focusing their visual attention on the content-relevant areas. In terms of construct validity, CESTs were found to capture both language and thinking skills because firstly the L2 scores of CESTs were found to be significantly correlated with YLLs’ L2 productive vocabulary size. Secondly CESTs elicited a large number of CCs, which manifest CESTs’ cognitive demand for verbalising causal reasoning. Thirdly, YLLs paid most of their visual attention to the content-relevant areas in both L1 and L2 performance, and the majority of YLLs reported to have response processes such as identifying psychological status and
psychological causality (i.e., what makes the boy unhappy) and identifying a more reasonable choice. These are what CESTs were intended to elicit. However, caution needs to be taken because YLLs’ developing cognitive ability to interpret mental states and their limited meta-cognitive ability to manage test-taking time and visual attention may pose threats to the construct validity. Additionally, with a process-oriented approach, the findings revealed YLLs’ dynamic interactions with the multi-modal features of CESTs at different stages of their speaking performance. The same visual, written or audio stimuli can be engaged differently by different YLLs at different stages of their speaking performance. The findings further suggest that while YLLs can be strategic test-takers and active visual scrutinizers, they can also be susceptible to the test anxiety from managing limited test time and retrieving their limited L2 linguistic resources. The project implies the importance of collecting multiple validity evidence to build a strong and comprehensive validity argument. The findings generate rich implications on evaluating and improving the cognitive appropriateness of computerised task design for YLLs.
Date of Award9 May 2023
Original languageEnglish
Awarding Institution
  • The University of Bristol
SupervisorGuoxing Yu (Supervisor) & Katherin Barg (Supervisor)


  • young language learners
  • asssessing speaking
  • thinking skills
  • asssessment validation
  • eye-tracking
  • causal explanation
  • computerised assessment
  • picture-based speaking tasks

Cite this