Investigating the efficacy of automated writing evaluation as a diagnostic assessment tool in L2 writing instruction
: A mixed-method study

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)


Despite growing consensus on the diagnostic potential of Automated Writing Evaluation (AWE) in L2 writing instruction, research into its use as a diagnostic tool remains limited. Particularly, there is a gap in understanding both the quality of diagnostic feedback from current AWE systems and how L2 learners engage with this feedback. However, understanding learner engagement is crucial to harnessing the full benefits of AWE feedback and linking feedback provision to learning outcomes.
This study aims to bridge these gaps by investigating a) the quality of diagnostic feedback from Write & Improve with Cambridge (W&I), an AWE system targeting L2 learners, focusing on its feedback scope, accuracy, and explicitness and b) L2 learners' engagement with the feedback during essay revision, along with influencing factors. A new conceptual model of learner engagement with AWE feedback is proposed, conceptualising engagement as a dynamic, recursive process involving three key elements: attention allocation, cognitive effort expenditure, and revision responses, influenced by both learner-external and learner-internal factors.
Methodologically, this study employs a mix of eye-tracking, stimulated recalls, questionnaires, reflective journals, and text analysis to deeply explore learners’ engagement with feedback and influencing factors. The participants included 24 Chinese EFL learners who wrote and revised essays using W&I feedback in response to two graph-based writing prompts. Data were collected and analysed from multiple perspectives, including eye movements, stimulated recall interviews, participants’ written text with corresponding W&I feedback and scores, revisions based on W&I feedback, perceptions of feedback, English language proficiency, and reflective journals.
The findings validate the use of W&I as a diagnostic assessment tool in L2 settings, demonstrating satisfactory performance in detecting common Chinese EFL learners’ errors with an overall precision rate of 85.65%. W&I effectively generates feedback at three levels of explicitness and introduces indirect sentence-level feedback as a novel form of AWE feedback. Overall, W&I feedback actively engaged participants in the essay revision process. This engagement was influenced by various factors: a) learner-external factors, including systemic elements (scope, accuracy, and explicitness of W&I feedback) and contextual aspects (learners' workload and peers’ approach to W&I), b) learner-internal factors (perceptions of W&I feedback and English language proficiency), and c) three emerging factors (prior AWE experience, the time constraint imposed by the research, and the language use in the writing prompts).
By identifying these diverse factors, the study provides valuable insights for educators, AWE tool developers, and L2 learners, aiding in the optimal utilisation of AWE as a diagnostic assessment tool in L2 writing instruction. The findings not only strongly support but also contribute to the refinement of the proposed model. This refined model introduces an alternative theoretical framework for further research, thereby unlocking new avenues for exploration in the field, particularly emphasising a cognitive perspective in the study of AWE feedback processing.
Date of Award19 Mar 2024
Original languageEnglish
Awarding Institution
  • The University of Bristol
SupervisorGuoxing Yu (Supervisor) & William J Browne (Supervisor)


  • Automated Writing Evaluation
  • Diagnostic Assessment
  • L2 Writing Instruction
  • Learner Engagement
  • Mixed-Methods Study

Cite this