Rater accuracy and training group effects in Expert- and Supervisor-based monitoring systems

Jo Anne Baird*, Michelle Meadows, George Leckie, Daniel Caro

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

7 Citations (Scopus)
328 Downloads (Pure)


This study evaluated rater accuracy with rater-monitoring data from high stakes examinations in England. Rater accuracy was estimated with cross-classified multilevel modelling. The data included face-to-face training and monitoring of 567 raters in 110 teams, across 22 examinations, giving a total of 5500 data points. Two rater-monitoring systems (Expert consensus scores and Supervisor judgement of correct scores) were utilised for all raters. Results showed significant group training (table leader) effects upon rater accuracy and these were greater in the expert consensus score monitoring system. When supervisor judgement methods of monitoring were used, differences between training teams (table leader effects) were underestimated. Supervisor-based judgements of raters’ accuracies were more widely dispersed than in the Expert consensus monitoring system. Supervisors not only influenced their teams’ scoring accuracies, they overestimated differences between raters’ accuracies, compared with the Expert consensus system. Systems using supervisor judgements of correct scores and face-to-face rater training are, therefore, likely to underestimate table leader effects and overestimate rater effects.

Original languageEnglish
Pages (from-to)44-59
Number of pages16
JournalAssessment in Education: Principles, Policy and Practice
Issue number1
Early online date1 Dec 2015
Publication statusPublished - Jan 2017

Structured keywords

  • SoE Centre for Multilevel Modelling


  • multilevel modelling
  • rater accuracy
  • rater monitoring
  • rater training
  • table effects

Fingerprint Dive into the research topics of 'Rater accuracy and training group effects in Expert- and Supervisor-based monitoring systems'. Together they form a unique fingerprint.

Cite this