The presence of randomly distributed measurement errors in scale scores such as those used in educational and behavioural assessments implies that careful adjustments are required to statistical model estimation procedures if inferences are required for ‘true’ as opposed to ‘observed’ relationships. In many cases this requires the use of external values for ‘reliability’ statistics or ‘measurement error variances’ which may be provided by a test constructor or else inferred or estimated by the data analyst. Popular measures are those described as ‘internal consistency’ estimates and sometimes other measures based on data grouping. All such measures, however, make particular assumptions that may be questionable but are often not examined. In this paper we focus on scaled scores derived from aggregating a set of indicators, and set out a general methodological framework for exploring different ways of estimating reliability statistics and measurement error variances, critiquing certain approaches and suggesting more satisfactory methods in the presence of longitudinal data. In particular, we explore the assumption of local (conditional) item response independence and show how a failure of this assumption can lead to biased estimates in statistical models using scaled scores as explanatory variables. We illustrate our methods using a large longitudinal dataset of mathematics test scores from Queensland, Australia.
- SoE Centre for Multilevel Modelling
- instrumental variable
Goldstein, H., Haynes, M., Leckie, G., & Tran, P. (2020). Estimating reliability statistics and measurement error variances using instrumental variables with longitudinal data. Longitudinal and Life Course Studies, 11(3), 289-306. https://doi.org/10.1332/175795920X15844303873216