Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: A simulation study

Rosie Cornish*, John Macleod, James Carpenter, Kate Tilling

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

4 Citations (Scopus)
303 Downloads (Pure)

Abstract

Background: When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI). Methods: Using data from the Avon Longitudinal Study of Parents and Children (ALSPAC) we estimated the association between breastfeeding and IQ (continuous outcome), incorporating linked attainment data (proxies for IQ) as auxiliary variables in MI models. Simulation studies explored the impact of varying the proportion of missing data (from 20 to 80%), the correlation between the outcome and its proxy (0.1-0.9), the strength of the missing data mechanism, and having a proxy variable that was incomplete. Results: Incorporating a linked proxy for the missing outcome as an auxiliary variable reduced bias and increased efficiency in all scenarios, even when 80% of the outcome was missing. Using an incomplete proxy was similarly beneficial. High correlations (> 0.5) between the outcome and its proxy substantially reduced the missing information. Consistent with this, ALSPAC analysis showed inclusion of a proxy reduced bias and improved efficiency. Gains with additional proxies were modest. Conclusions: In longitudinal studies with loss to follow-up, incorporating proxies for this study outcome obtained via linkage to external sources of data as auxiliary variables in MI models can give practically important bias reduction and efficiency gains when the study outcome is MNAR.

Original languageEnglish
Article number14
Number of pages13
JournalEmerging Themes in Epidemiology
Volume14
DOIs
Publication statusPublished - 19 Dec 2017

Structured keywords

  • Jean Golding

Keywords

  • ALSPAC
  • Bias
  • Breastfeeding
  • Data linkage
  • IQ
  • Missing data
  • Multiple imputation
  • Simulation study

Fingerprint Dive into the research topics of 'Multiple imputation using linked proxy outcome data resulted in important bias reduction and efficiency gains: A simulation study'. Together they form a unique fingerprint.

Cite this