Projects per year
Abstract
Objective
Missing data are a pervasive problem, often leading to bias in complete records analysis (CRA). Multiple imputation (MI) via chained equations is one solution, but its use in the presence of interactions is not straightforward .
Study Design and Setting
We simulated data with outcome Y dependent on binary explanatory variables X and Z and their interaction XZ. Six scenarios were simulated (Y continuous and binary, each with no interaction, a weak and a strong interaction), under 5 missing data mechanisms. We use DAGs to identify when CRA and MI would each be unbiased. We evaluate the performance of CRA, MI without interactions, MI including all interactions, and stratified imputation. We also illustrated these methods using a simple example from the National Child Development Study (NCDS).
Results
MI excluding interactions is invalid, and resulted in biased estimates and low coverage. When XZ was zero, MI excluding interactions gave unbiased estimates but over-coverage. MI including interactions and stratified MI gave equivalent, valid inference in all cases. In the NCDS example, MI excluding interactions incorrectly concluded there was no evidence for an important interaction.
Conclusions
Epidemiologists carrying out MI should ensure that their imputation model(s) are compatible with their analysis model.
Missing data are a pervasive problem, often leading to bias in complete records analysis (CRA). Multiple imputation (MI) via chained equations is one solution, but its use in the presence of interactions is not straightforward .
Study Design and Setting
We simulated data with outcome Y dependent on binary explanatory variables X and Z and their interaction XZ. Six scenarios were simulated (Y continuous and binary, each with no interaction, a weak and a strong interaction), under 5 missing data mechanisms. We use DAGs to identify when CRA and MI would each be unbiased. We evaluate the performance of CRA, MI without interactions, MI including all interactions, and stratified imputation. We also illustrated these methods using a simple example from the National Child Development Study (NCDS).
Results
MI excluding interactions is invalid, and resulted in biased estimates and low coverage. When XZ was zero, MI excluding interactions gave unbiased estimates but over-coverage. MI including interactions and stratified MI gave equivalent, valid inference in all cases. In the NCDS example, MI excluding interactions incorrectly concluded there was no evidence for an important interaction.
Conclusions
Epidemiologists carrying out MI should ensure that their imputation model(s) are compatible with their analysis model.
Original language | English |
---|---|
Pages (from-to) | 107-115 |
Number of pages | 9 |
Journal | Journal of Clinical Epidemiology |
Volume | 80 |
Early online date | 19 Jul 2016 |
DOIs | |
Publication status | Published - 1 Dec 2016 |
Structured keywords
- Jean Golding
Keywords
- Bias
- Complete case analysis
- Interaction
- Missing data
- Multiple imputation
- Simulation
Fingerprint
Dive into the research topics of 'Appropriate inclusion of interactions was needed to avoid bias in multiple imputation'. Together they form a unique fingerprint.Projects
- 2 Finished
-
Developing and disseminating robust methods for handling missing data in epidemiological studies
1/04/10 → 1/10/13
Project: Research
-