Abstract
Multiple imputation has become one of the most popular approaches for handling
missing data in statistical analyses. Part of this success is due to Rubin’s simple
combination rules. These give frequentist valid inferences when the imputation
model and analysis procedures are so called congenial and the embedding model
is correctly specified, but otherwise may not. Roughly speaking, congeniality
corresponds to whether the imputation model and analysis procedure make different assumptions about the data. In practice imputation models and analysis procedures are often not congenial, such that tests may not have the correct size and confidence interval coverage deviates from the advertised level. We examine a
number of recent proposals which combine bootstrapping with multiple imputation, and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.
missing data in statistical analyses. Part of this success is due to Rubin’s simple
combination rules. These give frequentist valid inferences when the imputation
model and analysis procedures are so called congenial and the embedding model
is correctly specified, but otherwise may not. Roughly speaking, congeniality
corresponds to whether the imputation model and analysis procedure make different assumptions about the data. In practice imputation models and analysis procedures are often not congenial, such that tests may not have the correct size and confidence interval coverage deviates from the advertised level. We examine a
number of recent proposals which combine bootstrapping with multiple imputation, and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.
Original language | English |
---|---|
Number of pages | 29 |
Journal | Statistical Methods in Medical Research |
Early online date | 30 Jun 2020 |
DOIs | |
Publication status | Published - 1 Dec 2020 |
Keywords
- multiple imputation
- bootstrap
- congeniality