Projects per year
Abstract
Multiple imputation (MI) is a well-established method for dealing with missing data. MI is computationally intensive when imputing missing covariates with high dimensional outcome data (e.g. DNA methylation data in epigenome-wide association studies (EWAS)), because every outcome variable must be included in the imputation model to avoid biasing associations towards the null. Instead, EWAS analyses are reduced to only complete cases (CC), limiting power and potentially causing bias. We used simulations to compare five MI methods for high dimensional data under two missingness mechanisms. All imputation methods had increased power over CC analyses. Imputing separately for each variable was
computationally inefficient, but dividing sites at random into evenly sized bins improved efficiency and gave low bias. Methods imputing solely using subsets of sites identified by the CC suffered from bias towards the null. However, if these subsets were added into random bins of sites the bias was reduced. The optimal methods were applied to an EWAS study with missingness in covariates. All methods identified additional sites over the CC, and many of these sites had been replicated in other studies. These methods are also applicable to other high dimensional datasets, including the rapidly-expanding area of ‘omics studies.
computationally inefficient, but dividing sites at random into evenly sized bins improved efficiency and gave low bias. Methods imputing solely using subsets of sites identified by the CC suffered from bias towards the null. However, if these subsets were added into random bins of sites the bias was reduced. The optimal methods were applied to an EWAS study with missingness in covariates. All methods identified additional sites over the CC, and many of these sites had been replicated in other studies. These methods are also applicable to other high dimensional datasets, including the rapidly-expanding area of ‘omics studies.
Original language | English |
---|---|
Article number | kwz186 |
Number of pages | 23 |
Journal | American Journal of Epidemiology |
DOIs | |
Publication status | Published - 5 Sept 2019 |
Keywords
- ALSPAC
- ARIES
- epigenetic data
- imputation
- missing data
Fingerprint
Dive into the research topics of 'Methods for Dealing with Missing Covariate Data in Epigenome-Wide Association Studies'. Together they form a unique fingerprint.Projects
- 3 Finished
-
IEU 2 Relton Programme - Epigenetic Epidemiology
Relton, C. L. (Principal Investigator)
1/04/18 → 31/03/23
Project: Research
-
Development of a multilevel and latent class framework for epigenetic change (IEU)
Tilling, K. M. (Principal Investigator)
29/01/16 → 28/11/18
Project: Research
-
(IEU) Epigenetics: Environment, Embodiment & Equality (E4)
Relton, C. L. (Principal Investigator)
1/01/16 → 31/12/19
Project: Research
Profiles
-
Dr Matthew J Suderman
- Bristol Medical School (PHS) - Associate Professor in Molecular Epidemiology
- Bristol Population Health Science Institute
- MRC Integrative Epidemiology Unit
Person: Academic , Member