19 Citations (Scopus)
212 Downloads (Pure)


Participants in epidemiologic and genetic studies are rarely true random samples of the populations they are intended to represent, and both known and unknown factors can influence participation in a study (known as selection into a study). The circumstances in which selection causes bias in an instrumental variable (IV) analysis are not widely understood by practitioners of IV analyses. We use directed acyclic graphs (DAGs) to depict assumptions about the selection mechanism (factors affecting selection) and show how DAGs can be used to determine when a two-stage least squares IV analysis is biased by different selection mechanisms. Through simulations, we show that selection can result in a biased IV estimate with substantial confidence interval (CI) undercoverage, and the level of bias can differ between instrument strengths, a linear and nonlinear exposure-instrument association, and a causal and noncausal exposure effect. We present an application from the UK Biobank study, which is known to be a selected sample of the general population. Of interest was the causal effect of staying in school at least 1 extra year on the decision to smoke. Based on 22,138 participants, the two-stage least squares exposure estimates were very different between the IV analysis ignoring selection and the IV analysis which adjusted for selection (e.g., risk differences, 1.8% [95% CI, -1.5%, 5.0%] and -4.5% [95% CI, -6.6%, -2.4%], respectively). We conclude that selection bias can have a major effect on an IV analysis, and further research is needed on how to conduct sensitivity analyses when selection depends on unmeasured data.

Original languageEnglish
Pages (from-to)350-357
Number of pages8
Issue number3
Early online date1 May 2019
Publication statusPublished - 1 May 2019


  • causal exposure effect
  • collider stratification bias
  • instrumental variable
  • selection bias
  • two stage least squares


Dive into the research topics of 'Selection bias when estimating average treatment effects using one-sample instrumental variable analysis'. Together they form a unique fingerprint.

Cite this