Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation

Caitlin Carey*, Rebecca Shafee, Robbee Wedow, Amanda Elliott, Duncan S. Palmer, John Compitello, Masahiro Kanai, Liam Abbott, Patrick Schultz, Konrad J. Karczewski, Samuel C Bryant, Caroline Cusick, Claire Churchhouse, Daniel P. Howrigan, Daniel King, George Davey Smith, Benjamin M Neale, Raymond K Walters*, Elise B. Robinson

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

8 Citations (Scopus)

Abstract

Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs, using data from unrelated individuals with predominantly estimated European genetic ancestry in UK Biobank. These factors recapitulate known disease classifications, disentangle elements of socioeconomic status, highlight the relevance of psychiatric constructs to health and improve measurement of pro-health behaviours. We go on to demonstrate the power of this approach to clarify genetic signal, enhance discovery and identify associations between underlying phenotypic structure and health outcomes. In building a deeper understanding of ways in which constructs such as socioeconomic status, trauma, or physical activity are structured in the dataset, we emphasize the importance of considering the interwoven nature of the human phenome when evaluating public health patterns.
Original languageEnglish
Pages (from-to)1599-1615
Number of pages17
JournalNature Human Behaviour
Volume8
Issue number8
Early online date4 Jul 2024
DOIs
Publication statusE-pub ahead of print - 4 Jul 2024

Bibliographical note

Publisher Copyright:
© The Author(s) 2024.

Research Groups and Themes

  • Bristol Population Health Science Institute

Fingerprint

Dive into the research topics of 'Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation'. Together they form a unique fingerprint.
  • Integrative Epidemiology Unit

    Davey Smith, G. (Principal Investigator)

    1/04/2331/03/28

    Project: Research

Cite this