A framework for research into continental ancestry groups of the UK Biobank

Andrei-Emil Constantinescu, Ruth E. Mitchell, Jie Zheng, Caroline J. Bull, Nicholas J. Timpson, Borko Amulic, Emma E. Vincent, David A. Hughes*

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

7 Citations (Scopus)
93 Downloads (Pure)


The UK Biobank is a large prospective cohort, based in the UK, that has deep phenotypic and genomic data on roughly a half a million individuals. Included in this resource are data on approximately 78,000 individuals with “non-white British ancestry.” While most epidemiology studies have focused predominantly on populations of European ancestry, there is an opportunity to contribute to the study of health and disease for a broader segment of the population by making use of the UK Biobank’s “non-white British ancestry” samples. Here, we present an empirical description of the continental ancestry and population structure among the individuals in this UK Biobank subset.

Reference populations from the 1000 Genomes Project for Africa, Europe, East Asia, and South Asia were used to estimate ancestry for each individual. Those with at least 80% ancestry in one of these four continental ancestry groups were taken forward (N = 62,484). Principal component and K-means clustering analyses were used to identify and characterize population structure within each ancestry group. Of the approximately 78,000 individuals in the UK Biobank that are of “non-white British” ancestry, 50,685, 6653, 2782, and 2364 individuals were associated to the European, African, South Asian, and East Asian continental ancestry groups, respectively. Each continental ancestry group exhibits prominent population structure that is consistent with self-reported country of birth data and geography.

Methods outlined here provide an avenue to leverage UK Biobank’s deeply phenotyped data allowing researchers to maximize its potential in the study of health and disease in individuals of non-white British ancestry.
Original languageEnglish
Article number3
Number of pages14
JournalHuman Genomics
Issue number1
Publication statusPublished - 29 Jan 2022

Bibliographical note

Funding Information:
AC acknowledges funding from a Medical Research Council PhD studentship (MR/N013794/1). NJT and REM acknowledge funding from the Medical Research Council (MC_UU_00011/1). NJT is the PI of the Avon Longitudinal Study of Parents and Children (Medical Research Council & Wellcome Trust 217065/Z/19/Z) and is supported by the University of Bristol NIHR Biomedical Research Centre (BRC-1215-2001). EEV, CJB, NJT, and DH acknowledge funding from the Wellcome Trust (202802/Z/16/Z). EEV, CJB, and NJT also acknowledge funding by the CRUK Integrative Cancer Epidemiology Programme (C18281/A29019). EEV and CJB are supported by Diabetes UK (17/0005587) and the World Cancer Research Fund (WCRF UK), as part of the World Cancer Research Fund International grant program (IIG_2019_2009). JZ is supported by the Academy of Medical Sciences (AMS) Springboard Award, the Wellcome Trust, the Government Department of Business, Energy and Industrial Strategy (BEIS), the British Heart Foundation and Diabetes UK (SBF006\1117). JZ is funded by the Vice-Chancellor Fellowship from the University of Bristol and is supported by Shanghai Thousand Talents Program. BA acknowledges funding from the Medical Research Council (MR/R02149x/1). The funders of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.

Publisher Copyright:
© 2022, The Author(s).

Structured keywords

  • ICEP


  • Ancestry
  • UK Biobank
  • Population structure


Dive into the research topics of 'A framework for research into continental ancestry groups of the UK Biobank'. Together they form a unique fingerprint.

Cite this