Projects per year
Personal profile
Research interests
My research interests lie in the development and application of computational methods in population health sciences. I am involved in a wide range of different projects and am always interested in hearing from potential PhD students or postdoctoral researchers. A selection of my research interests:
Data mining
I am interested in understanding the mechanisms of disease, and approach this through the integration of diverse biomedical and epidemiological data and the development of software tools for analysis of these data. One of our key developments is EpiGraphDB, a database that integrates epidemiological and biomedical data to support mechanism discovery and aid causal inference. The platform is openly available and is used by academic and industry researchers worldwide.
Systematic analysis of potential interventions
The MR-Base platform aims to systematise causal inference using Mendelian randomization [Gib Hemani, Philip Haycock, Ben Elsworth, Matt Lyon and Jie (Chris) Zheng]. MR-Base integrates an extensive database of genome-wide association study data (the MRC-IEU OpenGWAS database) with Mendelian randomization (MR) methods in both a user-friendly web application and a comprehensive R package.
We have applied these tools to the systematic causal analysis of a wide array of risk factors and diseases and the prioritization of drug targets. OpenGWAS is openly available (hosted in Oracle Cloud) and used by thousands of academic and industry users worldwide to support MR and other post-GWAS analyses.
Drug target prioritization
Working in collaboration with major pharmaceutical companies we have carried out systematic analyses of potential drug targets using MR, making results openly available in EpigraphDB [Zheng et al, Nature Genetics 2020]. This approach has been summarised in an animation. We have subsequently applied this in other contexts, including for neurological and psychiatric disease, and in a cross-population context for various diseases.
Literature mining and natural language processing (NLP)
The MELODI and newer MELODI-Presto platform both aim to mine mechanistic pathways from the biomedical literature [Ben Elsworth]. The software searches for overlapping terms between two literature sets that represent two different entities (eg a risk factor and a disease). Enriched overlapping terms may represent candidate mechanisms for further investigation. MELODI is paralleled by the TeMMPo platform (developed in collaboration with WCRF), which assesses the literature for number of publications underpinning hypothesised mechanistic pathways.
We implement NLP tools (such as text embeddings and language models) to enable the mapping of human traits across different biomedical and health datasets, with proofs-of-principle including Vectology and the NLP tool in EpiGraphDB. These approaches have been used to provide trait recommendations in the OpenGWAS database. We are also working on natural language interfaces to knowledge graphs (such as EpiGraphDB), and have recently implemented the ASQ EpiGraphDB platform as a proof-of-principle of this approach.
Machine learning
I have interests in the application of machine learning approaches to molecular data, and (with Colin Campbell) have published tools that predict the functional effects of genetic variants (the widely-used FATHMM suite of tools), haploinsufficiency (HIPred) and breast cancer survival (FS-MKL).
Epigenetics
As co-I of the BBSRC-funded ARIES project I led the bioinformatics workpackage in generating, QC’ing and normalizing the data, and have subsequently been involved in over 20 papers utilizing these data (including a major methylation QTL analysis published in Genome Biology in 2016). The methylation QTL derived from the ARIES data are presented in our online mQTLdb, and ongoing work with the GoDMC consortium will substantially extend the scale of this analysis.
Other software
Other software tools I have overseen include: FATHMM (Shihab), mQTLdb (Shihab), TeMMPo and GTB (Shihab) (see MRC-IEU software page).
See my Scopus and Google Scholar pages for publications.
Research group and funding
My group currently comprises 6 postdoctoral researchers and 9 PhD students. I lead a programme in Data Mining in the MRC Integrative Epidemiology Unit, and as co-investigator on the CRUK Integrative Cancer Epidemiology programme I lead a bioinformatics cross-cutting strand, and with the Bristol NIHR Biomedical Research Centre I co-lead a work-strand within the Translational Population Sciences theme. I am an Executive Board member for the ALSPAC cohort.
Group website→
Postgraduate research career support
I am a co-director of the Wellcome Molecular, Genetic and Lifecourse Epidemiology PhD programme and PGR co-director for Bristol Medical School.
Keywords
- Bioinformatics
- Data Science
- Population Health
Fingerprint
- 1 Similar Profiles
Collaborations and top research areas from the last five years
Research output
-
Assessing the effects of hyperparameters on knowledge graph embedding quality
Lloyd, O., Gaunt, T. R. & Liu, Y., 6 May 2023, In: Journal of Big Data. 10, 1, 59.Research output: Contribution to journal › Article (Academic Journal) › peer-review
Open AccessFile3 Downloads (Pure) -
Association between genetically proxied PCSK9 inhibition and prostate cancer risk: A Mendelian randomisation study
the PRACTICAL Consortium, 3 Jan 2023, In: PLoS Medicine. 20, 1, 23 p., e1003988.Research output: Contribution to journal › Article (Academic Journal) › peer-review
Open AccessFile2 Citations (Scopus)50 Downloads (Pure) -
Biomedical consequences of elevated cholesterol-containing lipoproteins and apolipoproteins on cardiovascular and non-cardiovascular outcomes
Schmidt, A. F., Joshi, R., Gordillo-Marañón, M., Drenos, F., Gaunt, T. R. & Lawlor, D. A., 20 Jan 2023, In: Communications Medicine. 3, 1, 9.Research output: Contribution to journal › Article (Academic Journal) › peer-review
Open AccessFile10 Downloads (Pure)
Datasets
-
Additional file 1 of Assessing the effects of hyperparameters on knowledge graph embedding quality
Lloyd, O. (Creator), Liu, Y. (Creator) & Gaunt, T. R. (Creator), figshare, 2023
DOI: 10.6084/m9.figshare.22775962, https://springernature.figshare.com/articles/dataset/Additional_file_1_of_Assessing_the_effects_of_hyperparameters_on_knowledge_graph_embedding_quality/22775962
Dataset
-
MRC IEU UK Biobank GWAS pipeline version 2
Mitchell, R. E. (Creator), Elsworth, B. L. (Creator), Mitchell, R. (Creator), Raistrick, C. A. (Creator), Paternoster, L. (Creator), Hemani, G. (Creator) & Gaunt, T. R. (Creator), University of Bristol, 19 Feb 2019
DOI: 10.5523/bris.pnoat8cxo0u52p6ynfaekeigi, http://data.bris.ac.uk/data/dataset/pnoat8cxo0u52p6ynfaekeigi
Dataset
-
MRC IEU UK Biobank GWAS pipeline version 1
Mitchell, R. E. (Creator), Elsworth, B. (Creator), Raistrick, C. (Creator), Paternoster, L. (Creator), Hemani, G. (Creator) & Gaunt, T. R. (Creator), University of Bristol, 14 Dec 2017
DOI: 10.5523/bris.2fahpksont1zi26xosyamqo8rr, http://data.bris.ac.uk/data/dataset/2fahpksont1zi26xosyamqo8rr
Dataset
Prizes
-
Assessing capability of large AI models in text mining of cancer studies
Liu, Yi (Recipient), Xu, Zhaozhen (Recipient), Sobczyk-Barad, Maria K (Recipient), Gaunt, Tom R (Recipient) & Simpson, Edwin D. (Recipient), 19 Sept 2023
Prize: Prizes, Medals, Awards and Grants