Integrating Mendelian randomization and literature-mined evidence for breast cancer risk factors

Marina Vabistsevits*, Tim Robinson, Benjamin Elsworth, Yi Liu, Tom Gaunt

*Corresponding author for this work

Research output: Contribution to journalArticle (Academic Journal)peer-review

Abstract

Objective:
An increasing challenge in population health research is efficiently utilising the wealth of data available from multiple sources to investigate disease mechanisms and identify potential intervention targets. The use of biomedical data integration platforms can facilitate evidence triangulation from these different sources, improving confidence in causal relationships of interest. In this work, we aimed to integrate Mendelian randomization (MR) and literature-mined evidence from the EpiGraphDB biomedical knowledge graph to build a comprehensive overview of risk factors for developing breast cancer.

Methods:
We utilised MR-EvE (“Everything-vs-Everything”) data to identify candidate risk factors for breast cancer and generate hypotheses for potential mediators of their effect. We also integrated this data with literature-mined relationships, which were extracted by overlapping literature spaces of risk factors and breast cancer. The literature-based discovery (LBD) results were followed up by validation with two-step MR to triangulate the findings from two data sources.

Results:
We identified 129 novel and established lifestyle risk factors and molecular traits with evidence of an effect on breast cancer, and made the MR results available in an R/Shiny app (https://mvab.shinyapps.io/MR_heatmaps/). We developed an LBD approach for identifying potential mechanistic intermediates of identified risk factors. We present the results of MR and literature evidence integration for two case studies (childhood body size and HDL-cholesterol), demonstrating their complementary functionalities.

Conclusion:
We demonstrate that MR-EvE data offers an efficient hypothesis-generating approach for identifying disease risk factors. Moreover, we show that integrating MR evidence with literature-mined data may be used to identify causal intermediates and uncover the mechanisms behind the disease.
Original languageEnglish
Article number104810
Number of pages19
JournalJournal of Biomedical Informatics
Volume165
Early online date22 Mar 2025
DOIs
Publication statusPublished - 1 May 2025

Bibliographical note

Publisher Copyright:
© 2025 The Authors

Fingerprint

Dive into the research topics of 'Integrating Mendelian randomization and literature-mined evidence for breast cancer risk factors'. Together they form a unique fingerprint.

Cite this