Letter regarding, “Association between the use of aspirin and risk of lung cancer: results from pooled cohorts and Mendelian randomization analyses”

We are writing to comment on the paper “Association between the use of aspirin and risk of lung cancer: results from pooled cohorts and Mendelian randomization analyses” by Jiang et al. (Jiang et al. 2020). Findings from this Mendelian randomization (MR) study suggest that aspirin use decreases the incidence of lung cancer, specifically overall lung cancer and squamous cell carcinoma. Observational studies have provided some evidence for the use of aspirin as a chemopreventive agent in lung cancer (relative risk 0.93, 95%CI 0.87–1.00) (Qiao et al 2018). However, few randomized trials have been carried out to answer this question. Using genetic variation as a method of randomization to aspirin use and testing for association with specific cancers within an MR framework is, therefore, an attractive method to appraise causality before long and costly trials are conducted. The authors used genetic variants from a genome-wide association study (GWAS) on aspirin use conducted by the Neale Lab (UK Biobank—Neale lab) to test whether aspirin intake was causally related to lung cancer incidence. However, there are some potential concerns with this method of instrumenting aspirin use. One of the first major concerns with using instruments from a GWAS of drug use is disentangling the genetic variants for the drug from those for the drug’s indication. Using the MRC IEU OpenGWAS database (Elsworth et al. 2020), we found that the SNPs that predict aspirin use by this study (rs583104, rs2521501, rs10455872, rs73015016, rs7412, rs1831733, rs117733303) have all been shown to be associated with coronary artery disease (CAD) risk at least at genome-wide significance (van der Harst and Verweij 2018; Nikpay et al. 2015). Therefore, it may be that these SNPs increase the risk of CAD and that these individuals are being instructed to take aspirin as a preventative measure, thereby confounding the SNP association with aspirin use. Many of the SNPs used to instrument aspirin use are also associated with a large number of other risk factors in the MRC IEU OpenGWAS database (Elsworth et al. 2020). This raises the potential for the violation of two of the MR assumptions: no confounding (independence assumption) and no horizontal pleiotropy (exclusion restriction assumption). Specifically, SNPs rs583104, rs10455872, rs73015016, rs7412 and rs117733303 have previously been associated with levels of low-density lipoprotein (LDL) cholesterol at genome-wide significance (Global Lipids Genetics Consortium, 2013; Prins et al. 2017), with increasing LDL-cholesterol levels leading to increased risk of coronary heart disease (Richardson et al. 2020). If cardiovascular risk factors are causally related to lung cancer incidence, as suggested by a previous MR study where increasing LDLcholesterol levels was inversely associated with lung cancer incidence (OR 0.90, 95% CI 0.84–0.97 per SD of 38 mg/ dl) (Carreras-Torres et al. 2017), then this may introduce confounding into the MR analysis. Alternatively, associations between the genetic variants being used to instrument aspirin use and other risk factors could indicate violation of the exclusion restriction assumption of MR—namely, that the SNP is only affecting the outcome via the exposure of interest (Lawlor et al. 2008). When SNPs are also associated with other risk factors that may affect disease risk, this is termed horizontal pleiotropy (Burgess and Thompson 2013); however, we do acknowledge that the MR Egger regression was conducted and found little evidence of pleiotropy (Bowden et al. 2015). Furthermore, a weighted median approach was carried out and consistent results with the IVW were observed for overall lung cancer * Aayah Nounu an0435@bristol.ac.uk


Dear Editor,
We are writing to comment on the paper "Association between the use of aspirin and risk of lung cancer: results from pooled cohorts and Mendelian randomization analyses" by Jiang et al. (Jiang et al. 2020). Findings from this Mendelian randomization (MR) study suggest that aspirin use decreases the incidence of lung cancer, specifically overall lung cancer and squamous cell carcinoma.
Observational studies have provided some evidence for the use of aspirin as a chemopreventive agent in lung cancer (relative risk 0.93, 95%CI 0.87-1.00) (Qiao et al 2018). However, few randomized trials have been carried out to answer this question. Using genetic variation as a method of randomization to aspirin use and testing for association with specific cancers within an MR framework is, therefore, an attractive method to appraise causality before long and costly trials are conducted.
The authors used genetic variants from a genome-wide association study (GWAS) on aspirin use conducted by the Neale Lab (UK Biobank-Neale lab) to test whether aspirin intake was causally related to lung cancer incidence. However, there are some potential concerns with this method of instrumenting aspirin use.
One of the first major concerns with using instruments from a GWAS of drug use is disentangling the genetic variants for the drug from those for the drug's indication. Using the MRC IEU OpenGWAS database , we found that the SNPs that predict aspirin use by this study (rs583104, rs2521501, rs10455872, rs73015016, rs7412, rs1831733, rs117733303) have all been shown to be associated with coronary artery disease (CAD) risk at least at genome-wide significance (van der Harst and Verweij 2018; Nikpay et al. 2015). Therefore, it may be that these SNPs increase the risk of CAD and that these individuals are being instructed to take aspirin as a preventative measure, thereby confounding the SNP association with aspirin use.
Many of the SNPs used to instrument aspirin use are also associated with a large number of other risk factors in the MRC IEU OpenGWAS database ). This raises the potential for the violation of two of the MR assumptions: no confounding (independence assumption) and no horizontal pleiotropy (exclusion restriction assumption). Specifically, SNPs rs583104, rs10455872, rs73015016, rs7412 and rs117733303 have previously been associated with levels of low-density lipoprotein (LDL) cholesterol at genome-wide significance (Global Lipids Genetics Consortium, 2013;Prins et al. 2017), with increasing LDL-cholesterol levels leading to increased risk of coronary heart disease (Richardson et al. 2020). If cardiovascular risk factors are causally related to lung cancer incidence, as suggested by a previous MR study where increasing LDLcholesterol levels was inversely associated with lung cancer incidence (OR 0.90, 95% CI 0.84-0.97 per SD of 38 mg/ dl) (Carreras-Torres et al. 2017), then this may introduce confounding into the MR analysis.
Alternatively, associations between the genetic variants being used to instrument aspirin use and other risk factors could indicate violation of the exclusion restriction assumption of MR-namely, that the SNP is only affecting the outcome via the exposure of interest (Lawlor et al. 2008). When SNPs are also associated with other risk factors that may affect disease risk, this is termed horizontal pleiotropy (Burgess and Thompson 2013); however, we do acknowledge that the MR Egger regression was conducted and found little evidence of pleiotropy (Bowden et al. 2015). Furthermore, a weighted median approach was carried out and consistent results with the IVW were observed for overall lung cancer (OR 1.32 × 10 -4 , 95% CI 1.69 × 10 -7 to 0.10, P value: 0.05) indicating a similar causal effect even if 50% of the weight came from invalid instruments (Bowden et al. 2015).
Another threat to the validity of findings is the potential for selection bias. Since many of the SNPs may also be proxying liability to CAD, this may indicate survival bias, whereby individuals with higher risk of CAD die prematurely and, therefore, do not live long enough to be diagnosed with lung cancer. A frailty analysis could be conducted in this case to re-estimate the causal estimate in the presence of survival bias (Noyce et al. 2017).
A final concern with the results presented are the very large effect sizes (overall lung cancer, OR 0.042, 95% CI 0.003-0.564 and squamous cell lung cancer, OR 0.002, 95% CI 1.21 × 10 -5 to 0.301) obtained from the MR analysis, which are much larger in magnitude than the corresponding observational estimates presented in the study (overall lung cancer relative risk (RR): 0.95, 95% CI 0.91-0.98, P value: 0.004; and squamous cell lung cancer RR: 0.80, 95% CI: 0.65 to 0.98, P value: 0.034). While it is possible for MR to estimate larger causal effects than corresponding observational analyses; for example, in the presence of negative confounding or short-term exposure, a more detailed assessment of the genetic estimates used in the analysis is first required. In particular, the GWAS for aspirin use conducted by the Neale lab was conducted on the absolute risk difference rather than log-odds scale and, therefore, the MR effect estimates presented are unlikely to be directly comparable to those obtained in the accompanying observational analysis.
Using the same SNPs and datasets, we first replicated the results to confirm that the same exposure and outcome datasets were being used before transforming the SNP-exposure associations to a more interpretable scale for binary traits (the log odds scale) using the formula provided by Elsworth et al. (2019) and re-conducting the MR analysis (code to reproduce the analysis in this paper: https ://githu b.com/an043 5/aspir in_lung_cance r_MR/). The transformed MR results can be interpreted as the OR for lung cancer per doubling of aspirin use. After conversion, a doubling of aspirin use decreases the risk of lung cancer and squamous cell lung cancer by 31% and 52%, respectively (lung cancer IVW OR 0.69, 95% CI 0.51-0.94 and squamous cell lung cancer IVW OR 0.48, 95% CI 0.27-0.87) ( Table 1). Using the numbers of cases and sample sizes from the cohort studies listed in the paper, the prevalence of lung cancer was 0.51% (72,782/14,369,951*100). When the disease is rare in a population such as with lung cancer (prevalence below 10%), the OR can be interpreted as a relative risk, making the results from the observational analysis and MR comparable (Sedgwick, 2014 One alternative approach to study drug effects using MR is to identify SNPs that mimic a drug's mechanism of action by investigating SNPs in the genes of the targeted protein (Gill et al. 2019). For example, statins inhibit the enzyme 3-hydroxy-3-methyl-glutaryl-coenzyme A reductase (HMGCR) resulting in reduced levels of LDL-cholesterol (Ference et al. 2015). Based on this understanding, SNPs that are in or around (within 100 kb) the HMGCR gene and that are associated with LDL-cholesterol have proven a useful method to instrument exposure to statins in MR studies (Ference et al. 2015(Ference et al. , 2016. In the case of drugs such as aspirin that have multiple targets, it may be useful to conduct proteomic analysis to identify proteins targeted by the drug and instrument the effect of changes in mRNA/protein levels on cancer risk (Nounu et al. 2020). Instrumenting levels of mRNA/protein expression provides a continuous exposure, compared to aspirin use which is a binary variable and, therefore, results in complications when conducting and interpreting MR analyses (Burgess and Labrecque 2018).
Whilst we acknowledge that conducting MR studies of drug use and cancer incidence would provide much needed answers for clinical intervention, careful consideration in MR study design is needed when instrumenting drug use in MR to avoid the potential pitfalls highlighted. Methods and guidelines are now available for informing best practice in MR (Burgess et al. 2020;Davey Smith et al. 2019) so that appropriate inference can be made. Code availability The code to reproduce this analysis can be found at (https ://githu b.com/an043 5/aspir in_lung_cance r_MR/).

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Availability of data and material Data are freely available on the online platform for MR Base (http://app.mrbas e.org/) as well as through the TwoSampleMR R package (github.com/MRCIEU/TwoSampleMR).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.