Projects per year
Abstract
Motivation:Nextgenerationsequencingtechnologieshaveacceleratedthediscoveryofsinglenucleotide variants (SNVs) in the human genome, stimulating the development of predictors for classifying which of these variants are likely functional in disease, and which neutral. Recently we proposed CScape, a method for discriminating between cancer driver mutations and presumed benign variants (Rogers et al., 2017a). For the neutral class this method relied on benign germline variants found in the 1000 Genomes Project database. Discrimination could therefore be influenced by the distinction of germline versus somatic, rather than neutral versus disease-driver. This motivates the current paper in which we consider predictive discrimination between recurrent and rare somatic single point mutations based solely on using cancer data, and the distinction between these two somatic classes and germline single point mutations.
Results: For somatic point mutations in coding and non-coding regions of the genome, we propose CScapesomatic, an integrative classifier for predictively discriminating between recurrent and rare variants in the human cancer genome. In the present study we use purely cancer genome data and investigate the distinction between minimal occurrence and significantly recurrent somatic single point mutations in the human cancer genome. We show that this type of predictive distinction can give novel insight, and may deliver more meaningful prediction in both coding and non-coding regions of the cancer genome. Tested on somatic mutations, CScape-somatic outperforms alternative methods, reaching 74% balanced accuracy in coding regions and 69% in non-coding regions, while even higher accuracy may be achieved using thresholds to isolate high-confidence predictions.
Results: For somatic point mutations in coding and non-coding regions of the genome, we propose CScapesomatic, an integrative classifier for predictively discriminating between recurrent and rare variants in the human cancer genome. In the present study we use purely cancer genome data and investigate the distinction between minimal occurrence and significantly recurrent somatic single point mutations in the human cancer genome. We show that this type of predictive distinction can give novel insight, and may deliver more meaningful prediction in both coding and non-coding regions of the cancer genome. Tested on somatic mutations, CScape-somatic outperforms alternative methods, reaching 74% balanced accuracy in coding regions and 69% in non-coding regions, while even higher accuracy may be achieved using thresholds to isolate high-confidence predictions.
Original language | English |
---|---|
Article number | btaa242 |
Journal | Bioinformatics |
DOIs | |
Publication status | Published - 13 Apr 2020 |
Fingerprint
Dive into the research topics of 'CScape-somatic: distinguishing driver and passenger point mutations in the cancer genome'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Novel Methodology for Predicting the Functional Effects of Genetic Variation
Campbell, I. C. G. (Principal Investigator)
1/06/15 → 31/05/18
Project: Research