Skip to content

Impact of Variable RNA-Sequencing Depth on Gene Expression Signatures and Target Compound Robustness: Case Study Examining Brain Tumor (Glioma) Disease Progression

Research output: Contribution to journalArticle

  • Alexey Stupnikov
  • Paul G O'Reilly
  • Caitriona E McInerney
  • Aideen C Roddy
  • Philip D Dunne
  • Alan Gilmore
  • Hayley P Ellis
  • Tom Flannery
  • Estelle Healy
  • Stuart A McIntosh
  • Kienan Savage
  • Kathreena M Kurian
  • Frank Emmert-Streib
  • Kevin M Prise
  • Manuel Salto-Tellez
  • Darragh G McArt
Original languageEnglish
Article number14
Number of pages16
JournalJCO Precision Oncology
DateAccepted/In press - 26 Jul 2018
DatePublished (current) - 13 Sep 2018


Purpose: Gene expression profiling can uncover biologic mechanisms underlying disease and is important in drug development. RNA sequencing (RNA-seq) is routinely used to assess gene expression, but costs remain high. Sample multiplexing reduces RNAseq costs; however, multiplexed samples have lower cDNA sequencing depth, which can hinder accurate differential gene expression detection. The impact of sequencing depth alteration on RNA-seq-based downstream analyses such as gene expression connectivity mapping is not known, where this method is used to identify potential therapeutic compounds for repurposing.

Methods: In this study, published RNA-seq profiles from patients with brain tumor (glioma) were assembled into two disease progression gene signature contrasts for astrocytoma. Available treatments for glioma have limited effectiveness, rendering this a disease of poor clinical outcome. Gene signatures were subsampled to simulate sequencing alterations and analyzed in connectivity mapping to investigate target compound robustness.

Results: Data loss to gene signatures led to the loss, gain, and consistent identification of significant connections. The most accurate gene signature contrast with consistent patient gene expression profiles was more resilient to data loss and identified robust target compounds. Target compounds lost included candidate compounds of potential clinical utility in glioma (eg, suramin, dasatinib). Lost connections may have been linked to low-abundance genes in the gene signature that closely characterized the disease phenotype. Consistently identified connections may have been related to highly expressed abundant genes that were ever-present in gene signatures, despite data reductions. Potential noise surrounding findings included false-positive connections that were gained as a result of gene signature modification with data loss.

Conclusion: Findings highlight the necessity for gene signature accuracy for connectivity mapping, which should improve the clinical utility of future target compound discoveries.

Download statistics

No data available



  • Full-text PDF (final published version)

    Rights statement: This is the final published version of the article (version of record). It first appeared online via ASCO at . Please refer to any applicable terms of use of the publisher.

    Final published version, 2.49 MB, PDF document

    Licence: CC BY


View research connections

Related faculties, schools or groups