High throughput discovery of protein variants using proteomics informed by transcriptomics

David Matthews, Conrad Bessant, Shyamasree Saha

Research output: Contribution to journalArticle (Academic Journal)peer-review

4 Citations (Scopus)
220 Downloads (Pure)


Proteomics informed by transcriptomics (PIT), in which proteomic MS/MS spectra are searched against open reading frames derived from de novo assembled transcripts, can reveal previously unknown translated genomic elements (TGEs). However, determining which TGEs are truly novel, which are variants of known proteins, and which are simply artefacts of poor sequence assembly, is challenging. We have designed and implemented an automated solution that classifies putative TGEs by comparing to reference proteome sequences. This allows large-scale identification of sequence polymorphisms, splice isoforms and novel TGEs supported by presence or absence of variant-specific peptide evidence. Unlike previously reported methods, ours does not require a catalogue of known variants, making it more applicable to non-model organisms. The method was validated on human PIT data, then applied to Mus musculus, Pteropus alecto and Aedes aegypti. Novel discoveries included 60 human protein isoforms, 32,392¬¬¬¬¬¬¬¬¬¬ polymorphisms in P. alecto, and TGEs with non-methionine start sites including tyrosine.
Original languageEnglish
JournalNucleic Acids Research
Early online date30 Apr 2018
Publication statusE-pub ahead of print - 30 Apr 2018


Dive into the research topics of 'High throughput discovery of protein variants using proteomics informed by transcriptomics'. Together they form a unique fingerprint.

Cite this