TY - CHAP
T1 - Uncertainty aware protein-level quantification and differential expression analysis of proteomics data with seaMass
AU - Phillips, Alexander
AU - Unwin, Richard D.
AU - Hubbard, Simon
AU - Dowsey , Andrew
PY - 2021
Y1 - 2021
N2 - seaMass is an R package for protein-level quantification, normalisation and differential expression analysis of proteomics mass spectrometry data after peptide identification, protein grouping and feature-level quantification. Using the concept of a blocked experimental design, seaMass can analyse all common discovery proteomics paradigms including label-free (e.g. Waters Progenesis input), SILAC (e.g. MaxQuant input), isotope labelling (e.g. SCIEX ProteinPilot iTraq and Thermo Pro- teomeDiscoverer TMT input) and data-independent acquisition (e.g. OpenSWATH- PyProphet input), and is able to scale to studies with hundreds of assays or more. By utilising hierarchical Bayesian modelling, seaMass assesses the quantification reliability of each feature and peptide across assays so that only those in consensus influence the resulting protein group quantification strongly. Similarly, unexplained variation in each individual assay is captured, providing both a metric for quality control and automatic down-weighting of suspect assays. To achieve this, each protein group-level quantification outputted by seaMass is accompanied by the standard deviation of its posterior uncertainty. seaMass integrates a flexible differential expression analysis subsystem with false discovery rate control based on the popular MCMCglmm package for Bayesian mixed-effects modelling, and also provides uncertainty-aware principal components analysis. We provide a description for using seaMass to perform an end-to-end analysis using a real dataset associated with a published clinical proteomics study.
AB - seaMass is an R package for protein-level quantification, normalisation and differential expression analysis of proteomics mass spectrometry data after peptide identification, protein grouping and feature-level quantification. Using the concept of a blocked experimental design, seaMass can analyse all common discovery proteomics paradigms including label-free (e.g. Waters Progenesis input), SILAC (e.g. MaxQuant input), isotope labelling (e.g. SCIEX ProteinPilot iTraq and Thermo Pro- teomeDiscoverer TMT input) and data-independent acquisition (e.g. OpenSWATH- PyProphet input), and is able to scale to studies with hundreds of assays or more. By utilising hierarchical Bayesian modelling, seaMass assesses the quantification reliability of each feature and peptide across assays so that only those in consensus influence the resulting protein group quantification strongly. Similarly, unexplained variation in each individual assay is captured, providing both a metric for quality control and automatic down-weighting of suspect assays. To achieve this, each protein group-level quantification outputted by seaMass is accompanied by the standard deviation of its posterior uncertainty. seaMass integrates a flexible differential expression analysis subsystem with false discovery rate control based on the popular MCMCglmm package for Bayesian mixed-effects modelling, and also provides uncertainty-aware principal components analysis. We provide a description for using seaMass to perform an end-to-end analysis using a real dataset associated with a published clinical proteomics study.
M3 - Chapter in a book
T3 - Methods in Molecular Biology
BT - Statistical methods for proteomics
PB - Springer
ER -