seaMass is an R package for protein-level quantification, normalization, and differential expression analysis of proteomics mass spectrometry data after peptide identification, protein grouping, and feature-level quantification. Using the concept of a blocked experimental design, seaMass can analyze all common discovery proteomics paradigms, including label-free (e.g., Waters Progenesis input), SILAC (e.g., MaxQuant input), isotope labelling (e.g., SCIEX ProteinPilot iTraq and Thermo ProteomeDiscoverer TMT input), and data-independent acquisition (e.g., OpenSWATH-PyProphet input), and is able to scale to study with hundreds of assays or more. By utilizing hierarchical Bayesian modelling, seaMass assesses the quantification reliability of each feature and peptide across assays so that only those in consensus influence the resulting protein group quantification strongly. Similarly, unexplained variation in each individual assay is captured, providing both a metric for quality control and automatic down-weighting of suspect assays. To achieve this, each protein group-level quantification outputted by seaMass is accompanied by the standard deviation of its posterior uncertainty. Moreover, seaMass integrates a flexible differential expression analysis subsystem with false discovery rate control based on the popular MCMCglmm package for Bayesian mixed-effects modelling, and also provides uncertainty-aware principal components analysis. We provide a description for using seaMass to perform an end-to-end analysis using a real dataset associated with a published clinical proteomics study.
|Title of host publication||Statistical Analysis of Proteomic Data|
|Number of pages||22|
|Publication status||Published - 25 Aug 2021|
|Name||Methods in Molecular Biology|
Bibliographical noteFunding Information:
The development of seaMass was supported by BBSRC grants BB/M024954/2 and BB/R021430/1, as well as MRC grant MR/N028457/1.
© 2023, The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.
- Bayesian modelling
- Differential expression analysis
- False discovery rate control
- Protein quantification
- Quantitative proteomics