scispace - formally typeset
Search or ask a question
Journal ArticleDOI

mProphet: automated data processing and statistical validation for large-scale SRM experiments

TL;DR: In this article, the authors present mProphet, a fully automated system that computes accurate error rates for the identification of targeted peptides in SRM data sets and maximizes specificity and sensitivity by combining relevant features in the data into a statistical model.
Abstract: Selected reaction monitoring (SRM) is a targeted mass spectrometric method that is increasingly used in proteomics for the detection and quantification of sets of preselected proteins at high sensitivity, reproducibility and accuracy. Currently, data from SRM measurements are mostly evaluated subjectively by manual inspection on the basis of ad hoc criteria, precluding the consistent analysis of different data sets and an objective assessment of their error rates. Here we present mProphet, a fully automated system that computes accurate error rates for the identification of targeted peptides in SRM data sets and maximizes specificity and sensitivity by combining relevant features in the data into a statistical model.
Citations
More filters
Journal ArticleDOI
TL;DR: A new strategy that systematically queries sample sets for the presence and quantity of essentially any protein of interest is presented, using the information available in fragment ion spectral libraries to mine the complete fragment ion maps generated using a data-independent acquisition method.

2,358 citations


Cites background from "mProphet: automated data processing..."

  • ...Data analysis in targeted proteomics essentially consists of computing the likelihood that a group of transition signal traces are derived from the targeted peptide (9)....

    [...]

  • ...(9) or Skyline (33)) such as co-elution of the fragment ions...

    [...]

  • ...(9): co-elution, peak shape similarity, correlation of the relative intensities with reference spectra, correlation of the relative intensities with those of a spike-in reference peptide, co-elution with spiked-in reference, and peak shape similarity with spiked-in reference....

    [...]

Journal ArticleDOI
TL;DR: For a reversed-phase LC-MS/MS analysis of nine algal strains, MS-DIAL using an enriched LipidBlast library identified 1,023 lipid compounds, highlighting the chemotaxonomic relationships between theAlgal strains.
Abstract: Data-independent acquisition (DIA) in liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) provides comprehensive untargeted acquisition of molecular data. We provide an open-source software pipeline, which we call MS-DIAL, for DIA-based identification and quantification of small molecules by mass spectral deconvolution. For a reversed-phase LC-MS/MS analysis of nine algal strains, MS-DIAL using an enriched LipidBlast library identified 1,023 lipid compounds, highlighting the chemotaxonomic relationships between the algal strains.

1,609 citations

Journal ArticleDOI
TL;DR: How SRM is applied in proteomics is described, recent advances are reviewed, present selected applications and a perspective on the future of this powerful technology is provided.
Abstract: Selected reaction monitoring (SRM) is a targeted mass spectrometry technique that is emerging in the field of proteomics as a complement to untargeted shotgun methods. SRM is particularly useful when predetermined sets of proteins, such as those constituting cellular networks or sets of candidate biomarkers, need to be measured across multiple samples in a consistent, reproducible and quantitatively precise manner. Here we describe how SRM is applied in proteomics, review recent advances, present selected applications and provide a perspective on the future of this powerful technology.

1,187 citations

Journal ArticleDOI
TL;DR: This work proposes a new targeted proteomics paradigm centered on the use of next generation, quadrupole-equipped high resolution and accurate mass instruments: parallel reaction monitoring (PRM), and suggests that PRM will be a promising new addition to the quantitative proteomics toolbox.

993 citations


Cites background from "mProphet: automated data processing..."

  • ...Through use of theoretical calculations, the high specificity of high mass accuracy data lays the groundwork for a generalized and statistically sound detection algorithm for reaction monitoring experiments that incorporates probability of correctness measures, as well as obviates the need for manual curation, ad hoc detection criteria (42), “decoy transitions” at the measurement-level (66), and the potentially high level of human intervention (and error) that comes with such detection strategies....

    [...]

Journal ArticleDOI
17 Jun 2016-Science
TL;DR: It is demonstrated the importance of the amount of the oxidized form of cellular nicotinamide adenine dinucleotide (NAD+) and its effect on mitochondrial activity as a pivotal switch to modulate muscle SC (MuSC) senescence and it is demonstrated that NR delays senescences of neural SCs and melanocyteSCs and increases mouse life span.
Abstract: Adult stem cells (SCs) are essential for tissue maintenance and regeneration yet are susceptible to senescence during aging. We demonstrate the importance of the amount of the oxidized form of cellular nicotinamide adenine dinucleotide (NAD(+)) and its effect on mitochondrial activity as a pivotal switch to modulate muscle SC (MuSC) senescence. Treatment with the NAD(+) precursor nicotinamide riboside (NR) induced the mitochondrial unfolded protein response and synthesis of prohibitin proteins, and this rejuvenated MuSCs in aged mice. NR also prevented MuSC senescence in the mdx (C57BL/10ScSn-Dmd(mdx)/J) mouse model of muscular dystrophy. We furthermore demonstrate that NR delays senescence of neural SCs and melanocyte SCs and increases mouse life span. Strategies that conserve cellular NAD(+) may reprogram dysfunctional SCs and improve life span in mammals.

833 citations

References
More filters
Journal ArticleDOI
TL;DR: This work proposes an approach to measuring statistical significance in genomewide studies based on the concept of the false discovery rate, which offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted.
Abstract: With the increase in genomewide experiments and the sequencing of multiple genomes, the analysis of large data sets has become commonplace in biology. It is often the case that thousands of features in a genomewide data set are tested against some null hypothesis, where a number of features are expected to be significant. Here we propose an approach to measuring statistical significance in these genomewide studies based on the concept of the false discovery rate. This approach offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted. In doing so, a measure of statistical significance called the q value is associated with each tested feature. The q value is similar to the well known p value, except it is a measure of significance in terms of the false discovery rate rather than the false positive rate. Our approach avoids a flood of false positive results, while offering a more liberal criterion than what has been used in genome scans for linkage.

9,239 citations


"mProphet: automated data processing..." refers methods in this paper

  • ...The weight of the false population is estimated using the parameters of the null distribution and part of the target data point...

    [...]

Journal ArticleDOI
TL;DR: SILAC is a simple, inexpensive, and accurate procedure that can be used as a quantitative proteomic approach in any cell culture system and is applied to the relative quantitation of changes in protein expression during the process of muscle cell differentiation.

5,653 citations


"mProphet: automated data processing..." refers methods in this paper

  • ...The sample consisted of a tryptic digest of an extract of the human u2os cell line that had been isotopically labeled with SILAC mediu...

    [...]

Journal ArticleDOI
TL;DR: A statistical model is presented to estimate the accuracy of peptide assignments to tandem mass (MS/MS) spectra made by database search applications such as SEQUEST, demonstrating that the computed probabilities are accurate and have high power to discriminate between correctly and incorrectly assigned peptides.
Abstract: We present a statistical model to estimate the accuracy of peptide assignments to tandem mass (MS/MS) spectra made by database search applications such as SEQUEST. Employing the expectation maximization algorithm, the analysis learns to distinguish correct from incorrect database search results, computing probabilities that peptide assignments to spectra are correct based upon database search scores and the number of tryptic termini of peptides. Using SEQUEST search results for spectra generated from a sample of known protein components, we demonstrate that the computed probabilities are accurate and have high power to discriminate between correctly and incorrectly assigned peptides. This analysis makes it possible to filter large volumes of MS/MS database search results with predictable false identification error rates and can serve as a common standard by which the results of different research groups are compared.

4,861 citations


"mProphet: automated data processing..." refers methods in this paper

  • ...mProphet estimates a FDR, which can be used to filter the data according to a user-defined quality; this is analogous to the filtering of tandem mass spectrometry (MS) identificatio...

    [...]

Journal ArticleDOI
TL;DR: A statistical model is presented for computing probabilities that proteins are present in a sample on the basis of peptides assigned to tandem mass (MS/MS) spectra acquired from a proteolytic digest of the sample, and it is shown to produce probabilities that are accurate and have high power to discriminate correct from incorrect protein identifications.
Abstract: A statistical model is presented for computing probabilities that proteins are present in a sample on the basis of peptides assigned to tandem mass (MS/MS) spectra acquired from a proteolytic digest of the sample. Peptides that correspond to more than a single protein in the sequence database are apportioned among all corresponding proteins, and a minimal protein list sufficient to account for the observed peptide assignments is derived using the expectation−maximization algorithm. Using peptide assignments to spectra generated from a sample of 18 purified proteins, as well as complex H. influenzae and Halobacterium samples, the model is shown to produce probabilities that are accurate and have high power to discriminate correct from incorrect protein identifications. This method allows filtering of large-scale proteomics data sets with predictable sensitivity and false positive identification error rates. Fast, consistent, and transparent, it provides a standard for publishing large-scale protein identif...

4,544 citations


Additional excerpts

  • ...This is analogous to the situation in early shotgun proteomics, in which the tools to control the quality of the identification were developed long after the principal measurement method...

    [...]

Journal ArticleDOI
TL;DR: The Skyline user interface simplifies the development of mass spectrometer methods and the analysis of data from targeted proteomics experiments performed using selected reaction monitoring (SRM).
Abstract: Summary: Skyline is a Windows client application for targeted proteomics method creation and quantitative data analysis. It is open source and freely available for academic and commercial use. The Skyline user interface simplifies the development of mass spectrometer methods and the analysis of data from targeted proteomics experiments performed using selected reaction monitoring (SRM). Skyline supports using and creating MS/MS spectral libraries from a wide variety of sources to choose SRM filters and verify results based on previously observed ion trap data. Skyline exports transition lists to and imports the native output files from Agilent, Applied Biosystems, Thermo Fisher Scientific and Waters triple quadrupole instruments, seamlessly connecting mass spectrometer output back to the experimental design document. The fast and compact Skyline file format is easily shared, even for experiments requiring many sample injections. A rich array of graphs displays results and provides powerful tools for inspecting data integrity as data are acquired, helping instrument operators to identify problems early. The Skyline dynamic report designer exports tabular data from the Skyline document model for in-depth analysis with common statistical tools. Availability: Single-click, self-updating web installation is available at http://proteome.gs.washington.edu/software/skyline. This web site also provides access to instructional videos, a support board, an issues list and a link to the source code project.

3,794 citations

Related Papers (5)