scispace - formally typeset

Journal ArticleDOI

MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data

23 Jul 2010-BMC Bioinformatics (BioMed Central)-Vol. 11, Iss: 1, pp 395-395

TL;DR: A new generation of a popular open-source data processing toolbox, MZmine 2 is introduced, suitable for processing large batches of data and has been applied to both targeted and non-targeted metabolomic analyses.
Abstract: Mass spectrometry (MS) coupled with online separation methods is commonly applied for differential and quantitative profiling of biological samples in metabolomic as well as proteomic research. Such approaches are used for systems biology, functional genomics, and biomarker discovery, among others. An ongoing challenge of these molecular profiling approaches, however, is the development of better data processing methods. Here we introduce a new generation of a popular open-source data processing toolbox, MZmine 2. A key concept of the MZmine 2 software design is the strict separation of core functionality and data processing modules, with emphasis on easy usability and support for high-resolution spectra processing. Data processing modules take advantage of embedded visualization tools, allowing for immediate previews of parameter settings. Newly introduced functionality includes the identification of peaks using online databases, MSn data support, improved isotope pattern support, scatter plot visualization, and a new method for peak list alignment based on the random sample consensus (RANSAC) algorithm. The performance of the RANSAC alignment was evaluated using synthetic datasets as well as actual experimental data, and the results were compared to those obtained using other alignment algorithms. MZmine 2 is freely available under a GNU GPL license and can be obtained from the project website at: http://mzmine.sourceforge.net/ . The current version of MZmine 2 is suitable for processing large batches of data and has been applied to both targeted and non-targeted metabolomic analyses.
Citations
More filters

Journal ArticleDOI
TL;DR: By completely re-implementing the MetaboAnalyst suite using the latest web framework technologies, the server has been able to substantially improve its performance, capacity and user interactivity.
Abstract: MetaboAnalyst (www.metaboanalyst.ca) is a web server designed to permit comprehensive metabolomic data analysis, visualization and interpretation. It supports a wide range of complex statistical calculations and high quality graphical rendering functions that require significant computational resources. First introduced in 2009, MetaboAnalyst has experienced more than a 50X growth in user traffic (>50 000 jobs processed each month). In order to keep up with the rapidly increasing computational demands and a growing number of requests to support translational and systems biology applications, we performed a substantial rewrite and major feature upgrade of the server. The result is MetaboAnalyst 3.0. By completely re-implementing the MetaboAnalyst suite using the latest web framework technologies, we have been able substantially improve its performance, capacity and user interactivity. Three new modules have also been added including: (i) a module for biomarker analysis based on the calculation of receiver operating characteristic curves; (ii) a module for sample size estimation and power analysis for improved planning of metabolomics studies and (iii) a module to support integrative pathway analysis for both genes and metabolites. In addition, popular features found in existing modules have been significantly enhanced by upgrading the graphical output, expanding the compound libraries and by adding support for more diverse organisms.

2,181 citations


Cites methods from "MZmine 2: Modular framework for pro..."

  • ...Users are encouraged to use other dedicated and freely available tools such as XCMSOnline (30) and MZmine (31) for such tasks....

    [...]


Journal ArticleDOI
Jasmine Chong1, Othman Soufan1, Carin Li2, Iurie Caraus1  +4 moreInstitutions (3)
TL;DR: The user interface of MetaboAnalyst 4.0 has been reengineered to provide a more modern look and feel, as well as to give more space and flexibility to introduce new functions.
Abstract: We present a new update to MetaboAnalyst (version 4.0) for comprehensive metabolomic data analysis, interpretation, and integration with other omics data. Since the last major update in 2015, MetaboAnalyst has continued to evolve based on user feedback and technological advancements in the field. For this year's update, four new key features have been added to MetaboAnalyst 4.0, including: (1) real-time R command tracking and display coupled with the release of a companion MetaboAnalystR package; (2) a MS Peaks to Pathways module for prediction of pathway activity from untargeted mass spectral data using the mummichog algorithm; (3) a Biomarker Meta-analysis module for robust biomarker identification through the combination of multiple metabolomic datasets and (4) a Network Explorer module for integrative analysis of metabolomics, metagenomics, and/or transcriptomics data. The user interface of MetaboAnalyst 4.0 has been reengineered to provide a more modern look and feel, as well as to give more space and flexibility to introduce new functions. The underlying knowledgebases (compound libraries, metabolite sets, and metabolic pathways) have also been updated based on the latest data from the Human Metabolome Database (HMDB). A Docker image of MetaboAnalyst is also available to facilitate download and local installation of MetaboAnalyst. MetaboAnalyst 4.0 is freely available at http://metaboanalyst.ca.

2,169 citations


Cites methods from "MZmine 2: Modular framework for pro..."

  • ...A number of excellent methods have been developed to deal with the first two tasks (26,27), which typically yield a list of ‘clean’ MS peaks....

    [...]


Journal ArticleDOI
TL;DR: This work reviews strategies for natural product screening that harness the recent technical advances that have reduced technical barriers and assess the use of genomic and metabolomic approaches to augment traditional methods of studying natural products.
Abstract: Natural products have been a rich source of compounds for drug discovery. However, their use has diminished in the past two decades, in part because of technical barriers to screening natural products in high-throughput assays against molecular targets. Here, we review strategies for natural product screening that harness the recent technical advances that have reduced these barriers. We also assess the use of genomic and metabolomic approaches to augment traditional methods of studying natural products, and highlight recent examples of natural products in antimicrobial drug discovery and as inhibitors of protein-protein interactions. The growing appreciation of functional assays and phenotypic screens may further contribute to a revival of interest in natural products for drug discovery.

1,465 citations


Journal ArticleDOI
TL;DR: While the intrinsic complexity of natural product-based drug discovery necessitates highly integrated interdisciplinary approaches, the reviewed scientific developments, recent technological advances, and research trends clearly indicate that natural products will be among the most important sources of new drugs in the future.
Abstract: Medicinal plants have historically proven their value as a source of molecules with therapeutic potential, and nowadays still represent an important pool for the identification of novel drug leads. In the past decades, pharmaceutical industry focused mainly on libraries of synthetic compounds as drug discovery source. They are comparably easy to produce and resupply, and demonstrate good compatibility with established high throughput screening (HTS) platforms. However, at the same time there has been a declining trend in the number of new drugs reaching the market, raising renewed scientific interest in drug discovery from natural sources, despite of its known challenges. In this survey, a brief outline of historical development is provided together with a comprehensive overview of used approaches and recent developments relevant to plant-derived natural product drug discovery. Associated challenges and major strengths of natural product-based drug discovery are critically discussed. A snapshot of the advanced plant-derived natural products that are currently in actively recruiting clinical trials is also presented. Importantly, the transition of a natural compound from a "screening hit" through a "drug lead" to a "marketed drug" is associated with increasingly challenging demands for compound amount, which often cannot be met by re-isolation from the respective plant sources. In this regard, existing alternatives for resupply are also discussed, including different biotechnology approaches and total organic synthesis. While the intrinsic complexity of natural product-based drug discovery necessitates highly integrated interdisciplinary approaches, the reviewed scientific developments, recent technological advances, and research trends clearly indicate that natural products will be among the most important sources of new drugs also in the future.

1,261 citations


Additional excerpts

  • ...Different open-source and commercial software packages are available for this task (Creek et al., 2011; Hartler et al., 2011; Jankevics et al., 2012; Kamleh et al., 2008; Kawaguchi et al., 2010; Pluskal et al., 2010; Smith et al., 2006)....

    [...]


Journal ArticleDOI
21 Jul 2016-Nature
TL;DR: It is shown how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals and suggested that microbial targets may have the potential to diminish insulin resistance and reduce the incidence of common metabolic and cardiovascular disorders.
Abstract: Insulin resistance is a forerunner state of ischaemic cardiovascular disease and type 2 diabetes. Here we show how the human gut microbiome impacts the serum metabolome and associates with insulin resistance in 277 non-diabetic Danish individuals. The serum metabolome of insulin-resistant individuals is characterized by increased levels of branched-chain amino acids (BCAAs), which correlate with a gut microbiome that has an enriched biosynthetic potential for BCAAs and is deprived of genes encoding bacterial inward transporters for these amino acids. Prevotella copri and Bacteroides vulgatus are identified as the main species driving the association between biosynthesis of BCAAs and insulin resistance, and in mice we demonstrate that P. copri can induce insulin resistance, aggravate glucose intolerance and augment circulating levels of BCAAs. Our findings suggest that microbial targets may have the potential to diminish insulin resistance and reduce the incidence of common metabolic and cardiovascular disorders.

937 citations


References
More filters

Journal ArticleDOI
Martin A. Fischler1, Robert C. Bolles1Institutions (1)
TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.
Abstract: A new paradigm, Random Sample Consensus (RANSAC), for fitting a model to experimental data is introduced. RANSAC is capable of interpreting/smoothing data containing a significant percentage of gross errors, and is thus ideally suited for applications in automated image analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of this paper describes the application of RANSAC to the Location Determination Problem (LDP): Given an image depicting a set of landmarks with known locations, determine that point in space from which the image was obtained. In response to a RANSAC requirement, new results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form. These results provide the basis for an automatic system that can solve the LDP under difficult viewing

20,503 citations


"MZmine 2: Modular framework for pro..." refers methods in this paper

  • ...The RANSAC algorithm [20] is a non-deterministic iterative algorithm that estimates parameters of a mathematical model from a set of observed data, which may include outliers....

    [...]


Journal ArticleDOI
Minoru Kanehisa1, Susumu Goto1Institutions (1)
Abstract: Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).

19,643 citations


Journal ArticleDOI
01 Dec 1999-Electrophoresis
TL;DR: A new computer program, Mascot, is presented, which integrates all three types of search for protein identification by searching a sequence database using mass spectrometry data, and the scoring algorithm is probability based.
Abstract: Several algorithms have been described in the literature for protein identification by searching a sequence database using mass spectrometry data. In some approaches, the experimental data are peptide molecular weights from the digestion of a protein by an enzyme. Other approaches use tandem mass spectrometry (MS/MS) data from one or more peptides. Still others combine mass data with amino acid sequence data. We present results from a new computer program, Mascot, which integrates all three types of search. The scoring algorithm is probability based, which has a number of advantages: (i) A simple rule can be used to judge whether a result is significant or not. This is particularly useful in guarding against false positives. (ii) Scores can be compared with those from other types of search, such as sequence homology. (iii) Search parameters can be readily optimised by iteration. The strengths and limitations of probability-based scoring are discussed, particularly in the context of high throughput, fully automated protein identification.

7,877 citations


"MZmine 2: Modular framework for pro..." refers methods in this paper

  • ...The flexibility of MZmine 2, however, allows for easy expansion to other dataset types such as gas chromatography-MS, as well as interoperation with popular proteomics search engines such as MASCOT....

    [...]

  • ...For proteomic applications, a module allowing identification of peptide peaks using the MASCOT [19] search engine and MS/MS spectra is under development....

    [...]


Journal ArticleDOI
Abstract: Locally weighted regression, or loess, is a way of estimating a regression surface through a multivariate smoothing procedure, fitting a function of the independent variables locally and in a moving fashion analogous to how a moving average is computed for a time series With local fitting we can estimate a much wider class of regression surfaces than with the usual classes of parametric functions, such as polynomials The goal of this article is to show, through applications, how loess can be used for three purposes: data exploration, diagnostic checking of parametric models, and providing a nonparametric regression surface Along the way, the following methodology is introduced: (a) a multivariate smoothing procedure that is an extension of univariate locally weighted regression; (b) statistical procedures that are analogous to those used in the least-squares fitting of parametric functions; (c) several graphical methods that are useful tools for understanding loess estimates and checking the a

4,803 citations


"MZmine 2: Modular framework for pro..." refers methods in this paper

  • ...Step 3: Apply the locally-weighted scatterplot smoothing (LOESS) method for regression [21] on all points in the model obtained with RANSAC....

    [...]


Journal ArticleDOI
Colin A. Smith1, Grace O'Maille, Elizabeth J. Want, Chuan Qin  +5 moreInstitutions (1)
TL;DR: METLIN includes an annotated list of known metabolite structural information that is easily cross-correlated with its catalogue of high-resolution Fourier transform mass spectrometry (FTMS) spectra, tandem mass spectrumetry (MS/MS) Spectra, and LC/MS data.
Abstract: Endogenous metabolites have gained increasing interest over the past 5 years largely for their implications in diagnostic and pharmaceutical biomarker discovery. METLIN (http://metlin.scripps.edu), a freely accessible web-based data repository, has been developed to assist in a broad array of metabolite research and to facilitate metabolite identification through mass analysis. METLINincludes an annotated list of known metabolite structural information that is easily cross-correlated with its catalogue of high-resolution Fourier transform mass spectrometry (FTMS) spectra, tandem mass spectrometry (MS/MS) spectra, and LC/MS data.

1,698 citations


"MZmine 2: Modular framework for pro..." refers methods in this paper

  • ...In MZmine 2, identification of peaks can be performed either by searching a custom database of m/z values and retention times, or by connecting to an online resource such as PubChem [15], KEGG [16], METLIN [17], or HMDB [18] directly from the MZmine 2 interface (Figure 4)....

    [...]

  • ...Smith CA, O’Maille G, Want EJ, Qin C, Trauger SA, Brandon TR, Custodio DE, Abagyan R, Siuzdak G: METLIN: a metabolite mass spectral database....

    [...]


Network Information
Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
202213
2021449
2020381
2019299
2018205
2017156