scispace - formally typeset
Open AccessJournal ArticleDOI

Feature-based molecular networking in the GNPS analysis environment.

Louis-Félix Nothias, +87 more
- 24 Aug 2020 - 
- Vol. 17, Iss: 9, pp 905-908
TLDR
Feature-based molecular networking (FBMN) as discussed by the authors is an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools.
Abstract
Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.

read more

Content maybe subject to copyright    Report

Feature-based molecular networking in the GNPS analysis environment
Nature Methods : techniques for life scientists and chemists
Nothias, Louis Félix; Petras, Daniel; Schmid, Robin; Dührkop, Kai; Rainer, Johannes et al
https://doi.org/10.1038/s41592-020-0933-6
This article is made publicly available in the institutional repository of Wageningen University and Research, under the
terms of article 25fa of the Dutch Copyright Act, also known as the Amendment Taverne. This has been done with explicit
consent by the author.
Article 25fa states that the author of a short scientific work funded either wholly or partially by Dutch public funds is
entitled to make that work publicly available for no consideration following a reasonable period of time after the work was
first published, provided that clear reference is made to the source of the first publication of the work.
This publication is distributed under The Association of Universities in the Netherlands (VSNU) 'Article 25fa
implementation' project. In this project research outputs of researchers employed by Dutch Universities that comply with the
legal requirements of Article 25fa of the Dutch Copyright Act are distributed online and free of cost or other barriers in
institutional repositories. Research outputs are distributed six months after their first online publication in the original
published version and with proper attribution to the source of the original publication.
You are permitted to download and use the publication for personal purposes. All rights remain with the author(s) and / or
copyright owner(s) of this work. Any use of the publication or parts of it other than authorised under article 25fa of the
Dutch Copyright act is prohibited. Wageningen University & Research and the author(s) of this publication shall not be
held responsible or liable for any damages resulting from your (re)use of this publication.
For questions regarding the public availability of this article please contact openscience.library@wur.nl

Brief CommuniCation
https://doi.org/10.1038/s41592-020-0933-6
Molecular networking has become a key method to visual-
ize and annotate the chemical space in non-targeted mass
spectrometry data. We present feature-based molecular net-
working (FBMN) as an analysis method in the Global Natural
Products Social Molecular Networking (GNPS) infrastructure
that builds on chromatographic feature detection and align-
ment tools. FBMN enables quantitative analysis and resolu-
tion of isomers, including from ion mobility spectrometry.
Since its introduction in 2012 (ref.
1
), molecular networking has
become an essential bioinformatics tool to visualize and annotate
non-targeted mass spectrometry (MS) data
2,3
. Molecular network-
ing, uniquely, goes beyond spectral matching against reference
spectra, by aligning experimental spectra against one another
and connecting related molecules by their spectral similarity. In a
molecular network, related molecules are referred to as a ‘molecular
family’, differing by simple transformations such as glycosylation,
alkylation and oxidation/reduction. Molecular networking became
publicly accessible in 2013 through the initial release of GNPS, a
web-enabled MS knowledge capture and analysis platform (https://
gnps.ucsd.edu/)
4
, and has been widely applied in MS-based metab-
olomics to aid in the annotation of molecular families from their
fragmentation spectra (MS
2
).
Powered by more than 3,000 CPU cores at the University of
California San Diego Center for Computational Mass Spectrometry
and the MassIVE data repository, GNPS has provided researchers
from more than 150 countries with the ability to perform molecu
-
lar networking. To build upon the success of the first molecular
networking method referred to as ‘classical’ molecular network
-
ing (classical MN), which is based on the MS-Cluster algorithm
5
,
we introduce a complementary tool named FBMN. FBMN lever
-
ages the capability of well-established MS processing software
and improves upon classical MN by incorporating not only MS
1
information, such as isotope patterns and retention time, but also
ion mobility separation when performed. By relying on processed
spectral information, molecular networks obtained with FBMN
can (1) distinguish isomers producing similar MS
2
spectra that are
resolved by chromatographic or ion mobility separation, which
may have remained hidden in classical MN, (2) facilitate spectral
annotation, and (3) incorporate relative quantitative information
that enables robust downstream metabolomics statistical analy
-
sis. Whereas users of the classical MN would have had to perform
molecular networking and MS
1
analysis separately before per-
forming a cumbersome linking of the outputs, the FBMN method
accepts the output of feature detection and alignment tools, mak
-
ing them directly compatible with annotation tools and the entirety
of the analysis pipeline.
To fully utilize the MS
1
and MS
2
data collected during a
non-targeted metabolomics experiment in liquid chromatography
coupled to tandem MS (LC–MS
2
), we have created an online and
streamlined workflow (Fig. 1a) infrastructure that supports the out
-
puts of feature detection and alignment tools for FBMN analysis
(https://ccms-ucsd.github.io/GNPSDocumentation/featurebased
-
molecularnetworking/), including the standard output format for
analysis of small molecules (mzTab-M)
6
. The diversity of supported
software, each offering different functionalities and modules, serves
experimentalists, bioinformaticians, and software developers.
FBMN is the second most commonly used analysis tool within the
GNPS environment (Fig. 1b), with more than 6,767 jobs performed
Feature-based molecular networking in the GNPS
analysis environment
Louis-Félix Nothias
1,2,45
, Daniel Petras
1,2,3,45
, Robin Schmid
4,45
, Kai Dührkop
5
, Johannes Rainer
6
,
Abinesh Sarvepalli
1,2
, Ivan Protsyuk
7
, Madeleine Ernst
1,2,8
, Hiroshi Tsugawa
9,10
, Markus Fleischauer
5
,
Fabian Aicheler
11,12
, Alexander A. Aksenov
1,2
, Oliver Alka
11,12
, Pierre-Marie Allard
13
, Aiko Barsch
14
,
Xavier Cachet
15
, Andres Mauricio Caraballo-Rodriguez
1,2
, Ricardo R. Da Silva
2,16
, Tam Dang
2,17
,
Neha Garg
18
, Julia M. Gauglitz
1,2
, Alexey Gurevich
19
, Giorgis Isaac
20
, Alan K. Jarmusch
1,2
,
Zdeněk Kameník
21
, Kyo Bin Kang
1,2,22
, Nikolas Kessler
14
, Irina Koester
1,2,3
, Ansgar Korf
4
,
Audrey Le Gouellec
23
, Marcus Ludwig
5
, Christian Martin H.
24
, Laura-Isobel McCall
25
,
Jonathan McSayles
26
, Sven W. Meyer
14
, Hosein Mohimani
27
, Mustafa Morsy
28
, Oriane Moyne
23,29
,
Steffen Neumann
30,31
, Heiko Neuweger
14
, Ngoc Hung Nguyen
1,2
, Melissa Nothias-Esposito
1,2
,
Julien Paolini
32
, Vanessa V. Phelan
33
, Tomáš Pluskal
34
, Robert A. Quinn
35
, Simon Rogers
36
,
Bindesh Shrestha
20
, Anupriya Tripathi
1,29,37
, Justin J. J. vander Hooft
1,2,38
, Fernando Vargas
1,2
,
Kelly C. Weldon
1,2,39
, Michael Witting
40
, Heejung Yang
41
, Zheng Zhang
1,2
, Florian Zubeil
14
,
Oliver Kohlbacher
11,12,42,43
, Sebastian Böcker
5
, Theodore Alexandrov
1,2,7
, Nuno Bandeira
1,2,44
,
Mingxun Wang
1,2,44
 ✉
and Pieter C. Dorrestein
1,2,29,39
 ✉
A full list of affiliations appears at the end of the paper.
NATURE METHODS | VOL 17 | SEPTEMBER 2020 | 905–908 | www.nature.com/naturemethods
905

Brief CommuniCation
Nature MethodS
in 2019, and has already been used in more than 80 publications
since its introduction in November 2017.
The molecular networks generated with FBMN enable the effi
-
cient visualization and annotation of isomers in LC–MS
2
datasets,
as demonstrated below with LC–MS
2
data from a drug discovery
project from Euphorbia plant extract
7
(Fig. 2a,b) and the detec-
tion of human microbiome-derived lipids belonging to the com-
mendamide family
8
, detected in fecal samples from the American
Gut Project (AGP
9
; a crowd-sourced citizen-science microbiome
project; Fig. 2c,d). In both cases, FBMN resolved positional iso
-
mers/stereoisomers in the molecular networks that have simi-
lar MS
2
spectra but distinct retention times, that would not have
been resolved with classical MN. The uses of FBMN facilitated the
discovery of antiviral compounds
7
(Fig. 2c), and the annotation
of commendamide isomers
9
and of a putative new derivative, the
N-(dehydrohexadecanoyl)glycine (Fig. 2d).
In non-targeted LC–MS
2
data acquisition, the same precur-
sor ion is frequently fragmented multiple times during chromato-
graphic elution. While MS-Cluster is often able to cluster these
spectra into one single node in classical MN, there are cases where
it will fail and produce multiple nodes representing the same com
-
pound. For example, this can happen for compounds producing
mostly low-intensity fragment ions or for chimeric spectra result
-
ing from coeluting isobaric ions isolated and fragmented together.
With FBMN, a singular representative MS
2
spectrum is selected for
the LC–MS feature (defined as the detected ion signal for an elut
-
ing molecule)
10
. The benefit of using FBMN in such instances can
be illustrated with the metal chelating agent EDTA in the LC–MS
2
analysis of plasma samples (Fig. 2e), in which it was used as an anti
-
coagulant agent. Classical MN resulted in 13 duplicated nodes with
identical precursor m/z values in one molecular family, 10 of which
had spectral library matches to EDTA reference MS
2
data (Fig. 2e,f).
On the contrary, FBMN displayed a unique representative MS
2
spectrum that matched EDTA spectra in the library. The reduction
of redundancy within the resulting molecular network simplifies
the discovery of structurally related compounds.
While classical MN uses the spectral count or the summed pre
-
cursor ion count, FBMN uses the LC–MS feature abundance (peak
area or peak height), resulting in a more accurate estimation of the
relative ion intensity. FBMN simplifies and aggregates data by includ
-
ing relative quantitative information and other MS
1
derived informa-
tion (that is, precursor isotope patterns, adduct annotation). FBMN
enables robust statistical analysis by providing accurate relative ion
intensities across a dataset. This capacity is demonstrated with a
serial dilution series dataset of the NIST 1950 serum reference stan
-
dard, containing 150 spiked standards. Here the LC–MS
2
data were
processed with MZmine
11
or OpenMS
10
for FBMN (Fig. 2g,h). A lin-
ear regression analysis was used to evaluate the relative quantification
between classical MN and FBMN. Figure 2h shows that for FBMN,
relative quantification had a coefficient of determination (R
2
) value
distribution mostly above 0.7, whereas this was not found when the
precursor ion abundance was obtained from classical MN via spec
-
tral counts (Fig. 2g). The improved distribution of correlation coef-
ficients toward 1 indicates a more linear response between molecular
concentration and ion abundance, which improves the accuracy and
precision of the quantification of results. In addition, FBMN facilitates
2019
20182017
FBMN
*
*
*
Ionic count
Retention time
*
*
*
ISOMERS
[A1+H]
+
[B+H]
+
[A2+H]
+
[A1+H]
+
isotopologue [B+NH
4
]
+
*
*
Ionic count
Retention time
*
*
*
*
*
*
*
*
*
*
*
FEATURE
DETECTION
FEATURE
GROUPING
FEATURE
ALIGNMENT
SAMPLE I
LC–MS
2
METABOLOMICS
SAMPLE II
SAMPLE I
Ionic count
SAMPLE II
*
*
*
*
MS
2
spectra
1
2
3
*
MS
2
spectra
Molecular networking jobs on GNPS
Key events in the development of FBMN
Number of jobs per month
a b
Year
0
500
1,000
1,500
2,000
2,500
3,000
20202016
Classical MN
FBMN
Classical MN
CLUSTER
CLUSTER
CLUSTER
CLUSTER
4
[B+?]
+
[B+?]
+
[A2+?]
+
[A1+H]
+
[A1+?]
+
MS
2
SPECTRAL CLUSTERING
1
3
2
4
MOLECULAR
NETWORKING
4
[A1+H]
+
[B+H]
+
MOLECULAR
NETWORKING
[A2+H]
+
[B+NH4]
+
1
4
2
3
SAMPLE I
SAMPLE II
PEAK AREA
0
200
400
600
800
1,000
1,200
20192018 2020
Year
Video tutorial on FBMN
with MZmine
Documentation for
FBMN with MZmine
Support for other tools
and documentation release
Support for mzTab-M
Release of the preprint
Number of jobs per month
Ionic count
m/z
EXPORT RESULTS
- .MGF & .TXT
FILES
OR
-
MZ
ML &
MZ
TAB-M
FILES
GNPS
GNPS
E
XPORT RESULTS
.MGF
FILE
[A1+?]
+
[A2+?]
+
[B+?]
+
[B+?]
+
[A1+?]
+
3
4
2
1
SPECTRAL COUNT
SAMPLE I
SAMPLE II
Fig. 1 | Methods for the generation of molecular networks from non-targeted MS data with the GNPS web platform. a, Two methods exist for the
generation of molecular networks on the GNPS web platform: classical MN and FBMN. For both methods, the MS data files are converted to the
mzML format using tools such as Proteowizard MSConvert
21
. The classical MN method runs entirely on the GNPS platform, in which MS
2
spectra are
clustered with MS-Cluster and the consensus MS
2
spectra obtained are used for molecular network generation. For FBMN, the user first applies a feature
detection and alignment tool to process the LC–MS
2
data (such as MZmine, MS-DIAL, XCMS, OpenMS, Progenesis QI or MetaboScape) instead of using
MS-Cluster (classical MN) on GNPS. Results are then exported as a feature quantification table (.TXT format) and MS
2
spectral summary (.MGF format)
or an mzTab-M file and uploaded to the GNPS web platform for molecular networking analysis with the FBMN workflow. b, Graphs show the number of
molecular networking jobs performed on GNPS. Top: the number of classical MN and FBMN jobs since 2016. Bottom: the number of FBMN jobs since its
inception and key events accelerating its use.
NATURE METHODS | VOL 17 | SEPTEMBER 2020 | 905–908 | www.nature.com/naturemethods
906

Brief CommuniCation
Nature MethodS
the direct application of existing statistical, visualization and annota-
tion tools, such as QIIME2 (ref.
12
), MetaboAnalyst
13
, ’ili
14
, SIRIUS
15
,
DEREPLICATOR
16
, MS2LDA
17
and Qemistree
18
.
FBMN further enables the creation of molecular networks from
ion mobility spectrometry (IMS) experiments coupled within
LC–MS
2
analysis. As an orthogonal separation method, the use
of ion mobility offers additional resolving power to differentiate
isomeric ions in the molecular network based on their collisional
cross-section. The integration of ion mobility with FBMN on GNPS
can currently be performed with MetaboScape, MS-DIAL
19
and
Progenesis QI. An example of such isomer separation using trapped
IMS (TIMS) coupled to LC–MS
2
is shown in Supplementary Fig. 1.
Available on the GNPS web platform at https://gnps.ucsd.edu/,
FBMN is ideally suited for advanced molecular networking anal
-
ysis, enabling the characterization of isomers, incorporation of
relative quantification and integration of ion mobility data. FBMN
analysis is recommended for a single LC–MS
2
metabolomics study,
but its applicability is limited when applied across multiple stud
-
ies due to different experimental conditions and possible batch
effects. Moreover, the use of FBMN for the analysis of very large
datasets (containing several thousand samples) is limited by the
scalability of most feature detection and alignment software tools.
Thus, while FBMN offers an improvement upon many aspects of
molecular networking analysis, classical MN remains essential for
meta-analysis of large-scale datasets and is convenient for rapid
analysis of LC–MS
2
data with less user-defined parameters; one
important aspect of molecular networks obtained with FBMN is the
use of adequate processing steps and parameters, which otherwise
could negatively affect the resulting molecular networks. To facili
-
tate dissemination and education of the FBMN method and the
supported processing software, we have created detailed tutorials
and step-by-step instructions, available at https://ccms-ucsd.github.
io/GNPSDocumentation/featurebasedmolecularnetworking/.
The FBMN workflow not only offers automated spectral library
search and spectral library entry curation, but is also integrated with
other annotation tools available on the GNPS environment, such as
MASST
20
, while promoting data analysis reproducibility by saving
the FBMN jobs on the user’s private online workspace. The GNPS
environment conveniently enables the user to evaluate different
parameters and share the results via a URL for publication.
0
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
293.172
293.205
292.118
293.070
387.15
I
I
I
II
II
II
III
III
III
IV
IV
IV
V
V
V
VI
VI
VI
VII
VII
IX
VIII
IX
X
100 200 300
100 200 300
100 200 300
100 200 300
100 200 300
100 200 300
Classical MN
EIC (EDTA,
m
/
z
293.205)
for m/z 293.205
0
2
4
6
8
10
X
VIII
293.098
293.098
293.099
293.098
293.098
293.099
293.098
293.098
293.098
Molecular network of EDTA
e
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
Time
(min)
Time
(min)
100
200 300
160.06
132.07
Classic molecular networking
Feature-based molecular networking
C
D
Classic molecular networking
Feature-based molecular networking
C
D
Classic molecular networking
Feature-based molecular networking
C
D
293.10
189.087
236.095
217.082
160.060
235.092
249.107
205.118
991.127
930.190
639.107
279.118
638.099
983.097
114.055
293.209
163.122
295.102
387.150
159.092
132.080
286.103
287.054
293.098
247.093
Consensus MS
2
for
I (m/z 293.205)
FBMN
0
2
4
6
8
10
EIC (EDTA,
m/z 293.098)
I
I
114.06
EDTA, [M+H]
+
m/z 293.098
EDTA in-source
fragments
f
I'
EIC for m/z 589.31
Clustered MS
2
spectrum
for I' (m/z 589.31)
Cluster I'
m/z 589.31
641.344
641.344
549.245
641.343
571.302
573.318
571.302
573.318
571.303
571.302
641.344
573.317
643.359
591.329
615.328
589.313
633.339
575.297
641.344
563.297
883.532
24 26 28 30
32
0
2
4
6
8
10
200 400 600
423.21
335.17
501.26
295.17
Classical MN
Molecular network of 4-deoxyphorbol esters
589.311
a
VI
III
VI
V
Consensus MS
2
for m/z 589.31
519.269
441.225
519.269
573.319
573.318
571.301
571.301
563.296
589.312
591.327
591.327
589.312
641.344
589.311
I
II
II
II
IV
IV
III
V
IV
20 0 40 0 60 0
20 0 40 0 60 0
m/ z
20 0 40 0 60 0
20 0 40 0 60 0
20 0 40 0 60 0
0
2
4
6
8
10
EIC for m/z 589.31
589.311
589.311
24
26
28 30
32
VII
VI
I
I
m/ z
m
/z
m
/z
m
/z
FBMN
Molecular network of 4-deoxyphorbol esters
b
554.478
568.494
E
F
589.313
330.264
Commendamide,
m/z 330.26, [M+H]
+
N-(3-hydroxyheptadecanoyl)glycine
m/z 344.28, [M+H]
+
554.478
589.313
589.313
330.264
589.313
330.264
589.313
344.281
589.313
344.281
589.313
344.281
589.313
344.281
589.313
312.256
Isomers of
N-(hydroxyhexadecanoyl)glycine
N-(dehydrohexadecanoyl)glycine,
[M+H]
+
Classical MN FBMN
c d
Classical MN
g
FBMN
h
Distribution of the R
2
for OLS
linear regression analysis between the observed and expected ion abundances
4.0 4.2 4.4
4.8
0
5
10
5.04.63.8
4.0 4.2 4.4
0
10
4.8 5.04.6
3.8
Time
(min)
Time
(min)
I
II
III
I
II
III
IV
V
VI
IV
V
VI
[M+Na]
+
, m/z 589.313
[M+Na]
+
, m/z 589.313
(six isomers I-VI)
589.311
VII
Molecular network of EDTA
Molecular network of N-acyl amides
Molecular network of N-acyl amides
EDTA, [M+H]
+
m/z 293.098
Time
(min)
EIC for m/z 330.26 and 344.28
EIC for m/z 330.26 and 344.28
330.264
Relative intensity (AU)
Relative intensity (AU)
Relative intensity (AU)
Relative intensity (AU)
Relative intensity (AU)
Relative intensity (AU)
5
589.311
V
III
20 0 40 0 60 0
m/ z
VII
20 0 40 0 60 0
m/ z
80
60
40
20
0
0.2 0.4 0.6 0.8 1.0
0.2 0.4 0.6 0.8 1.0
2
8
6
4
10
0
0.2 0.4 0.6 0.8 1.0
0
0.2 0.4 0.6 0.8 1.0
0
2
6
4
8
5,000
10,000
15,000
Number of nodesNumber of nodes
Number of nodes
Number of nodes
Coefficient of determination (R
2
)
Coefficient of determination (R
2
)
All nodes
All nodes
Reference
compounds
Reference
compounds
Clustered
MS
2
spectra
Consensus
MS
2
spectra
m/z
m/z
Relative intensity
0
1
Relative intensity
0
1
R = -C
5
H
7
O,
-C
10
H
13
O,-H
OH
O
OR
OR
RO
OH
H
N
C
13
H
27
O
O
OH
OH
H
N
C
14
H
29
O
O
OH
Isomers of
N-(hydroxyheptadecanoyl)glycine
OH
H
N
C
14
H
29
O
O
OH
OH
O
N
N
HO
O OH
O
HO
O
293.098
Time
(min)
Fig. 2 | Comparisons of classical MN and FBMN. ah, In these examples, the node size corresponds to the relative spectral count in classical MN (orange
boxes, left) or to the sum of LC–MS peak area in FBMN (blue boxes, right); diamond shape nodes are spectra annotated by spectral library matching; the
edge color gradient indicates the spectral similarity degree (lighter colors correspond to less similarity). a,b, Results from classical MN with LC–MS
2
data
of Euphorbia dendroides plant samples (14 samples; n=1 LC–MS
2
experiment per sample); classical MN resulted in one node for the ion at m/z 589.313 (a),
while FBMN was able to detect seven isomers (b). AU, arbitrary units; EIC, extracted ion chromatogram. c,d, Classical MN with data from the AGP (201
samples; n=1 LC–MS
2
experiment per sample) showed two different N-acyl amides (c), while the use of FBMN allowed the annotation of three different
isomers per N-acyl amides (d). e,f, Classical MN (e) and FBMN (f) were used to analyze the network of EDTA in plasma (373 samples; n=1 LC–MS
2
experiment per sample). By merging MS
2
spectra of EDTA eluting over 2.5min into one representative MS
2
spectrum, FBMN recovered the molecular
similarity of in-source fragments observed for EDTA. g,h, Evaluation of quantitative performance using multiple dilutions of a reference serum sample (5
dilutions; n=3 LC–MS
2
experiments per sample). The plots show the distribution of the coefficient of determination (R
2
) from the ordinary least squares
(OLS) linear regression analysis between the observed and expected relative ion abundances for molecular network nodes in classical MN (g) or FBMN
(h). The upper charts present the distribution of the R
2
for the network nodes with classical MN (n=3,367) and FBMN (n=877), and the lower charts
show the R
2
distribution for the annotated reference compounds with classical MN (n=49) and FBMN (n=54).
NATURE METHODS | VOL 17 | SEPTEMBER 2020 | 905–908 | www.nature.com/naturemethods
907

Brief CommuniCation
Nature MethodS
Online content
Any methods, additional references, Nature Research report-
ing summaries, source data, extended data, supplementary infor-
mation, acknowledgements, peer review information; details of
author contributions and competing interests; and statements of
data and code availability are available at https://doi.org/10.1038/
s41592-020-0933-6.
Received: 18 October 2019; Accepted: 22 July 2020;
Published online: 24 August 2020
References
1. Watrous, J. etal. Mass spectral molecular networking of living microbial
colonies. Proc. Natl Acad. Sci. USA 109, E1743–E1752 (2012).
2. Quinn, R. A. etal. Molecular networking as a drug discovery, drug
metabolism and precision medicine strategy. Trends Pharmacol. Sci. 38,
143–154 (2017).
3. Traxler, M. F. & Kolter, R. A massively spectacular view of the chemical lives
of microbes. Proc. Natl Acad. Sci. USA 109, 10128–10129 (2012).
4. Wang, M. etal. Sharing and community curation of mass spectrometry data
with Global Natural Products Social Molecular Networking. Nat. Biotechnol.
34, 828–837 (2016).
5. Frank, A. M. etal. Clustering millions of tandem mass spectra. J. Proteome
Res. 7, 113–122 (2008).
6. Homann, N. etal. mzTab-M: a data standard for sharing quantitative results
in mass spectrometry metabolomics. Anal. Chem. 91, 3302–3310 (2019).
7. Nothias, L.-F. etal. Bioactivity-based molecular networking for the discovery
of drug leads in natural product bioassay-guided fractionation. J. Nat. Prod.
81, 758–767 (2018).
8. Cohen, L. J. etal. Functional metagenomic discovery of bacterial eectors in
the human microbiome and isolation of commendamide, a GPCR G2A/132
agonist. Proc. Natl Acad. Sci. USA. 112, E4825–E4834 (2015).
9. McDonald, D. etal. American Gut: an open platform for citizen-science
microbiome research. mSystems 3, e0031–18 (2018).
10. Röst, H. L. etal. OpenMS: a exible open-source soware platform for mass
spectrometry data analysis. Nat. Methods 13, 741–748 (2016).
11. Pluskal, T., Castillo, S., Villar-Briones, A. & Oresic, M. MZmine 2: modular
framework for processing, visualizing and analyzing mass spectrometry-based
molecular prole data. BMC Bioinformatics 11, 395 (2010).
12. Bolyen, E. etal. Reproducible, interactive, scalable and extensible microbiome
data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019).
13. Xia, J., Sinelnikov, I. V., Han, B. & Wishart, D. S. MetaboAnalyst 3.0—making
metabolomics more meaningful. Nucleic Acids Res. 43, W251–W257 (2015).
14. Protsyuk, I., Melnik, A. V., Nothias, L. F. & Rappez, L. 3D molecular
cartography using LC–MS facilitated by Optimus and’ili soware. Nat. Protoc.
13, 134–154 (2018).
15. Dührkop, K. etal. SIRIUS 4: a rapid tool for turning tandem mass spectra
into metabolite structure information. Nat. Methods 16, 299–302 (2019).
16. Mohimani, H. etal. Dereplication of peptidic natural products through
database search of mass spectra. Nat. Chem. Biol. 13, 30–37 (2017).
17. vander Hoo, J. J. J., Wandy, J., Barrett, M. P., Burgess, K. E. V. & Rogers, S.
Topic modeling for untargeted substructure exploration in metabolomics.
Proc. Natl Acad. Sci. USA. 113, 13738–13743 (2016).
18. Tripathi, A. etal. Chemically-informed analyses of metabolomics mass
spectrometry data with qemistree. Preprint at bioRxiv 2020.05.04.077636
(2020) https://doi.org/10.1101/2020.05.04.077636.
19. Tsugawa, H. etal. A lipidome atlas in MS-DIAL 4. Nat. Biotechnol. https://
doi.org/10.1038/s41587-020-0531-2 (2020).
20. Wang, M. etal. Mass spectrometry searches using MASST. Nat. Biotechnol.
38, 23–26 (2020).
21. Chambers, M. C. etal. A cross-platform toolkit for mass spectrometry and
proteomics. Nat. Biotechnol. 30, 918–920 (2012).
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
© The Author(s), under exclusive licence to Springer Nature America, Inc. 2020
1
Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
2
Collaborative Mass Spectrometry
Innovation Center, University of California San Diego, La Jolla, CA, USA.
3
Scripps Institution of Oceanography, University of California San Diego, La
Jolla, CA, USA.
4
Institute of Inorganic and Analytical Chemistry, University of Münster, Münster, Germany.
5
Chair for Bioinformatics, Friedrich-Schiller
University, Jena, Germany.
6
Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Bolzano, Italy.
7
Structural and
Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
8
Section for Clinical Mass Spectrometry, Department of
Congenital Disorders, Danish Center for Neonatal Screening, Statens Serum Institut, Copenhagen, Denmark.
9
RIKEN Center for Sustainable Resource
Science, Yokohama, Japan.
10
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
11
Applied Bioinformatics, Department of Computer
Science, University of Tübingen, Tübingen, Germany.
12
Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, Germany.
13
Department of Phytochemistry and Bioactive Natural Products, University of Geneva, Geneva, Switzerland.
14
Bruker Daltonics, Bremen, Germany.
15
Equipe
PNAS, UMR 8038 CiTCoM CNRS, Faculté de Pharmacie de Paris, Université Paris Descartes, Paris, France.
16
Department of Physics and Chemistry, School
of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Brazil.
17
Institute of Chemistry, Technische Universität Berlin, Berlin,
Germany.
18
School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA.
19
Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, St. Petersburg State University, St. Petersburg, Russia.
20
Waters Corporation,
Milford, MA, USA.
21
Institute of Microbiology of the Czech Academy of Sciences, Prague, Czech Republic.
22
College of Pharmacy, Sookmyung Women’s
University, Seoul, Republic of Korea.
23
Univ. Grenoble Alpes, CNRS, Grenoble INP, CHU Grenoble Alpes, TIMC-IMAG, Grenoble, France.
24
Centro de
Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Panama, Republic
of Panama.
25
Department of Chemistry and Biochemistry, Department of Microbiology and Plant Biology and Laboratories of Molecular Anthropology
and Microbiome Research, University of Oklahoma, Norman, OK, USA.
26
Nonlinear Dynamics, Milford, MA, USA.
27
Computational Biology Department,
School of Computer Sciences, Carnegie Mellon University, Pittsburgh, PA, USA.
28
Department of Biological and Environmental Sciences, University of
West Alabama, Livingston, AL, USA.
29
Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
30
Bioinformatics and Scientific
Data, Leibniz Institute of Plant Biochemistry, Halle, Germany.
31
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig,
Germany.
32
Laboratoire de Chimie des Produits Naturels, UMR CNRS SPE, Université de Corse Pascal Paoli, Corte, France.
33
Skaggs School of Pharmacy
and Pharmaceutical Sciences, University of Colorado, Denver, Aurora, CO, USA.
34
Whitehead Institute for Biomedical Research, Cambridge, MA, USA.
35
Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA.
36
School of Computing Science, University of
Glasgow, Glasgow, UK.
37
Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA.
38
Bioinformatics Group, Wageningen
University, Wageningen, the Netherlands.
39
Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA.
40
Research Unit
Analytical BioGeoChemistry, Helmholtz Zentrum München, München, Germany.
41
College of Pharmacy, Kangwon National University, Chuncheon-si,
Republic of Korea.
42
Institute for Translational Bioinformatics, University Hospital Tübingen, Tübingen, Germany.
43
Biomolecular Interactions, Max Planck
Institute for Developmental Biology, Tübingen, Germany.
44
Department of Computer Science and Engineering, University of California San Diego, La Jolla,
CA, USA.
45
These authors contributed equally: Louis-Félix Nothias, Daniel Petras, Robin Schmid.
e-mail: miw023@ucsd.edu; pdorrestein@ucsd.edu
NATURE METHODS | VOL 17 | SEPTEMBER 2020 | 905–908 | www.nature.com/naturemethods
908

Citations
More filters
Journal ArticleDOI

Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices

TL;DR: In this article, the authors present guidelines covering sample preparation, replication and randomization, quantification, recovery and recombination, ion suppression and peak misidentification, as a means to enable high-quality reporting of liquid chromatography and gas chromatography-mass spectrometry-derived data.
Journal ArticleDOI

Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra

TL;DR: The broad utility of CANOPUS is demonstrated by investigating the effect of microbial colonization in the mouse digestive system, through analysis of the chemodiversity of different Euphorbia plants and regarding the discovery of a marine natural product, revealing biological insights at the compound class level.
Journal ArticleDOI

Recurrent Topics in Mass Spectrometry-Based Metabolomics and Lipidomics-Standardization, Coverage, and Throughput.

TL;DR: This work states that improving Coverage, Selectivity, and Reliability in Metabolomics and Lipidomics is a major goal and should be focused on in the coming academic year.
Journal ArticleDOI

Software tools, databases and resources in metabolomics: updates from 2018 to 2019.

TL;DR: This review introduces and briefly presents around 100 metabolomics software resources, tools, databases, and other utilities that have surfaced or have improved in 2019.
References
More filters
Journal ArticleDOI

Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks

TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Journal ArticleDOI

Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2

Evan Bolyen, +123 more
- 01 Aug 2019 - 
TL;DR: QIIME 2 development was primarily funded by NSF Awards 1565100 to J.G.C. and R.K.P. and partial support was also provided by the following: grants NIH U54CA143925 and U54MD012388.
Journal ArticleDOI

MZmine 2: Modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data

TL;DR: A new generation of a popular open-source data processing toolbox, MZmine 2 is introduced, suitable for processing large batches of data and has been applied to both targeted and non-targeted metabolomic analyses.
Journal ArticleDOI

MetaboAnalyst 3.0—making metabolomics more meaningful

TL;DR: By completely re-implementing the MetaboAnalyst suite using the latest web framework technologies, the server has been able to substantially improve its performance, capacity and user interactivity.
Related Papers (5)

Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking

Mingxun Wang, +135 more
- 01 Aug 2016 - 
Frequently Asked Questions (13)
Q1. What contributions have the authors mentioned in the paper "Feature-based molecular networking in the gnps analysis environment" ?

This publication is distributed under The Association of Universities in the Netherlands ( VSNU ) 'Article 25fa implementation ' project. In this project research outputs of researchers employed by Dutch Universities that comply with the legal requirements of Article 25fa of the Dutch Copyright Act are distributed online and free of cost or other barriers in institutional repositories. 

the FBMN workflow also supports the mzTab-M format6, a standardized output format designed for the report of metabolomics MS data-processing results. 

Results from SIRIUS can be mapped on the molecular networks, which is essential since spectral library matching usually results frequently in a 1–5% annotation rate. 

Experimental spectral ‘clustering’ methods for the creation of the representative MS2 spectrum in FBMN are implemented in MZmine, OpenMS and XCMS. 

MS2LDA uses the latent Dirichlet allocation algorithm to mine for motifs (Mass2Motifs) of co-occurring fragments and neutral losses in MS2 spectra17,43. 

The computational cost of the data-processing part depends on (1) the software employed, (2) the number of samples in the dataset and (3) the parameters set. 

FBMN is ideally suited for advanced molecular networking analysis, enabling the characterization of isomers, incorporation of relative quantification and integration of ion mobility data. 

XCMS (for the most recent version, see https://github.com/sneumann/xcms/) is one of the most widely used software packages for processing of MS-based metabolomics data27. 

MS2LDA can be run on the GNPS web platform (https:// ccms-ucsd.github.io/GNPSDocumentation/ms2lda/) and/or in the MS2LDA web application43. 

For these large datasets, tools that were designed to operate on a cluster/cloud computer are preferred (XCMS27, OpenMS10 and, to some extent, MZmine). 

while FBMN offers an improvement upon many aspects of molecular networking analysis, classical MN remains essential for meta-analysis of large-scale datasets and is convenient for rapid analysis of LC–MS2 data with less user-defined parameters; one important aspect of molecular networks obtained with FBMN is the use of adequate processing steps and parameters, which otherwise could negatively affect the resulting molecular networks. 

along with DEREPLICATOR VarQuest41, is a collection of computational MS tools specialized in the annotation of peptidic small molecules often produced by microorganisms endowed with various biological activities. 

The MS2 spectral summary file (.MGF format) generated for the FBMN is compatible with SIRIUS, either running locally or with the dedicated GNPS workflow (https://ccms-ucsd.github.io/GNPSDocumentation/ sirius/).