scispace - formally typeset
Open accessJournal ArticleDOI: 10.1039/D0NP00090F

The year 2020 in natural product bioinformatics: an overview of the latest tools and databases.

04 Mar 2021-Natural Product Reports (Royal Society of Chemistry (RSC))-Vol. 38, Iss: 2, pp 301-306
Abstract: Covering: 2020 Bioinformatic approaches to document and analyse chemical structures, biosynthetic gene clusters and analytical data play an important role in the study of natural products. Every year, such a large number of new algorithms, tools and databases are released, that it is difficult to keep track of all the latest developments. The aim of this short article is to provide a concise overview of and reference to the major tools, methods and databases that have been released in the past year.

... read more

Citations
  More

11 results found


Open accessJournal ArticleDOI: 10.3390/FOODS10061308
07 Jun 2021-Foods
Abstract: The valorization of agri-food by-products is essential from both economic and sustainability perspectives. The large quantity of such materials causes problems for the environment; however, they can also generate new valuable ingredients and products which promote beneficial effects on human health. It is estimated that soybean production, the major oilseed crop worldwide, will leave about 597 million metric tons of branches, leaves, pods, and roots on the ground post-harvesting in 2020/21. An alternative for the use of soy-related by-products arises from the several bioactive compounds found in this plant. Metabolomics studies have already identified isoflavonoids, saponins, and organic and fatty acids, among other metabolites, in all soy organs. The present review aims to show the application of metabolomics for identifying high-added-value compounds in underused parts of the soy plant, listing the main bioactive metabolites identified up to now, as well as the factors affecting their production.

... read more

2 Citations


Open accessJournal ArticleDOI: 10.1099/MIC.0.001084
13 Sep 2021-Microbiology
Abstract: Last year ActinoBase, a Wiki-style initiative supported by the UK Microbiology Society, published a review highlighting the research of particular interest to the actinomycete community. Here, we present the second ActinoBase review showcasing selected reports published in 2020 and early 2021, integrating perspectives in the actinomycete field. Actinomycetes are well-known for their unsurpassed ability to produce specialised metabolites, of which many are used as therapeutic agents with antibacterial, antifungal, or immunosuppressive activities. Much research is carried out to understand the purpose of these metabolites in the environment, either within communities or in host interactions. Moreover, many efforts have been placed in developing computational tools to handle big data, simplify experimental design, and find new biosynthetic gene cluster prioritisation strategies. Alongside, synthetic biology has provided advances in tools to elucidate the biosynthesis of these metabolites. Additionally, there are still mysteries to be uncovered in understanding the fundamentals of filamentous actinomycetes' developmental cycle and regulation of their metabolism. This review focuses on research using integrative methodologies and approaches to understand the bigger picture of actinomycete biology, covering four research areas: i) technology and methodology; ii) specialised metabolites; iii) development and regulation; and iv) ecology and host interactions.

... read more

1 Citations


Open accessJournal ArticleDOI: 10.1128/MSYSTEMS.00489-21
Yoon-Hee Chung1, Hiyoung Kim1, Chang-Hun Ji1, Hyun-Woo Je1  +4 moreInstitutions (4)
24 Aug 2021-
Abstract: The genus Streptomyces is one of the richest sources of secondary metabolite biosynthetic gene clusters (BGCs). Sequencing of a large number of genomes has provided evidence that this well-known bacterial genus still harbors a large number of cryptic BGCs, and their metabolites are yet to be discovered. When taking a gene-first approach for new natural product discovery, BGC prioritization would be the most crucial step for the discovery of novel chemotypes. We hypothesized that strains with a greater number of BGCs would also contain a greater number of silent unique BGCs due to the presence of complex regulatory systems. Based on this hypothesis, we employed a comparative genomics approach to identify a specific Streptomyces phylogenetic lineage with the highest and yet-uncharacterized biosynthetic potential. A comparison of BGC abundance and genome size across 158 phylogenetically diverse Streptomyces type strains identified that members of the phylogenetic group characterized by the formation of rugose-ornamented spores possess the greatest number of BGCs (average, 50 BGCs) and also the largest genomes (average, 11.5 Mb). The study of genetic and biosynthetic diversities using comparative genomics of 11 sequenced genomes and a genetic similarity network analysis of BGCs suggested that members of this group carry a large number of unique BGCs, the majority of which are cryptic and not associated with any known natural product. We believe that members of this Streptomyces phylogenetic group possess a remarkable biosynthetic potential and thus would be a good target for a metabolite characterization study that could lead to the discovery of novel chemotypes. IMPORTANCE It is now well recognized that members of the genus Streptomyces still harbor a large number of cryptic BGCs in their genomes, which are mostly silent under laboratory culture conditions. Activation of transcriptionally silent BGCs is technically challenging and thus forms a bottleneck when taking a gene-first approach for the discovery of new natural products. Thus, it is important to focus activation efforts on strains with BGCs that have the potential to produce novel metabolites. The clade-level analysis of biosynthetic diversity could provide insights into the relationship between phylogenetic lineage and biosynthetic diversity. By exploring BGC abundance in relation to Streptomyces phylogeny, we identified a specific monophyletic lineage associated with the highest BGC abundance. Then, using a combined analysis of comparative genomics and a genetic network, we demonstrated that members of this lineage are genetically and biosynthetically diverse, contain a large number of cryptic BGCs with novel genotypes, and thus would be a good target for metabolite characterization studies.

... read more

1 Citations


Open accessJournal ArticleDOI: 10.1039/D0SC06919A
05 May 2021-Chemical Science
Abstract: Antibiotic development based on natural products has faced a long lasting decline since the 1970s, while both the speed and the extent of antimicrobial resistance (AMR) development have been severely underestimated The discovery of antimicrobial natural products of bacterial and fungal origin featuring new chemistry and previously unknown mode of actions is increasingly challenged by rediscovery issues Natural products that are abundantly produced by the corresponding wild type organisms often featuring strong UV signals have been extensively characterized, especially the ones produced by extensively screened microbial genera such as streptomycetes Purely synthetic chemistry approaches aiming to replace the declining supply from natural products as starting materials to develop novel antibiotics largely failed to provide significant numbers of antibiotic drug leads To cope with this fundamental issue, microbial natural products science is being transformed from a ‘grind-and-find’ study to an integrated approach based on bacterial genomics and metabolomics Novel technologies in instrumental analytics are increasingly employed to lower detection limits and expand the space of detectable substance classes, while broadening the scope of accessible and potentially bioactive natural products Furthermore, the almost exponential increase in publicly available bacterial genome data has shown that the biosynthetic potential of the investigated strains by far exceeds the amount of detected metabolites This can be judged by the discrepancy between the number of biosynthetic gene clusters (BGC) encoded in the genome of each microbial strain and the number of secondary metabolites actually detected, even when considering the increased sensitivity provided by novel analytical instrumentation In silico annotation tools for biosynthetic gene cluster classification and analysis allow fast prioritization in BGC-to-compound workflows, which is highly important to be able to process the enormous underlying data volumes BGC prioritization is currently accompanied by novel molecular biology-based approaches to access the so-called orphan BGCs not yet correlated with a secondary metabolite Integration of metabolomics, in silico genomics and molecular biology approaches into the mainstream of natural product research will critically influence future success and impact the natural product field in pharmaceutical, nutritional and agrochemical applications and especially in anti-infective research

... read more

1 Citations


Journal ArticleDOI: 10.1039/D1NP00036E
Abstract: Covering: 2010 to 2021Organisms in nature have evolved into proficient synthetic chemists, utilizing specialized enzymatic machinery to biosynthesize an inspiring diversity of secondary metabolites. Often serving to boost competitive advantage for their producers, these secondary metabolites have widespread human impacts as antibiotics, anti-inflammatories, and antifungal drugs. The natural products discovery field has begun a shift away from traditional activity-guided approaches and is beginning to take advantage of increasingly available metabolomics and genomics datasets to explore undiscovered chemical space. Major strides have been made and now enable -omics-informed prioritization of chemical structures for discovery, including the prospect of confidently linking metabolites to their biosynthetic pathways. Over the last decade, more integrated strategies now provide researchers with pipelines for simultaneous identification of expressed secondary metabolites and their biosynthetic machinery. However, continuous collaboration by the natural products community will be required to optimize strategies for effective evaluation of natural product biosynthetic gene clusters to accelerate discovery efforts. Here, we provide an evaluative guide to scientific literature as it relates to studying natural product biosynthesis using genomics, metabolomics, and their integrated datasets. Particular emphasis is placed on the unique insights that can be gained from large-scale integrated strategies, and we provide source organism-specific considerations to evaluate the gaps in our current knowledge.

... read more

Topics: Chemical space (50%)

1 Citations


References
  More

56 results found


Open accessJournal ArticleDOI: 10.1038/S41589-019-0400-9
Abstract: Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues. Two bioinformatic tools, BiG-SCAPE and CORASON, enable sequence similarity network and phylogenetic analysis of gene clusters and their families across hundreds of strains and in large datasets, leading to the discovery of new natural products.

... read more

Topics: Gene cluster (50%)

201 Citations


Open accessJournal ArticleDOI: 10.1038/S41592-020-0933-6
24 Aug 2020-Nature Methods
Abstract: Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present feature-based molecular networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. FBMN enables quantitative analysis and resolution of isomers, including from ion mobility spectrometry.

... read more

163 Citations


Open accessJournal ArticleDOI: 10.1021/ACSCENTSCI.9B00806
Abstract: Despite rapid evolution in the area of microbial natural products chemistry, there is currently no open access database containing all microbially produced natural product structures. Lack of availability of these data is preventing the implementation of new technologies in natural products science. Specifically, development of new computational strategies for compound characterization and identification are being hampered by the lack of a comprehensive database of known compounds against which to compare experimental data. The creation of an open access, community-maintained database of microbial natural product structures would enable the development of new technologies in natural products discovery and improve the interoperability of existing natural products data resources. However, these data are spread unevenly throughout the historical scientific literature, including both journal articles and international patents. These documents have no standard format, are often not digitized as machine readable text, and are not publicly available. Further, none of these documents have associated structure files (e.g., MOL, InChI, or SMILES), instead containing images of structures. This makes extraction and formatting of relevant natural products data a formidable challenge. Using a combination of manual curation and automated data mining approaches we have created a database of microbial natural products (The Natural Products Atlas, www.npatlas.org) that includes 24 594 compounds and contains referenced data for structure, compound names, source organisms, isolation references, total syntheses, and instances of structural reassignment. This database is accompanied by an interactive web portal that permits searching by structure, substructure, and physical properties. The Web site also provides mechanisms for visualizing natural products chemical space and dashboards for displaying author and discovery timeline data. These interactive tools offer a powerful knowledge base for natural products discovery with a central interface for structure and property-based searching and presents new viewpoints on structural diversity in natural products. The Natural Products Atlas has been developed under FAIR principles (Findable, Accessible, Interoperable, and Reusable) and is integrated with other emerging natural product databases, including the Minimum Information About a Biosynthetic Gene Cluster (MIBiG) repository, and the Global Natural Products Social Molecular Networking (GNPS) platform. It is designed as a community-supported resource to provide a central repository for known natural product structures from microorganisms and is the first comprehensive, open access resource of this type. It is expected that the Natural Products Atlas will enable the development of new natural products discovery modalities and accelerate the process of structural characterization for complex natural products libraries.

... read more

116 Citations


Open accessJournal ArticleDOI: 10.1186/S13321-020-00424-9
Abstract: Natural products (NPs) have been the centre of attention of the scientific community in the last decencies and the interest around them continues to grow incessantly. As a consequence, in the last 20 years, there was a rapid multiplication of various databases and collections as generalistic or thematic resources for NP information. In this review, we establish a complete overview of these resources, and the numbers are overwhelming: over 120 different NP databases and collections were published and re-used since 2000. 98 of them are still somehow accessible and only 50 are open access. The latter include not only databases but also big collections of NPs published as supplementary material in scientific publications and collections that were backed up in the ZINC database for commercially-available compounds. Some databases, even published relatively recently are already not accessible anymore, which leads to a dramatic loss of data on NPs. The data sources are presented in this manuscript, together with the comparison of the content of open ones. With this review, we also compiled the open-access natural compounds in one single dataset a COlleCtion of Open NatUral producTs (COCONUT), which is available on Zenodo and contains structures and sparse annotations for over 400,000 non-redundant NPs, which makes it the biggest open collection of NPs available to this date.

... read more

81 Citations


Journal ArticleDOI: 10.1038/S41587-020-0740-8
Abstract: Metabolomics using nontargeted tandem mass spectrometry can detect thousands of molecules in a biological sample. However, structural molecule annotation is limited to structures present in libraries or databases, restricting analysis and interpretation of experimental data. Here we describe CANOPUS (class assignment and ontology prediction using mass spectrometry), a computational tool for systematic compound class annotation. CANOPUS uses a deep neural network to predict 2,497 compound classes from fragmentation spectra, including all biologically relevant classes. CANOPUS explicitly targets compounds for which neither spectral nor structural reference data are available and predicts classes lacking tandem mass spectrometry training data. In evaluation using reference data, CANOPUS reached very high prediction performance (average accuracy of 99.7% in cross-validation) and outperformed four baseline methods. We demonstrate the broad utility of CANOPUS by investigating the effect of microbial colonization in the mouse digestive system, through analysis of the chemodiversity of different Euphorbia plants and regarding the discovery of a marine natural product, revealing biological insights at the compound class level. Unknown metabolites are classified from mass spectrometry data.

... read more

68 Citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
202111