scispace - formally typeset
Search or ask a question

Showing papers by "Conrad L. Schoch published in 2017"


Journal ArticleDOI
TL;DR: To realize the full potential of fungal SBCI it will be necessary to make advances in multiple areas, including changes to nomenclatural rules to enable validPUBLICation of sequence-based taxon descriptions.
Abstract: Fungal taxonomy and ecology have been revolutionized by the application of molecular methods and both have increasing connections to genomics and functional biology. However, data streams from traditional specimen- and culture-based systematics are not yet fully integrated with those from metagenomic and metatranscriptomic studies, which limits understanding of the taxonomic diversity and metabolic properties of fungal communities. This article reviews current resources, needs, and opportunities for sequence-based classification and identification (SBCI) in fungi as well as related efforts in prokaryotes. To realize the full potential of fungal SBCI it will be necessary to make advances in multiple areas. Improvements in sequencing methods, including long-read and single-cell technologies, will empower fungal molecular ecologists to look beyond ITS and current shotgun metagenomics approaches. Data quality and accessibility will be enhanced by attention to data and metadata standards and rigorous enforcement of policies for deposition of data and workflows. Taxonomic communities will need to develop best practices for molecular characterization in their focal clades, while also contributing to globally useful datasets including ITS. Changes to nomenclatural rules are needed to enable validPUBLICation of sequence-based taxon descriptions. Finally, cultural shifts are necessary to promote adoption of SBCI and to accord professional credit to individuals who contribute to community resources.

159 citations


Journal ArticleDOI
01 Jan 2017-Database
TL;DR: The recent taxonomic information was applied to do a complete taxonomic audit for the genus Trichoderma in the NCBI Taxonomy database, and a list of quality records of the RPB2 gene obtained from type material in GenBank that could help validate future submissions.
Abstract: The ITS (nuclear ribosomal internal transcribed spacer) RefSeq database at the National Center for Biotechnology Information (NCBI) is dedicated to the clear association between name, specimen and sequence data This database is focused on sequences obtained from type material stored in public collections While the initial ITS sequence curation effort together with numerous fungal taxonomy experts attempted to cover as many orders as possible, we extended our latest focus to the family and genus ranks We focused on Trichoderma for several reasons, mainly because the asexual and sexual synonyms were well documented, and a list of proposed names and type material were recently proposed and published In this case study the recent taxonomic information was applied to do a complete taxonomic audit for the genus Trichoderma in the NCBI Taxonomy database A name status report is available here: https://wwwncbinlmnihgov/Taxonomy/TaxIdentifier/tax_identifiercgi As a result, the ITS RefSeq Targeted Loci database at NCBI has been augmented with more sequences from type and verified material from Trichoderma species Additionally, to aid in the cross referencing of data from single loci and genomes we have collected a list of quality records of the RPB2 gene obtained from type material in GenBank that could help validate future submissions During the process of curation misidentified genomes were discovered, and sequence records from type material were found hidden under previous classifications Source metadata curation, although more cumbersome, proved to be useful as confirmation of the type material designation Database URL:http://wwwncbinlmnihgov/bioproject/PRJNA177353

26 citations


Journal ArticleDOI
01 Dec 2017
TL;DR: This article focuses on one such step by proposing a standard way for publications to flag papers with novel taxonomic information, so the potential for automated searches of publication aggregators are improved, as well as the accurate curation ofTaxonomic information.
Abstract: The combination of manual curation and the reliance on updates from submitters to the public sequence databases is currently inefficient and impedes the comprehensive and timely release of records with new taxonomic names. This should be improved by making several steps during data release more efficient. This article focuses on one such step by proposing a standard way for publications to flag papers with novel taxonomic information. As a result, the potential for automated searches of publication aggregators are improved, as well as the accurate curation of taxonomic information.

10 citations



Posted ContentDOI
22 Nov 2017
TL;DR: A new NCBI policy is outlined that normalizes Influenza virus taxonomy processing but maintains features supported by the previous approach, and will reduce the amount of manual handling necessary for flu submissions and pave the way for increased automation of the submissions process.
Abstract: 46 47 Currently the National Center of Biotechnology Information (NCBI) assigns individual 48 taxonomy identifiers to each distinct influenza virus isolate submitted to GenBank. To 49 support this practice, individual flu isolates must be manually added to the NCBI 50 taxonomy database and unique taxonomy identifiers generated. This added layer of 51 manual processing is unique to influenza virus and prevents automatization of the flu 52 sequence submission process. Here we outline a new NCBI policy that normalizes 53 Influenza virus taxonomy processing but maintains features supported by the previous 54 approach. This change will reduce the amount of manual handling necessary for flu 55 submissions and pave the way for increased automation of the submissions process. 56 While this automation may disrupt some historic practices, it will better align influenza 57 virus data processing with other viruses and ultimately lower the submission burden on 58 data providers. 59 60 61 62 63 Introduction 64 65 GenBank is a member of the International Nucleotide Sequence Database Collaboration 66 (INSDC) (Cochrane et al. 2016) data repositories dedicated to providing public access 67 to biological sequence data. Viral taxonomy within INSDC databases follows the 68 guidelines provided by the International Committee on the Taxonomy of Viruses (ICTV). 69 The scope of the ICTV mandate extends from species to higher level taxa, and no 70 subspecific taxa are maintained by the ICTV (Adams et al. 2017). 71 72 All viral sequences submitted to GenBank and other INSDC repositories are assigned to 73 a species. Sequences from characterized viruses are assigned to their pre-existing 74 species. Sequences from novel viruses are assigned to newly created, unclassified 75 species. Typically, subspecific taxonomic ranks are not created at the time of submission, 76 though some formally unranked subspecific taxa are made during post-submission 77 taxonomic revisions. Creation of new viral taxa within the NCBI taxonomy database 78 whether families, species, or subspecific ranks requires manual validation and database 79 operations. 80 81 There are currently more than 550,000 Influenzavirus A, B, and C nucleotide sequences 82 in GenBank nearly twenty percent of the entire viral nucleotide sequence content of this 83 database (see Table 1). These sequences represent a coordinated effort by the 84 international scientific community to share critical public health data (Bao et al. 2008), 85 and it is imperative that GenBank provides efficient data distribution pathways to support 86 this and similar efforts. Given the number of influenza virus sequences generated by the 87 scientific community, efficient distribution to GenBank can only be sustained through 88 increased automation of the submissions process. 89 90 91 PeerJ Preprints | https://doi.org/10.7287/peerj.preprints.3428v1 | CC BY 4.0 Open Access | rec: 22 Nov 2017, publ: 22 Nov 2017