scispace - formally typeset
Search or ask a question
Author

David Bass

Bio: David Bass is an academic researcher from Natural History Museum. The author has contributed to research in topics: Cercozoa & Phylogenetic tree. The author has an hindex of 45, co-authored 116 publications receiving 9420 citations. Previous affiliations of David Bass include Centre for Environment, Fisheries and Aquaculture Science & American Museum of Natural History.


Papers
More filters
Journal ArticleDOI
TL;DR: This revision of the classification of eukaryotes retains an emphasis on the protists and incorporates changes since 2005 that have resolved nodes and branches in phylogenetic trees.
Abstract: This revision of the classification of eukaryotes, which updates that of Adl et al. [J. Eukaryot. Microbiol. 52 (2005) 399], retains an emphasis on the protists and incorporates changes since 2005 that have resolved nodes and branches in phylogenetic trees. Whereas the previous revision was successful in re-introducing name stability to the classification, this revision provides a classification for lineages that were then still unresolved. The supergroups have withstood phylogenetic hypothesis testing with some modifications, but despite some progress, problematic nodes at the base of the eukaryotic tree still remain to be statistically resolved. Looking forward, subsequent transformations to our understanding of the diversity of life will be from the discovery of novel lineages in previously under-sampled areas and from environmental genomic information.

1,414 citations

Journal ArticleDOI
TL;DR: The presence of both rRNA and rDNA sequences, taking into account introns (crucial for eukaryotic sequences), a normalized eight terms ranked-taxonomy and updates of new GenBank releases were made possible by a long-term collaboration between experts in taxonomy and computer scientists.
Abstract: The interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR 2 , http://ssurrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of hightroughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields. In total,

1,175 citations

Journal ArticleDOI
TL;DR: It is suggested that current understanding of the ecological complexity of protist communities, genetic diversity, and global species richness are severely limited by the sequence data hitherto available, and long‐tailed rank abundance curves suggest that the 454 sequencing approach provides improved access to rare genotypes.
Abstract: Sequencing of ribosomal DNA clone libraries amplified from environmental DNA has revolutionized our understanding of microbial eukaryote diversity and ecology. The results of these analyses have shown that protist groups are far more genetically heterogeneous than their morphological diversity suggests. However, the clone library approach is labour-intensive, relatively expensive, and methodologically biased. Therefore, even the most intensive rDNA library analyses have recovered only small samples of much larger assemblages, indicating that global environments harbour a vast array of unexplored biodiversity. High-throughput parallel tag 454 sequencing offers an unprecedented scale of sampling for molecular detection of microbial diversity. Here, we report a 454 protocol for sampling and characterizing assemblages of eukaryote microbes. We use this approach to sequence two SSU rDNA diversity markers-the variable V4 and V9 regions-from 10 L of anoxic Norwegian fjord water. We identified 38 116 V4 and 15 156 V9 unique sequences. Both markers detect a wide range of taxonomic groups but in both cases the diversity detected was dominated by dinoflagellates and close relatives. Long-tailed rank abundance curves suggest that the 454 sequencing approach provides improved access to rare genotypes. Most tags detected represent genotypes not currently in GenBank, although many are similar to database sequences. We suggest that current understanding of the ecological complexity of protist communities, genetic diversity, and global species richness are severely limited by the sequence data hitherto available, and we discuss the biological significance of this high amplicon diversity.

1,026 citations

Journal ArticleDOI
TL;DR: It is confirmed that eukaryotes form at least two domains, the loss of monophyly in the Excavata, robust support for the Haptista and Cryptista, and suggested primer sets for DNA sequences from environmental samples that are effective for each clade are provided.
Abstract: This revision of the classification of eukaryotes follows that of Adl et al., 2012 [J. Euk. Microbiol. 59(5)] and retains an emphasis on protists. Changes since have improved the resolution of many ...

750 citations

Journal ArticleDOI
TL;DR: A group of protist experts proposes a two-step DNA barcoding approach, comprising a universal eukaryotic pre-barcode followed by group-specific barcodes, to unveil the hidden biodiversity of microbial Eukaryotes.
Abstract: Animals, plants, and fungi—the three traditional kingdoms of multicellular eukaryotic life—make up almost all of the visible biosphere, and they account for the majority of catalogued species on Earth [1]. The remaining eukaryotes have been assembled for convenience into the protists, a group composed of many diverse lineages, single-celled for the most part, that diverged after Archaea and Bacteria evolved but before plants, animals, or fungi appeared on Earth. Given their single-celled nature, discovering and describing new species has been difficult, and many protistan lineages contain a relatively small number of formally described species (Figure 1A), despite the critical importance of several groups as pathogens, environmental quality indicators, and markers of past environmental changes. It would seem natural to apply molecular techniques such as DNA barcoding to the taxonomy of protists to compensate for the lack of diagnostic morphological features, but this has been hampered by the extreme diversity within the group. The genetic divergence observed between and within major protistan groups greatly exceeds that found in each of the three multicellular kingdoms. No single set of molecular markers has been identified that will work in all lineages, but an international working group is now close to a solution. A universal DNA barcode for protists coupled with group-specific barcodes will enable an explosion of taxonomic research that will catalyze diverse applications.

458 citations


Cited by
More filters
Journal ArticleDOI
18 Oct 2016-PeerJ
TL;DR: VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with US EARCH for paired-ends read merging and dereplication.
Abstract: Background: VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods: When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results: VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion: VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.

5,850 citations

Journal ArticleDOI
TL;DR: Among the regions of the ribosomal cistron, the internal transcribed spacer (ITS) region has the highest probability of successful identification for the broadest range of fungi, with the most clearly defined barcode gap between inter- and intraspecific variation.
Abstract: Six DNA regions were evaluated as potential DNA barcodes for Fungi, the second largest kingdom of eukaryotic life, by a multinational, multilaboratory consortium. The region of the mitochondrial cytochrome c oxidase subunit 1 used as the animal barcode was excluded as a potential marker, because it is difficult to amplify in fungi, often includes large introns, and can be insufficiently variable. Three subunits from the nuclear ribosomal RNA cistron were compared together with regions of three representative protein-coding genes (largest subunit of RNA polymerase II, second largest subunit of RNA polymerase II, and minichromosome maintenance protein). Although the protein-coding gene regions often had a higher percent of correct identification compared with ribosomal markers, low PCR amplification and sequencing success eliminated them as candidates for a universal fungal barcode. Among the regions of the ribosomal cistron, the internal transcribed spacer (ITS) region has the highest probability of successful identification for the broadest range of fungi, with the most clearly defined barcode gap between inter- and intraspecific variation. The nuclear ribosomal large subunit, a popular phylogenetic marker in certain groups, had superior species resolution in some taxonomic groups, such as the early diverging lineages and the ascomycete yeasts, but was otherwise slightly inferior to the ITS. The nuclear ribosomal small subunit has poor species-level resolution in fungi. ITS will be formally proposed for adoption as the primary fungal barcode marker to the Consortium for the Barcode of Life, with the possibility that supplementary barcodes may be developed for particular narrowly circumscribed taxonomic groups.

4,116 citations

Journal ArticleDOI
TL;DR: All fungal species represented by at least two ITS sequences in the international nucleotide sequence databases are now given a unique, stable name of the accession number type, and the term ‘species hypothesis’ (SH) is introduced for the taxa discovered in clustering on different similarity thresholds.
Abstract: The nuclear ribosomal internal transcribed spacer (ITS) region is the formal fungal barcode and in most cases the marker of choice for the exploration of fungal diversity in environmental samples. Two problems are particularly acute in the pursuit of satisfactory taxonomic assignment of newly generated ITS sequences: (i) the lack of an inclusive, reliable public reference data set and (ii) the lack of means to refer to fungal species, for which no Latin name is available in a standardized stable way. Here, we report on progress in these regards through further development of the UNITE database (http://unite.ut.ee) for molecular identification of fungi. All fungal species represented by at least two ITS sequences in the international nucleotide sequence databases are now given a unique, stable name of the accession number type (e.g. Hymenoscyphus pseudoalbidus|GU586904|SH133781.05FU), and their taxonomic and ecological annotations were corrected as far as possible through a distributed, third-party annotation effort. We introduce the term ‘species hypothesis’ (SH) for the taxa discovered in clustering on different similarity thresholds (97–99%). An automatically or manually designated sequence is chosen to represent each such SH. These reference sequences are released (http://unite.ut.ee/repository.php) for use by the scientific community in, for example, local sequence similarity searches and in the QIIME pipeline. The system and the data will be updated automatically as the number of public fungal ITS sequences grows. We invite everybody in the position to improve the annotation or metadata associated with their particular fungal lineages of expertise to do so through the new Web-based sequence management system in UNITE.

2,605 citations

Journal ArticleDOI
28 Nov 2014-Science
TL;DR: Diversity of most fungal groups peaked in tropical ecosystems, but ectomycorrhizal fungi and several fungal classes were most diverse in temperate or boreal ecosystems, and manyfungal groups exhibited distinct preferences for specific edaphic conditions (such as pH, calcium, or phosphorus).
Abstract: Fungi play major roles in ecosystem processes, but the determinants of fungal diversity and biogeographic patterns remain poorly understood. Using DNA metabarcoding data from hundreds of globally distributed soil samples, we demonstrate that fungal richness is decoupled from plant diversity. The plant-to-fungus richness ratio declines exponentially toward the poles. Climatic factors, followed by edaphic and spatial variables, constitute the best predictors of fungal richness and community composition at the global scale. Fungi show similar latitudinal diversity gradients to other organisms, with several notable exceptions. These findings advance our understanding of global fungal diversity patterns and permit integration of fungi into a general macroecological framework.

2,346 citations

01 Jan 2011
TL;DR: The sheer volume and scope of data posed by this flood of data pose a significant challenge to the development of efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

2,187 citations