Showing papers in &quot;Molecular Ecology Resources in 2015&quot;

Distance, flow and PCR inhibition: eDNA dynamics in two headwater streams

TL;DR: The level of replication required for accurate detection of targeted taxa in different contexts was evaluated and whether statistical approaches developed to estimate occupancy in the presence of observational errors can successfully estimate true prevalence, detection probability and false‐positive rates was evaluated.

...read moreread less

Abstract: Environmental DNA (eDNA) metabarcoding is increasingly used to study the present and past biodiversity. eDNA analyses often rely on amplification of very small quantities or degraded DNA. To avoid missing detection of taxa that are actually present (false negatives), multiple extractions and amplifications of the same samples are often performed. However, the level of replication needed for reliable estimates of the presence/absence patterns remains an unaddressed topic. Furthermore, degraded DNA and PCR/sequencing errors might produce false positives. We used simulations and empirical data to evaluate the level of replication required for accurate detection of targeted taxa in different contexts and to assess the performance of methods used to reduce the risk of false detections. Furthermore, we evaluated whether statistical approaches developed to estimate occupancy in the presence of observational errors can successfully estimate true prevalence, detection probability and false-positive rates. Replications reduced the rate of false negatives; the optimal level of replication was strongly dependent on the detection probability of taxa. Occupancy models successfully estimated true prevalence, detection probability and false-positive rates, but their performance increased with the number of replicates. At least eight PCR replicates should be performed if detection probability is not high, such as in ancient DNA studies. Multiple DNA extractions from the same sample yielded consistent results; in some cases, collecting multiple samples from the same locality allowed detecting more species. The optimal level of replication for accurate species detection strongly varies among studies and could be explicitly estimated to improve the reliability of results.

...read moreread less

490 citations

Journal Article•DOI•

[...]

Stephen F. Jane¹, Taylor M. Wilcox², Kevin S. McKelvey³, Michael K. Young³, Michael K. Schwartz³, Winsor H. Lowe², Benjamin H. Letcher⁴, Andrew R. Whiteley¹ - Show less +4 more•Institutions (4)

University of Massachusetts Amherst¹, University of Montana², United States Forest Service³, United States Geological Survey⁴

Tag jumps illuminated--reducing sequence-to-sample misidentifications in metabarcoding studies.

TL;DR: During high leaf deposition periods, the presence of inhibitors resulted in no amplification for high copy number samples in the absence of an inhibition‐releasing strategy, demonstrating the necessity to carefully consider inhibition in eDNA analysis.

...read moreread less

Abstract: Environmental DNA (eDNA) detection has emerged as a powerful tool for monitoring aquatic organisms, but much remains unknown about the dynamics of aquatic eDNA over a range of environmental conditions. DNA concentrations in streams and rivers will depend not only on the equilibrium between DNA entering the water and DNA leaving the system through degradation, but also on downstream transport. To improve understanding of the dynamics of eDNA concentration in lotic systems, we introduced caged trout into two fishless headwater streams and took eDNA samples at evenly spaced downstream intervals. This was repeated 18 times from mid-summer through autumn, over flows ranging from approximately 1-96 L/s. We used quantitative PCR to relate DNA copy number to distance from source. We found that regardless of flow, there were detectable levels of DNA at 239.5 m. The main effect of flow on eDNA counts was in opposite directions in the two streams. At the lowest flows, eDNA counts were highest close to the source and quickly trailed off over distance. At the highest flows, DNA counts were relatively low both near and far from the source. Biomass was positively related to eDNA copy number in both streams. A combination of cell settling, turbulence and dilution effects is probably responsible for our observations. Additionally, during high leaf deposition periods, the presence of inhibitors resulted in no amplification for high copy number samples in the absence of an inhibition-releasing strategy, demonstrating the necessity to carefully consider inhibition in eDNA analysis.

...read moreread less

415 citations

Journal Article•DOI•

[...]

Ida Bærholm Schnell¹, Ida Bærholm Schnell², Kristine Bohmann², Kristine Bohmann³, M. Thomas P. Gilbert², M. Thomas P. Gilbert⁴ - Show less +2 more•Institutions (4)

Copenhagen Zoo¹, University of Copenhagen², University of Bristol³, Curtin University⁴

Restriction site-associated DNA sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference

TL;DR: It is argued that tag jumping and contamination between libraries represents a considerable challenge for Illumina‐based metabarcoding studies, and measures to avoid false assignment of tag jumping‐derived sequences to samples are suggested.

...read moreread less

Abstract: Metabarcoding of environmental samples on second-generation sequencing platforms has rapidly become a valuable tool for ecological studies. A fundamental assumption of this approach is the reliance on being able to track tagged amplicons back to the samples from which they originated. In this study, we address the problem of sequences in metabarcoding sequencing outputs with false combinations of used tags (tag jumps). Unless these sequences can be identified and excluded from downstream analyses, tag jumps creating sequences with false, but already used tag combinations, can cause incorrect assignment of sequences to samples and artificially inflate diversity. In this study, we document and investigate tag jumping in metabarcoding studies on Illumina sequencing platforms by amplifying mixed-template extracts obtained from bat droppings and leech gut contents with tagged generic arthropod and mammal primers, respectively. We found that an average of 2.6% and 2.1% of sequences had tag combinations, which could be explained by tag jumping in the leech and bat diet study, respectively. We suggest that tag jumping can happen during blunt-ending of pools of tagged amplicons during library build and as a consequence of chimera formation during bulk amplification of tagged amplicons during library index PCR. We argue that tag jumping and contamination between libraries represents a considerable challenge for Illumina-based metabarcoding studies, and suggest measures to avoid false assignment of tag jumping-derived sequences to samples.

...read moreread less

409 citations

Journal Article•DOI•

[...]

Alicia Mastretta-Yanes¹, Nils Arrigo², Nadir Alvarez², Tove H. Jorgensen³, Daniel Piñero⁴, Brent C. Emerson⁵, Brent C. Emerson¹ - Show less +3 more•Institutions (5)

University of East Anglia¹, University of Lausanne², Aarhus University³, National Autonomous University of Mexico⁴, Spanish National Research Council⁵

METAXA2: improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data.

TL;DR: Individual sample replicates are used, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome and optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci.

...read moreread less

Abstract: Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for nonmodel organisms, potentially revolutionizing the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (i) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci; and (ii) quantify error rates for loci, alleles and single-nucleotide polymorphisms. As an empirical example, we use a double-digest RAD data set of a nonmodel plant species, Berberis alpina, collected from high-altitude mountains in Mexico.

...read moreread less

353 citations

Journal Article•DOI•

[...]

Johan Bengtsson-Palme¹, Martin Hartmann, Karl Martin Eriksson², Chandan Pal¹, Kaisa Thorell¹, Dan Göran Joakim Larsson¹, Rolf Henrik Nilsson¹ - Show less +3 more•Institutions (2)

University of Gothenburg¹, Chalmers University of Technology²

Genotyping-in-Thousands by sequencing (GT-seq): A cost effective SNP genotyping method based on custom amplicon sequencing.

TL;DR: A comprehensive update to metaxa is described, introducing support for the LSU rRNA gene, a greatly improved classifier allowing classification down to genus or species level, as well as enhanced support for short‐read (100 bp) and paired‐end sequences, among other changes.

...read moreread less

Abstract: The ribosomal rRNA genes are widely used as genetic markers for taxonomic identification of microbes. Particularly the small subunit (SSU; 16S/18S) rRNA gene is frequently used for species- or genus-level identification, but also the large subunit (LSU; 23S/28S) rRNA gene is employed in taxonomic assignment. The metaxa software tool is a popular utility for extracting partial rRNA sequences from large sequencing data sets and assigning them to an archaeal, bacterial, nuclear eukaryote, mitochondrial or chloroplast origin. This study describes a comprehensive update to metaxa – metaxa2 – that extends the capabilities of the tool, introducing support for the LSU rRNA gene, a greatly improved classifier allowing classification down to genus or species level, as well as enhanced support for short-read (100 bp) and paired-end sequences, among other changes. The performance of metaxa2 was compared to other commonly used taxonomic classifiers, showing that metaxa2 often outperforms previous methods in terms of making correct predictions while maintaining a low misclassification rate. metaxa2 is freely available from http://microbiology.se/software/metaxa2/

...read moreread less

345 citations

Journal Article•DOI•

[...]

Nathan R. Campbell, Stephanie A. Harmon, Shawn R. Narum

Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods

TL;DR: This study demonstrates amplicon sequencing with GT‐seq greatly reduces the cost of genotyping hundreds of targeted SNPs relative to existing methods by utilizing a simple library preparation method and massive efficiency of scale.

...read moreread less

Abstract: Genotyping-in-Thousands by sequencing (GT-seq) is a method that uses next-generation sequencing of multiplexed PCR products to generate genotypes from relatively small panels (50-500) of targeted single-nucleotide polymorphisms (SNPs) for thousands of individuals in a single Illumina HiSeq lane. This method uses only unlabelled oligos and PCR master mix in two thermal cycling steps for amplification of targeted SNP loci. During this process, sequencing adapters and dual barcode sequence tags are incorporated into the amplicons enabling thousands of individuals to be pooled into a single sequencing library. Post sequencing, reads from individual samples are split into individual files using their unique combination of barcode sequences. Genotyping is performed with a simple perl script which counts amplicon-specific sequences for each allele, and allele ratios are used to determine the genotypes. We demonstrate this technique by genotyping 2068 individual steelhead trout (Oncorhynchus mykiss) samples with a set of 192 SNP markers in a single library sequenced in a single Illumina HiSeq lane. Genotype data were 99.9% concordant to previously collected TaqMan(™) genotypes at the same 192 loci, but call rates were slightly lower with GT-seq (96.4%) relative to Taqman (99.0%). Of the 192 SNPs, 187 were genotyped in ≥90% of the individual samples and only 3 SNPs were genotyped in <70% of samples. This study demonstrates amplicon sequencing with GT-seq greatly reduces the cost of genotyping hundreds of targeted SNPs relative to existing methods by utilizing a simple library preparation method and massive efficiency of scale.

...read moreread less

314 citations

Journal Article•DOI•

[...]

Josep Piñol, Gisela Mir¹, Gisela Mir², Priscila Gomez-Polo, Nuria Agustí - Show less +1 more•Institutions (2)

Spanish National Research Council¹, Peter MacCallum Cancer Centre²

TL;DR: The results show that reads of all species were recovered after PCR enrichment at the authors' control conditions and high‐throughput sequencing, and show that the four factors considered biased the final proportions of the species to some degree.

...read moreread less

Abstract: The quantification of the biological diversity in environmental samples using high-throughput DNA sequencing is hindered by the PCR bias caused by variable primer-template mismatches of the individual species. In some dietary studies, there is the added problem that samples are enriched with predator DNA, so often a predator-specific blocking oligonucleotide is used to alleviate the problem. However, specific blocking oligonucleotides could coblock nontarget species to some degree. Here, we accurately estimate the extent of the PCR biases induced by universal and blocking primers on a mock community prepared with DNA of twelve species of terrestrial arthropods. We also compare universal and blocking primer biases with those induced by variable annealing temperature and number of PCR cycles. The results show that reads of all species were recovered after PCR enrichment at our control conditions (no blocking oligonucleotide, 45 °C annealing temperature and 40 cycles) and high-throughput sequencing. They also show that the four factors considered biased the final proportions of the species to some degree. Among these factors, the number of primer-template mismatches of each species had a disproportionate effect (up to five orders of magnitude) on the amplification efficiency. In particular, the number of primer-template mismatches explained most of the variation (~3/4) in the amplification efficiency of the species. The effect of blocking oligonucleotide concentration on nontarget species relative abundance was also significant, but less important (below one order of magnitude). Considering the results reported here, the quantitative potential of the technique is limited, and only qualitative results (the species list) are reliable, at least when targeting the barcoding COI region.

...read moreread less

313 citations

Journal Article•DOI•

[...]

Jack Pew¹, Paul H. Muir¹, Jinliang Wang², Timothy R. Frasier¹•Institutions (2)

University of Saint Mary¹, Zoological Society of London²

The room temperature preservation of filtered environmental DNA samples and assimilation into a phenol–chloroform–isoamyl alcohol DNA extraction

TL;DR: A new R package is presented, called related, that can calculate relatedness based on seven estimators, can account for genotyping errors, missing data and inbreeding, and can estimate 95% confidence intervals.

...read moreread less

Abstract: Analyses of pairwise relatedness represent a key component to addressing many topics in biology. However, such analyses have been limited because most available programs provide a means to estimate relatedness based on only a single estimator, making comparison across estimators difficult. Second, all programs to date have been platform specific, working only on a specific operating system. This has the undesirable outcome of making choice of relatedness estimator limited by operating system preference, rather than being based on scientific rationale. Here, we present a new R package, called related, that can calculate relatedness based on seven estimators, can account for genotyping errors, missing data and inbreeding, and can estimate 95% confidence intervals. Moreover, simulation functions are provided that allow for easy comparison of the performance of different estimators and for analyses of how much resolution to expect from a given data set. Because this package works in R, it is platform independent. Combined, this functionality should allow for more appropriate analyses and interpretation of pairwise relatedness and will also allow for the integration of relatedness data into larger R workflows.

...read moreread less

296 citations

Journal Article•DOI•

[...]

Mark A. Renshaw¹, Brett P. Olds¹, Christopher L. Jerde¹, Margaret M. McVeigh¹, David M. Lodge¹ - Show less +1 more•Institutions (1)

University of Notre Dame¹

Target enrichment of ultraconserved elements from arthropods provides a genomic perspective on relationships among Hymenoptera.

TL;DR: This study demonstrates the successful preservation of eDNA at room temperature (20 °C) in two lysis buffers, CTAB and Longmire's, over a 2‐week period of time and suggests that for many kinds of studies recently reported on macrobial eDNA, detection probabilities could have been increased, and at a lower cost, by utilizing the Longmire’s preservation buffer with a PCI DNA extraction.

...read moreread less

Abstract: Current research targeting filtered macrobial environmental DNA (eDNA) often relies upon cold ambient temperatures at various stages, including the transport of water samples from the field to the laboratory and the storage of water and/or filtered samples in the laboratory. This poses practical limitations for field collections in locations where refrigeration and frozen storage is difficult or where samples must be transported long distances for further processing and screening. This study demonstrates the successful preservation of eDNA at room temperature (20 °C) in two lysis buffers, CTAB and Longmire's, over a 2-week period of time. Moreover, the preserved eDNA samples were seamlessly integrated into a phenol–chloroform–isoamyl alcohol (PCI) DNA extraction protocol. The successful application of the eDNA extraction to multiple filter membrane types suggests the methods evaluated here may be broadly applied in future eDNA research. Our results also suggest that for many kinds of studies recently reported on macrobial eDNA, detection probabilities could have been increased, and at a lower cost, by utilizing the Longmire's preservation buffer with a PCI DNA extraction.

...read moreread less

258 citations

Journal Article•DOI•

[...]

Brant C. Faircloth¹, Brant C. Faircloth², Michael G. Branstetter³, Noor D. White⁴, Noor D. White³, Seán G. Brady³ - Show less +2 more•Institutions (4)

University of California, Los Angeles¹, Louisiana State University², National Museum of Natural History³, University of Maryland, College Park⁴

Metabarcoding vs. morphological identification to assess diatom diversity in environmental studies

TL;DR: A large set (n = 1510) of ultraconserved elements (UCEs) shared among the insect order Hymenoptera are identified and used to reconstruct phylogenetic relationships spanning very old to very young divergences among hymenopteran lineages with complete support.

...read moreread less

Abstract: Gaining a genomic perspective on phylogeny requires the collection of data from many putatively independent loci across the genome. Among insects, an increasingly common approach to collecting this class of data involves transcriptome sequencing, because few insects have high-quality genome sequences available; assembling new genomes remains a limiting factor; the transcribed portion of the genome is a reasonable, reduced subset of the genome to target; and the data collected from transcribed portions of the genome are similar in composition to the types of data with which biologists have traditionally worked (e.g. exons). However, molecular techniques requiring RNA as a template, including transcriptome sequencing, are limited to using very high-quality source materials, which are often unavailable from a large proportion of biologically important insect samples. Recent research suggests that DNA-based target enrichment of conserved genomic elements offers another path to collecting phylogenomic data across insect taxa, provided that conserved elements are present in and can be collected from insect genomes. Here, we identify a large set (n = 1510) of ultraconserved elements (UCEs) shared among the insect order Hymenoptera. We used in silico analyses to show that these loci accurately reconstruct relationships among genome-enabled hymenoptera, and we designed a set of RNA baits (n = 2749) for enriching these loci that researchers can use with DNA templates extracted from a variety of sources. We used our UCE bait set to enrich an average of 721 UCE loci from 30 hymenopteran taxa, and we used these UCE loci to reconstruct phylogenetic relationships spanning very old (≥220 Ma) to very young (≤1 Ma) divergences among hymenopteran lineages. In contrast to a recent study addressing hymenopteran phylogeny using transcriptome data, we found ants to be sister to all remaining aculeate lineages with complete support, although this result could be explained by factors such as taxon sampling. We discuss this approach and our results in the context of elucidating the evolutionary history of one of the most diverse and speciose animal orders.

...read moreread less

Journal Article•DOI•

[...]

Jonas Zimmermann¹, Jonas Zimmermann², Gernot Glöckner³, Gernot Glöckner⁴, Regine Jahn¹, Neela Enke¹, Birgit Gemeinholzer² - Show less +3 more•Institutions (4)

Free University of Berlin¹, University of Giessen², Leibniz Association³, University of Cologne⁴

A comprehensive DNA barcode database for Central European beetles with a focus on Germany: adding more than 3500 identified species to BOLD.

TL;DR: Evidence is provided that metabarcoding of diatoms via NGS sequencing of the V4 region (18S) has a great potential for water quality assessments and could complement and maybe even improve the identification via light microscopy.

...read moreread less

Abstract: Diatoms are frequently used for water quality assessments; however, identification to species level is difficult, time-consuming and needs in-depth knowledge of the organisms under investigation, as nonhomoplastic species-specific morphological characters are scarce. We here investigate how identification methods based on DNA (metabarcoding using NGS platforms) perform in comparison to morphological diatom identification and propose a workflow to optimize diatom fresh water quality assessments. Diatom diversity at seven different sites along the course of the river system Odra and Lusatian Neisse from the source to the mouth is analysed with DNA and morphological methods, which are compared. The NGS technology almost always leads to a higher number of identified taxa (270 via NGS vs. 103 by light microscopy LM), whose presence could subsequently be verified by LM. The sequence-based approach allows for a much more graduated insight into the taxonomic diversity of the environmental samples. Taxa retrieval varies considerably throughout the river system, depending on species occurrences and the taxonomic depth of the reference databases. Mostly rare taxa from oligotrophic parts of the river systems are less well represented in the reference database used. A workflow for DNA-based NGS diatom identification is presented. 28 000 diatom sequences were evaluated. Our findings provide evidence that metabarcoding of diatoms via NGS sequencing of the V4 region (18S) has a great potential for water quality assessments and could complement and maybe even improve the identification via light microscopy.

...read moreread less

Journal Article•DOI•

[...]

Lars Hendrich, Jérôme Morinière, Gerhard Haszprunar¹, Paul D. N. Hebert², Axel Hausmann¹, Frank Köhler, Michael Balke¹ - Show less +3 more•Institutions (2)

Ludwig Maximilian University of Munich¹, University of Guelph²

DNA barcoding largely supports 250 years of classical taxonomy: identifications for Central European bees (Hymenoptera, Apoidea partim)

TL;DR: This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well‐identified species (53% of the German fauna) with representatives from 97 of 103 families with a focus on Germany.

...read moreread less

Abstract: Beetles are the most diverse group of animals and are crucial for ecosystem functioning. In many countries, they are well established for environmental impact assessment, but even in the well-studied Central European fauna, species identification can be very difficult. A comprehensive and taxonomically well-curated DNA barcode library could remedy this deficit and could also link hundreds of years of traditional knowledge with next generation sequencing technology. However, such a beetle library is missing to date. This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well-identified species (53% of the German fauna) with representatives from 97 of 103 families (94%). This study is the first comprehensive regional test of the efficiency of DNA barcoding for beetles with a focus on Germany. Sequences ≥500 bp were recovered from 63% of the specimens analysed (15 948 of 25 294) with short sequences from another 997 specimens. Whereas most specimens (92.2%) could be unambiguously assigned to a single known species by sequence diversity at CO1, 1089 specimens (6.8%) were assigned to more than one Barcode Index Number (BIN), creating 395 BINs which need further study to ascertain if they represent cryptic species, mitochondrial introgression, or simply regional variation in widespread species. We found 409 specimens (2.6%) that shared a BIN assignment with another species, most involving a pair of closely allied species as 43 BINs were involved. Most of these taxa were separated by barcodes although sequence divergences were low. Only 155 specimens (0.97%) show identical or overlapping clusters.

...read moreread less

Journal Article•DOI•

[...]

Stefan Schmidt, Christian Schmid-Egger, Jérôme Morinière, Gerhard Haszprunar, Paul D. N. Hebert¹ - Show less +1 more•Institutions (1)

University of Guelph¹

Efficient and sensitive identification and quantification of airborne pollen using next-generation DNA sequencing

TL;DR: The barcode data contributed to clarifying the status of nearly half the examined taxonomically problematic species of bees in the German fauna, and the role of DNA barcoding as a tool for current and future taxonomic work is discussed.

...read moreread less

Abstract: This study presents DNA barcode records for 4118 specimens representing 561 species of bees belonging to the six families of Apoidea (Andrenidae, Apidae, Colletidae, Halictidae, Megachilidae and Melittidae) found in Central Europe. These records provide fully compliant barcode sequences for 503 of the 571 bee species in the German fauna and partial sequences for 43 more. The barcode results are largely congruent with traditional taxonomy as only five closely allied pairs of species could not be discriminated by barcodes. As well, 90% of the species possessed sufficiently deep sequence divergence to be assigned to a different Barcode Index Number (BIN). In fact, 56 species (11%) were assigned to two or more BINs reflecting the high levels of intraspecific divergence among their component specimens. Fifty other species (9.7%) shared the same Barcode Index Number with one or more species, but most of these species belonged to a distinct barcode cluster within a particular BIN. The barcode data contributed to clarifying the status of nearly half the examined taxonomically problematic species of bees in the German fauna. Based on these results, the role of DNA barcoding as a tool for current and future taxonomic work is discussed.

...read moreread less

Journal Article•DOI•

[...]

Ken Kraaijeveld¹, Ken Kraaijeveld², Letty A. de Weger², Marina Ventayol Garcia², Henk P. J. Buermans², Jeroen Frank², Pieter S. Hiemstra², Johan T. den Dunnen² - Show less +4 more•Institutions (2)

University of Applied Sciences Leiden¹, Leiden University Medical Center²

The development and characterization of a 57K single nucleotide polymorphism array for rainbow trout.

TL;DR: This method for identification and quantification of airborne pollen using DNA sequencing is presented and it is shown that it provides an accurate qualitative and quantitative view of the species composition of samples of pollen grains.

...read moreread less

Abstract: Pollen monitoring is an important and widely used tool in allergy research and creation of awareness in pollen-allergic patients. Current pollen monitoring methods are microscope-based, labour intensive and cannot identify pollen to the genus level in some relevant allergenic plant groups. Therefore, a more efficient, cost-effective and sensitive method is needed. Here, we present a method for identification and quantification of airborne pollen using DNA sequencing. Pollen is collected from ambient air using standard techniques. DNA is extracted from the collected pollen, and a fragment of the chloroplast gene trnL is amplified using PCR. The PCR product is subsequently sequenced on a next-generation sequencing platform (Ion Torrent). Amplicon molecules are sequenced individually, allowing identification of different sequences from a mixed sample. We show that this method provides an accurate qualitative and quantitative view of the species composition of samples of airborne pollen grains. We also show that it correctly identifies the individual grass genera present in a mixed sample of grass pollen, which cannot be achieved using microscopic pollen identification. We conclude that our method is more efficient and sensitive than current pollen monitoring techniques and therefore has the potential to increase the throughput of pollen monitoring.

...read moreread less

Journal Article•DOI•

[...]

Yniv Palti¹, Guangtu Gao¹, Sixin Liu¹, Matthew Peter Kent², Sigbjørn Lien², Michael R. Miller³, Caird E. Rexroad¹, Thomas Moen - Show less +4 more•Institutions (3)

United States Department of Agriculture¹, Norwegian University of Life Sciences², University of California, Davis³

DNA barcoding gap: reliable species identification over morphological and geographical scales.

TL;DR: The development and characterization of the first high‐density single nucleotide polymorphism (SNP) genotyping array for rainbow trout is described and strong evidence for a wide distribution throughout the genome with good representation in all 29 chromosomes is provided.

...read moreread less

Abstract: In this study, we describe the development and characterization of the first high-density single nucleotide polymorphism (SNP) genotyping array for rainbow trout. The SNP array is publically available from a commercial vendor (Affymetrix). The SNP genotyping quality was high, and validation rate was close to 90%. This is comparable to other farm animals and is much higher than previous smaller scale SNP validation studies in rainbow trout. High quality and integrity of the genotypes are evident from sample reproducibility and from nearly 100% agreement in genotyping results from other methods. The array is very useful for rainbow trout aquaculture populations with more than 40 900 polymorphic markers per population. For wild populations that were confounded by a smaller sample size, the number of polymorphic markers was between 10 577 and 24 330. Comparison between genotypes from individual populations suggests good potential for identifying candidate markers for populations' traceability. Linkage analysis and mapping of the SNPs to the reference genome assembly provide strong evidence for a wide distribution throughout the genome with good representation in all 29 chromosomes. A total of 68% of the genome scaffolds and contigs were anchored through linkage analysis using the SNP array genotypes, including ~20% of the genome assembly that has not been previously anchored to chromosomes.

...read moreread less

Journal Article•DOI•

[...]

Klemen Candek¹, Matjaž Kuntner², Matjaž Kuntner¹, Matjaž Kuntner³•Institutions (3)

Slovenian Academy of Sciences and Arts¹, Hubei University², National Museum of Natural History³

01 Mar 2015-Molecular Ecology Resources

TL;DR: The results support models of independent patterns of morphological and molecular evolution by showing that DNA barcodes are effective in species identification regardless of their morphological diagnosibility and that the size of the barcoding gap strongly depends on taxonomic groups and practices.

...read moreread less

Abstract: The philosophical basis and utility of DNA barcoding have been a subject of numerous debates. While most literature embraces it, some studies continue to question its use in dipterans, butterflies and marine gastropods. Here, we explore the utility of DNA barcoding in identifying spider species that vary in taxonomic affiliation, morphological diagnosibility and geographic distribution. Our first test searched for a 'barcoding gap' by comparing intra- and interspecific means, medians and overlap in more than 75,000 computed Kimura 2-parameter (K2P) genetic distances in three families. Our second test compared K2P distances of congeneric species with high vs. low morphological distinctness in 20 genera of 11 families. Our third test explored the effect of enlarging geographical sampling area at a continental scale on genetic variability in DNA barcodes within 20 species of nine families. Our results generally point towards a high utility of DNA barcodes in identifying spider species. However, the size of the barcoding gap strongly depends on taxonomic groups and practices. It is becoming critical to define the barcoding gap statistically more consistently and to document its variation over taxonomic scales. Our results support models of independent patterns of morphological and molecular evolution by showing that DNA barcodes are effective in species identification regardless of their morphological diagnosibility. We also show that DNA barcodes represent an effective tool for identifying spider species over geographic scales, yet their variation contains useful biogeographic information.

...read moreread less

Journal Article•DOI•

PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic eukaryotes with curated taxonomy

[...]

Johan Decelle, Sarah Romac, Rowena Stern, El Mahdi Bendif, Adriana Zingone¹, Stéphane Audic, Michel D Guiry², Laure Guillou, Désiré Tessier, Florence Le Gall, Priscillia Gourvil, Adriana Lopes dos Santos, Ian Probert, Daniel Vaulot, Colomban de Vargas, Richard Christen - Show less +12 more•Institutions (2)

Stazione Zoologica Anton Dohrn¹, National University of Ireland²

WFABC: a Wright–Fisher ABC‐based approach for inferring effective population sizes and selection coefficients from time‐sampled data

TL;DR: The PhytoREF database is built that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages and mainly focuses on marine microalgae, but sequences from land plants and freshwater taxa were also included to broaden the applicability of Phy toREF to different aquatic and terrestrial habitats.

...read moreread less

Abstract: Photosynthetic eukaryotes have a critical role as the main producers in most ecosystems of the biosphere. The ongoing environmental metabarcoding revolution opens the perspective for holistic ecosystems biological studies of these organisms, in particular the unicellular microalgae that often lack distinctive morphological characters and have complex life cycles. To interpret environmental sequences, metabarcoding necessarily relies on taxonomically curated databases containing reference sequences of the targeted gene (or barcode) from identified organisms. To date, no such reference framework exists for photosynthetic eukaryotes. In this study, we built the PhytoREF database that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages. We compiled 3333 amplicon sequences available from public databases and 879 sequences extracted from plastidial genomes, and generated 411 novel sequences from cultured marine microalgal strains belonging to different eukaryotic lineages. A total of 1867 environmental Sanger 16S rDNA sequences were also included in the database. Stringent quality filtering and a phylogeny-based taxonomic classification were applied for each 16S rDNA sequence. The database mainly focuses on marine microalgae, but sequences from land plants (representing half of the PhytoREF sequences) and freshwater taxa were also included to broaden the applicability of PhytoREF to different aquatic and terrestrial habitats. PhytoREF, accessible via a web interface (http://phytoref.fr), is a new resource in molecular ecology to foster the discovery, assessment and monitoring of the diversity of photosynthetic eukaryotes using high-throughput sequencing.

...read moreread less

Journal Article•DOI•

[...]

Matthieu Foll¹, Matthieu Foll², Hyunjin Shim², Hyunjin Shim¹, Jeffrey D. Jensen¹, Jeffrey D. Jensen² - Show less +2 more•Institutions (2)

Swiss Institute of Bioinformatics¹, École Polytechnique Fédérale de Lausanne²

ITS1: a DNA barcode better than ITS2 in eukaryotes?

TL;DR: In this paper, approximate Bayesian computation (ABC)-based method is used to estimate the population genetic parameters from time-sampled data sets, which is then set as a prior for inferring per-site selection coefficients accurately and precisely.

...read moreread less

Abstract: With novel developments in sequencing technologies, time-sampled data are becoming more available and accessible. Naturally, there have been efforts in parallel to infer population genetic parameters from these data sets. Here, we compare and analyse four recent approaches based on the Wright-Fisher model for inferring selection coefficients (s) given effective population size (N-e), with simulated temporal data sets. Furthermore, we demonstrate the advantage of a recently proposed approximate Bayesian computation (ABC)-based method that is able to correctly infer genomewide average N-e from time-serial data, which is then set as a prior for inferring per-site selection coefficients accurately and precisely. We implement this ABC method in a new software and apply it to a classical time-serial data set of the medionigra genotype in the moth Panaxia dominula. We show that a recessive lethal model is the best explanation for the observed variation in allele frequency by implementing an estimator of the dominance ratio (h).

...read moreread less

Journal Article•DOI•

[...]

Xin-Cun Wang¹, Chang Liu¹, Liang Huang¹, Johan Bengtsson-Palme², Haimei Chen¹, Jianhui Zhang¹, Dayong Cai¹, Jianqin Li¹ - Show less +4 more•Institutions (2)

Peking Union Medical College¹, University of Gothenburg²

Comparing the effectiveness of metagenomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrix nemaeus).

TL;DR: A large‐scale meta‐analysis was carried out to compare ITS1 and ITS2 from three aspects: PCR amplification, DNA sequencing and species discrimination, and found that ITS1 represents a better DNA barcode than ITS2 for eukaryotic species.

...read moreread less

Abstract: A DNA barcode is a short piece of DNA sequence used for species determination and discovery. The internal transcribed spacer (ITS/ITS2) region has been proposed as the standard DNA barcode for fungi and seed plants and has been widely used in DNA barcoding analyses for other biological groups, for example algae, protists and animals. The ITS region consists of both ITS1 and ITS2 regions. Here, a large-scale meta-analysis was carried out to compare ITS1 and ITS2 from three aspects: PCR amplification, DNA sequencing and species discrimination, in terms of the presence of DNA barcoding gaps, species discrimination efficiency, sequence length distribution, GC content distribution and primer universality. In total, 85 345 sequence pairs in 10 major groups of eukaryotes, including ascomycetes, basidiomycetes, liverworts, mosses, ferns, gymnosperms, monocotyledons, eudicotyledons, insects and fishes, covering 611 families, 3694 genera, and 19 060 species, were analysed. Using similarity-based methods, we calculated species discrimination efficiencies for ITS1 and ITS2 in all major groups, families and genera. Using Fisher's exact test, we found that ITS1 has significantly higher efficiencies than ITS2 in 17 of the 47 families and 20 of the 49 genera, which are sample-rich. By in silico PCR amplification evaluation, primer universality of the extensively applied ITS1 primers was found superior to that of ITS2 primers. Additionally, shorter length of amplification product and lower GC content was discovered to be two other advantages of ITS1 for sequencing. In summary, ITS1 represents a better DNA barcode than ITS2 for eukaryotic species.

...read moreread less

Journal Article•DOI•

[...]

Amrita Srivathsan¹, Amrita Srivathsan², John Chih Mun Sha³, Alfried P. Vogler², Alfried P. Vogler⁴, Rudolf Meier¹ - Show less +2 more•Institutions (4)

National University of Singapore¹, Imperial College London², Wildlife Reserves Singapore³, American Museum of Natural History⁴

01 Mar 2015-Molecular Ecology Resources

TL;DR: Read numbers for diet species in metagenomic and metabarcoding data were correlated, indicating that both are useful for determining relative sequence abundance, and the precision of identifications and species recovery would improve further.

...read moreread less

Abstract: Faecal samples are of great value as a non-invasive means to gather information on the genetics, distribution, demography, diet and parasite infestation of endangered species. Direct shotgun sequencing of faecal DNA could give information on these simultaneously, but this approach is largely untested. Here, we used two faecal samples to characterize the diet of two red-shanked doucs langurs (Pygathrix nemaeus) that were fed known foliage, fruits, vegetables and cereals. Illumina HiSeq produced ~74 and 67 million paired reads for these samples, of which ~ 10,000 (0.014%) and ~ 44,000 (0.066%), respectively, were of chloroplast origin. Sequences were matched against a database of available chloroplast 'barcodes' for angiosperms. The results were compared with 'metabarcoding' using PCR amplification of the P6 loop of trnL. Metagenomics identified seven and nine of the likely 16 diet plants while six and five were identified by metabarcoding. Metabarcoding produced thousands of reads consistent with the known diet, but the barcodes were too short to identify several plant species to genus. Metagenomics utilized multiple, longer barcodes that combined had greater power of identification. However, rare diet items were not recovered. Read numbers for diet species in metagenomic and metabarcoding data were correlated, indicating that both are useful for determining relative sequence abundance. Metagenomic reads were uniformly distributed across the chloroplast genomes; thus, if chloroplast genomes were used as reference, the precision of identifications and species recovery would improve further. Metagenomics also recovered the host mitochondrial genome and numerous intestinal parasite sequences in addition to generating data useful for characterizing the microbiome.

...read moreread less

Journal Article•DOI•

Impacts of degraded DNA on restriction enzyme associated DNA sequencing (RADSeq).

[...]

Carly F. Graham¹, Travis C. Glenn², Andrew G. McArthur³, Douglas R. Boreham⁴, Troy J. Kieran², Stacey L. Lance², Richard G. Manzon¹, Jessica A. Martino¹, Todd W. Pierson², Sean M. Rogers⁵, Joanna Y. Wilson³, Christopher M. Somers¹ - Show less +8 more•Institutions (5)

University of Regina¹, University of Georgia², McMaster University³, Northern Ontario School of Medicine⁴, University of Calgary⁵

New primers for DNA barcoding of digeneans and cestodes (Platyhelminthes).

TL;DR: It is concluded that starting DNA quality is an important consideration for RADSeq; however, the approach remains robust until genomic DNA is extensively degraded.

...read moreread less

Abstract: Degraded DNA from suboptimal field sampling is common in molecular ecology. However, its impact on techniques that use restriction site associated next-generation DNA sequencing (RADSeq, GBS) is unknown. We experimentally examined the effects of in situDNA degradation on data generation for a modified double-digest RADSeq approach (3RAD). We generated libraries using genomic DNA serially extracted from the muscle tissue of 8 individual lake whitefish (Coregonus clupeaformis) following 0-, 12-, 48- and 96-h incubation at room temperature posteuthanasia. This treatment of the tissue resulted in input DNA that ranged in quality from nearly intact to highly sheared. All samples were sequenced as a multiplexed pool on an Illumina MiSeq. Libraries created from low to moderately degraded DNA (12-48 h) performed well. In contrast, the number of RADtags per individual, number of variable sites, and percentage of identical RADtags retained were all dramatically reduced when libraries were made using highly degraded DNA (96-h group). This reduction in performance was largely due to a significant and unexpected loss of raw reads as a result of poor quality scores. Our findings remained consistent after changes in restriction enzymes, modified fold coverage values (2- to 16-fold), and additional read-length trimming. We conclude that starting DNA quality is an important consideration for RADSeq; however, the approach remains robust until genomic DNA is extensively degraded.

...read moreread less

Journal Article•DOI•

[...]

Niels Van Steenkiste¹, Sean A. Locke², Magalie Castelin¹, David J. Marcogliese², Cathryn L. Abbott¹ - Show less +1 more•Institutions (2)

Fisheries and Oceans Canada¹, Environment Canada²

Towards a universal barcode of oomycetes--a comparison of the cox1 and cox2 loci.

TL;DR: New degenerate primers were developed that enabled acquisition of the COI barcode region from 100% of specimens tested, representing 23 families of digeneans and 6 orders of cestodes, and represent an improvement over existing methods.

...read moreread less

Abstract: Digeneans and cestodes are species-rich taxa and can seriously impact human health, fisheries, aqua- and agriculture, and wildlife conservation and management. DNA barcoding using the COI Folmer region could be applied for species detection and identification, but both 'universal' and taxon-specific COI primers fail to amplify in many flatworm taxa. We found that high levels of nucleotide variation at priming sites made it unrealistic to design primers targeting all flatworms. We developed new degenerate primers that enabled acquisition of the COI barcode region from 100% of specimens tested (n = 46), representing 23 families of digeneans and 6 orders of cestodes. This high success rate represents an improvement over existing methods. Primers and methods provided here are critical pieces towards redressing the current paucity of COI barcodes for these taxa in public databases.

...read moreread less

Journal Article•DOI•

[...]

Young Joon Choi¹, Gordon W. Beakes², Sally L. Glockling, Julia Kruse¹, Bora Nam¹, Lisa Nigrelli¹, Sebastian Ploch, Hyeon Dong Shin³, Roger G. Shivas, Sabine Telle¹, Hermann Voglmayr⁴, Hermann Voglmayr⁵, Marco Thines - Show less +9 more•Institutions (5)

Goethe University Frankfurt¹, Newcastle University², Korea University³, University of Natural Resources and Life Sciences, Vienna⁴, University of Vienna⁵

DNA barcoding of Rhododendron (Ericaceae), the largest Chinese plant genus in biodiversity hotspots of the Himalaya-Hengduan Mountains

TL;DR: Which out of cox1 or cox2 is best suited as a universal oomycete barcode, in terms of PCR efficiency for 31 representative genera, as well as for historic herbarium specimens, and sequence polymorphism, intra‐ and interspecific divergence is compared.

...read moreread less

Abstract: Oomycetes are a diverse group of eukaryotes in terrestrial, limnic and marine habitats worldwide and include several devastating plant pathogens, for example Phytophthora infestans (potato late blight). The cytochrome c oxidase subunit 2 gene (cox2) has been widely used for identification, taxonomy and phylogeny of various oomycete groups. However, recently the cox1 gene was proposed as a DNA barcode marker instead, together with ITS rDNA. The cox1 locus has been used in some studies of Pythium and Phytophthora, but has rarely been used for other oomycetes, as amplification success of cox1 varies with different lineages and sample ages. To determine which out of cox1 or cox2 is best suited as a universal oomycete barcode, we compared these two genes in terms of (i) PCR efficiency for 31 representative genera, as well as for historic herbarium specimens, and (ii) sequence polymorphism, intra- and interspecific divergence. The primer sets for cox2 successfully amplified all oomycete genera tested, while cox1 failed to amplify three genera. In addition, cox2 exhibited higher PCR efficiency for historic herbarium specimens, providing easier access to barcoding-type material. Sequence data for several historic type specimens exist for cox2, but there are none for cox1. In addition, cox2 yielded higher species identification success, with higher interspecific and lower intraspecific divergences than cox1. Therefore, cox2 is suggested as a partner DNA barcode along with ITS rDNA instead of cox1. The cox2-1 spacer could be a useful marker below species level. Improved protocols and universal primers are presented for all genes to facilitate future barcoding efforts.

...read moreread less

Journal Article•DOI•

[...]

Li-Jun Yan¹, Jie Liu¹, Michael Möller², Michael Möller¹, Lin Zhang¹, Xue-Mei Zhang³, De-Zhu Li¹, Lian-Ming Gao¹ - Show less +4 more•Institutions (3)

Chinese Academy of Sciences¹, Royal Botanic Garden Edinburgh², Yunnan Agricultural University³

Simulating and detecting autocorrelation of molecular evolutionary rates among lineages.

TL;DR: Taking the morphology, distribution range and habitat of the species into account, DNA barcoding provided additional information for species identification and delivered a preliminary assessment of biodiversity for the large genus Rhododendron in the biodiversity hotspots of the Himalaya–Hengduan Mountains.

...read moreread less

Abstract: The Himalaya–Hengduan Mountains encompass two global biodiversity hotspots with high levels of biodiversity and endemism. This area is one of the diversification centres of the genus Rhododendron, which is recognized as one of the most taxonomically challenging plant taxa due to recent adaptive radiations and rampant hybridization. In this study, four DNA barcodes were evaluated on 531 samples representing 173 species of seven sections of four subgenera in Rhododendron, with a high sampling density from the Himalaya–Hengduan Mountains employing three analytical methods. The varied approaches (NJ, PWG and BLAST) had different species identification powers with BLAST performing best. With the PWG analysis, the discrimination rates for single barcodes varied from 12.21% to 25.19% with ITS < rbcL < matK < psbA-trnH. Combinations of ITS + psbA-trnH + matK and the four barcodes showed the highest discrimination ability (both 41.98%) among all possible combinations. As a single barcode, psbA-trnH performed best with a relatively high performance (25.19%). Overall, the three-marker combination of ITS + psbAtrnH + matK was found to be the best DNA barcode for identifying Rhododendron species. The relatively low discriminative efficiency of DNA barcoding in this genus (~42%) may possibly be attributable to too low sequence divergences as a result of a long generation time of Rhododendron and complex speciation patterns involving recent radiations and hybridizations. Taking the morphology, distribution range and habitat of the species into account, DNA barcoding provided additional information for species identification and delivered a preliminary assessment of biodiversity for the large genus Rhododendron in the biodiversity hotspots of the Himalaya–Hengduan Mountains.

...read moreread less

Journal Article•DOI•

[...]

Simon Y. W. Ho¹, Sebastián Duchêne¹, David A. Duchêne²•Institutions (2)

University of Sydney¹, Australian National University²

Does complete plastid genome sequencing improve species discrimination and phylogenetic resolution in Araucaria

TL;DR: An R package that allows the evolution of DNA sequences to be simulated according to a range of clock models is presented and the ability of two Bayesian phylogenetic methods to distinguish among different relaxed‐clock models and to quantify rate variation among lineages is assessed.

...read moreread less

Abstract: Evolutionary timescales can be estimated from genetic data using phylogenetic methods based on the molecular clock. To account for molecular rate variation among lineages, a number of relaxed-clock models have been developed. Some of these models assume that rates vary among lineages in an autocorrelated manner, so that closely related species share similar rates. In contrast, uncorrelated relaxed clocks allow all of the branch-specific rates to be drawn from a single distribution, without assuming any correlation between rates along neighbouring branches. There is uncertainty about which of these two classes of relaxed-clock models are more appropriate for biological data. We present an R package, NELSI, that allows the evolution of DNA sequences to be simulated according to a range of clock models. Using data generated by this package, we assessed the ability of two Bayesian phylogenetic methods to distinguish among different relaxed-clock models and to quantify rate variation among lineages. The results of our analyses show that rate autocorrelation is typically difficult to detect, even when there is complete taxon sampling. This provides a potential explanation for past failures to detect rate autocorrelation in a range of data sets.

...read moreread less

Journal Article•DOI•

[...]

Markus Ruhsam¹, Hardeep S. Rai², Sarah Mathews³, T. Gregory Ross⁴, Sean W. Graham⁴, Linda A. Raubeson⁵, Wenbin Mei⁵, Wenbin Mei⁶, Philip Thomas¹, Martin F. Gardner¹, Richard A. Ennos⁷, Peter M. Hollingsworth¹ - Show less +8 more•Institutions (7)

Royal Botanic Garden Edinburgh¹, Utah State University², Harvard University³, University of British Columbia⁴, Central Washington University⁵, University of Florida⁶, University of Edinburgh⁷

01 Sep 2015-Molecular Ecology Resources

TL;DR: Modest gains in discrimination are possible, but using complete plastid genomes or a small number of nuclear genes in DNA barcoding may not substantially raise species discriminatory power in many evolutionarily young lineages.

...read moreread less

Abstract: Obtaining accurate phylogenies and effective species discrimination using a small standardized set of plastid genes is challenging in evolutionarily young lineages. Complete plastid genome sequencing offers an increasingly easy-to-access source of characters that helps address this. The usefulness of this approach, however, depends on the extent to which plastid haplotypes track morphological species boundaries. We have tested the power of complete plastid genomes to discriminate among multiple accessions of 11 of 13 New Caledonian Araucaria species, an evolutionarily young lineage where the standard DNA barcoding approach has so far failed and phylogenetic relationships have remained elusive. Additionally, 11 nuclear gene regions were Sanger sequenced for all accessions to ascertain the success of species discrimination using a moderate number of nuclear genes. Overall, fewer than half of the New Caledonian Araucaria species with multiple accessions were monophyletic in the plastid or nuclear trees. However, the plastid data retrieved a phylogeny with a higher resolution compared to any previously published tree of this clade and supported the monophyly of about twice as many species and nodes compared to the nuclear data set. Modest gains in discrimination thus are possible, but using complete plastid genomes or a small number of nuclear genes in DNA barcoding may not substantially raise species discriminatory power in many evolutionarily young lineages. The big challenge therefore remains to develop techniques that allow routine access to large numbers of nuclear markers scaleable to thousands of individuals from phylogenetically disparate sample sets.

...read moreread less

Journal Article•DOI•

A comparison of single nucleotide polymorphism and microsatellite markers for analysis of parentage and kinship in a cooperatively breeding bird

[...]

Lucia R. Weinman¹, Joseph W. Solomon¹, Dustin R. Rubenstein¹, Dustin R. Rubenstein²•Institutions (2)

Columbia University¹, American Museum of Natural History²