Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies

doi:10.1093/NAR/GKS808

Home
/
Papers
/
Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies

Journal Article•DOI•

Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies

Anna Klindworth¹, Elmar Pruesse², Timmy Schweer², Jörg Peplies², Christian Quast², Matthias Horn², Frank Oliver Glöckner² - Show less +3 more•Institutions (2)

Max Planck Society¹, Jacobs University Bremen²

01 Jan 2013-Nucleic Acids Research (Oxford University Press)-Vol. 41, Iss: 1, pp 1-11

TL;DR: The results of this study may be used as a guideline for selecting primer pairs with the best overall coverage and phylum spectrum for specific applications, therefore reducing the bias in PCR-based microbial diversity studies.

read less

Abstract: 16S ribosomal RNA gene (rDNA) amplicon analysis remains the standard approach for the cultivation-independent investigation of microbial diversity. The accuracy of these analyses depends strongly on the choice of primers. The overall coverage and phylum spectrum of 175 primers and 512 primer pairs were evaluated in silico with respect to the SILVA 16S/18S rDNA non-redundant reference dataset (SSURef 108 NR). Based on this evaluation a selection of 'best available' primer pairs for Bacteria and Archaea for three amplicon size classes (100-400, 400-1000, ≥ 1000 bp) is provided. The most promising bacterial primer pair (S-D-Bact-0341-b-S-17/S-D-Bact-0785-a-A-21), with an amplicon size of 464 bp, was experimentally evaluated by comparing the taxonomic distribution of the 16S rDNA amplicons with 16S rDNA fragments from directly sequenced metagenomes. The results of this study may be used as a guideline for selecting primer pairs with the best overall coverage and phylum spectrum for specific applications, therefore reducing the bias in PCR-based microbial diversity studies.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Reagent and laboratory contamination can critically impact sequence-based microbiome analyses

[...]

Susannah J. Salter¹, Michael J. Cox², Elena M. Turek², Szymon T. Calus³, William O.C.M. Cookson², Miriam F. Moffatt², Paul Turner⁴, Paul Turner⁵, Julian Parkhill¹, Nicholas J. Loman³, Alan W. Walker¹, Alan W. Walker⁶ - Show less +8 more•Institutions (6)

Wellcome Trust Sanger Institute¹, National Institutes of Health², University of Birmingham³, Mahidol University⁴, University of Oxford⁵, University of Aberdeen⁶

12 Nov 2014-BMC Biology

TL;DR: It is demonstrated that contaminating DNA is ubiquitous in commonly used DNA extraction kits and other laboratory reagents, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass.

...read moreread less

Abstract: The study of microbial communities has been revolutionised in recent years by the widespread adoption of culture independent analytical techniques such as 16S rRNA gene sequencing and metagenomics. One potential confounder of these sequence-based approaches is the presence of contamination in DNA extraction kits and other laboratory reagents. In this study we demonstrate that contaminating DNA is ubiquitous in commonly used DNA extraction kits and other laboratory reagents, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass. Contamination impacts both PCR-based 16S rRNA gene surveys and shotgun metagenomics. We provide an extensive list of potential contaminating genera, and guidelines on how to mitigate the effects of contamination. These results suggest that caution should be advised when applying sequence-based techniques to the study of microbiota present in low biomass environments. Concurrent sequencing of negative control samples is strongly advised.

...read moreread less

2,459 citations

Cites methods from "Evaluation of general 16S ribosomal..."

...Primers used were: S-D-Bact-0564-a-S- 15, 5′AYTGGGYDTAAAGNG and S-D-Bact-0785-b-A-18, 5TACNVGGGTATCTAATCC [65] generating a 253 bp amplicon....
[...]

Journal Article•DOI•

Every base matters: assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples

[...]

Alma E. Parada¹, David M. Needham¹, Jed A. Fuhrman¹•Institutions (1)

University of Southern California¹

01 May 2016-Environmental Microbiology

TL;DR: It is shown that beyond in silico predictions, testing with mock communities and field samples is important in primer selection, and a single mismatch can strongly bias amplification, but even perfectly matched primers can exhibit preferential amplification.

...read moreread less

Abstract: Summary Microbial community analysis via high-throughput sequencing of amplified 16S rRNA genes is an essential microbiology tool. We found the popular primer pair 515F (515F-C) and 806R greatly underestimated (e.g. SAR11) or overestimated (e.g. Gammaproteobacteria) common marine taxa. We evaluated marine samples and mock communities (containing 11 or 27 marine 16S clones), showing alternative primers 515F-Y (5′-GTGYCAGCMGCCGCGGTAA) and 926R (5′-CCGYCAATTYMTTTRAGTTT) yield more accurate estimates of mock community abundances, produce longer amplicons that can differentiate taxa unresolvable with 515F-C/806R, and amplify eukaryotic 18S rRNA. Mock communities amplified with 515F-Y/926R yielded closer observed community composition versus expected (r2 = 0.95) compared with 515F-Y/806R (r2 ∼ 0.5). Unexpectedly, biases with 515F-Y/806R against SAR11 in field samples (∼4–10-fold) were stronger than in mock communities (∼2-fold). Correcting a mismatch to Thaumarchaea in the 515F-C increased their apparent abundance in field samples, but not as much as using 926R rather than 806R. With plankton samples rich in eukaryotic DNA (> 1 μm size fraction), 18S sequences averaged ∼17% of all sequences. A single mismatch can strongly bias amplification, but even perfectly matched primers can exhibit preferential amplification. We show that beyond in silico predictions, testing with mock communities and field samples is important in primer selection.

...read moreread less

2,077 citations

Journal Article•DOI•

Uniting the classification of cultured and uncultured bacteria and archaea using 16S rRNA gene sequences

[...]

Pablo Yarza¹, Pelin Yilmaz², Elmar Pruesse², Frank Oliver Glöckner³, Wolfgang Ludwig⁴, Karl-Heinz Schleifer⁴, William B. Whitman⁵, Jean Euzeby⁶, Rudolf Amann², Ramon Rosselló-Móra¹ - Show less +6 more•Institutions (6)

University of the Balearic Islands¹, Max Planck Society², Jacobs University Bremen³, Technische Universität München⁴, University of Georgia⁵, École nationale vétérinaire de Toulouse⁶

01 Sep 2014-Nature Reviews Microbiology

TL;DR: This article proposes rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggests a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and Archaea.

...read moreread less

Abstract: Publicly available sequence databases of the small subunit ribosomal RNA gene, also known as 16S rRNA in bacteria and archaea, are growing rapidly, and the number of entries currently exceeds 4 million. However, a unified classification and nomenclature framework for all bacteria and archaea does not yet exist. In this Analysis article, we propose rational taxonomic boundaries for high taxa of bacteria and archaea on the basis of 16S rRNA gene sequence identities and suggest a rationale for the circumscription of uncultured taxa that is compatible with the taxonomy of cultured bacteria and archaea. Our analyses show that only nearly complete 16S rRNA sequences give accurate measures of taxonomic diversity. In addition, our analyses suggest that most of the 16S rRNA sequences of the high taxa will be discovered in environmental surveys by the end of the current decade.

...read moreread less

1,755 citations

Journal Article•DOI•

A new view of the tree of life

[...]

Laura A. Hug¹, Laura A. Hug², Brett J. Baker³, Karthik Anantharaman², Christopher T. Brown⁴, Alexander J. Probst², Cindy J. Castelle², Cristina N. Butterfield², Alex W Hernsdorf⁴, Yuki Amano⁵, Kotaro Ise⁵, Yohey Suzuki⁶, Natasha Dudek⁷, David A. Relman⁸, David A. Relman⁹, Kari M. Finstad⁴, Ronald Amundson⁴, Brian C. Thomas², Jillian F. Banfield⁴, Jillian F. Banfield² - Show less +16 more•Institutions (9)

University of Waterloo¹, Planetary Science Institute², University of Texas at Austin³, University of California, Berkeley⁴, Japan Atomic Energy Agency⁵, University of Tokyo⁶, University of California, Santa Cruz⁷, Stanford University⁸, Veterans Health Administration⁹

11 Apr 2016-Nature microbiology

TL;DR: New genomic data from over 1,000 uncultivated and little known organisms, together with published sequences, are used to infer a dramatically expanded version of the tree of life, with Bacteria, Archaea and Eukarya included.

...read moreread less

Abstract: The tree of life is one of the most important organizing principles in biology1. Gene surveys suggest the existence of an enormous number of branches2, but even an approximation of the full scale of the tree has remained elusive. Recent depictions of the tree of life have focused either on the nature of deep evolutionary relationships3–5 or on the known, well-classified diversity of life with an emphasis on eukaryotes6. These approaches overlook the dramatic change in our understanding of life's diversity resulting from genomic sampling of previously unexamined environments. New methods to generate genome sequences illuminate the identity of organisms and their metabolic capacities, placing them in community and ecosystem contexts7,8. Here, we use new genomic data from over 1,000 uncultivated and little known organisms, together with published sequences, to infer a dramatically expanded version of the tree of life, with Bacteria, Archaea and Eukarya included. The depiction is both a global overview and a snapshot of the diversity within each major lineage. The results reveal the dominance of bacterial diversification and underline the importance of organisms lacking isolated representatives, with substantial evolution concentrated in a major radiation of such organisms. This tree highlights major lineages currently underrepresented in biogeochemical models and identifies radiations that are probably important for future evolutionary analyses. An update to the ‘tree of life’ has revealed a dominance of bacterial diversity in many ecosystems and extensive evolution in some branches of the tree. It also highlights how few organisms we have been able to cultivate for further investigation.

...read moreread less

1,614 citations

Journal Article•DOI•

Population-level analysis of gut microbiome variation

[...]

Gwen Falony¹, Marie Joossens², Marie Joossens¹, Sara Vieira-Silva¹, Jun Wang¹, Youssef Darzi², Youssef Darzi¹, Karoline Faust¹, Karoline Faust², Alexander Kurilshikov³, Marc Jan Bonder⁴, Mireia Valles-Colomer¹, Doris Vandeputte², Doris Vandeputte¹, Raul Y. Tito², Raul Y. Tito¹, Samuel Chaffron², Samuel Chaffron¹, Leen Rymenans¹, Leen Rymenans², Chloë Verspecht¹, Lise De Sutter¹, Lise De Sutter², Gipsi Lima-Mendez¹, Kevin D'hoe², Kevin D'hoe¹, Karl Jonckheere¹, Karl Jonckheere², Daniel Homola¹, Daniel Homola², Roberto Garcia², Roberto Garcia¹, Ettje F. Tigchelaar⁴, Linda Eeckhaudt², Linda Eeckhaudt¹, Jingyuan Fu⁴, Liesbet Henckaerts¹, Alexandra Zhernakova⁴, Cisca Wijmenga⁴, Jeroen Raes², Jeroen Raes¹ - Show less +37 more•Institutions (4)

Katholieke Universiteit Leuven¹, Vrije Universiteit Brussel², Novosibirsk State University³, University Medical Center Groningen⁴

29 Apr 2016-Science

TL;DR: Stool consistency showed the largest effect size, whereas medication explained largest total variance and interacted with other covariate-microbiota associations, and proposed disease marker genera associated to host covariates were found associated to microbiota compositional variation with a 92% replication rate.

...read moreread less

Abstract: Fecal microbiome variation in the average, healthy population has remained under-investigated. Here, we analyzed two independent, extensively phenotyped cohorts: the Belgian Flemish Gut Flora Project (FGFP; discovery cohort; N = 1106) and the Dutch LifeLines-DEEP study (LLDeep; replication; N = 1135). Integration with global data sets (N combined = 3948) revealed a 14-genera core microbiota, but the 664 identified genera still underexplore total gut diversity. Sixty-nine clinical and questionnaire-based covariates were found associated to microbiota compositional variation with a 92% replication rate. Stool consistency showed the largest effect size, whereas medication explained largest total variance and interacted with other covariate-microbiota associations. Early-life events such as birth mode were not reflected in adult microbiota composition. Finally, we found that proposed disease marker genera associated to host covariates, urging inclusion of the latter in study design.

...read moreread less

1,562 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Search and clustering orders of magnitude faster than BLAST

[...]

Robert C. Edgar

01 Oct 2010-Bioinformatics

TL;DR: UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters and offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.

...read moreread less

Abstract: Motivation: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. Results: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. Availability: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

...read moreread less

17,301 citations

Journal Article•DOI•

Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA

[...]

Gerard Muyzer¹, E.C. de Waal¹, And A G Uitterlinden¹•Institutions (1)

Leiden University¹

01 Mar 1993-Applied and Environmental Microbiology

TL;DR: Analysis of the genomic DNA from a bacterial biofilm grown under aerobic conditions suggests that sulfate-reducing bacteria, despite their anaerobicity, were present in this environment.

...read moreread less

Abstract: We describe a new molecular approach to analyzing the genetic diversity of complex microbial populations. This technique is based on the separation of polymerase chain reaction-amplified fragments of genes coding for 16S rRNA, all the same length, by denaturing gradient gel electrophoresis (DGGE). DGGE analysis of different microbial communities demonstrated the presence of up to 10 distinguishable bands in the separation pattern, which were most likely derived from as many different species constituting these populations, and thereby generated a DGGE profile of the populations. We showed that it is possible to identify constituents which represent only 1% of the total population. With an oligonucleotide probe specific for the V3 region of 16S rRNA of sulfate-reducing bacteria, particular DNA fragments from some of the microbial populations could be identified by hybridization analysis. Analysis of the genomic DNA from a bacterial biofilm grown under aerobic conditions suggests that sulfate-reducing bacteria, despite their anaerobicity, were present in this environment. The results we obtained demonstrate that this technique will contribute to our understanding of the genetic diversity of uncharacterized microbial populations.

...read moreread less

11,380 citations

Additional excerpts

...Primer pairs were: (i): S-D-Bact-0341b-S-17, 50-CCTACGGGNGGCWGCAG-30 (32), and S-D-Bact-0785-a-A-21, 50-GACTACHVGGGTATCTA ATCC-3 (32); and (ii): S-D-Bact-0008-a-S-16, 50-AGAG TTTGATCMTGGC-30 (33), and S-D-Bact-0907-a-A-20, 50-CCGTCAATTCMTTTGAGTTT-30 (34)....
[...]

Journal Article•DOI•

Greengenes, a Chimera-Checked 16S rRNA Gene Database and Workbench Compatible with ARB

[...]

Todd Z. DeSantis¹, Philip Hugenholtz², Neils Larsen, Mark Rojas³, Eoin L. Brodie¹, Keith Keller⁴, Thomas Huber⁵, Daniel Dalevi⁶, Ping Hu¹, Gary L. Andersen¹ - Show less +6 more•Institutions (6)

Lawrence Berkeley National Laboratory¹, Joint Genome Institute², Baylor University³, University of California, Berkeley⁴, University of Queensland⁵, Chalmers University of Technology⁶

01 Jul 2006-Applied and Environmental Microbiology

TL;DR: A 16S rRNA gene database (http://greengenes.lbl.gov) was used to provide chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies as mentioned in this paper.

...read moreread less

Abstract: A 16S rRNA gene database (http://greengenes.lbl.gov) addresses limitations of public repositories by providing chimera screening, standard alignment, and taxonomic classification using multiple published taxonomies. It was found that there is incongruent taxonomic nomenclature among curators even at the phylum level. Putative chimeras were identified in 3% of environmental sequences and in 0.2% of records derived from isolates. Environmental sequences were classified into 100 phylum-level lineages in the Archaea and Bacteria.

...read moreread less

9,593 citations

Journal Article•DOI•

Genome sequencing in microfabricated high-density picolitre reactors

[...]

Marcel Margulies, Michael Egholm, William E. Altman, Said Attiya, Joel S. Bader, Lisa A. Bemben, Jan Berka, Michael S. Braverman, Yi-Ju Chen, Zhoutao Chen, Scott Dewell, Lei Du, J. M. Fierro, Xavier V. Gomes, Brian C. Godwin, Wen He, Scott Edward Helgesen, Chun Heen Ho, Gerard P. Irzyk, Szilveszter C. Jando, Maria L. I. Alenquer, Thomas P. Jarvie, Kshama B. Jirage, Jong-Bum Kim, James R. Knight, Janna R. Lanza, John H. Leamon, Steven Lefkowitz, Ming Lei, Jing Li, Kenton Lohman, Hong Lu, Vinod Makhijani, Keith Mcdade, Michael P. McKenna, Eugene W. Myers¹, Elizabeth Nickerson, John Nobile, Ramona Plant, Bernard P. Puc, Michael T. Ronan, George T. Roth, Gary J. Sarkis, Jan Fredrik Simons, John Simpson, Maithreyan Srinivasan, Karrie R. Tartaro, Alexander Tomasz², Kari A. Vogt, Greg A. Volkmer, Shally H. Wang, Yong Wang, Michael P. Weiner³, Pengguang Yu, Richard F. Begley, Jonathan M. Rothberg - Show less +52 more•Institutions (3)

University of California, Berkeley¹, Rockefeller University², Rothberg Institute For Childhood Diseases³

15 Sep 2005-Nature

TL;DR: A scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments with 96% coverage at 99.96% accuracy in one run of the machine is described.

...read moreread less

Abstract: The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.

...read moreread less

8,434 citations

"Evaluation of general 16S ribosomal..." refers background in this paper

...For example, ‘0338’ stands for start position 338 in the Escherichia coli system of nomenclature (23); (5) A single lowercase letter indicating the version of the probe....
[...]
...In 2006, Roche’s 454 GS 20 pyrosequencing (5) became the first high-throughput sequencing technology to be successfully applied for large scale biodiversity analysis and was key to uncovering the ‘rare biosphere’ (6)....
[...]

Journal Article•DOI•

Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms

[...]

J. Gregory Caporaso¹, Christian L. Lauber², William A. Walters³, Donna Berg-Lyons², James Huntley³, Noah Fierer², Noah Fierer³, Sarah M. Owens⁴, Jason Betley⁵, Louise Fraser⁵, Markus J. Bauer⁵, Niall Anthony Gormley⁵, Jack A. Gilbert⁴, Jack A. Gilbert⁶, Geoff Smith⁵, Rob Knight - Show less +12 more•Institutions (6)

Northern Arizona University¹, Cooperative Institute for Research in Environmental Sciences², University of Colorado Boulder³, Argonne National Laboratory⁴, Illumina⁵, University of Chicago⁶

01 Aug 2012-The ISME Journal

TL;DR: It is shown that the protocol developed for these instruments successfully recaptures known biological results, and additionally that biological conclusions are consistent across sequencing platforms (the HiSeq2000 versus the MiSeq) and across the sequenced regions of amplicons.

...read moreread less

Abstract: DNA sequencing continues to decrease in cost with the Illumina HiSeq2000 generating up to 600 Gb of paired-end 100 base reads in a ten-day run. Here we present a protocol for community amplicon sequencing on the HiSeq2000 and MiSeq Illumina platforms, and apply that protocol to sequence 24 microbial communities from host-associated and free-living environments. A critical question as more sequencing platforms become available is whether biological conclusions derived on one platform are consistent with what would be derived on a different platform. We show that the protocol developed for these instruments successfully recaptures known biological results, and additionally that biological conclusions are consistent across sequencing platforms (the HiSeq2000 versus the MiSeq) and across the sequenced regions of amplicons.

...read moreread less

6,840 citations

"Evaluation of general 16S ribosomal..." refers background in this paper

...The attractiveness of Illumina (8) lies in the reduced per base costs and comparatively high sequencing depth (9), despite having short read lengths....
[...]