scispace - formally typeset
Open AccessPosted ContentDOI

Reagent contamination can critically impact sequence-based microbiome analyses

Reads0
Chats0
TLDR
It is demonstrated that contaminating DNA is ubiquitous in commonly used DNA extraction kits, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass.
Abstract
The study of microbial communities has been revolutionised in recent years by the widespread adoption of culture independent analytical techniques such as 16S rRNA gene sequencing and metagenomics. One potential confounder of these sequence-based approaches is the presence of contamination in DNA extraction kits and other laboratory reagents. In this study we demonstrate that contaminating DNA is ubiquitous in commonly used DNA extraction kits, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass. Contamination impacts both PCR-based 16S rRNA gene surveys and shotgun metagenomics. These results suggest that caution should be advised when applying sequence-based techniques to the study of microbiota present in low biomass environments. We provide an extensive list of potential contaminating genera, and guidelines on how to mitigate the effects of contamination. Concurrent sequencing of negative control samples is strongly advised.

read more

Content maybe subject to copyright    Report

RES E A R C H A R T I C L E Open Access
Reagent and laboratory contamination can
critically impact sequence-based microbiome
analyses
Susannah J Salter
1*
, Michael J Cox
2
, Elena M Turek
2
, Szymon T Calus
3
, William O Cookson
2
, Miriam F Moffatt
2
,
Paul Turner
4,5
, Julian Parkhill
1
, Nicholas J Loman
3
and Alan W Walker
1,6*
Abstract
Background: The study of microbial communities has been revolutionised in recent years by the wides pread
adoption of culture independent analytical techniques such as 16S rRNA gene sequencing and me tagenomics. One
potential confounder of these sequence-based approaches is the presence of contamination in DNA extraction kits
and other laboratory reagents.
Results: In this study we demonstrate that contaminating DNA is ubiquitous in commonly used DNA extraction
kits and other laboratory reagents, varies greatly in composition between different kits and kit batches, and that this
contamination critically impacts results obtained from samples containing a low microbial biomass. Contamination
impacts both PCR-based 16S rRNA gene surveys and shotgun metagenomics. We provide an extensive list of
potential contaminating genera, and guidelines on how to mitigate the effects of contamination.
Conclusions: These results suggest that caution should be advised when applying sequence-based techniques to
the study of microbiota present in low biomass environments. Concurrent sequencing of negative control samples
is strongly advised.
Keywords: Contamination, Microbiome, Microbiota, Metagenomics, 16S rRNA
Background
Culture-independent studies of microbial communities
are revolutionising our understanding of microbiology
and revealing exquisite interactions between microbes, an-
imals and plants. Two widely used techniques are deep se-
quence surveying of PCR-amplified marker genes such as
16S rRNA, or whole-genome shotgun metagenomics,
where the entire complement of community DNA is se-
quenced en masse. While both of these approaches are
powerful, they have important technical caveats and limi-
tations, which may distort taxonomic distributions and
frequencies observed in the sequence dataset. Such limita-
tions, which have been well reported in the literature, in-
clude choices relating to sample collection, sample storage
and preservation, DNA extraction, amplifying primers,
sequencing technology, read length and depth and bio-
informatics analysis techniques [1,2].
A related additional problem is the introduction of
contaminating microbial DNA during sample prepar-
ation. Possible sources of DNA contamination include
molecular biology grade w ater [3-9], PCR reagent s
[10-15] and DNA extraction kits themselves [16].
Contaminating sequences matching water- and soil-
associated bacterial genera including Acinetobacter,
Alcaligenes , B acil lus , Bradyrhizobium, Herbaspirillum,
Legionella, Leifsonia, Mesorhizobium, Methylobacterium,
Microbacterium, Novosphingobium, Pseudomonas, Ralsto-
nia, Sphingomonas, Stenotrophomonas and Xanthomonas
have been reported pre vious ly [3-15,17,18]. The pres-
ence of contaminating DN A is a particular challenge for
researchers working with samples containing a low
microbial bioma ss. In these ca ses, the low amount of
starting material may be effectively swamped by the
contaminating DNA and generate misleading results.
* Correspondence: sb18@sanger.ac.uk; alan.walker@abdn.ac.uk
1
Pathogen Genomics Group, Wellcome Trust Sanger Institute, Hinxton, UK
6
Microbiology Group, Rowett Institute of Nutrition and Health, University of
Aberdeen, Aberdeen, UK
Full list of author information is available at the end of the article
© 2014 Salter et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain
Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article,
unless otherwise stated.
Salter et al. BMC Biology 2014, 12:87
http://www.biomedcentral.com/1741-7007/12/87

Although the presence of such contaminating DNA has
been reported in the literature, usually associated with
PCR-based studies, its possible impact on high-throughput
16S rRNA gene-based profiling and shotgun metagenomics
studies has not been reported. In our laboratories we rou-
tinely sequence negative controls, consisting of blank
DNA extractions and subsequent PCR amplifications. Des-
pite adding no sample template at the DNA extraction step,
these negative control samples often yield a range of con-
taminating bacterial species (see Table 1), which are often
also visible in the human-derived samples that are proc-
essed concomitantly with the same batch of DNA extrac-
tion kits. The presence of contaminating sequences is
greater in low-biomass samples (such as from blood or the
lung)thaninhigh-biomasssamples(suchasfromfaeces),
suggesting that there is a critical tipping point where con-
taminating DNA becomes dominant in sequence libraries.
Many re c ent publications [19-37] describe important
or core microbiota members , often members that are
biologically unexpected, which overlap with previously-
described contaminant genera. Spurred by this and by
the results from negative control samples in our own la-
boratories when d ealing with low-input DNA samples,
we investigated the impact of contamination on micro-
biota studies and explored methods to limit the impact
of such contamination. In this study we identify the
range of contaminants present in commonly used DNA
extraction reage nts and d emonstrate the significant im-
pact they can have on microbiota studies.
Results
16S rRNA gene sequencing of a pure Salmonella bongori
culture
To demonstrate the presence of contaminating DNA and
its impact on high and low biomass samples, we used 16S
rRNA gene sequence profiling of a pure culture of Sal-
monella bongori that had undergone five rounds of serial
ten-fold dilutions (equating to a range of approximately
10
8
cells as input for DNA extraction in the original un-
diluted sample, to 10
3
cells in dilution five). S. bongori was
chosen because we have not observed it as a contaminant
in any of our previous studies and it can be differentiated
from other Salmonella species by 16S rRNA gene sequen-
cing. As a pure culture was used as starting template, re-
gardless of starting biomass, any organisms other than S.
bongori observed in subsequent DNA sequencing results
must therefore be derived from contamination. Aliquots
from the dilution series were sent to three institutes
(Imperial College London, ICL; University of Birmingham,
UB; Wellcome Trust Sanger Institute, WTSI) and proc-
essed with different batches of the FastDNA SPIN Kit for
Soil (kit FP). 16S rRNA gene amplicons were generated
using both 20 and 40 PCR cycles and returned to WTSI
for Illumina MiSeq sequencing.
Table 1 List of contaminant genera detected in sequenced negative blank controls
Phylum List of constituent contaminant genera
Proteobacteria Alpha-proteobacteria:
Afipia, Aquabacterium
e
, Asticcacaulis, Aurantimonas, Beijerinckia, Bosea, Bradyrhizobium
d
, Brevundimonas
c
, Caulobacter,
Craurococcus, Devosia, Hoeflea
e
, Mesorhizobium, Methylobacterium
c
, Novosphingobium, Ochrobactrum, Paracoccus, Pedomicrobium,
Phyllobacterium
e
, Rhizobium
c,d
, Roseomonas, Sphingobium, Sphingomonas
c,d,e
, Sphingopyxis
Beta-proteobacteria:
Acidovorax
c,e
, Azoarcus
e
, Azospira, Burkholderia
d
, Comamonas
c
, Cupriavidus
c
, Curvibacter, Delftia
e
, Duganella
a
, Herbaspirillum
a,c
,
Janthinobacterium
e
, Kingella, Leptothrix
a
, Limnobacter
e
, Massilia
c
, Methylophilus, Methyloversatilis
e
, Oxalobacter, Pelomonas,
Polaromonas
e
, Ralstonia
b,c,d,e
, Schlegelella, Sulfuritalea, Undibacterium
e
, Variovorax
Gamma-proteobacteria:
Acinetobacter
a,d,c
, Enhydrobacter, Enterobacter, Escherichia
a,c,d,e
, Nevskia
e
, Pseudomonas
b,d,e
, Pseudoxanthomonas, Psychrobacter,
Stenotrophomonas
a,b,c,d,e
, Xanthomonas
b
Actinobacteria Aeromicrobium, Arthrobacter, Beutenbergia, Brevibacterium, Corynebacterium, Curtobacterium, Dietzia, Geodermatophilus, Janibacter,
Kocuria, Microbacterium, Micrococcus, Microlunatus, Patulibacter, Propionibacterium
e
, Rhodococcus, Tsukamurella
Firmicutes Abiotrophia, Bacillus
b
, Brevibacillus, Brochothrix, Facklamia, Paenibacillus, Streptococcus
Bacteroidetes Chryseobacterium, Dyadobacter, Flavobacterium
d
, Hydrotalea, Niastella, Olivibacter, Pedobacter, Wautersiella
Deinococcus-
Thermus
Deinococcus
Acidobacteria Predominantly unclassified Acidobacteria Gp2 organisms
The listed genera were all detected in sequenced negative controls that were processed alongside human-derived samples in our laboratories (WTSI, ICL and UB)
over a period of four years. A variety of DNA extraction and PCR kits were used over this period, although DNA was primarily extracted using the FastDNA SPIN
Kit for Soil. Genus names followed by a superscript letter indicate those that have also been independently reported as contaminants previously.
a
also reported by
Tanner et al.[12];
b
also reported by Grahn et al.[14];
c
also reported by Barton et al.[17];
d
also reported by Laurence et al.[18];
e
also detected as contaminants of
multiple displacement amplification kits (information provided by Paul Scott, Wellcome Trust Sanger Institute). ICL, Imperial College London; UB, University of
Birmingham; WTSI, Wellcome Trust Sanger Institute.
Salter et al. BMC Biology 2014, 12:87 Page 2 of 12
http://www.biomedcentral.com/1741-7007/12/87

S. bongori was the sole organism identified in the ori-
ginal undiluted culture but with subsequent dilutions a
range of contaminating bacterial groups increased in rela-
tive abundance while the proportion of S. bongori reads
concurrently decreased (Figure 1). By the fifth serial dilu-
tion, equivalent to an input biomass of roughly 10
3
Salmonella cells, contamination was the dominant feature
of the sequencing results. This pattern was consistent
across all three sites and was most pronounced with 40 cy-
cles of PCR. These results highlight a key problem with
low biomass samples. The most diluted 20-PCR cycle
samples resulted in low PCR product yields, leading to
under-representation in the multiplexed pool of samples
for sequencing as an equimolar mix could not be achieved
(read counts for each sample are listed in Additional file 1:
Table S1a). Conversely, using 40 PCR cycles generated
enough PCR products for effective sequencing (a mini-
mum of at least 14,000 reads per sample were returned,
see Additional file 1: Table S1a), but a significant propor-
tion of the resulting sequence data was derived from
Figure 1 Summary of 16S rRNA gene sequencing taxonomic assignment from ten-fold diluted pure cultures and controls. Undiluted DNA
extractions contained approximately 10
8
cells, and controls (annotated in the Figure with 'con') were template-free PCRs. DNA was extracted at ICL, UB
and WTSI laboratories and amplified with 40 PCR cycles. Each column represents a single sample; sections (a) and (b) describe the same samples at
different taxonomic levels. a) Proportion of S. bongori sequence reads in black. The proportional abundance of non-Salmonella reads at the Class level
is indicated by other colours. As the sample becomes more dilute, the proportion of the sequenced bacterial amplicons from the cultured
microorganism decreases and contaminants become more dominant. b) Abundance of genera which make up >0.5% of the results from at least one
laboratory, excluding S. bongori. The profiles of the non-Salmonella reads within each laboratory/kit batch are consistent but differ between sites.
Salter et al. BMC Biology 2014, 12:87 Page 3 of 12
http://www.biomedcentral.com/1741-7007/12/87

contaminating, non-Salmonella,DNA.Itshouldbenoted
though that even when using only 20 PCR cycles contam-
ination was still predominant with the lowest input bio-
mass [see Additional file 1: Figure S1].
Sequence profiles revealed some similar taxonomic clas-
sifications between all sites, including Acidobacteria Gp2,
Microbacterium, Propionibacterium and Pseudomonas
(Figure 1b). Differences between sites were observed, how-
ever, with Chryseobacterium, Enterobacter and Massilia
more dominant at WTSI, Sphingomonas at UB, and Cor-
ynebacterium, Facklamia and Streptococcus at ICL, along
with a greater proportion of Actinobacteria in general
(Figure 1a). This illustrates that there is variation in con-
taminant content between laboratories, which may be due
to differences between reagent/kit batches or contami-
nants introduced from the wider laboratory environment.
Many of the contaminating operational taxonomic units
(OTUs) represent bacterial genera normally found in soil
and water, for example Arthrobacter, Burkholderia, Chry-
seobacterium, Ochrobactrum, Pseudomonas, Ralstonia,
Rhodococcus and Sphingomonas, while others, such as
Corynebacterium, Propionibacterium and Streptococcus,
are common human skin-associated organisms. By se-
quencing PCR blank negative controls, specifically PCR-
amplified ultrapure water with no template DNA added,
we were able to distinguish between taxa that had origi-
nated from the DNA extraction kits as opposed to DNA
from other sources (such as PCR kit reagents, laboratory
consumables or laboratory personnel). Sixty-three taxa
were absent from all PCR blank controls but present at
>0.1% proportional abundance in one or more serially-
diluted S. bongori samples [see Additional file 1: Figure
S2], suggesting that they were introduced to the samples
at the DNA extraction stage. These include several abun-
dant genera observed at all three sites, such as Acidobac-
teria Gp2, Burkholderia, unclassified Burkholderiaceae
and Mesorhizobium. It also includes taxa, such as Hydro-
talea and Bradyrhizobium, that were only abundant in
samples processed by one or two sites, possibly indicative
of variation in contaminants between different batches of
the same type of DNA extraction kit.
Quantitative PCR of bacterial biomass
To assess how much background bacterial DNA was
present in the samples , we performed qPCR of bacterial
16S rRN A genes and calculated the copy number of
genes present with reference to a standard cur ve. A s -
suming a complete absence of contamination, copy
number of the 16S rRNA genes present should correlate
with dilution of S. bongori and reduce in a linear man-
ner. However, at the third dilution copy number
remained stable and did not reduce further, indicating
the presence of background DNA at approximately 500
copies per μl of elution volume from the DNA extrac-
tion kit (Figure 2).
Shotgun metagenomics of a pure S. bongori culture
processed with four comm ercial DNA extraction kits
Having established that 16S rRN A gene sequencing re-
sults can be confounded by contaminating DNA, we
next investigated whether similar patterns emerge in
shotgun metagenomics studies, which do not involve a
targeted P CR step. We hypothesised that if contamin-
ation arises from the DNA extraction kit, it should also
be present in metagenomic sequencing result s. DNA e x-
traction kits from four different manufa cturers were
used in order to investigate whether or not the problem
was limited to a single manufacturer. A liquots from the
same S. bongori dilution series were processed at UB
with the FastDNA SPIN Kit for Soil (FP), MoBio Ultra-
Clean Microbial DNA I solation Kit (MB), QIAmp DNA
Stool Mini Kit ( QIA) and PSP Spin Stool DNA Plus kit
(PSP). As with 16S rRNA gene sequencing , it was found
that as the sample dilution increased, the proportion of
reads mapping to the S. bongori reference genome se-
quence decreased (Figure 3a). Regardless of kit, contam-
ination was always the predominant feature of the
sequence data by the fourth serial dilution, which
equated t o an i nput of around 10
4
Salmonella cells.
Samples were processed concurrently within the same
laboratory. If the contamination was derived from the la-
boratory environment then similar bacterial compositions
would be expected in each of the results. Instead, a range
of environmental bacteria was observed, which were of a
different profile in each kit (Figure 3b). FP had a stable kit
profile dominated by Burkholderia, PSP was dominated by
Bradyrhizobium, while the QIA kit had the most complex
mix of bacterial DNA. Bradyrhizobiaceae, Burkholderia-
ceae, Chitinophagaceae, Comomonadaceae, Propionibac-
teriaceae and Pseudomonadaceae were present in at least
three quarters of the dilutions from PSP, FP and QIA kits.
However, relative abundances of taxa at the Family level
varied according to kit: FP was marked by Burkholderia-
ceae and Enterobacteriaceae, PSP was marked by Bradyr-
hizobiaceae and Chitinophagaceae. The contamination in
the QIA kit was relatively diverse in comparison to the
other kits, and included higher proportions of Aerococca-
ceae, Bacillaceae, Flavobacteriaceae, Microbacteriaceae,
Paenibacillaceae, Planctomycetaceae and Polyangiaceae
than the other kits. Kit MB did not have a distinct con-
taminant profile. This was likely a result of the very low
number of reads sequenced, with 210 reads in dilution 2,
79readsindilution3andfewerthan20readsinsubse-
quent dilutions [see Additional file 1: Table S1b]. Although
read count is only a semi-quantitative measure of DNA
concentration, this may indicate that levels of background
Salter et al. BMC Biology 2014, 12:87 Page 4 of 12
http://www.biomedcentral.com/1741-7007/12/87

contamination from this kit were comparatively lower than
the other kits tested.
Comparatively few contaminant taxa that were de-
tected in the blank water control, which was dominated
by Pseudomonas, were detected in the serially diluted
metagenomic samples. This provided further evidence
that the observed con tamination was likely to have origi-
nated in large part from the DNA extraction kits them-
selves. These metagenomic results, therefore, clearly
show that contamination becomes the dominant feature
of seque nce data from low biomass samples, and that
the kit used to extract DNA can have an impact on the
observed bacterial diversity, even in the absence of a
PCR amplification step. Reducing input biomass again
increases the impact of these contaminants upon the ob-
served microbiota.
Impact of contaminated extraction kits on a study of
low-biomass microbiota
Having established that the contamination in different lots
of DNA extraction kits is not constant or predictable, we
next show the impact that this can have on real datasets. A
recent study in a refugee camp on the border between
Thailand and Burma used an existing nasopharyngeal swab
archive [38] to examine the development of the infant
nasopharyngeal microbiota. A cohort of 20 children born
in 2007/2008 were sampled every month until two years of
age, and the 16S rRNA gene profiles of these swabs were
sequenced by 454 pyrosequencing.
Principal coordinate analysis (PCoA) showed two dis-
tinct clusters distinguishing samples taken during early life
from those taken from subsequent sampling time points,
suggesting an early, founder nasopharyngeal microbiota
(Figure 4a). Four batches of FP kits had been used to ex-
tract the samples and a record was made of which kit was
used for each sample. Further analysis of the OTUs
present indicated that samples possessed different com-
munities depending on which kit had been used for DNA
extraction (Figure 4b,d,e) and that the first two kits asso-
ciated OTUs made up the majority of their samples reads
(Figure 4d). As samples had been extracted in chrono-
logical order, rather than random order, this led to the
false conclusion that OTUs from the first two kits were
associated with age. OTUs driving clustering to the left in
Figure 4a and b (P value of <0.01), were classified as
Achromobacter, Aminobacter, Brevundimonas, Herbaspir-
illum, Ochrobactrum, Pedobacter, Pseudomonas, Rhodo-
coccus, Sphingomonas and Stenotrophomonas. OTUs
driving data points to the right (P value of <0.01) included
Acidaminococcus and Ralstonia. A full list of significant
OTUs is shown in Additional file 1: Table S2. Once the
contaminants were identified and removed, the PCoA
clustering of samples from the run no longer had a dis-
cernible pattern, showing that the contamination was the
biggest driver of sample ordination (Figure 4c). New ali-
quots were obtained from the original sample archive and
were reprocessed using a different kit lot and sequenced.
The previously observed contaminant OTUs were not
detected, further confirming their absence in the original
Figure 2 Copy number of total 16S rRNA genes present in a dilution series of S. bongori culture. Total bacterial DNA present in serial
ten-fold dilutions of a pure S. bongori culture was quantified using qPCR. While the copy number initially reduces in tandem with increased
dilution, plateauing after four dilutions indicates consistent background levels of contaminating DNA. Error bars indicate standard deviation of
triplicate reactions. The broken red line indicates the detection limit of 45 copies of 16S rRNA genes. The no template internal control for the
qPCR reactions (shown in blue) was below the cycle threshold selected for interpreting the fluorescence values (that is, less than 0), indicating
the contamination did not come from the qPCR reagents themselves.
Salter et al. BMC Biology 2014, 12:87 Page 5 of 12
http://www.biomedcentral.com/1741-7007/12/87

Citations
More filters
Journal ArticleDOI

Host–microorganism interactions in lung diseases

TL;DR: It is proposed that akin to recent discoveries in intestinal research, dysbiosis of the airway microbiota could underlie susceptibility to, and progression and chronicity of lung disease.
Journal ArticleDOI

Does the Urinary Microbiome Play a Role in Urgency Urinary Incontinence and Its Severity

TL;DR: It is established that an increase in UUI symptom severity is associated with a decrease in microbial diversity in women with UUI and that the loss of microbial diversity may be associated with clinical severity.
Journal ArticleDOI

From benchtop to desktop: important considerations when designing amplicon sequencing workflows.

TL;DR: This paper focuses on aspects pertaining to the benchtop within typical amplicon workflows: sample screening, the target region, and library generation, and the impact of various data analysis parameters is investigated.
Journal ArticleDOI

Diversity and Biogeography of Bathyal and Abyssal Seafloor Bacteria.

TL;DR: Most bacterial populations were rare and exhibited a high degree of endemism, explaining the substantial differences in community composition observed over large spatial scales, and indicators of productivity regimes, especially sediment organic matter content, were identified.
References
More filters
Journal ArticleDOI

Trimmomatic: a flexible trimmer for Illumina sequence data

TL;DR: Timmomatic is developed as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data and is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Journal ArticleDOI

Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform.

TL;DR: This work presents an improved method for sequencing variable regions within the 16S rRNA gene using Illumina's MiSeq platform, which is currently capable of producing paired 250-nucleotide reads and demonstrates that it can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.
Book

Nucleic acid techniques in bacterial systematics

TL;DR: Isolation and purification of nucleic acids DNA reassociation experiments DNA-rRNA hybridization and methods DNA sequencing in bacterial systematics direct sequence analysis of small RNAs 16S/23S rRNA sequencing the polymerase chain reaction development and application of nucleics acid probes DNA fingerprinting from macromolecules to trees.
Journal ArticleDOI

Determination of microbial diversity in environmental samples: pitfalls of PCR‐based rRNA analysis

TL;DR: Specific aspects of sample collection, cell lysis, nucleic acid extraction, PCR amplification, separation of amplified DNA, application of nucleic probes and data analysis are covered.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions in "Reagent and laboratory contamination can critically impact sequence-based microbiome analyses" ?

The study of microbial communities has been revolutionised in recent years by the widespread adoption of culture independent analytical techniques such as 16S rRNA gene sequencing and metagenomics. In this study the authors demonstrate that contaminating DNA is ubiquitous in commonly used DNA extraction kits and other laboratory reagents, varies greatly in composition between different kits and kit batches, and that this contamination critically impacts results obtained from samples containing a low microbial biomass. The authors provide an extensive list of potential contaminating genera, and guidelines on how to mitigate the effects of contamination. These results suggest that caution should be advised when applying sequence-based techniques to the study of microbiota present in low biomass environments. One potential confounder of these sequence-based approaches is the presence of contamination in DNA extraction kits and other laboratory reagents. 

In the event that suspect taxa are still of interest, repeatsequencing should be carried out on additional samples usingseparate batches of DNA extraction kits/reagents, and, ideally,a non-sequencing-based approach (such as traditional culturingor FISH, using properly validated probe sets) should also beused to further confirm their presence in the samples. 

Alternative bioinformatics approaches, such as oligotyping [62], could potentially provide fine-grained discrimination between contaminant OTUs and genuine OTUs assigned to the same genus or species. 

Many of the contaminating operational taxonomic units (OTUs) represent bacterial genera normally found in soil and water, for example Arthrobacter, Burkholderia, Chryseobacterium, Ochrobactrum, Pseudomonas, Ralstonia, Rhodococcus and Sphingomonas, while others, such as Corynebacterium, Propionibacterium and Streptococcus, are common human skin-associated organisms. 

Contamination of DNA extraction kit reagents has also been reported [16] and kit contamination is a particular challenge for low biomass studies, which may provide little template DNA to compete with that in the reagents for amplification [12,39]. 

With awareness of common contaminating species, careful collection of controls to cover different batchesof sampling, extraction and PCR kits, and sequencing to monitor the content of these controls, it should be possible to effectively mitigate the impact of contaminants in microbiota studies. 

Deviation from a neutral model of community formation to compare source (kit controls) and recipient communities may also be useful in this context [63]. 

Regardless of kit, contamination was always the predominant feature of the sequence data by the fourth serial dilution, which equated to an input of around 104 Salmonella cells. 

These metagenomic results, therefore, clearly show that contamination becomes the dominant feature of sequence data from low biomass samples, and that the kit used to extract DNA can have an impact on the observed bacterial diversity, even in the absence of a PCR amplification step. 

As with 16S rRNA gene sequencing, it was found that as the sample dilution increased, the proportion of reads mapping to the S. bongori reference genome sequence decreased (Figure 3a). 

FP had a stable kit profile dominated by Burkholderia, PSP was dominated by Bradyrhizobium, while the QIA kit had the most complex mix of bacterial DNA.