scispace - formally typeset
Search or ask a question
Author

Sarah Maman

Bio: Sarah Maman is an academic researcher from University of Toulouse. The author has contributed to research in topics: Metagenomics & Illumina dye sequencing. The author has an hindex of 7, co-authored 8 publications receiving 501 citations. Previous affiliations of Sarah Maman include Institut national de la recherche agronomique.

Papers
More filters
Journal ArticleDOI
TL;DR: This Galaxy‐supported pipeline, called FROGS, is designed to analyze large sets of amplicon sequences and produce abundance tables of Operational Taxonomic Units (OTUs) and their taxonomic affiliation to highlight databases conflicts and uncertainties.
Abstract: Motivation Metagenomics leads to major advances in microbial ecology and biologists need user friendly tools to analyze their data on their own. Results This Galaxy-supported pipeline, called FROGS, is designed to analyze large sets of amplicon sequences and produce abundance tables of Operational Taxonomic Units (OTUs) and their taxonomic affiliation. The clustering uses Swarm. The chimera removal uses VSEARCH, combined with original cross-sample validation. The taxonomic affiliation returns an innovative multi-affiliation output to highlight databases conflicts and uncertainties. Statistical results and numerous graphical illustrations are produced along the way to monitor the pipeline. FROGS was tested for the detection and quantification of OTUs on real and in silico datasets and proved to be rapid, robust and highly sensitive. It compares favorably with the widespread mothur, UPARSE and QIIME. Availability and implementation Source code and instructions for installation: https://github.com/geraldinepascal/FROGS.git. A companion website: http://frogs.toulouse.inra.fr. Contact geraldine.pascal@inra.fr. Supplementary information Supplementary data are available at Bioinformatics online.

527 citations

Journal ArticleDOI
TL;DR: The results confirm that Serratia symbiotica is often present in Cinara species, in addition to the primary symbiont, Buchnera aphidicola, and reveal new symbiotic associations with Erwinia‐ and Sodalis‐related bacteria.
Abstract: The bacterial communities inhabiting arthropods are generally dominated by a few endosymbionts that play an important role in the ecology of their hosts. Rather than comparing bacterial species richness across samples, ecological studies on arthropod endosymbionts often seek to identify the main bacterial strains associated with each specimen studied. The filtering out of contaminants from the results and the accurate taxonomic assignment of sequences are therefore crucial in arthropod microbiome studies. We aimed here to validate an Illumina 16S rRNA gene sequencing protocol and analytical pipeline for investigating endosymbiotic bacteria associated with aphids. Using replicate DNA samples from 12 species (Aphididae: Lachninae, Cinara) and several controls, we removed individual sequences not meeting a minimum threshold number of reads in each sample and carried out taxonomic assignment for the remaining sequences. With this approach, we show that (i) contaminants accounted for a negligible proportion of the bacteria identified in our samples; (ii) the taxonomic composition of our samples and the relative abundance of reads assigned to a taxon were very similar across PCR and DNA replicates for each aphid sample; in particular, bacterial DNA concentration had no impact on the results. Furthermore, by analysing the distribution of unique sequences across samples rather than aggregating them into operational taxonomic units (OTUs), we gained insight into the specificity of endosymbionts for their hosts. Our results confirm that Serratia symbiotica is often present in Cinara species, in addition to the primary symbiont, Buchnera aphidicola. Furthermore, our findings reveal new symbiotic associations with Erwinia- and Sodalis-related bacteria. We conclude with suggestions for generating and analysing 16S rRNA gene sequences for arthropod-endosymbiont studies.

61 citations

Journal ArticleDOI
TL;DR: It is concluded that RNA-Seq and 16S-MiSeq are equally sensitive in detecting bacteria, and the number of bacterial reads obtained with the 15S- MiSeq could be a good proxy for bacterial prevalence.
Abstract: Background Rodents are major reservoirs of pathogens responsible for numerous zoonotic diseases in humans and livestock. Assessing their microbial diversity at both the individual and population level is crucial for monitoring endemic infections and revealing microbial association patterns within reservoirs. Recently, NGS approaches have been employed to characterize microbial communities of different ecosystems. Yet, their relative efficacy has not been assessed. Here, we compared two NGS approaches, RNA-Sequencing (RNA-Seq) and 16S-metagenomics, assessing their ability to survey neglected zoonotic bacteria in rodent populations. Methodology/Principal Findings : We first extracted nucleic acids from the spleens of 190 voles collected in France. RNA extracts were pooled, randomly retro-transcribed, then RNA-Seq was performed using HiSeq. Assembled bacterial sequences were assigned to the closest taxon registered in GenBank. DNA extracts were analyzed via a 16S-metagenomics approach using two sequencers: the 454 GS-FLX and the MiSeq. The V4 region of the gene coding for 16S rRNA was amplified for each sample using barcoded universal primers. Amplicons were multiplexed and processed on the distinct sequencers. The resulting datasets were de-multiplexed, and each read was processed through a pipeline to be taxonomically classified using the Ribosomal Database Project. Altogether, 45 pathogenic bacterial genera were detected. The bacteria identified by RNA-Seq were comparable to those detected by 16S-metagenomics approach processed with MiSeq (16S-MiSeq). In contrast, 21 of these pathogens went unnoticed when the 16S-metagenomics approach was processed via 454-pyrosequencing (16S-454). In addition, the 16S-metagenomics approaches revealed a high level of coinfection in bank voles. Conclusions/Significance :We concluded that RNA-Seq and 16S-MiSeq are equally sensitive in detecting bacteria. Although only the 16S-MiSeq method enabled identification of bacteria in each individual reservoir, with subsequent derivation of bacterial prevalence in host populations, and generation of intra-reservoir patterns of bacterial interactions. Lastly, the number of bacterial reads obtained with the 16S-MiSeq could be a good proxy for bacterial prevalence.

58 citations

Journal ArticleDOI
TL;DR: The data demonstrate that NGS allows having a rather complete screening of pathogenic bacteria present in animal reservoirs without any a priori on their presence, while having a price compatible with cohort studies.
Abstract: Rodents represent one of the major sources of pathogens; most of them are vectored by ticks. Tick-borne diseases are very diverse and cause a wide range of diseases in livestock and human populations. Rodents, carrying ticks, are distributed across a vast range of natural habitats and they often live in close contact with humans and their domestic animals, exposing them to zoonoses circulating in natural ecosystems. In this study, we analyse the potential of Next-Generation Sequencing (NGS) technologies as a tool for large-scale survey of bacterial zoonotic pathogens carried by rodents. We combined two NGS approaches in order to establish a list of zoonotic bacteria and to identify their distribution in individuals of rodents in natural populations. Briefly, RNA/DNA were extracted from the spleen of 192 rodents collected in Northeast France. RNA from all samples was pooled and submitted to high throughput RNA sequencing (RNAseq). Succeeding de novo assembly, bacterial contigs were assigned to the closest already-known taxa, revealing a list of zoonotic bacteria for the whole sample. Parallel, DNA samples were submitted to meta-barcoding approach: each sample was amplified by PCR using universal primers tagged at the V4 region of the 16S rRNA. The amplified templates were multiplexed and submitted to 454-pyrosequencing. The resulting dataset was demultiplexed using a home-made pipeline that assigns each read to a sample using the tagged primers, following these were processed using Mothur pipeline to construct OTUs and classify them using the RDP database. These methods allowed listing bacteria detected in each rodent and, so derive the prevalence, coinfections and bacteria interactions. DNA/RNA of the following bacteria genera were detected by both approaches, RNAseq and DNA 16S-metabarcoding: Bartonella, Leptospira, Borrelia, Rickettsia, Treponema, Neisseria, Spiroplasma, Klebsiella, Listeria and Shigella. Some unexpected genera were detected; such as Orientia, up to now only found in Asia or Helicobacter, generally thought to be restricted to animal guts. Several bacterial pathogens explored by RNAseq passed undetected by 16S-metabarcoding: Anaplasma, (Neo)Ehrlichia, Wolbachia, Brucella, Coxiella, Campylobacter, Mycoplasma, Salmonella, Yersinia, and Francisella. Furthermore, 16S meta-barcoding allowed to specify prevalence of bacteria within our sample, and revealed high level of coinfection in wild rodents. Our data demonstrate that NGS allows having a rather complete screening of pathogenic bacteria present in animal reservoirs without any a priori on their presence, while having a price compatible with cohort studies. NGS approaches are becoming the new routine approaches in large-scale epidemiological studies.

27 citations

Journal ArticleDOI
13 May 2020-Viruses
TL;DR: Little evidence of genetic mixing between the spatially segregated lineages was found, suggesting that BCoV genetic diversity is a result of a global transmission pathway that occurred during the last century, however, variation in evolution rates between the European and non-European lineages indicating differences in virus ecology.
Abstract: Bovine coronavirus (BCoV) is widespread in cattle and wild ruminant populations throughout the world. The virus causes neonatal calf diarrhea and winter dysentery in adult cattle, as well as upper and lower respiratory tract infection in young cattle. We isolated and deep sequenced whole genomes of BCoV from calves with respiratory distress in the south-west of France and conducted a comparative genome analysis using globally collected BCoV sequences to provide insights into the genomic characteristics, evolutionary origins, and global diversity of BCoV. Molecular clock analyses allowed us to estimate that the BCoV ancestor emerged in the 1940s, and that two geographically distinct lineages diverged from the 1960s-1970s. A recombination event in the spike gene (breakpoint at nt 1100) may be at the origin of the genetic divergence sixty years ago. Little evidence of genetic mixing between the spatially segregated lineages was found, suggesting that BCoV genetic diversity is a result of a global transmission pathway that occurred during the last century. However, we found variation in evolution rates between the European and non-European lineages indicating differences in virus ecology.

20 citations


Cited by
More filters
Journal Article
TL;DR: FastTree as mentioned in this paper uses sequence profiles of internal nodes in the tree to implement neighbor-joining and uses heuristics to quickly identify candidate joins, then uses nearest-neighbor interchanges to reduce the length of the tree.
Abstract: Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement neighbor-joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest-neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N^2) space and O(N^2 L) time, but FastTree requires just O( NLa + N sqrt(N) ) memory and O( N sqrt(N) log(N) L a ) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 hours and 2.4 gigabytes of memory. Just computing pairwise Jukes-Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 hours and 50 gigabytes of memory. In simulations, FastTree was slightly more accurate than neighbor joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

2,436 citations

Journal ArticleDOI
TL;DR: The application of decontam to two recently published datasets corroborated and extended their conclusions that little evidence existed for an indigenous placenta microbiome and that some low-frequency taxa seemingly associated with preterm birth were contaminants.
Abstract: The accuracy of microbial community surveys based on marker-gene and metagenomic sequencing (MGS) suffers from the presence of contaminants—DNA sequences not truly present in the sample. Contaminants come from various sources, including reagents. Appropriate laboratory practices can reduce contamination, but do not eliminate it. Here we introduce decontam ( https://github.com/benjjneb/decontam ), an open-source R package that implements a statistical classification procedure that identifies contaminants in MGS data based on two widely reproduced patterns: contaminants appear at higher frequencies in low-concentration samples and are often found in negative controls. Decontam classified amplicon sequence variants (ASVs) in a human oral dataset consistently with prior microscopic observations of the microbial taxa inhabiting that environment and previous reports of contaminant taxa. In metagenomics and marker-gene measurements of a dilution series, decontam substantially reduced technical variation arising from different sequencing protocols. The application of decontam to two recently published datasets corroborated and extended their conclusions that little evidence existed for an indigenous placenta microbiome and that some low-frequency taxa seemingly associated with preterm birth were contaminants. Decontam improves the quality of metagenomic and marker-gene sequencing by identifying and removing contaminant DNA sequences. Decontam integrates easily with existing MGS workflows and allows researchers to generate more accurate profiles of microbial communities at little to no additional cost.

1,287 citations

Posted ContentDOI
25 Jul 2018-bioRxiv
TL;DR: The application of decontam to two recently published datasets corroborated and extended their conclusions that little evidence existed for an indigenous placenta microbiome, and that some low-frequency taxa seemingly associated with preterm birth were run-specific contaminants.
Abstract: Background: The accuracy of microbial community surveys based on marker-gene and metagenomic sequencing (MGS) suffers from the presence of contaminants - DNA sequences not truly present in the sample. Contaminants come from various sources, including reagents. Appropriate laboratory practices can reduce contamination, but do not eliminate it. Here we introduce decontam (https://github.com/benjjneb/decontam), an open-source R package that implements a statistical classification procedure that identifies contaminants in MGS data based on two widely reproduced patterns: contaminants appear at higher frequencies in low-concentration samples, and are often found in negative controls. Results: decontam classified amplicon sequence variants (ASVs) in a human oral dataset consistently with prior microscopic observations of the microbial taxa inhabiting that environment and previous reports of contaminant taxa. In metagenomics and marker-gene measurements of a dilution series, decontam substantially reduced technical variation arising from different sequencing protocols. The application of decontam to two recently published datasets corroborated and extended their conclusions that little evidence existed for an indigenous placenta microbiome, and that some low-frequency taxa seemingly associated with preterm birth were contaminants. Conclusions: decontam improves the quality of metagenomic and marker-gene sequencing by identifying and removing contaminant DNA sequences. decontam integrates easily with existing MGS workflows, and allows researchers to generate more accurate profiles of microbial communities at little to no additional cost.

584 citations

Journal ArticleDOI
TL;DR: It is demonstrated that fecal microbiota transplants and chronic treatment with phenylacetic acid, a microbial product of aromatic amino acids metabolism, successfully trigger steatosis and branched-chain amino acid metabolism.
Abstract: Hepatic steatosis is a multifactorial condition that is often observed in obese patients and is a prelude to non-alcoholic fatty liver disease. Here, we combine shotgun sequencing of fecal metagenomes with molecular phenomics (hepatic transcriptome and plasma and urine metabolomes) in two well-characterized cohorts of morbidly obese women recruited to the FLORINASH study. We reveal molecular networks linking the gut microbiome and the host phenome to hepatic steatosis. Patients with steatosis have low microbial gene richness and increased genetic potential for the processing of dietary lipids and endotoxin biosynthesis (notably from Proteobacteria), hepatic inflammation and dysregulation of aromatic and branched-chain amino acid metabolism. We demonstrated that fecal microbiota transplants and chronic treatment with phenylacetic acid, a microbial product of aromatic amino acid metabolism, successfully trigger steatosis and branched-chain amino acid metabolism. Molecular phenomic signatures were predictive (area under the curve = 87%) and consistent with the gut microbiome having an effect on the steatosis phenome (>75% shared variation) and, therefore, actionable via microbiome-based therapies. Metabolic activity of specific human gut microorganisms contributes to liver steatosis in obese women.

396 citations

Journal ArticleDOI
TL;DR: Microplastics collected in the Bay of Brest were characterized by manual sorting followed by Raman spectroscopy and studied their associated bacterial assemblages using 16S amplicon high-throughput sequencing to understand the role of microplastics on pathogen population transport and ultimate disease emergence.

239 citations