scispace - formally typeset
Search or ask a question
Author

Jason W. Sahl

Bio: Jason W. Sahl is an academic researcher from Northern Arizona University. The author has contributed to research in topics: Genome & Burkholderia pseudomallei. The author has an hindex of 36, co-authored 136 publications receiving 19338 citations. Previous affiliations of Jason W. Sahl include Reed College & Colorado School of Mines.


Papers
More filters
Journal ArticleDOI
TL;DR: M mothur is used as a case study to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the α and β diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments.
Abstract: mothur aims to be a comprehensive software package that allows users to use a single piece of software to analyze community sequence data. It builds upon previous tools to provide a flexible and powerful software package for analyzing sequencing data. As a case study, we used mothur to trim, screen, and align sequences; calculate distances; assign sequences to operational taxonomic units; and describe the alpha and beta diversity of eight marine samples previously characterized by pyrosequencing of 16S rRNA gene fragments. This analysis of more than 222,000 sequences was completed in less than 2 h with a laptop computer.

17,350 citations

Journal ArticleDOI
TL;DR: The findings suggest that horizontal genetic exchange allowed for the emergence of the highly virulent Shiga-toxin-producing enteroaggregative E. coli O104:H4 strain that caused the German outbreak, and highlight the way in which the plasticity of bacterial genomes facilitates the emerged of new pathogens.
Abstract: Background A large outbreak of diarrhea and the hemolytic–uremic syndrome caused by an unusual serotype of Shiga-toxin–producing Escherichia coli (O104:H4) began in Germany in May 2011. As of July 22, a large number of cases of diarrhea caused by Shiga-toxin–producing E. coli have been reported — 3167 without the hemolytic–uremic syndrome (16 deaths) and 908 with the hemolytic–uremic syndrome (34 deaths) — indicating that this strain is notably more virulent than most of the Shiga-toxin–producing E. coli strains. Preliminary genetic characterization of the outbreak strain suggested that, unlike most of these strains, it should be classified within the enteroaggregative pathotype of E. coli. Methods We used third-generation, single-molecule, real-time DNA sequencing to determine the complete genome sequence of the German outbreak strain, as well as the genome sequences of seven diarrhea-associated enteroaggregative E. coli serotype O104:H4 strains from Africa and four enteroaggregative E. coli reference st...

840 citations

Journal ArticleDOI
TL;DR: It is concluded that the Y pestis lineages that caused the Plague of Justinian and the Black Death 800 years later were independent emergences from rodents into human beings.
Abstract: Summary Background Yersinia pestis has caused at least three human plague pandemics. The second (Black Death, 14–17th centuries) and third (19–20th centuries) have been genetically characterised, but there is only a limited understanding of the first pandemic, the Plague of Justinian (6–8th centuries). To address this gap, we sequenced and analysed draft genomes of Y pestis obtained from two individuals who died in the first pandemic. Methods Teeth were removed from two individuals (known as A120 and A76) from the early medieval Aschheim-Bajuwarenring cemetery (Aschheim, Bavaria, Germany). We isolated DNA from the teeth using a modified phenol-chloroform method. We screened DNA extracts for the presence of the Y pestis -specific pla gene on the pPCP1 plasmid using primers and standards from an established assay, enriched the DNA, and then sequenced it. We reconstructed draft genomes of the infectious Y pestis strains, compared them with a database of genomes from 131 Y pestis strains from the second and third pandemics, and constructed a maximum likelihood phylogenetic tree. Findings Radiocarbon dating of both individuals (A120 to 533 AD [plus or minus 98 years]; A76 to 504 AD [plus or minus 61 years]) places them in the timeframe of the first pandemic. Our phylogeny contains a novel branch (100% bootstrap at all relevant nodes) leading to the two Justinian samples. This branch has no known contemporary representatives, and thus is either extinct or unsampled in wild rodent reservoirs. The Justinian branch is interleaved between two extant groups, 0.ANT1 and 0.ANT2, and is distant from strains associated with the second and third pandemics. Interpretation We conclude that the Y pestis lineages that caused the Plague of Justinian and the Black Death 800 years later were independent emergences from rodents into human beings. These results show that rodent species worldwide represent important reservoirs for the repeated emergence of diverse lineages of Y pestis into human populations. Funding McMaster University, Northern Arizona University, Social Sciences and Humanities Research Council of Canada, Canada Research Chairs Program, US Department of Homeland Security, US National Institutes of Health, Australian National Health and Medical Research Council.

354 citations

Journal ArticleDOI
01 Apr 2014-PeerJ
TL;DR: The large-scale BLAST score ratio (LS-BSR) pipeline is presented, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed.
Abstract: Background. As whole genome sequence data from bacterial isolates becomes cheaper to generate, computational methods are needed to correlate sequence data with biological observations. Here we present the large-scale BLAST score ratio (LS-BSR) pipeline, which rapidly compares the genetic content of hundreds to thousands of bacterial genomes, and returns a matrix that describes the relatedness of all coding sequences (CDSs) in all genomes surveyed. This matrix can be easily parsed in order to identify genetic relationships between bacterial genomes. Although pipelines have been published that group peptides by sequence similarity, no other software performs the rapid, large-scale, full-genome comparative analyses carried out by LS-BSR. Results. To demonstrate the utility of the method, the LS-BSR pipeline was tested on 96 Escherichia coli and Shigella genomes; the pipeline ran in 163 min using 16 processors, which is a greater than 7-fold speedup compared to using a single processor. The BSR values for each CDS, which indicate a relative level of relatedness, were then mapped to each genome on an independent core genome single nucleotide polymorphism (SNP) based phylogeny. Comparisons were then used to identify clade specific CDS markers and validate the LS-BSR pipeline based on molecular markers that delineate between classical E. coli pathogenic variant (pathovar) designations. Scalability tests demonstrated that the LS-BSR pipeline can process 1,000 E. coli genomes in 27-57 h, depending upon the alignment method, using 16 processors. Conclusions. LS-BSR is an open-source, parallel implementation of the BSR algorithm, enabling rapid comparison of the genetic content of large numbers of genomes. The results of the pipeline can be used to identify specific markers between user-defined phylogenetic groups, and to identify the loss and/or acquisition of genetic information between bacterial isolates. Taxa-specific genetic markers can then be translated into clinical diagnostics, or can be used to identify broadly conserved putative therapeutic candidates.

213 citations

Journal ArticleDOI
25 Aug 2016
TL;DR: This study demonstrates how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset and demonstrates differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions.
Abstract: Whole-genome sequencing (WGS) of bacterial isolates has become standard practice in many laboratories. Applications for WGS analysis include phylogeography and molecular epidemiology, using single nucleotide polymorphisms (SNPs) as the unit of evolution. NASP was developed as a reproducible method that scales well with the hundreds to thousands of WGS data typically used in comparative genomics applications. In this study, we demonstrate how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset. Our results demonstrate that NASP produces similar, and often better, results in comparison with other pipelines, but is much more flexible in terms of data input types, job management systems, diversity of supported tools and output formats. We also demonstrate differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions. NASP represents a source-available, version-controlled, unit-tested method and can be obtained from tgennorth.github.io/NASP.

192 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An overview of the analysis pipeline and links to raw data and processed output from the runs with and without denoising are provided.
Abstract: Supplementary Figure 1 Overview of the analysis pipeline. Supplementary Table 1 Details of conventionally raised and conventionalized mouse samples. Supplementary Discussion Expanded discussion of QIIME analyses presented in the main text; Sequencing of 16S rRNA gene amplicons; QIIME analysis notes; Expanded Figure 1 legend; Links to raw data and processed output from the runs with and without denoising.

28,911 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: The extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.
Abstract: SILVA (from Latin silva, forest, http://www.arb-silva.de) is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.

18,256 citations

Journal ArticleDOI
TL;DR: The open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors is presented, revealing a diversity of previously undetected Lactobacillus crispatus variants.
Abstract: We present the open-source software package DADA2 for modeling and correcting Illumina-sequenced amplicon errors (https://github.com/benjjneb/dada2). DADA2 infers sample sequences exactly and resolves differences of as little as 1 nucleotide. In several mock communities, DADA2 identified more real variants and output fewer spurious sequences than other methods. We applied DADA2 to vaginal samples from a cohort of pregnant women, revealing a diversity of previously undetected Lactobacillus crispatus variants.

14,505 citations

Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations