scispace - formally typeset
Search or ask a question
Institution

J. Craig Venter Institute

NonprofitLa Jolla, California, United States
About: J. Craig Venter Institute is a nonprofit organization based out in La Jolla, California, United States. It is known for research contribution in the topics: Genome & Gene. The organization has 1268 authors who have published 2300 publications receiving 304083 citations. The organization is also known as: JCVI & The Institute for Genomic Research.
Topics: Genome, Gene, Genomics, Population, Microbiome


Papers
More filters
Journal ArticleDOI
TL;DR: Genome analysis at the population-level suggests that gene transfer including recombination also contributes to Ab evolutionary dynamics, and insights are provided into the transmission dynamics of Ab and the identification of patients with repeat infections.
Abstract: Limited treatment options are available for patients infected with multidrug (MDR)- or pan-drug (PDR)-resistant bacterial pathogens, resulting in infections that can persist for weeks or months. In order to better understand transmission and evolutionary dynamics of MDR Acinetobacter baumannii (Ab) during long-term infection, we analyzed genomes from a series of isolates from individual patients at isolate-specific, patient-specific, and population levels. Whole genome analysis of longitudinal isolates (range 2-10 isolates per patient spanning 0-829 days) from 40 patients included detection of single-nucleotide variants (SNVs), insertion sequence (IS) mapping, and gene content changes. Phylogenetic analysis revealed that a significant fraction of apparently persistent infections are in fact due to re-infection with new strains. SNVs primarily resulted in protein coding changes, and IS events primarily interrupted genes or were in an orientation such that the adjacent gene would be over-expressed. Mutations acquired during infection were over-represented in transcriptional regulators, notably pmrAB and adeRS, which can mediate resistance to the last line therapies colistin and tigecycline, respectively, as well as transporters, surface structures, and iron acquisition genes. Most SNVs and IS events were isolate-specific indicating these mutations did not become fixed on the time scale investigated, yet over-representation of independent mutations in some genes or functional categories suggests that they are under selective pressure. Genome analysis at the population-level suggests that gene transfer including recombination also contributes to Ab evolutionary dynamics. These findings provide important insight into the transmission dynamics of Ab and the identification of patients with repeat infections has implications for infection control programs targeted to this pathogen.

65 citations

Patent
27 Mar 2002
TL;DR: The authors provided proteins and nucleic acid sequences from Streptocccus pneumoniae, together with a genome sequence for the development of vaccines, diagnostics, and antibiotics, which are useful for the analysis of vaccines and diagnostics.
Abstract: The invention provides proteins and nucleic acid sequences from Streptocccus pneumoniae, together with a genome sequence. These are useful for the development of vaccines, diagnostics, and antibiotics.

65 citations

Journal ArticleDOI
TL;DR: This study focuses on the recovery of a nearly complete genome representing a novel strain of the periodontal pathogen Porphyromonas gingivalis using the single-cell assembly tool SPAdes and shows for the first time that it enables comparative genomic analysis of strain variation in a pathogen captured from complex biofilm samples in a healthcare facility.
Abstract: Ongoing efforts to understand the genomic diversity of microbes in nature and in human health are hampered by the limited availability of cultivated organisms and their genomes (The Human Microbiome Jumpstart Reference Strains Consortium 2010). Only 1%–10% of known bacterial species (Rappe and Giovannoni 2003) are thought to be currently cultivated, although great progress is being made for some bacterial communities; for example, about half of bacterial species within the human oral cavity have been cultivated (Dewhirst et al. 2010). The recent advancements in DNA sequencing of single bacterial cells (Raghunathan et al. 2005) have accelerated the discovery of uncultivated microbes (Lasken 2012), providing genomic assemblies for species previously known only from 16S rRNA clone libraries and metagenomic data (Marcy et al. 2007; Podar et al. 2007; Binga et al. 2008; Eloe et al. 2011; Youssef et al. 2011; Dupont et al. 2012). This newly developed methodology provides a culture-independent approach to capture the genomes of uncultivated organisms, which can then be integrated into many intensive genomics-based studies. A high-throughput strategy was recently established to sequence and assemble single-cell genomes of bacteria (Chitsaz et al. 2011) and viruses (Allen et al. 2011), including novel uncultivated bacteria from environmental samples (Chitsaz et al. 2011; Eloe et al. 2011; Dupont et al. 2012). The workflow consists of (1) delivery of single bacterial cells into 384-well microtiter wells by fluorescence activated cell sorting (FACS); (2) use of a robotic platform to perform 384-well automated cell lysis and amplification of DNA by the multiple displacement amplification (MDA) method (Dean et al. 2001, 2002; Hosono et al. 2003) to create libraries of genomic DNA derived from single cells; (3) PCR and cycle sequencing of 16S rRNA genes to profile the taxonomy and diversity of the libraries; (4) selection of candidate amplified genomes for whole-genome sequencing; and (5) sequencing and assembly of selected genomes using assembly tools designed specifically for MDA-amplified single cells (Chitsaz et al. 2011; Bankevich et al. 2012). A highly integrated robotic platform, described in this study for the first time, was used to increase the throughput, ease, and overall cost of processing single cells. Here we have focused this approach on the indoor environment. Despite the fact that a typical person spends ∼90% of their time indoors (Klepeis et al. 2001), there is little known about the microbial diversity of this environment. Of particular interest is the prevalence of species affecting human health, including both opportunistic and primary pathogens. Recent studies of indoor environments using culture-independent molecular methods indicate an unexpectedly high bacterial diversity on surfaces within daycare facilities and public bathroom facilities (Lee et al. 2007; Flores et al. 2011), where the majority of organisms in the latter environment were human associated (Flores et al. 2011). Another study shows that bacterial diversity is lower in indoor air at a healthcare facility compared with outdoor air; however, the indoor air contained a higher number of potential human pathogens as shown by 16S rRNA gene sequence analyses (Kembel et al. 2012). Biofilms in particular are thought to be reservoirs of disease-causing organisms in both outdoor and indoor environments. Several pathogens, including Escherichia coli, Vibrio cholerae (Shikuma and Hadfield 2010), and Helicobacter pylori (Percival and Thomas 2009; Linke et al. 2010), have been detected in biofilms within water distribution systems. In addition, the long-term persistence of Legionella pneumophila, the causative agent of Legionnaire's disease, in biofilms within natural and human-impacted freshwater environments is well known (Walker et al. 1993; Murga et al. 2001; Declerck 2010; Giao et al. 2011). Recent 16S rRNA gene molecular surveys have revealed a significant load of Mycobacterium avium in showerhead biofilms (Feazel et al. 2009), and studies on biofilms growing on shower curtains suggest that these communities also harbor potential opportunistic pathogens that can threaten immune-compromised patients (Kelley et al. 2004). In another study, the source of a deadly outbreak of a multidrug-resistant strain of Pseudomonas aeruginosa was traced to biofilms in hand hygiene sink drains, where its viable cells could be identified (Hota et al. 2009). There is great interest, therefore, to investigate biofilms as reservoirs of pathogens at higher resolution than allowed by the most commonly used detection and identification methodologies. Culture-independent surveys using the 16S rRNA gene as a marker are currently the most widely used approach; however, genetic strain differences reflecting pathogenicity are often difficult to resolve due to this gene being highly conserved among many bacterial strains. Quantitative PCR and direct culturing are focused on either a handful of predetermined pathogens or what can be readily cultivated. Metagenomic surveys are becoming common, but so far, our ability is limited to accurately predicting taxonomic affiliation at species or strain levels from highly diverse and complex data sets. Additionally, a whole-genome comparative genomic study on the evolution and transmission of a pathogen requires substantial amounts of DNA or a cultured strain, which often cannot be obtained. It has been demonstrated in a controlled experiment with 10 pg of extracted DNA provided as a template that MDA-amplified genotyping call and accuracy rates were only slightly lower than those for genomic DNA isolated directly from cultured cells (Giardina et al. 2009). Using single-cell genomic approaches, partial to near complete genomes should be obtainable without cultivation, from difficult samples within critical indoor environments such as healthcare facilities. In-depth analyses of these genomic data can then provide accurate and detailed information of strain-specific pathogen-gene signatures and other virulence factors. The aim of this study was to investigate for the first time the bacteria present in a healthcare facility with a high-throughput single-cell genomics approach. Based on the known prevalence of pathogens in biofilms, we focused on a sink drain biofilm from a public restroom adjoining an emergency waiting room. Sequencing 16S rRNA genes PCR-amplified from 416 single-cell MDA reactions, we found 18 candidate commensal and potentially pathogenic species that were selected for 454 shallow sequencing. Initial read mapping and de novo assembly of the low-coverage 454 sequence data confirmed that we had obtained genomic sequences for the pathogen Streptococcus pneumoniae as well as bacterial species highly similar to and those reported to be potentially pathogenic, including Sphingobacterium spiritivorum (Tronel et al. 2003; Kampfer et al. 2005), Leptotrichia buccalis (Hammann et al. 1993; Hot et al. 2008), as well as the host-associated oral bacteria, Streptococcus mitis and Veillonella parvula. Of particular note, we found three MDA products with sequences for the oral pathogen Porphyromonas gingivalis, which is a periodontal pathogen involved in periodontal bone loss that has also been linked to progression of atherosclerotic disease (Pussinen et al. 2007; Yilmaz 2008). P. gingivalis possesses many virulence factors, including functions that allow it to survive intracellularly and to be transmitted between different types of host cells (Li et al. 2008). Despite being detected at a very low abundance in the oral cavity, P. gingivalis can strongly disrupt the host–microbial homeostasis (Hajishengallis et al. 2011). As with many pathogens, the environmental reservoirs and mode(s) of transmission of P. gingivalis are not fully understood, yet it is a globally important pathogen with only three sequenced genomes available at the time of this report. It was recently stated by a CDC report that nearly 50% of American adults have mild, moderate, or severe periodontitis, and this percentage rises to 70% in adults greater than age 65 (Eke et al. 2012). To our knowledge, there are no previous reports detecting P. gingivalis outside of a host. Three MDA-amplified genomes with 16S rRNA gene sequences identified as P. gingivalis were chosen for additional deep sequencing on the Illumina GA IIx platform, and the resulting reads were mapped to P. gingivalis genomes. One MDA-read data set had ∼90% sequence coverage to P. gingivalis strain TDC60, which was isolated from a patient in Japan with severe periodontitis (Watanabe et al. 2011). A new single-cell de novo assembly algorithm, SPAdes (Bankevich et al. 2012), was used to generate contigs of the highest-coverage MDA product, which produced a 2.35-Mb draft genome (PG JCVI SC001). Comparative genomics and pangenome analyses were performed with the three other available P. gingivalis genomes; virulent strains W83 (Nelson et al. 2003) and TDC60 (Watanabe et al. 2011), and the less virulent strain ATCC 33277 (Naito et al. 2008). We demonstrate that single-cell genomics is a powerful approach that can produce highly accurate sequence data, enabling comparative genomic studies of pathogens obtained from a complex heterogeneous environmental sample.

65 citations

Journal ArticleDOI
TL;DR: This study indicates that highly similar K. pneumoniae subpopulations coexist within the same hospitals over time and supports a division of sequence type 258 (ST258) into two distinct groups.
Abstract: Genome sequencing of carbapenem-resistant Klebsiella pneumoniae isolates from regional U.S. hospitals was used to characterize strain diversity and the bla(KPC) genetic context. A phylogeny based on core single-nucleotide variants (SNVs) supports a division of sequence type 258 (ST258) into two distinct groups. The primary differences between the groups are in the capsular polysaccharide locus (cps) and their plasmid contents. A strict association between clade and KPC variant was found. The bla(KPC) gene was found on variants of two plasmid backbones. This study indicates that highly similar K. pneumoniae subpopulations coexist within the same hospitals over time.

65 citations

Journal ArticleDOI
TL;DR: It is shown that a significant portion of the duplicated genes in rice show divergent expression although a correlation between sequence divergence and correlation of expression could be seen in very young genes.
Abstract: High gene numbers in plant genomes reflect polyploidy and major gene duplication events. Oryza sativa, cultivated rice, is a diploid monocotyledonous species with a ~390 Mb genome that has undergone segmental duplication of a substantial portion of its genome. This, coupled with other genetic events such as tandem duplications, has resulted in a substantial number of its genes, and resulting proteins, occurring in paralogous families. Using a computational pipeline that utilizes Pfam and novel protein domains, we characterized paralogous families in rice and compared these with paralogous families in the model dicotyledonous diploid species, Arabidopsis thaliana. Arabidopsis, which has undergone genome duplication as well, has a substantially smaller genome (~120 Mb) and gene complement compared to rice. Overall, 53% and 68% of the non-transposable element-related rice and Arabidopsis proteins could be classified into paralogous protein families, respectively. Singleton and paralogous family genes differed substantially in their likelihood of encoding a protein of known or putative function; 26% and 66% of singleton genes compared to 73% and 96% of the paralogous family genes encode a known or putative protein in rice and Arabidopsis, respectively. Furthermore, a major skew in the distribution of specific gene function was observed; a total of 17 Gene Ontology categories in both rice and Arabidopsis were statistically significant in their differential distribution between paralogous family and singleton proteins. In contrast to mammalian organisms, we found that duplicated genes in rice and Arabidopsis tend to have more alternative splice forms. Using data from Massively Parallel Signature Sequencing, we show that a significant portion of the duplicated genes in rice show divergent expression although a correlation between sequence divergence and correlation of expression could be seen in very young genes. Collectively, these data suggest that while co-regulation and conserved function are present in some paralogous protein family members, evolutionary pressures have resulted in functional divergence with differential expression patterns.

65 citations


Authors

Showing all 1274 results

NameH-indexPapersCitations
John R. Yates1771036129029
Anders M. Dale156823133891
Ronald W. Davis155644151276
Steven L. Salzberg147407231756
Mark Raymond Adams1471187135038
Nicholas J. Schork12558762131
William R. Jacobs11849048638
Ian T. Paulsen11235469460
Michael B. Brenner11139344771
Kenneth H. Nealson10848351100
Claire M. Fraser10835276292
Stephen L. Hoffman10445838597
Michael J. Brownstein10227447929
Amalio Telenti10242140509
John Quackenbush9942767029
Network Information
Related Institutions (5)
Wellcome Trust Sanger Institute
9.6K papers, 1.2M citations

94% related

Broad Institute
11.6K papers, 1.5M citations

92% related

Cold Spring Harbor Laboratory
6.6K papers, 1M citations

92% related

Pasteur Institute
50.3K papers, 2.5M citations

92% related

Howard Hughes Medical Institute
34.6K papers, 5.2M citations

92% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20233
202211
2021116
2020141
2019154
2018157