scispace - formally typeset
Search or ask a question
Institution

J. Craig Venter Institute

NonprofitLa Jolla, California, United States
About: J. Craig Venter Institute is a nonprofit organization based out in La Jolla, California, United States. It is known for research contribution in the topics: Genome & Gene. The organization has 1268 authors who have published 2300 publications receiving 304083 citations. The organization is also known as: JCVI & The Institute for Genomic Research.
Topics: Genome, Gene, Genomics, Population, Microbiome


Papers
More filters
Journal ArticleDOI
21 Sep 2007-Science
TL;DR: It is shown that some of these inserted Wolbachia genes are transcribed within eukaryotic cells lacking endosymbionts, potentially providing a mechanism for acquisition of new genes and functions.
Abstract: Although common among bacteria, lateral gene transfer-the movement of genes between distantly related organisms-is thought to occur only rarely between bacteria and multicellular eukaryotes. However, the presence of endosymbionts, such as Wolbachia pipientis, within some eukaryotic germlines may facilitate bacterial gene transfers to eukaryotic host genomes. We therefore examined host genomes for evidence of gene transfer events from Wolbachia bacteria to their hosts. We found and confirmed transfers into the genomes of four insect and four nematode species that range from nearly the entire Wolbachia genome (>1 megabase) to short (<500 base pairs) insertions. Potential Wolbachia-to-host transfers were also detected computationally in three additional sequenced insect genomes. We also show that some of these inserted Wolbachia genes are transcribed within eukaryotic cells lacking endosymbionts. Therefore, heritable lateral gene transfer occurs into eukaryotic hosts from their prokaryote symbionts, potentially providing a mechanism for acquisition of new genes and functions.

772 citations

01 Jun 2012
TL;DR: This work determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.
Abstract: Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/humann. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.

769 citations

Journal ArticleDOI
TL;DR: This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further the understanding of the biological processes of this plant model but also of other species.
Abstract: Summary The flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, the various classes of non-coding RNA, and small RNA. The TAIR10 annotation update had a profound impact on Arabidopsis research but was released more than 5 years ago. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-Seq libraries from 113 datasets and constructed 48 359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of non-coding RNA including microRNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and small RNA using published datasets and in-house analytic results. Altogether, we identified 635 novel protein-coding genes, 508 novel transcribed regions, 5178 non-coding RNAs, and 35 846 small RNA loci that were formerly unannotated. Analysis of the splicing events and RNA-Seq based expression profiles revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.

769 citations

Journal ArticleDOI
TL;DR: The EcoCyc database contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein–ligand interactions.
Abstract: The EcoCyc database (http://EcoCyc.org/) is a comprehensive source of information on the biology of the prototypical model organism Escherichia coli K12. The mission for EcoCyc is to contain both computable descriptions of, and detailed comments describing, all genes, proteins, pathways and molecular interactions in E.coli. Through ongoing manual curation, extensive information such as summary comments, regulatory information, literature citations and evidence types has been extracted from 8862 publications and added to Version 8.5 of the EcoCyc database. The EcoCyc database can be accessed through a World Wide Web interface, while the downloadable Pathway Tools software and data files enable computational exploration of the data and provide enhanced querying capabilities that web interfaces cannot support. For example, EcoCyc contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein-ligand interactions.

768 citations

Journal ArticleDOI
15 Jul 2005-Science
TL;DR: No evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont is revealed, and a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters is revealed.
Abstract: A comparison of gene content and genome architecture of Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major, three related pathogens with different life cycles and disease pathology, revealed a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters. Many species-specific genes, especially large surface antigen families, occur at nonsyntenic chromosome-internal and subtelomeric regions. Retroelements, structural RNAs, and gene family expansion are often associated with syntenic discontinuities that-along with gene divergence, acquisition and loss, and rearrangement within the syntenic regions-have shaped the genomes of each parasite. Contrary to recent reports, our analyses reveal no evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont.

761 citations


Authors

Showing all 1274 results

NameH-indexPapersCitations
John R. Yates1771036129029
Anders M. Dale156823133891
Ronald W. Davis155644151276
Steven L. Salzberg147407231756
Mark Raymond Adams1471187135038
Nicholas J. Schork12558762131
William R. Jacobs11849048638
Ian T. Paulsen11235469460
Michael B. Brenner11139344771
Kenneth H. Nealson10848351100
Claire M. Fraser10835276292
Stephen L. Hoffman10445838597
Michael J. Brownstein10227447929
Amalio Telenti10242140509
John Quackenbush9942767029
Network Information
Related Institutions (5)
Wellcome Trust Sanger Institute
9.6K papers, 1.2M citations

94% related

Broad Institute
11.6K papers, 1.5M citations

92% related

Cold Spring Harbor Laboratory
6.6K papers, 1M citations

92% related

Pasteur Institute
50.3K papers, 2.5M citations

92% related

Howard Hughes Medical Institute
34.6K papers, 5.2M citations

92% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20233
202211
2021116
2020141
2019154
2018157