Institution
J. Craig Venter Institute
Nonprofit•La Jolla, California, United States•
About: J. Craig Venter Institute is a nonprofit organization based out in La Jolla, California, United States. It is known for research contribution in the topics: Genome & Gene. The organization has 1268 authors who have published 2300 publications receiving 304083 citations. The organization is also known as: JCVI & The Institute for Genomic Research.
Topics: Genome, Gene, Genomics, Population, Microbiome
Papers published on a yearly basis
Papers
More filters
••
TL;DR: It is shown that some of these inserted Wolbachia genes are transcribed within eukaryotic cells lacking endosymbionts, potentially providing a mechanism for acquisition of new genes and functions.
Abstract: Although common among bacteria, lateral gene transfer-the movement of genes between distantly related organisms-is thought to occur only rarely between bacteria and multicellular eukaryotes. However, the presence of endosymbionts, such as Wolbachia pipientis, within some eukaryotic germlines may facilitate bacterial gene transfers to eukaryotic host genomes. We therefore examined host genomes for evidence of gene transfer events from Wolbachia bacteria to their hosts. We found and confirmed transfers into the genomes of four insect and four nematode species that range from nearly the entire Wolbachia genome (>1 megabase) to short (<500 base pairs) insertions. Potential Wolbachia-to-host transfers were also detected computationally in three additional sequenced insect genomes. We also show that some of these inserted Wolbachia genes are transcribed within eukaryotic cells lacking endosymbionts. Therefore, heritable lateral gene transfer occurs into eukaryotic hosts from their prokaryote symbionts, potentially providing a mechanism for acquisition of new genes and functions.
772 citations
01 Jun 2012
TL;DR: This work determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.
Abstract: Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/humann. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads, enabling the determination of community roles in the HMP cohort and in future metagenomic studies.
769 citations
••
TL;DR: This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further the understanding of the biological processes of this plant model but also of other species.
Abstract: Summary
The flowering plant Arabidopsis thaliana is a dicot model organism for research in many aspects of plant biology. A comprehensive annotation of its genome paves the way for understanding the functions and activities of all types of transcripts, including mRNA, the various classes of non-coding RNA, and small RNA. The TAIR10 annotation update had a profound impact on Arabidopsis research but was released more than 5 years ago. Maintaining the accuracy of the annotation continues to be a prerequisite for future progress. Using an integrative annotation pipeline, we assembled tissue-specific RNA-Seq libraries from 113 datasets and constructed 48 359 transcript models of protein-coding genes in eleven tissues. In addition, we annotated various classes of non-coding RNA including microRNA, long intergenic RNA, small nucleolar RNA, natural antisense transcript, small nuclear RNA, and small RNA using published datasets and in-house analytic results. Altogether, we identified 635 novel protein-coding genes, 508 novel transcribed regions, 5178 non-coding RNAs, and 35 846 small RNA loci that were formerly unannotated. Analysis of the splicing events and RNA-Seq based expression profiles revealed the landscapes of gene structures, untranslated regions, and splicing activities to be more intricate than previously appreciated. Furthermore, we present 692 uniformly expressed housekeeping genes, 43% of whose human orthologs are also housekeeping genes. This updated Arabidopsis genome annotation with a substantially increased resolution of gene models will not only further our understanding of the biological processes of this plant model but also of other species.
769 citations
••
TL;DR: The EcoCyc database contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein–ligand interactions.
Abstract: The EcoCyc database (http://EcoCyc.org/) is a comprehensive source of information on the biology of the prototypical model organism Escherichia coli K12. The mission for EcoCyc is to contain both computable descriptions of, and detailed comments describing, all genes, proteins, pathways and molecular interactions in E.coli. Through ongoing manual curation, extensive information such as summary comments, regulatory information, literature citations and evidence types has been extracted from 8862 publications and added to Version 8.5 of the EcoCyc database. The EcoCyc database can be accessed through a World Wide Web interface, while the downloadable Pathway Tools software and data files enable computational exploration of the data and provide enhanced querying capabilities that web interfaces cannot support. For example, EcoCyc contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein-ligand interactions.
768 citations
••
George Washington University1, University of Washington2, Seattle Biomed3, J. Craig Venter Institute4, Wellcome Trust Sanger Institute5, Karolinska Institutet6, Newcastle University7, Centre national de la recherche scientifique8, Universidade Federal de Minas Gerais9, Medical Research Council10, University of Cambridge11, University of Iowa12
TL;DR: No evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont is revealed, and a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters is revealed.
Abstract: A comparison of gene content and genome architecture of Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major, three related pathogens with different life cycles and disease pathology, revealed a conserved core proteome of about 6200 genes in large syntenic polycistronic gene clusters. Many species-specific genes, especially large surface antigen families, occur at nonsyntenic chromosome-internal and subtelomeric regions. Retroelements, structural RNAs, and gene family expansion are often associated with syntenic discontinuities that-along with gene divergence, acquisition and loss, and rearrangement within the syntenic regions-have shaped the genomes of each parasite. Contrary to recent reports, our analyses reveal no evidence that these species are descended from an ancestor that contained a photosynthetic endosymbiont.
761 citations
Authors
Showing all 1274 results
Name | H-index | Papers | Citations |
---|---|---|---|
John R. Yates | 177 | 1036 | 129029 |
Anders M. Dale | 156 | 823 | 133891 |
Ronald W. Davis | 155 | 644 | 151276 |
Steven L. Salzberg | 147 | 407 | 231756 |
Mark Raymond Adams | 147 | 1187 | 135038 |
Nicholas J. Schork | 125 | 587 | 62131 |
William R. Jacobs | 118 | 490 | 48638 |
Ian T. Paulsen | 112 | 354 | 69460 |
Michael B. Brenner | 111 | 393 | 44771 |
Kenneth H. Nealson | 108 | 483 | 51100 |
Claire M. Fraser | 108 | 352 | 76292 |
Stephen L. Hoffman | 104 | 458 | 38597 |
Michael J. Brownstein | 102 | 274 | 47929 |
Amalio Telenti | 102 | 421 | 40509 |
John Quackenbush | 99 | 427 | 67029 |