Showing papers by "J. Craig Venter Institute published in 2007"

PDF

Open Access

Journal Article•DOI•

A mammalian microRNA expression atlas based on small RNA library sequencing.

[...]

Pablo Landgraf¹, Mirabela Rusu², Robert L. Sheridan³, Alain Sewer², Alain Sewer⁴, Nicola Iovino¹, Alexei A. Aravin¹, Sébastien Pfeffer¹, Amanda J. Rice¹, Alice O. Kamphorst¹, Markus Landthaler¹, Carolina Lin¹, Nicholas D. Socci³, Leandro C. Hermida², Valerio Fulci⁵, Sabina Chiaretti⁵, Robin Foà⁵, Julia Schliwka⁶, Uta Fuchs⁷, Astrid Novosel⁷, Roman-Ulrich Müller¹, Roman-Ulrich Müller⁸, Bernhard Schermer⁸, Ute Bissels⁹, Jason M. Inman¹⁰, Quang Phan¹⁰, Minchen Chien¹¹, David B. Weir¹¹, Ruchi Choksi¹¹, Gabriella De Vita¹², Daniela Frezzetti¹², Hans Ingo Trompeter¹³, Veit Hornung⁷, Grace Teng¹, Gunther Hartmann¹⁴, Miklós Palkovits¹⁵, Roberto Di Lauro, Peter Wernet¹³, Giuseppe Macino⁵, Charles E. Rogler¹⁶, James W. Nagle¹⁷, Jingyue Ju¹¹, F. Nina Papavasiliou¹, Thomas Benzing⁸, Peter Lichter, Wayne Tam¹⁸, Michael J. Brownstein¹⁰, Andreas Bosio⁹, Arndt Borkhardt⁷, James J. Russo¹¹, Chris Sander³, Mihaela Zavolan⁴, Mihaela Zavolan², Thomas Tuschl¹ - Show less +50 more•Institutions (18)

Rockefeller University¹, University of Basel², Memorial Sloan Kettering Cancer Center³, Swiss Institute of Bioinformatics⁴, Sapienza University of Rome⁵, German Cancer Research Center⁶, Ludwig Maximilian University of Munich⁷, University of Freiburg⁸, Miltenyi Biotec⁹, J. Craig Venter Institute¹⁰, Columbia University¹¹, University of Naples Federico II¹², University of Düsseldorf¹³, University of Bonn¹⁴, Semmelweis University¹⁵, Yeshiva University¹⁶, National Institutes of Health¹⁷, Cornell University¹⁸

29 Jun 2007-Cell

TL;DR: A relatively small set of miRNAs, many of which are ubiquitously expressed, account for most of the differences in miRNA profiles between cell lineages and tissues.

...read moreread less

3,687 citations

Journal Article•DOI•

Evolution of genes and genomes on the Drosophila phylogeny.

[...]

Andrew G. Clark¹, Michael B. Eisen², Michael B. Eisen³, Douglas Smith +426 more•Institutions (70)

08 Nov 2007-Nature

TL;DR: These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution.

...read moreread less

Abstract: Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.

...read moreread less

2,057 citations

Journal Article•DOI•

The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical Pacific

[...]

Douglas B. Rusch¹, Aaron L. Halpern¹, Granger G. Sutton¹, Karla B. Heidelberg¹, Karla B. Heidelberg², Shannon J. Williamson¹, Shibu Yooseph¹, Dongying Wu³, Dongying Wu¹, Jonathan A. Eisen¹, Jonathan A. Eisen³, Jeff Hoffman¹, Karin A. Remington¹, Karen Beeson¹, Bao Duc Tran¹, Hamilton O. Smith¹, Holly Baden-Tillson¹, Clare Stewart¹, Joyce Thorpe¹, Jason Freeman¹, Cynthia Andrews-Pfannkoch¹, Joseph E. Venter¹, Kelvin Li¹, Saul A. Kravitz¹, John F. Heidelberg¹, John F. Heidelberg², T. Utterback¹, Yu-Hui Rogers¹, Luisa I. Falcón⁴, Valeria Souza⁴, Germán Bonilla-Rosso⁴, Luis E. Eguiarte⁴, David M. Karl⁵, Shubha Sathyendranath⁶, Trevor Platt⁶, Eldredge Bermingham⁷, Victor A. Gallardo⁸, Giselle Tamayo-Castillo⁹, Michael Ferrari¹⁰, Robert L. Strausberg¹, Kenneth H. Nealson¹, Kenneth H. Nealson², Robert Friedman¹, Marvin Frazier¹, J. Craig Venter¹ - Show less +41 more•Institutions (10)

J. Craig Venter Institute¹, University of Southern California², University of California, Davis³, National Autonomous University of Mexico⁴, University of Hawaii⁵, Bedford Institute of Oceanography⁶, Smithsonian Tropical Research Institute⁷, University of Concepción⁸, University of Costa Rica⁹, Rutgers University¹⁰

13 Mar 2007-PLOS Biology

TL;DR: A metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition, which yielded an extensive dataset consisting of 7.7 million sequencing reads.

...read moreread less

Abstract: The world's oceans contain a complex mixture of micro-organisms that are for the most part, uncharacterized both genetically and biochemically. We report here a metagenomic study of the marine planktonic microbiota in which surface (mostly marine) water samples were analyzed as part of the Sorcerer II Global Ocean Sampling expedition. These samples, collected across a several-thousand km transect from the North Atlantic through the Panama Canal and ending in the South Pacific yielded an extensive dataset consisting of 7.7 million sequencing reads (6.3 billion bp). Though a few major microbial clades dominate the planktonic marine niche, the dataset contains great diversity with 85% of the assembled sequence and 57% of the unassembled data being unique at a 98% sequence identity cutoff. Using the metadata associated with each sample and sequencing library, we developed new comparative genomic and assembly methods. One comparative genomic method, termed "fragment recruitment," addressed questions of genome structure, evolution, and taxonomic or phylogenetic diversity, as well as the biochemical diversity of genes and gene families. A second method, termed "extreme assembly," made possible the assembly and reconstruction of large segments of abundant but clearly nonclonal organisms. Within all abundant populations analyzed, we found extensive intra-ribotype diversity in several forms: (1) extensive sequence variation within orthologous regions throughout a given genome; despite coverage of individual ribotypes approaching 500-fold, most individual sequencing reads are unique; (2) numerous changes in gene content some with direct adaptive implications; and (3) hypervariable genomic islands that are too variable to assemble. The intra-ribotype diversity is organized into genetically isolated populations that have overlapping but independent distributions, implying distinct environmental preference. We present novel methods for measuring the genomic similarity between metagenomic samples and show how they may be grouped into several community types. Specific functional adaptations can be identified both within individual ribotypes and across the entire community, including proteorhodopsin spectral tuning and the presence or absence of the phosphate-binding gene PstS.

...read moreread less

1,982 citations

Journal Article•DOI•

The Diploid Genome Sequence of an Individual Human

[...]

Samuel Levy¹, Granger G. Sutton¹, Pauline C. Ng¹, Lars Feuk², Aaron L. Halpern¹, Brian P. Walenz¹, Nelson Axelrod¹, Jiaqi Huang¹, Ewen F. Kirkness¹, Gennady Denisov¹, Yuan Lin¹, Jeffrey R. MacDonald², Andy Wing Chun Pang², Mary Shago², Timothy B. Stockwell¹, Alexia Tsiamouri¹, Vineet Bafna³, Vikas Bansal³, Saul A. Kravitz¹, Dana A. Busam¹, Karen Beeson¹, Tina C McIntosh¹, Karin A. Remington¹, Josep F. Abril⁴, John Gill¹, Jon Borman¹, Yu-Hui Rogers¹, Marvin Frazier¹, Stephen W. Scherer², Robert L. Strausberg¹, J. Craig Venter¹ - Show less +27 more•Institutions (4)

J. Craig Venter Institute¹, University of Toronto², University of California, San Diego³, University of Barcelona⁴

04 Sep 2007-PLOS Biology

TL;DR: A modified version of the Celera assembler is developed to facilitate the identification and comparison of alternate alleles within this individual diploid genome, and a novel haplotype assembly strategy is used, able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploids nature of the genome.

...read moreread less

Abstract: Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

...read moreread less

1,843 citations

Journal Article•DOI•

Evolutionary and biomedical insights from the rhesus macaque genome

[...]

Richard A. Gibbs¹, Jeffrey Rogers², Michael G. Katze³, Roger E. Bumgarner³ +174 more•Institutions (28)

13 Apr 2007-Science

TL;DR: The genome sequence of an Indian-origin Macaca mulatta female is determined and compared with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families.

...read moreread less

Abstract: The rhesus macaque (Macaca mulatta) is an abundant primate species that diverged from the ancestors of Homo sapiens about 25 million years ago. Because they are genetically and physiologically similar to humans, rhesus monkeys are the most widely used nonhuman primate in basic and applied biomedical research. We determined the genome sequence of an Indian-origin Macaca mulatta female and compared the data with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families. A comparison of sequences from individual animals was used to investigate their underlying genetic diversity. The complete description of the macaque genome blueprint enhances the utility of this animal model for biomedical research and improves our understanding of the basic biology of the species.

...read moreread less

1,297 citations

Journal Article•DOI•

Genome sequence of Aedes aegypti, a major arbovirus vector

[...]

Vishvanath Nene¹, Jennifer R. Wortman¹, Daniel Lawson, Brian J. Haas¹, Chinnappa D. Kodira², Zhijian Jake Tu³, Brendan J. Loftus, Zhiyong Xi⁴, Karyn Megy, Manfred Grabherr², Quinghu Ren¹, Evgeny M. Zdobnov, Neil F. Lobo⁵, Kathryn S. Campbell⁶, Susan E. Brown⁷, Maria de Fatima Bonaldo⁸, Jingsong Zhu⁹, Steven P. Sinkins¹⁰, David G. Hogenkamp¹¹, Paolo Amedeo¹, Peter Arensburger⁹, Peter W. Atkinson⁹, Shelby L. Bidwell¹, Jim Biedler³, Ewan Birney, Robert V. Bruggner⁵, Javier Costas, Monique R. Coy³, Jonathan Crabtree¹, Matt Crawford², Becky deBruyn⁵, David DeCaprio², Karin Eiglmeier¹², Eric Eisenstadt¹, Hamza El-Dorry¹³, William M. Gelbart⁶, Suely Lopes Gomes¹³, Martin Hammond, Linda Hannick¹, James R. Hogan⁵, Michael H. Holmes¹, David M. Jaffe², J. Spencer Johnston, Ryan C. Kennedy⁵, Hean Koo¹, Saul A. Kravitz, Evgenia V. Kriventseva¹⁴, David Kulp¹⁵, Kurt LaButti², Eduardo Lee¹, Song Li³, Diane D. Lovin⁵, Chunhong Mao³, Evan Mauceli², Carlos Frederico Martins Menck¹³, Jason R. Miller¹, Philip Montgomery², Akio Mori⁵, Ana L. T. O. Nascimento¹⁶, Horacio Naveira¹⁷, Chad Nusbaum², Sinéad B. O'Leary², Joshua Orvis¹, Mihaela Pertea, Hadi Quesneville, Kyanne R. Reidenbach¹¹, Yu-Hui Rogers, Charles Roth¹², Jennifer R. Schneider⁵, Michael C. Schatz, Martin Shumway¹, Mario Stanke, Eric O. Stinson⁵, Jose M. C. Tubio, Janice P. Vanzee¹¹, Sergio Verjovski-Almeida¹³, Doreen Werner¹⁸, Owen White¹, Stefan Wyder¹⁴, Qiandong Zeng², Qi Zhao¹, Yongmei Zhao¹, Catherine A. Hill¹¹, Alexander S. Raikhel⁹, Marcelo B. Soares⁸, Dennis L. Knudson⁷, Norman H. Lee, James E. Galagan², Steven L. Salzberg, Ian T. Paulsen¹, George Dimopoulos⁴, Frank H. Collins⁵, Bruce W. Birren², Claire M. Fraser-Liggett, David W. Severson⁵ - Show less +91 more•Institutions (18)

J. Craig Venter Institute¹, Broad Institute², Virginia Tech³, Johns Hopkins University⁴, University of Notre Dame⁵, Harvard University⁶, Colorado State University⁷, Northwestern University⁸, University of California, Riverside⁹, University of Oxford¹⁰, Purdue University¹¹, Pasteur Institute¹², University of São Paulo¹³, University of Geneva¹⁴, University of Massachusetts Amherst¹⁵, Instituto Butantan¹⁶, University of A Coruña¹⁷, University of Göttingen¹⁸

22 Jun 2007-Science

TL;DR: A draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genomes of the malaria vector Anopheles gambiae was presented in this paper.

...read moreread less

Abstract: We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of approximately 4 to 6 increase in average gene length and in sizes of intergenic regions relative to An. gambiae and Drosophila melanogaster. Nonetheless, chromosomal synteny is generally maintained among all three insects, although conservation of orthologous gene order is higher (by a factor of approximately 2) between the mosquito species than between either of them and the fruit fly. An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species.

...read moreread less

1,107 citations

Journal Article•DOI•

The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.

[...]

Shibu Yooseph¹, Granger G. Sutton¹, Douglas B. Rusch¹, Aaron L. Halpern¹, Shannon J. Williamson¹, Karin A. Remington¹, Jonathan A. Eisen¹, Jonathan A. Eisen², Karla B. Heidelberg¹, Gerard Manning³, Weizhong Li⁴, Lukasz Jaroszewski⁴, Piotr Cieplak⁴, Christopher S. Miller⁵, Huiying Li⁵, Susan T. Mashiyama⁶, Marcin P. Joachimiak⁶, Christopher van Belle⁶, John-Marc Chandonia⁶, John-Marc Chandonia⁷, David A W Soergel⁶, Yufeng Zhai³, Kannan Natarajan⁸, Shaun W. Lee⁸, Benjamin J. Raphael⁹, Vineet Bafna⁸, Robert Friedman¹, Steven E. Brenner⁶, Adam Godzik⁴, David Eisenberg⁵, Jack E. Dixon⁸, Susan S. Taylor⁸, Robert L. Strausberg¹, Marvin Frazier¹, J. Craig Venter¹ - Show less +31 more•Institutions (9)

J. Craig Venter Institute¹, University of California, Davis², Salk Institute for Biological Studies³, Sanford-Burnham Institute for Medical Research⁴, University of California, Los Angeles⁵, University of California, Berkeley⁶, Lawrence Berkeley National Laboratory⁷, University of California, San Diego⁸, Brown University⁹

01 Mar 2007-PLOS Biology

TL;DR: This work used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling sequences to add a great deal of diversity to known protein families and shed light on their evolution.

...read moreread less

Abstract: Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.

...read moreread less

871 citations

Journal Article•DOI•

Widespread Lateral Gene Transfer from Intracellular Bacteria to Multicellular Eukaryotes

[...]

Julie C. Dunning Hotopp¹, Michael E. Clark², Deodoro C. S. G. Oliveira², Jeremy M. Foster³, Peter Fischer⁴, Monica C. Muñoz Torres⁵, Jonathan D. Giebel², Nikhil Kumar¹, Nadeeza Ishmael¹, Shiliang Wang¹, Jessica Ingram³, Rahul V. Nene¹, Jessica Shepard¹, Jeffrey P. Tomkins⁵, Stephen Richards⁶, David J. Spiro¹, Elodie Ghedin⁷, Elodie Ghedin¹, Barton E. Slatko³, Hervé Tettelin¹, John H. Werren² - Show less +17 more•Institutions (7)

J. Craig Venter Institute¹, University of Rochester², New England Biolabs³, Washington University in St. Louis⁴, Clemson University⁵, Baylor College of Medicine⁶, University of Pittsburgh⁷

21 Sep 2007-Science

TL;DR: It is shown that some of these inserted Wolbachia genes are transcribed within eukaryotic cells lacking endosymbionts, potentially providing a mechanism for acquisition of new genes and functions.

...read moreread less

Abstract: Although common among bacteria, lateral gene transfer-the movement of genes between distantly related organisms-is thought to occur only rarely between bacteria and multicellular eukaryotes. However, the presence of endosymbionts, such as Wolbachia pipientis, within some eukaryotic germlines may facilitate bacterial gene transfers to eukaryotic host genomes. We therefore examined host genomes for evidence of gene transfer events from Wolbachia bacteria to their hosts. We found and confirmed transfers into the genomes of four insect and four nematode species that range from nearly the entire Wolbachia genome (>1 megabase) to short (<500 base pairs) insertions. Potential Wolbachia-to-host transfers were also detected computationally in three additional sequenced insect genomes. We also show that some of these inserted Wolbachia genes are transcribed within eukaryotic cells lacking endosymbionts. Therefore, heritable lateral gene transfer occurs into eukaryotic hosts from their prokaryote symbionts, potentially providing a mechanism for acquisition of new genes and functions.

...read moreread less

772 citations

Journal Article•DOI•

Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis

[...]

Jane M. Carlton, Robert P. Hirt¹, Joana C. Silva², Arthur L. Delcher³, Michael C. Schatz³, Qi Zhao¹, Jennifer R. Wortman², Shelby L. Bidwell², U. Cecilia M. Alsmark¹, Sébastien Besteiro⁴, Thomas Sicheritz-Pontén⁵, Christophe Noël¹, Joel B. Dacks⁶, Peter G. Foster⁷, Cedric Simillion⁸, Yves Van de Peer⁸, Diego Miranda-Saavedra⁹, Geoffrey J. Barton⁹, Gareth D. Westrop⁴, Sylke Müller⁴, Daniele Dessì¹⁰, Pier Luigi Fiori¹⁰, Qinghu Ren², Ian T. Paulsen², Hanbang Zhang², Felix D. Bastida-Corcuera¹¹, Augusto Simoes-Barbosa¹¹, Mark T. Brown¹¹, Richard D. Hayes¹¹, Mandira Mukherjee¹¹, Cheryl Y. M. Okumura¹¹, Rachel E. Schneider¹¹, Alias J. Smith¹¹, Stepanka Vanacova¹¹, Maria Villalvazo¹¹, Brian J. Haas², Mihaela Pertea³, Tamara Feldblyum², T. Utterback, Chung-Li Shu¹², Kazutoyo Osoegawa¹², Pieter J. de Jong¹², Ivan Hrdy¹³, Lenka Horváthová¹³, Zuzana Zubáčová¹³, Pavel Dolezal¹³, Shehre-Banoo Malik¹⁴, John M. Logsdon¹⁴, Katrin Henze¹⁵, Arti Gupta¹⁶, Ching C. Wang¹⁶, R. L. Dunne¹⁷, Jacqueline A. Upcroft¹⁸, Peter Upcroft¹⁸, Owen White², Steven L. Salzberg³, Petrus Tang¹⁹, Cheng-Hsun Chiu¹⁹, Ying-Shiung Lee¹⁹, T. Martin Embley¹, Graham H. Coombs²⁰, Jeremy C. Mottram⁴, Jan Tachezy¹³, Claire M. Fraser-Liggett², Patricia J. Johnson¹¹ - Show less +61 more•Institutions (20)

12 Jan 2007-Science

TL;DR: The genome sequence of the protist Trichomonas vaginalis predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria.

...read moreread less

Abstract: We describe the genome sequence of the protist Trichomonas vaginalis, a sexually transmitted human pathogen. Repeats and transposable elements comprise about two-thirds of the similar to 160-megabase genome, reflecting a recent massive expansion of genetic material. This expansion, in conjunction with the shaping of metabolic pathways that likely transpired through lateral gene transfer from bacteria, and amplification of specific gene families implicated in pathogenesis and phagocytosis of host proteins may exemplify adaptations of the parasite during its transition to a urogenital environment. The genome sequence predicts previously unknown functions for the hydrogenosome, which support a common evolutionary origin of this unusual organelle with mitochondria.

...read moreread less

751 citations

Journal Article•DOI•

The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation

[...]

Brian Palenik¹, Jane Grimwood², Andrea Aerts³, Pierre Rouzé⁴, Asaf Salamov³, Nicholas H. Putnam³, Christopher L. Dupont¹, Richard A. Jorgensen⁵, Evelyne Derelle, Stephane Rombauts⁴, Kemin Zhou³, Robert Otillar³, Sabeeha S. Merchant⁶, Sheila Podell¹, Terry Gaasterland¹, Carolyn A. Napoli⁵, Karla C Gendler⁵, Andrea L Manuell⁷, Vera Tai¹, Olivier Vallon, Gwenael Piganeau, Séverine Jancek, Marc Heijde⁸, Kamel Jabbari⁸, Chris Bowler⁸, Martin Lohr⁹, Steven Robbens⁴, Gregory Werner³, Inna Dubchak³, Gregory J. Pazour¹⁰, Qinghu Ren¹¹, Ian T. Paulsen¹¹, Charles F. Delwiche¹², Jeremy Schmutz², Daniel S. Rokhsar³, Yves Van de Peer⁴, Hervé Moreau, Igor V. Grigoriev³ - Show less +34 more•Institutions (12)

University of California, San Diego¹, Stanford University², United States Department of Energy³, Ghent University⁴, University of Arizona⁵, University of California, Los Angeles⁶, Scripps Research Institute⁷, Centre national de la recherche scientifique⁸, University of Mainz⁹, University of Massachusetts Medical School¹⁰, J. Craig Venter Institute¹¹, University of Maryland, College Park¹²

01 May 2007-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: It is speculated that this latter process may be involved in altering the cell-surface characteristics of each species, and selenoenzymes, novel fusion proteins, and loss of some major protein families including ones associated with chromatin are likely important adaptations for achieving a small cell size.

...read moreread less

Abstract: The smallest known eukaryotes, at ≈1-μm diameter, are Ostreococcus tauri and related species of marine phytoplankton. The genome of Ostreococcus lucimarinus has been completed and compared with that of O. tauri. This comparison reveals surprising differences across orthologous chromosomes in the two species from highly syntenic chromosomes in most cases to chromosomes with almost no similarity. Species divergence in these phytoplankton is occurring through multiple mechanisms acting differently on different chromosomes and likely including acquisition of new genes through horizontal gene transfer. We speculate that this latter process may be involved in altering the cell-surface characteristics of each species. In addition, the genome of O. lucimarinus provides insights into the unique metal metabolism of these organisms, which are predicted to have a large number of selenocysteine-containing proteins. Selenoenzymes are more catalytically active than similar enzymes lacking selenium, and thus the cell may require less of that protein. As reported here, selenoenzymes, novel fusion proteins, and loss of some major protein families including ones associated with chromatin are likely important adaptations for achieving a small cell size.

...read moreread less

612 citations

Journal Article•DOI•

Draft Genome of the Filarial Nematode Parasite Brugia malayi

[...]

Elodie Ghedin¹, Elodie Ghedin², Shiliang Wang¹, David J. Spiro¹, Elisabet Caler¹, Qi Zhao¹, Jonathan Crabtree¹, Jonathan E. Allen¹, Arthur L. Delcher¹, David B. Guiliano³, Diego Miranda-Saavedra⁴, Samuel V. Angiuoli¹, Todd Creasy¹, Paolo Amedeo¹, Brian J. Haas¹, Najib M. El-Sayed¹, Jennifer R. Wortman¹, Tamara Feldblyum¹, Luke J. Tallon¹, Michael C. Schatz¹, Martin Shumway¹, Hean Koo¹, Steven L. Salzberg¹, Seth Schobel¹, Mihaela Pertea¹, Mihai Pop¹, Owen White¹, Geoffrey J. Barton⁴, Clotilde K. S. Carlow⁵, Crawford Michael J, Jennifer Daub⁶, Dimmic Matt W, Chris F. Estes⁷, Jeremy M. Foster⁵, Mehul B. Ganatra⁵, William F. Gregory⁶, Nicholas M. Johnson⁸, Jinming Jin⁵, Richard Komuniecki⁹, Ian F Korf¹⁰, Sanjay Kumar⁵, Sandra J. Laney¹¹, Ben-Wen Li¹², Wen Li¹¹, Tim H. Lindblom⁷, Sara Lustigman¹³, Dong Ma⁵, Claude V. Maina⁵, David M. A. Martin⁴, James P. McCarter¹², Larry A. McReynolds⁵, Makedonka Mitreva¹², Thomas B. Nutman¹⁴, John Parkinson, José M. Peregrín-Alvarez², Catherine B. Poole⁵, Qinghu Ren¹, Lori Saunders¹¹, Ann E. Sluder, Katherine A. Smith⁹, Mario Stanke¹⁵, Thomas R. Unnasch¹⁶, Jenna Ware⁵, Aguan Wei¹², Gary J. Weil¹², Deryck J. Williams⁶, Yinhua Zhang⁵, Steven A. Williams¹¹, Claire M. Fraser-Liggett¹, Barton E. Slatko⁵, Mark Blaxter⁶, Alan L. Scott¹⁷ - Show less +68 more•Institutions (17)

J. Craig Venter Institute¹, University of Pittsburgh², Imperial College London³, University of Dundee⁴, New England Biolabs⁵, University of Edinburgh⁶, Lyon College⁷, Australian National University⁸, University of Toledo⁹, University of California, Davis¹⁰, Smith College¹¹, Washington University in St. Louis¹², New York Blood Center¹³, National Institutes of Health¹⁴, University of Göttingen¹⁵, University of Alabama at Birmingham¹⁶, Johns Hopkins University¹⁷

21 Sep 2007-Science

TL;DR: In this article, the authors sequenced the ∼90 megabase (Mb) genome of the human filarial parasite Brugia malayi and predicted ∼11,500 protein coding genes in 71 Mb of robustly assembled sequence.

...read moreread less

Abstract: Parasitic nematodes that cause elephantiasis and river blindness threaten hundreds of millions of people in the developing world. We have sequenced the ∼90 megabase (Mb) genome of the human filarial parasite Brugia malayi and predict ∼11,500 protein coding genes in 71 Mb of robustly assembled sequence. Comparative analysis with the free-living, model nematode Caenorhabditis elegans revealed that, despite these genes having maintained little conservation of local synteny during ∼350 million years of evolution, they largely remain in linkage on chromosomal units. More than 100 conserved operons were identified. Analysis of the predicted proteome provides evidence for adaptations of B. malayi to niches in its human and vector hosts and insights into the molecular basis of a mutualistic relationship with its Wolbachia endosymbiont. These findings offer a foundation for rational drug design.

...read moreread less

Journal Article•DOI•

Assembling millions of short DNA sequences using SSAKE

[...]

René L. Warren, Granger G. Sutton¹, Steven J.M. Jones, Robert A. Holt•Institutions (1)

J. Craig Venter Institute¹

01 Feb 2007-Bioinformatics

TL;DR: SSAKE is a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets.

...read moreread less

Abstract: Summary: Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging. The instrument now available from Solexa Ltd, produces millions of short DNA sequences of 25 nt each. Due to ubiquitous repeats in large genomes and the inability of short sequences to uniquely and unambiguously characterize them, the short read length limits applicability for de novo sequencing. However, given the sequencing depth and the throughput of this instrument, stringent assembly of highly identical sequences can be achieved. We describe SSAKE, a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences. SSAKE is designed to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets. Availability: http://www.bcgsc.ca/bioinfo/software/ssake Contact: [email protected]

...read moreread less

Journal Article•DOI•

Current Production and Metal Oxide Reduction by Shewanella oneidensis MR-1 Wild Type and Mutants

[...]

Orianna Bretschger, Anna Obraztsova¹, Carter A. Sturm², In Seop Chang³, In Seop Chang⁴, Yuri A. Gorby⁵, Samantha B. Reed⁶, David E. Culley⁶, Catherine L. Reardon⁶, Soumitra Barua⁷, Soumitra Barua⁸, Margaret F. Romine⁶, Jizhong Zhou⁸, Jizhong Zhou⁷, Alexander S. Beliaev⁶, Rachida Bouhenni⁹, Daad A. Saffarini⁹, Florian Mansfeld, Byung Hong Kim³, Byung Hong Kim¹, James K. Fredrickson⁶, Kenneth H. Nealson¹ - Show less +18 more•Institutions (9)

University of Southern California¹, Rice University², Korea Institute of Science and Technology³, Gwangju Institute of Science and Technology⁴, J. Craig Venter Institute⁵, Pacific Northwest National Laboratory⁶, Oak Ridge National Laboratory⁷, University of Oklahoma⁸, University of Wisconsin–Milwaukee⁹

01 Nov 2007-Applied and Environmental Microbiology

TL;DR: The results showed that a few key cytochromes play a role in all of the processes but that their degrees of participation in each process are very different, suggesting a very complex picture of electron transfer to solid and soluble substrates by S. oneidensis MR-1.

...read moreread less

Abstract: Shewanella oneidensis MR-1 is a gram-negative facultative anaerobe capable of utilizing a broad range of electron acceptors, including several solid substrates. S. oneidensis MR-1 can reduce Mn(IV) and Fe(III) oxides and can produce current in microbial fuel cells. The mechanisms that are employed by S. oneidensis MR-1 to execute these processes have not yet been fully elucidated. Several different S. oneidensis MR-1 deletion mutants were generated and tested for current production and metal oxide reduction. The results showed that a few key cytochromes play a role in all of the processes but that their degrees of participation in each process are very different. Overall, these data suggest a very complex picture of electron transfer to solid and soluble substrates by S. oneidensis MR-1.

...read moreread less

Journal Article•DOI•

Genome Sequence of Avery's Virulent Serotype 2 Strain D39 of Streptococcus pneumoniae and Comparison with That of Unencapsulated Laboratory Strain R6

[...]

Joel A. Lanie¹, Wai-Leung Ng¹, Krystyna M. Kazmierczak¹, Tiffany M. Andrzejewski¹, Tanja M. Davidsen, Kyle J. Wayne¹, Hervé Tettelin, John I. Glass², Malcolm E. Winkler¹ - Show less +5 more•Institutions (2)

Indiana University¹, J. Craig Venter Institute²

01 Jan 2007-Journal of Bacteriology

TL;DR: The genome sequences and new annotation of two different isolates of strain D39 and the corrected sequence of strain R6 are reported and the implications of the D39 genome sequences to studies of pneumococcal physiology and pathogenesis are presented and discussed.

...read moreread less

Abstract: Streptococcus pneumoniae (pneumococcus) is a leading human respiratory pathogen that causes a variety of serious mucosal and invasive diseases. D39 is an historically important serotype 2 strain that was used in experiments by Avery and coworkers to demonstrate that DNA is the genetic material. Although isolated nearly a century ago, D39 remains extremely virulent in murine infection models and is perhaps the strain used most frequently in current studies of pneumococcal pathogenesis. To date, the complete genome sequences have been reported for only two S. pneumoniae strains: TIGR4, a recent serotype 4 clinical isolate, and laboratory strain R6, an avirulent, unencapsulated derivative of strain D39. We report here the genome sequences and new annotation of two different isolates of strain D39 and the corrected sequence of strain R6. Comparisons of these three related sequences allowed deduction of the likely sequence of the D39 progenitor and mutations that arose in each isolate. Despite its numerous repeated sequences and IS elements, the serotype 2 genome has remained remarkably stable during cultivation, and one of the D39 isolates contains only five relatively minor mutations compared to the deduced D39 progenitor. In contrast, laboratory strain R6 contains 71 single-base-pair changes, six deletions, and four insertions and has lost the cryptic pDP1 plasmid compared to the D39 progenitor strain. Many of these mutations are in or affect the expression of genes that play important roles in regulation, metabolism, and virulence. The nature of the mutations that arose spontaneously in these three strains, the relative global transcription patterns determined by microarray analyses, and the implications of the D39 genome sequences to studies of pneumococcal physiology and pathogenesis are presented and discussed.

...read moreread less

Journal Article•DOI•

CAMERA: a community resource for metagenomics.

[...]

Rekha Seshadri¹, Saul A. Kravitz, Larry Smarr, Paul Gilna, Marvin Frazier - Show less +1 more•Institutions (1)

J. Craig Venter Institute¹

13 Mar 2007-PLOS Biology

TL;DR: The CAMERA (Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis) community database for metagenomic data deposition is an important first step in developing methods for monitoring microbial communities.

...read moreread less

Abstract: Microbes are responsible for most of the chemical transformations that are crucial to sustaining life on Earth. Their ability to inhabit almost any environmental niche suggests that they possess an incredible diversity of physiological capabilities. However, we have little to no information on a majority of the millions of microbial species that are predicted to exist, mainly because of our inability to culture them in the laboratory. A growing discipline called metagenomics allows us to study these uncultured organisms by deciphering their genetic information from DNA that is extracted directly from their environment, thus effectively bypassing the laboratory culture step. Metagenomics allows us to address the questions “who's there?”, “what are they doing?”, and “how are they doing it?”, offering insights into the evolutionary history as well as previously unrecognized physiological abilities of uncultured communities. Studies such as the J. Craig Venter Institute's Global Ocean Sampling (GOS) expedition (in this issue) reveal a remarkable breadth and depth of microbial diversity in the oceans. To date, researchers have made significant but largely preliminary inroads into understanding the biogeography of microbial populations across ecosystems. We know even less about the dynamic physiological processes and complex interactions that impact global carbon cycles and ocean productivity. Marine microbes are thought to act as part of the biological conduit that transports carbon dioxide from the surface to the deep oceanic realms. By removing carbon from the atmosphere and sequestering it (in the form of organic matter), marine microorganisms may significantly affect global climate. Although we now have numerous global and real-time methods to measure physical and chemical parameters within the ocean, few methods or concepts have been developed to measure important microbial processes on a global scale. Even if the technology to make such measurements existed, we would presently not know what to measure or how to interpret those measurements. We invite the research community to submit its metagenomics data to CAMERA. We need a systematic way to explore the structure and function of ocean ecosystems, and their impact on global carbon processing and climate. Metagenomics has the potential to shed light on the genetic controls of these processes by investigating the key players, their roles, and community compositions that may change as a function of time, climate, nutrients, carbon dioxide, and anthropogenic factors. These studies include a substantial informatics component, requiring researchers to take on complex computational and mathematical challenges. Nonetheless, microbiologists have been quick to seize upon this modern technique, resulting in a deluge of sequence data, and an ever-widening gap between the rates of collecting data and interpreting it. The Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (CAMERA) project [1] is an important first step in attempting to bridge these gaps and in developing global methods for monitoring microbial communities in the ocean and their response to environmental changes. The aim is to create a rich, distinctive data repository and bioinformatics tools resource that will address many of the unique challenges of metagenomics and enable researchers to unravel the biology of environmental microorganisms (Figure 1). CAMERA's database includes environmental metagenomic and genomic sequence data, associated environmental parameters (“metadata”), precomputed search results, and software tools to support powerful cross-analysis of environmental samples. Figure 1 Schematic of Intended Core Functions of the CAMERA Project The initial release will include data and tools associated with the companion set of GOS expedition publications [2–4]; metagenome data from the Hawaii Ocean Time Series Station ALOHA [5] and marine viromes from four different oceanic regions[6]; standard nonredundant sequence databases (e.g., nrnt for nucleotides and nraa for amino acids[7]); and collections of microbial genome sequences, including a set of 155 marine microbial genomes funded by the Gordon and Betty Moore Foundation. The focal point for the CAMERA project is its Web site: http://camera.calit2.net. We invite the research community to submit its metagenomics data to CAMERA, and are establishing mechanisms to streamline this process. Here we describe some of the key challenges and features of the CAMERA project.

...read moreread less

Journal Article•DOI•

Genome transplantation in bacteria: changing one species to another.

[...]

Carole Lartigue¹, John I. Glass¹, Nina Alperovich¹, Rembert Pieper¹, Prashanth P. Parmar¹, Clyde A. Hutchison¹, Hamilton O. Smith¹, J. Craig Venter¹ - Show less +4 more•Institutions (1)

J. Craig Venter Institute¹

03 Aug 2007-Science

TL;DR: This work completely replaced the genome of a bacterial cell with one from another species by transplanting a whole genome as naked DNA into Mycoplasma capricolum cells by polyethylene glycol–mediated transformation.

...read moreread less

Abstract: As a step toward propagation of synthetic genomes, we completely replaced the genome of a bacterial cell with one from another species by transplanting a whole genome as naked DNA. Intact genomic DNA from Mycoplasma mycoides large colony (LC), virtually free of protein, was transplanted into Mycoplasma capricolum cells by polyethylene glycol-mediated transformation. Cells selected for tetracycline resistance, carried by the M. mycoides LC chromosome, contain the complete donor genome and are free of detectable recipient genomic sequences. These cells that result from genome transplantation are phenotypically identical to the M. mycoides LC donor strain as judged by several criteria.

...read moreread less

Journal Article•DOI•

Toward Sequencing Cotton ( Gossypium ) Genomes

[...]

Z. Jeffrey Chen, Brian E. Scheffler¹, Elizabeth S. Dennis², Barbara A. Triplett¹, Tianzhen Zhang³, Wangzhen Guo³, Xiao-Ya Chen, David M. Stelly⁴, Pablo D. Rabinowicz⁵, Christopher D. Town⁶, Tony Arioli⁷, Curt L. Brubaker⁷, Roy G. Cantrell⁸, Jean Marc Lacape, Mauricio Ulloa¹, Peng W. Chee⁹, Alan R. Gingle⁹, Candace H. Haigler¹⁰, Richard G. Percy¹, Sukumar Saha¹, Thea A. Wilkins¹¹, Robert J. Wright¹¹, Allen Van Deynze¹², Yu-Xian Zhu¹³, Shuxun Yu, Ibrokhim Y. Abdurakhmonov¹⁴, Ishwarappa S. Katageri¹⁵, P. Ananda Kumar¹⁶, Mehboob-ur-Rahman¹⁷, Yusuf Zafar¹⁷, John Z. Yu¹, Russell J. Kohel¹, Jonathan F. Wendel¹⁸, Andrew H. Paterson⁹ - Show less +30 more•Institutions (18)

United States Department of Agriculture¹, Commonwealth Scientific and Industrial Research Organisation², Nanjing Agricultural University³, Texas A&M University⁴, University of Maryland, Baltimore⁵, J. Craig Venter Institute⁶, Bayer⁷, Monsanto⁸, University of Georgia⁹, North Carolina State University¹⁰, Texas Tech University¹¹, University of California, Davis¹², Peking University¹³, Academy of Sciences of Uzbekistan¹⁴, University of Agricultural Sciences, Dharwad¹⁵, Indian Agricultural Research Institute¹⁶, National Institute for Biotechnology and Genetic Engineering¹⁷, Iowa State University¹⁸

01 Dec 2007-Plant Physiology

TL;DR: Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly and the difficulties of sequencing and assembling complex genomes de novo are not yet addressed.

...read moreread less

Abstract: Despite rapidly decreasing costs and innovative technologies, sequencing of angiosperm genomes is not yet undertaken lightly. Generating larger amounts of sequence data more quickly does not address the difficulties of sequencing and assembling complex genomes de novo. The cotton ( Gossypium spp.)

...read moreread less

Journal Article•DOI•

Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome.

[...]

Byrappa Venkatesh¹, Ewen F. Kirkness, Yong-Hwee E. Loh¹, Aaron L. Halpern², Alison P. Lee¹, Justin Johnson², Nidhi Dandona¹, Lakshmi D. Viswanathan², Alice Tay¹, J. Craig Venter², Robert L. Strausberg², Sydney Brenner¹ - Show less +8 more•Institutions (2)

Institute of Molecular and Cell Biology¹, J. Craig Venter Institute²

03 Apr 2007-PLOS Biology

TL;DR: Survey sequencing and comparative analysis of the elephant shark genome are described, showing the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes.

...read moreread less

Abstract: Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

...read moreread less

Journal Article•DOI•

Nanoliter reactors improve multiple displacement amplification of genomes from single cells.

[...]

Yann Marcy¹, Thomas Ishoey², Roger S. Lasken², Timothy B. Stockwell², Brian P. Walenz², Aaron L. Halpern², Karen Beeson², Susanne M. D. Goldberg², Stephen R. Quake¹, Stephen R. Quake³ - Show less +6 more•Institutions (3)

Stanford University¹, J. Craig Venter Institute², Howard Hughes Medical Institute³

21 Sep 2007-PLOS Genetics

TL;DR: Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells.

...read moreread less

Abstract: Since only a small fraction of environmental bacteria are amenable to laboratory culture, there is great interest in genomic sequencing directly from single cells. Sufficient DNA for sequencing can be obtained from one cell by the Multiple Displacement Amplification (MDA) method, thereby eliminating the need to develop culture methods. Here we used a microfluidic device to isolate individual Escherichia coli and amplify genomic DNA by MDA in 60-nl reactions. Our results confirm a report that reduced MDA reaction volume lowers nonspecific synthesis that can result from contaminant DNA templates and unfavourable interaction between primers. The quality of the genome amplification was assessed by qPCR and compared favourably to single-cell amplifications performed in standard 50-μl volumes. Amplification bias was greatly reduced in nanoliter volumes, thereby providing a more even representation of all sequences. Single-cell amplicons from both microliter and nanoliter volumes provided high-quality sequence data by high-throughput pyrosequencing, thereby demonstrating a straightforward route to sequencing genomes from single cells.

...read moreread less

Journal Article•DOI•

DNA sequencing: bench to bedside and beyond

[...]

Clyde A. Hutchison¹•Institutions (1)

J. Craig Venter Institute¹

01 Sep 2007-Nucleic Acids Research

TL;DR: New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine.

...read moreread less

Abstract: Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage rX174), which demonstrated that sequence could give profound insights into genetic organization. Incremental improvements allowed sequencing of molecules >200kb (human cytomegalovirus) leading to an avalanche of data that demanded computational analysis and spawned the field of bioinformatics. The US Human Genome Project spurred sequencing activity. By 1992 the first ‘sequencing factory’ was established, and others soon followed. The first complete cellular genome sequences, from bacteria, appeared in 1995 and other eubacterial, archaebacterial and eukaryotic genomes were soon sequenced. Competition between the public Human Genome Project and Celera Genomics produced working drafts of the human genome sequence, published in 2001, but refinement and analysis of the human genome sequence will continue for the foreseeable future. New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine. These advances will also allow new approaches to a variety of problems in biology, evolution and the environment.

...read moreread less

Journal Article•DOI•

Structural and functional diversity of the microbial kinome.

[...]

Natarajan Kannan¹, Susan S. Taylor¹, Yufeng Zhai², J. Craig Venter³, Gerard Manning² - Show less +1 more•Institutions (3)

University of California, San Diego¹, Salk Institute for Biological Studies², J. Craig Venter Institute³

13 Mar 2007-PLOS Biology

TL;DR: This huge phylogenetic and functional space is explored to cast light on the ancient evolution of this superfamily of enzymes built on a common protein kinase–like (PKL) fold and serves as a model for further structural and functional analysis of enzyme evolution.

...read moreread less

Abstract: The eukaryotic protein kinase (ePK) domain mediates the majority of signaling and coordination of complex events in eukaryotes. By contrast, most bacterial signaling is thought to occur through structurally unrelated histidine kinases, though some ePK-like kinases (ELKs) and small molecule kinases are known in bacteria. Our analysis of the Global Ocean Sampling (GOS) dataset reveals that ELKs are as prevalent as histidine kinases and may play an equally important role in prokaryotic behavior. By combining GOS and public databases, we show that the ePK is just one subset of a diverse superfamily of enzymes built on a common protein kinase–like (PKL) fold. We explored this huge phylogenetic and functional space to cast light on the ancient evolution of this superfamily, its mechanistic core, and the structural basis for its observed diversity. We cataloged 27,677 ePKs and 18,699 ELKs, and classified them into 20 highly distinct families whose known members suggest regulatory functions. GOS data more than tripled the count of ELK sequences and enabled the discovery of novel families and classification and analysis of all ELKs. Comparison between and within families revealed ten key residues that are highly conserved across families. However, all but one of the ten residues has been eliminated in one family or another, indicating great functional plasticity. We show that loss of a catalytic lysine in two families is compensated by distinct mechanisms both involving other key motifs. This diverse superfamily serves as a model for further structural and functional analysis of enzyme evolution.

...read moreread less

Journal Article•DOI•

Mechanism of chimera formation during the Multiple Displacement Amplification reaction

[...]

Roger S. Lasken¹, Timothy B. Stockwell¹•Institutions (1)

J. Craig Venter Institute¹

12 Apr 2007-BMC Biotechnology

TL;DR: Identification of the mechanism for chimera formation provides new insight into the MDA reaction and suggests methods to reduce chimeras, particularly for whole genome sequencing.

...read moreread less

Abstract: Multiple Displacement Amplification (MDA) is a method used for amplifying limiting DNA sources. The high molecular weight amplified DNA is ideal for DNA library construction. While this has enabled genomic sequencing from one or a few cells of unculturable microorganisms, the process is complicated by the tendency of MDA to generate chimeric DNA rearrangements in the amplified DNA. Determining the source of the DNA rearrangements would be an important step towards reducing or eliminating them. Here, we characterize the major types of chimeras formed by carrying out an MDA whole genome amplification from a single E. coli cell and sequencing by the 454 Life Sciences method. Analysis of 475 chimeras revealed the predominant reaction mechanisms that create the DNA rearrangements. The highly branched DNA synthesized in MDA can assume many alternative secondary structures. DNA strands extended on an initial template can be displaced becoming available to prime on a second template creating the chimeras. Evidence supports a model in which branch migration can displace 3'-ends freeing them to prime on the new templates. More than 85% of the resulting DNA rearrangements were inverted sequences with intervening deletions that the model predicts. Intramolecular rearrangements were favored, with displaced 3'-ends reannealing to single stranded 5'-strands contained within the same branched DNA molecule. In over 70% of the chimeric junctions, the 3' termini had initiated priming at complimentary sequences of 2–21 nucleotides (nts) in the new templates. Formation of chimeras is an important limitation to the MDA method, particularly for whole genome sequencing. Identification of the mechanism for chimera formation provides new insight into the MDA reaction and suggests methods to reduce chimeras. The 454 sequencing approach used here will provide a rapid method to assess the utility of reaction modifications.

...read moreread less

Journal Article•DOI•

Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

[...]

Takeshi Itoh¹, Takeshi Itoh², Tsuyoshi Tanaka¹, Roberto A. Barrero, Chisato Yamasaki², Yasuyuki Fujii², Phillip Hilton², Baltazar A. Antonio¹, Hideo Aono, Rolf Apweiler, Richard Bruskiewich³, Thomas E. Bureau⁴, Frances A. Burr⁵, Antonio Costa de Oliveira⁶, Galina Fuks⁷, Takuya Habara², Georg Haberer, Bin Han, Erimi Harada², Aiko T. Hiraki², Hirohiko Hirochika¹, Douglas R. Hoen⁴, Hiroki Hokari², Satomi Hosokawa, Yue-Ie C. Hsing⁸, Hiroshi Ikawa⁹, Kazuho Ikeo, Tadashi Imanishi², Tadashi Imanishi¹⁰, Yukiyo Ito, Pankaj Jaiswal¹¹, Masako Kanno², Yoshihiro Kawahara¹², Yoshihiro Kawahara², Toshiyuki Kawamura², Hiroaki Kawashima², Jitendra P. Khurana¹³, Shoshi Kikuchi¹, Setsuko Komatsu¹, Kanako O. Koyanagi¹⁰, Hiromi Kubooka², Damien Lieberherr¹⁴, Yao-Cheng Lin⁸, David M. Lonsdale, Takashi Matsumoto¹, Akihiro Matsuya², W. Richard McCombie¹⁵, Joachim Messing⁷, Akio Miyao¹, Nicola Mulder, Yoshiaki Nagamura¹, Jongmin Nam¹⁶, Jongmin Nam¹⁷, Nobukazu Namiki, Hisataka Numa¹, Shin Nurimoto², Claire O'Donovan, Hajime Ohyanagi⁹, Toshihisa Okido, Satoshi Oota, Naoki Osato, Lance E. Palmer¹⁵, Lance E. Palmer¹⁸, Francis Quetier¹⁹, Saurabh Raghuvanshi¹³, Naomi Saichi², Hiroaki Sakai², Hiroaki Sakai¹, Yasumichi Sakai⁹, Katsumi Sakata⁹, Tetsuya Sakurai, Fumihiko Sato², Yoshiharu Sato², Heiko Schoof²⁰, Heiko Schoof²¹, Motoaki Seki, Michie Shibata, Yuji Shimizu⁹, Kazuo Shinozaki, Yuji Shinso², Nagendra K. Singh²², Brian Smith-White²³, Jun-ichi Takeda², Motohiko Tanino², Tatiana Tatusova²³, Supat Thongjuea²⁴, Fusano Todokoro², Mika Tsugane, Akhilesh K. Tyagi¹³, Apichart Vanavichit²⁴, Aihui Wang²⁵, Rod A. Wing, Kaori Yamaguchi², Mayu Yamamoto, Naoyuki Yamamoto², Yeisoo Yu²⁶, Hao Zhang², Qiang Zhao, Kenichi Higo¹, Benjamin Burr⁵, Takashi Gojobori², Takuji Sasaki¹ - Show less +98 more•Institutions (26)

01 Feb 2007-Genome Research

TL;DR: The results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene.

...read moreread less

Abstract: We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene.

...read moreread less

Journal Article•DOI•

Phenotypic and Transcriptomic Changes Associated with Potato Autopolyploidization

[...]

Robert M. Stupar, Pudota B. Bhaskar¹, Brian S. Yandell¹, Willem Albert Rensink², Amy L. Hart², Shu Ouyang², Richard E. Veilleux³, James S. Busse⁴, Robert J. Erhardt¹, C. Robin Buell², Jiming Jiang¹ - Show less +7 more•Institutions (4)

University of Wisconsin-Madison¹, J. Craig Venter Institute², Virginia Tech³, United States Department of Agriculture⁴

01 Aug 2007-Genetics

TL;DR: It is demonstrated that there are few genes, if any, whose expression is linearly correlated with the ploids and can be dramatically changed because of ploidy alteration, and that alteration of ploids caused subtle expression changes of a substantial percentage of genes in the potato genome.

...read moreread less

Abstract: Polyploidy is remarkably common in the plant kingdom and polyploidization is a major driving force for plant genome evolution. Polyploids may contain genomes from different parental species (allopolyploidy) or include multiple sets of the same genome (autopolyploidy). Genetic and epigenetic changes associated with allopolyploidization have been a major research subject in recent years. However, we know little about the genetic impact imposed by autopolyploidization. We developed a synthetic autopolyploid series in potato (Solanum phureja) that includes one monoploid (1x) clone, two diploid (2x) clones, and one tetraploid (4x) clone. Cell size and organ thickness were positively correlated with the ploidy level. However, the 2x plants were generally the most vigorous and the 1x plants exhibited less vigor compared to the 2x and 4x individuals. We analyzed the transcriptomic variation associated with this autopolyploid series using a potato cDNA microarray containing ∼9000 genes. Statistically significant expression changes were observed among the ploidies for ∼10% of the genes in both leaflet and root tip tissues. However, most changes were associated with the monoploid and were within the twofold level. Thus, alteration of ploidy caused subtle expression changes of a substantial percentage of genes in the potato genome. We demonstrated that there are few genes, if any, whose expression is linearly correlated with the ploidy and can be dramatically changed because of ploidy alteration.

...read moreread less

Journal Article•DOI•

Single-cell genomic sequencing using Multiple Displacement Amplification.

[...]

Roger S. Lasken¹•Institutions (1)

J. Craig Venter Institute¹

01 Oct 2007-Current Opinion in Microbiology

TL;DR: Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction, which will greatly accelerate the pace of sequencing from uncultured microbes.

...read moreread less

Journal Article•DOI•

Assessing diversity and biogeography of aerobic anoxygenic phototrophic bacteria in surface waters of the Atlantic and Pacific Oceans using the Global Ocean Sampling expedition metagenomes.

[...]

Natalya Yutin¹, Marcelino T. Suzuki², Hanno Teeling³, M. Weber³, J. Craig Venter⁴, Douglas B. Rusch⁴, Oded Béjà¹ - Show less +3 more•Institutions (4)

Technion – Israel Institute of Technology¹, University of Maryland Center for Environmental Science², Max Planck Society³, J. Craig Venter Institute⁴

01 Jun 2007-Environmental Microbiology

TL;DR: The results support the notion that marine AAnP populations are complex and dynamic, and compose an important fraction of bacterioplankton assemblages in certain oceanic areas.

...read moreread less

Abstract: Summary Aerobic anoxygenic photosynthetic bacteria (AAnP) were recently proposed to be significant contributors to global oceanic carbon and energy cycles. However, AAnP abundance, spatial distribution, diversity and potential ecological importance remain poorly understood. Here we present metagenomic data from the Global Ocean Sampling expedition indicating that AAnP diversity and abundance vary in different oceanic regions. Furthermore, we show for the first time that the composition of AAnP assemblages change between different oceanic regions, with specific bacterial assemblages adapted to open ocean or coastal areas respectively. Our results support the notion that marine AAnP populations are complex and dynamic, and compose an important fraction of bacterioplankton assemblages in certain oceanic areas.

...read moreread less

Journal Article•DOI•

Origin and distribution of epipolythiodioxopiperazine (ETP) gene clusters in filamentous ascomycetes.

[...]

Nicola J. Patron¹, Nicola J. Patron², Ross F. Waller¹, Anton Cozijnsen¹, David C. Straney³, Donald M. Gardiner⁴, Donald M. Gardiner¹, William C. Nierman⁵, William C. Nierman⁶, Barbara J. Howlett¹ - Show less +6 more•Institutions (6)

University of Melbourne¹, University of British Columbia², University of Maryland, College Park³, Commonwealth Scientific and Industrial Research Organisation⁴, George Washington University⁵, J. Craig Venter Institute⁶

26 Sep 2007-BMC Evolutionary Biology

TL;DR: ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages, suggesting that a progenitor ETP gene cluster assembled within an ancestral taxon.

...read moreread less

Abstract: Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs) are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of these clusters is discussed with consideration to multiple instances of independent cluster loss and lateral transfer of gene clusters between lineages.

...read moreread less

Journal Article•DOI•

Breed relationships facilitate fine-mapping studies: A 7.8-kb deletion cosegregates with Collie eye anomaly across multiple dog breeds

[...]

Heidi G. Parker¹, Anna V. Kukekova², Dayna Akey³, Orly Goldstein⁴, Ewen F. Kirkness⁵, Kathleen C. Baysac¹, Dana S. Mosher¹, Gustavo D. Aguirre⁶, Gregory M. Acland⁴, Elaine A. Ostrander¹ - Show less +6 more•Institutions (6)

National Institutes of Health¹, University of Illinois at Urbana–Champaign², University of Washington³, Cornell University⁴, J. Craig Venter Institute⁵, University of Pennsylvania⁶

01 Nov 2007-Genome Research

TL;DR: This work establishes that the primary cea mutation arose as a single disease allele in a common ancestor of herding breeds as well as highlights the value of comparative population analysis for refining regions of linkage.

...read moreread less

Abstract: The features of modern dog breeds that increase the ease of mapping common diseases, such as reduced heterogeneity and extensive linkage disequilibrium, may also increase the difficulty associated with fine mapping and identifying causative mutations. One way to address this problem is by combining data from multiple breeds segregating the same trait after initial linkage has been determined. The multibreed approach increases the number of potentially informative recombination events and reduces the size of the critical haplotype by taking advantage of shortened linkage disequilibrium distances found across breeds. In order to identify breeds that likely share a trait inherited from the same ancestral source, we have used cluster analysis to divide 132 breeds of dog into five primary breed groups. We then use the multibreed approach to fine-map Collie eye anomaly (cea), a complex disorder of ocular development that was initially mapped to a 3.9-cM region on canine chromosome 37. Combined genotypes from affected individuals from four breeds of a single breed group significantly narrowed the candidate gene region to a 103-kb interval spanning only four genes. Sequence analysis revealed that all affected dogs share a homozygous deletion of 7.8 kb in the NHEJ1 gene. This intronic deletion spans a highly conserved binding domain to which several developmentally important proteins bind. This work both establishes that the primary cea mutation arose as a single disease allele in a common ancestor of herding breeds as well as highlights the value of comparative population analysis for refining regions of linkage.

...read moreread less

Journal Article•DOI•

Synthetic Genomics | Options for Governance

[...]

Michele S. Garfinkel¹, Drew Endy, Gerald L. Epstein, Robert M. Friedman•Institutions (1)

J. Craig Venter Institute¹

14 Dec 2007-Biosecurity and Bioterrorism-biodefense Strategy Practice and Science

Journal Article•DOI•

Transcriptional Regulation of Multi-Drug Tolerance and Antibiotic-Induced Responses by the Histone-Like Protein Lsr2 in M. tuberculosis

[...]

Roberto Colangeli¹, Danica Helb¹, Catherine Vilchèze², Manzour Hernando Hazbón¹, Chee Gun Lee¹, Hassan Safi¹, Brendan Sayers¹, Irene Sardone¹, Marcus B. Jones³, Robert D. Fleischmann³, Scott N. Peterson³, William R. Jacobs², David Alland¹ - Show less +9 more•Institutions (3)

University of Medicine and Dentistry of New Jersey¹, Albert Einstein College of Medicine², J. Craig Venter Institute³

22 Jun 2007-PLOS Pathogens

TL;DR: Lsr2 appears to regulate several important pathways in mycobacteria by preferentially binding to AT-rich sequences, including genes induced by antibiotics and those associated with inducible multi-drug tolerance.

...read moreread less

Abstract: Multi-drug tolerance is a key phenotypic property that complicates the sterilization of mammals infected with Mycobacterium tuberculosis. Previous studies have established that iniBAC, an operon that confers multi-drug tolerance to M. bovis BCG through an associated pump-like activity, is induced by the antibiotics isoniazid (INH) and ethambutol (EMB). An improved understanding of the functional role of antibiotic-induced genes and the regulation of drug tolerance may be gained by studying the factors that regulate antibiotic-mediated gene expression. An M. smegmatis strain containing a lacZ gene fused to the promoter of M. tuberculosis iniBAC (PiniBAC) was subjected to transposon mutagenesis. Mutants with constitutive expression and increased EMB-mediated induction of PiniBAC::lacZ mapped to the lsr2 gene (MSMEG6065), a small basic protein of unknown function that is highly conserved among mycobacteria. These mutants had a marked change in colony morphology and generated a new polar lipid. Complementation with multi-copy M. tuberculosis lsr2 (Rv3597c) returned PiniBAC expression to baseline, reversed the observed morphological and lipid changes, and repressed PiniBAC induction by EMB to below that of the control M. smegmatis strain. Microarray analysis of an lsr2 knockout confirmed upregulation of M. smegmatis iniA and demonstrated upregulation of genes involved in cell wall and metabolic functions. Fully 121 of 584 genes induced by EMB treatment in wild-type M. smegmatis were upregulated (“hyperinduced”) to even higher levels by EMB in the M. smegmatis lsr2 knockout. The most highly upregulated genes and gene clusters had adenine-thymine (AT)–rich 5-prime untranslated regions. In M. tuberculosis, overexpression of lsr2 repressed INH-mediated induction of all three iniBAC genes, as well as another annotated pump, efpA. The low molecular weight and basic properties of Lsr2 (pI 10.69) suggested that it was a histone-like protein, although it did not exhibit sequence homology with other proteins in this class. Consistent with other histone-like proteins, Lsr2 bound DNA with a preference for circular DNA, forming large oligomers, inhibited DNase I activity, and introduced a modest degree of supercoiling into relaxed plasmids. Lsr2 also inhibited in vitro transcription and topoisomerase I activity. Lsr2 represents a novel class of histone-like proteins that inhibit a wide variety of DNA-interacting enzymes. Lsr2 appears to regulate several important pathways in mycobacteria by preferentially binding to AT-rich sequences, including genes induced by antibiotics and those associated with inducible multi-drug tolerance. An improved understanding of the role of lsr2 may provide important insights into the mechanisms of action of antibiotics and the way that mycobacteria adapt to stresses such as antibiotic treatment.

...read moreread less