Showing papers on "Genome published in 2007"

PDF

Open Access

Journal Article•DOI•

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project

[...]

Ewan Birney, John A. Stamatoyannopoulos¹, Anindya Dutta², Roderic Guigó³ +317 more•Institutions (44)

14 Jun 2007-Nature

TL;DR: Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.

...read moreread less

Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

...read moreread less

5,091 citations

Journal Article•DOI•

A mammalian microRNA expression atlas based on small RNA library sequencing.

[...]

Pablo Landgraf¹, Mirabela Rusu², Robert L. Sheridan³, Alain Sewer², Alain Sewer⁴, Nicola Iovino¹, Alexei A. Aravin¹, Sébastien Pfeffer¹, Amanda J. Rice¹, Alice O. Kamphorst¹, Markus Landthaler¹, Carolina Lin¹, Nicholas D. Socci³, Leandro C. Hermida², Valerio Fulci⁵, Sabina Chiaretti⁵, Robin Foà⁵, Julia Schliwka⁶, Uta Fuchs⁷, Astrid Novosel⁷, Roman-Ulrich Müller⁸, Roman-Ulrich Müller¹, Bernhard Schermer⁸, Ute Bissels⁹, Jason M. Inman¹⁰, Quang Phan¹⁰, Minchen Chien¹¹, David B. Weir¹¹, Ruchi Choksi¹¹, Gabriella De Vita¹², Daniela Frezzetti¹², Hans Ingo Trompeter¹³, Veit Hornung⁷, Grace Teng¹, Gunther Hartmann¹⁴, Miklós Palkovits¹⁵, Roberto Di Lauro, Peter Wernet¹³, Giuseppe Macino⁵, Charles E. Rogler¹⁶, James W. Nagle¹⁷, Jingyue Ju¹¹, F. Nina Papavasiliou¹, Thomas Benzing⁸, Peter Lichter, Wayne Tam¹⁸, Michael J. Brownstein¹⁰, Andreas Bosio⁹, Arndt Borkhardt⁷, James J. Russo¹¹, Chris Sander³, Mihaela Zavolan², Mihaela Zavolan⁴, Thomas Tuschl¹ - Show less +50 more•Institutions (18)

Rockefeller University¹, University of Basel², Memorial Sloan Kettering Cancer Center³, Swiss Institute of Bioinformatics⁴, Sapienza University of Rome⁵, German Cancer Research Center⁶, Ludwig Maximilian University of Munich⁷, University of Freiburg⁸, Miltenyi Biotec⁹, J. Craig Venter Institute¹⁰, Columbia University¹¹, University of Naples Federico II¹², University of Düsseldorf¹³, University of Bonn¹⁴, Semmelweis University¹⁵, Yeshiva University¹⁶, National Institutes of Health¹⁷, Cornell University¹⁸

29 Jun 2007-Cell

TL;DR: A relatively small set of miRNAs, many of which are ubiquitously expressed, account for most of the differences in miRNA profiles between cell lineages and tissues.

...read moreread less

3,687 citations

Journal Article•DOI•

DNA–DNA hybridization values and their relationship to whole-genome sequence similarities

[...]

Johan Goris¹, Konstantinos T. Konstantinidis¹, Joel A. Klappenbach¹, Tom Coenye², Peter Vandamme², James M. Tiedje¹ - Show less +2 more•Institutions (2)

Michigan State University¹, Ghent University²

01 Jan 2007-International Journal of Systematic and Evolutionary Microbiology

TL;DR: It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available and reveal extensive gene diversity within the current concept of "species".

...read moreread less

Abstract: DNA-DNA hybridization (DDH) values have been used by bacterial taxonomists since the 1960s to determine relatedness between strains and are still the most important criterion in the delineation of bacterial species. Since the extent of hybridization between a pair of strains is ultimately governed by their respective genomic sequences, we examined the quantitative relationship between DDH values and genome sequence-derived parameters, such as the average nucleotide identity (ANI) of common genes and the percentage of conserved DNA. A total of 124 DDH values were determined for 28 strains for which genome sequences were available. The strains belong to six important and diverse groups of bacteria for which the intra-group 16S rRNA gene sequence identity was greater than 94 %. The results revealed a close relationship between DDH values and ANI and between DNA-DNA hybridization and the percentage of conserved DNA for each pair of strains. The recommended cut-off point of 70 % DDH for species delineation corresponded to 95 % ANI and 69 % conserved DNA. When the analysis was restricted to the protein-coding portion of the genome, 70 % DDH corresponded to 85 % conserved genes for a pair of strains. These results reveal extensive gene diversity within the current concept of "species". Examination of reciprocal values indicated that the level of experimental error associated with the DDH method is too high to reveal the subtle differences in genome size among the strains sampled. It is concluded that ANI can accurately replace DDH values for strains for which genome sequences are available.

...read moreread less

3,471 citations

Journal Article•DOI•

The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla.

[...]

Olivier Jaillon¹, Jean-Marc Aury, Benjamin Noel, Alberto Policriti, Christian Clepet, Alberto Casagrande, Nathalie Choisne, Sébastien Aubourg, Nicola Vitulo, Claire Jubin, Alessandro Vezzi, Fabrice Legeai, Philippe Hugueney, Corinne Dasilva, David S. Horner, Erica Mica, Delphine Jublot, Julie Poulain, Clémence Bruyère, Alain Billault, Béatrice Segurens, Michel Gouyvenoux, Edgardo Ugarte, Federica Cattonaro, Véronique Anthouard, Virginie Vico, Cristian Del Fabbro, Michael Alaux, Gabriele Di Gaspero, Vincent Dumas, Nicoletta Felice, Sophie Paillard, Irena Juman, Marco Moroldo, Simone Scalabrin, Aurélie Canaguier, Isabelle Le Clainche, G Malacrida, Eléonore Durand, Graziano Pesole, Valérie Laucou, Philippe Chatelet, Didier Merdinoglu, Massimo Delledonne, Mario Pezzotti, Alain Lecharny, Claude Scarpelli, François Artiguenave, M. Enrico Pè, Giorgio Valle, Michele Morgante, Michel Caboche, Anne-Françoise Adam-Blondon, Jean Weissenbach, Francis Quetier, Patrick Wincker - Show less +52 more•Institutions (1)

University of Évry Val d'Essonne¹

26 Aug 2007-Nature

TL;DR: A high-quality draft of the genome sequence of grapevine is obtained from a highly homozygous genotype, revealing the contribution of three ancestral genomes to the grapevine haploid content and explaining the chronology of previously described whole-genome duplication events in the evolution of flowering plants.

...read moreread less

Abstract: The analysis of the first plant genomes provided unexpected evidence for genome duplication events in species that had previously been considered as true diploids on the basis of their genetics. These polyploidization events may have had important consequences in plant evolution, in particular for species radiation and adaptation and for the modulation of functional capacities. Here we report a high-quality draft of the genome sequence of grapevine (Vitis vinifera) obtained from a highly homozygous genotype. The draft sequence of the grapevine genome is the fourth one produced so far for flowering plants, the second for a woody species and the first for a fruit crop (cultivated for both fruit and beverage). Grapevine was selected because of its important place in the cultural heritage of humanity beginning during the Neolithic period. Several large expansions of gene families with roles in aromatic features are observed. The grapevine genome has not undergone recent genome duplication, thus enabling the discovery of ancestral traits and features of the genetic organization of flowering plants. This analysis reveals the contribution of three ancestral genomes to the grapevine haploid content. This ancestral arrangement is common to many dicotyledonous plants but is absent from the genome of rice, which is a monocotyledon. Furthermore, we explain the chronology of previously described whole-genome duplication events in the evolution of flowering plants.

...read moreread less

3,311 citations

Journal Article•DOI•

Genome-Wide Mapping of in Vivo Protein-DNA Interactions

[...]

David S. Johnson¹, Ali Mortazavi², Ali Mortazavi¹, Richard M. Myers¹, Richard M. Myers², Barbara J. Wold², Barbara J. Wold¹ - Show less +3 more•Institutions (2)

Stanford University¹, California Institute of Technology²

08 Jun 2007-Science

TL;DR: A large-scale chromatin immunoprecipitation assay based on direct ultrahigh-throughput DNA sequencing was developed, which was then used to map in vivo binding of the neuron-restrictive silencer factor (NRSF; also known as REST) to 1946 locations in the human genome.

...read moreread less

Abstract: In vivo protein-DNA interactions connect each transcription factor with its direct targets to form a gene network scaffold. To map these protein-DNA interactions comprehensively across entire mammalian genomes, we developed a large-scale chromatin immunoprecipitation assay (ChIPSeq) based on direct ultrahigh-throughput DNA sequencing. This sequence census method was then used to map in vivo binding of the neuron-restrictive silencer factor (NRSF; also known as REST, for repressor element–1 silencing transcription factor) to 1946 locations in the human genome. The data display sharp resolution of binding position [±50 base pairs (bp)], which facilitated our finding motifs and allowed us to identify noncanonical NRSF-binding motifs. These ChIPSeq data also have high sensitivity and specificity [ROC (receiver operator characteristic) area ≥ 0.96] and statistical confidence (P <10^(–4)), properties that were important for inferring new candidate interactions. These include key transcription factors in the gene network that regulates pancreatic islet cell development.

...read moreread less

2,789 citations

Journal Article•DOI•

Identifying bacterial genes and endosymbiont DNA with Glimmer

[...]

Arthur L. Delcher¹, Kirsten A. Bratke², Edwin C. Powers³, Steven L. Salzberg¹•Institutions (3)

University of Maryland, College Park¹, Trinity College, Dublin², Johns Hopkins University³

20 Feb 2007-Bioinformatics

TL;DR: The interpolated Markov model (IMM) DNA discriminator correctly separated 99% of the sequences in a recent genome project that produced a mixture of sequences from the bacterium Prochloron didemni and its sea squirt host, Lissoclinum patella.

...read moreread less

Abstract: Motivation: The Glimmer gene-finding software has been successfully used for finding genes in bacteria, archaea and viruses representing hundreds of species. We describe several major changes to the Glimmer system, including improved methods for identifying both coding regions and start codons. We also describe a new module of Glimmer that can distinguish host and endosymbiont DNA. This module was developed in response to the discovery that eukaryotic genome sequencing projects sometimes inadvertently capture the DNA of intracellular bacteria living in the host. Results: The new methods dramatically reduce the rate of false-positive predictions, while maintaining Glimmer's 99% sensitivity rate at detecting genes in most species, and they find substantially more correct start sites, as measured by comparisons to known and well-curated genes. We show that our interpolated Markov model (IMM) DNA discriminator correctly separated 99% of the sequences in a recent genome project that produced a mixture of sequences from the bacterium Prochloron didemni and its sea squirt host, Lissoclinum patella. Availability: Glimmer is OSI Certified Open Source and available at http://cbcb.umd.edu/software/glimmer Contact: adelcher@umiacs.umd.edu

...read moreread less

2,738 citations

Journal Article•

Patterns of Somatic Mutation in Human Cancer Genomes

[...]

Michael R. Stratton¹•Institutions (1)

Wellcome Trust Sanger Institute¹

15 Nov 2007-Clinical Cancer Research

TL;DR: In this paper, the coding exons of the family of 518 protein kinases were sequenced in 210 cancers of diverse histological types to explore the nature of the information that will be derived from cancer genome sequencing.

...read moreread less

Abstract: AACR Centennial Conference: Translational Cancer Medicine-- Nov 4-8, 2007; Singapore PL02-05 All cancers are due to abnormalities in DNA. The availability of the human genome sequence has led to the proposal that resequencing of cancer genomes will reveal the full complement of somatic mutations and hence all the cancer genes. To explore the nature of the information that will be derived from cancer genome sequencing we have sequenced the coding exons of the family of 518 protein kinases, ~1.3Mb DNA per cancer sample, in 210 cancers of diverse histological types. Despite the screen being directed toward the coding regions of a gene family that has previously been strongly implicated in oncogenesis, the results indicate that the majority of somatic mutations detected are “passengers”. There is considerable variation in the number and pattern of these mutations between individual cancers, indicating substantial diversity of processes of molecular evolution between cancers. The imprints of exogenous mutagenic exposures, mutagenic treatment regimes and DNA repair defects can all be seen in the distinctive mutational signatures of individual cancers. This systematic mutation screen and others have previously yielded a number of cancer genes that are frequently mutated in one or more cancer types and which are now anticancer drug targets (for example BRAF , PIK3CA , and EGFR ). However, detailed analyses of the data from our screen additionally suggest that there exist a large number of additional “driver” mutations which are distributed across a substantial number of genes. It therefore appears that cells may be able to utilise mutations in a large repertoire of potential cancer genes to acquire the neoplastic phenotype. However, many of these genes are employed only infrequently. These findings may have implications for future anticancer drug development.

...read moreread less

2,737 citations

Journal Article•DOI•

Patterns of somatic mutation in human cancer genomes

[...]

Christopher Greenman¹, Philip J. Stephens¹, Raffaella Smith¹, Gillian L. Dalgliesh¹, Christopher I. Hunter¹, Graham R. Bignell¹, Helen Davies¹, Jon W. Teague¹, Adam Butler¹, Claire Stevens¹, Sarah Edkins¹, Sarah O’Meara¹, Imre Vastrik², Esther Schmidt², Tim Avis¹, Syd Barthorpe¹, Gurpreet Bhamra¹, Gemma Buck¹, Bhudipa Choudhury¹, Jody Clements¹, Jennifer Cole¹, Ed Dicks¹, Simon A. Forbes¹, Kris Gray¹, Kelly Halliday¹, Rachel Harrison¹, Katy Hills¹, Jon Hinton¹, Andy Jenkinson¹, David T. Jones¹, Andy Menzies¹, Tatiana Mironenko¹, Janet Perry¹, Keiran Raine¹, Dave Richardson¹, Rebecca Shepherd¹, Alexandra Small¹, Calli Tofts¹, Jennifer Varian¹, Tony Webb¹, Sofie West¹, Sara Widaa¹, Andrew D. Yates¹, Daniel P. Cahill³, David N. Louis³, Peter Goldstraw, Andrew G. Nicholson, Francis Brasseur⁴, Leendert H. J. Looijenga⁵, Barbara L. Weber⁶, Yoke Eng Chiew⁷, Anna deFazio⁷, Mel Greaves⁸, Anthony R. Green⁹, Peter J. Campbell¹, Ewan Birney², Douglas F. Easton⁹, Georgia Chenevix-Trench¹⁰, Min-Han Tan¹¹, Sok Kean Khoo¹¹, Bin Tean Teh¹¹, Siu Tsan Yuen¹², Suet Yi Leung¹², Richard Wooster¹, P. Andrew Futreal¹, Michael R. Stratton⁸, Michael R. Stratton¹ - Show less +63 more•Institutions (12)

Wellcome Trust Sanger Institute¹, European Bioinformatics Institute², Harvard University³, Ludwig Institute for Cancer Research⁴, Erasmus University Rotterdam⁵, University of Pennsylvania⁶, University of Sydney⁷, Institute of Cancer Research⁸, University of Cambridge⁹, QIMR Berghofer Medical Research Institute¹⁰, Van Andel Institute¹¹, University of Hong Kong¹²

08 Mar 2007-Nature

TL;DR: More than 1,000 somatic mutations found in 274 megabases of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers reveal the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.

...read moreread less

Abstract: Cancers arise owing to mutations in a subset of genes that confer growth advantage. The availability of the human genome sequence led us to propose that systematic resequencing of cancer genomes for mutations would lead to the discovery of many additional cancer genes. Here we report more than 1,000 somatic mutations found in 274 megabases (Mb) of DNA corresponding to the coding exons of 518 protein kinase genes in 210 diverse human cancers. There was substantial variation in the number and pattern of mutations in individual cancers reflecting different exposures, DNA repair defects and cellular origins. Most somatic mutations are likely to be 'passengers' that do not contribute to oncogenesis. However, there was evidence for 'driver' mutations contributing to the development of the cancers studied in approximately 120 genes. Systematic sequencing of cancer genomes therefore reveals the evolutionary diversity of cancers and implicates a larger repertoire of cancer genes than previously anticipated.

...read moreread less

2,732 citations

Journal Article•DOI•

A genome-wide transgenic RNAi library for conditional gene inactivation in Drosophila

[...]

Georg Dietzl¹, Doris Chen¹, Frank Schnorrer², Kuan-Chung Su¹, Yulia Barinova¹, Michaela Fellner¹, Michaela Fellner², Beate Gasser¹, Kaolin Kinsey¹, Kaolin Kinsey², Silvia Oppel², Silvia Oppel¹, Susanne Scheiblauer¹, Africa Couto², Vincent Marra¹, Krystyna Keleman¹, Krystyna Keleman², Barry J. Dickson², Barry J. Dickson¹ - Show less +15 more•Institutions (2)

Austrian Academy of Sciences¹, Research Institute of Molecular Pathology²

12 Jul 2007-Nature

TL;DR: The generation and validation of a genome-wide library of Drosophila melanogaster RNAi transgenes, enabling the conditional inactivation of gene function in specific tissues of the intact organism and opening up the prospect of systematically analysing gene functions in any tissue and at any stage of the Drosophile lifespan.

...read moreread less

Abstract: Forward genetic screens in model organisms have provided important insights into numerous aspects of development, physiology and pathology. With the availability of complete genome sequences and the introduction of RNA-mediated gene interference (RNAi), systematic reverse genetic screens are now also possible. Until now, such genome-wide RNAi screens have mostly been restricted to cultured cells and ubiquitous gene inactivation in Caenorhabditis elegans. This powerful approach has not yet been applied in a tissue-specific manner. Here we report the generation and validation of a genome-wide library of Drosophila melanogaster RNAi transgenes, enabling the conditional inactivation of gene function in specific tissues of the intact organism. Our RNAi transgenes consist of short gene fragments cloned as inverted repeats and expressed using the binary GAL4/UAS system. We generated 22,270 transgenic lines, covering 88% of the predicted protein-coding genes in the Drosophila genome. Molecular and phenotypic assays indicate that the majority of these transgenes are functional. Our transgenic RNAi library thus opens up the prospect of systematically analysing gene functions in any tissue and at any stage of the Drosophila lifespan.

...read moreread less

2,721 citations

Journal Article•DOI•

The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions

[...]

Sabeeha S. Merchant¹, Simon E. Prochnik², Olivier Vallon³, Elizabeth H. Harris⁴, Steven J. Karpowicz¹, George B. Witman⁵, Astrid Terry², Asaf Salamov², Lillian K. Fritz-Laylin⁶, Laurence Maréchal-Drouard⁷, Wallace F. Marshall⁸, Liang-Hu Qu⁹, David R. Nelson¹⁰, Anton A. Sanderfoot¹¹, Martin H. Spalding¹², Vladimir V. Kapitonov¹³, Qinghu Ren, Patrick J. Ferris¹⁴, Erika Lindquist², Harris Shapiro², Susan Lucas², Jane Grimwood¹⁵, Jeremy Schmutz¹⁵, Pierre Cardol³, Pierre Cardol¹⁶, Heriberto Cerutti¹⁷, Guillaume Chanfreau¹, Chun-Long Chen⁹, Valérie Cognat⁷, Martin T. Croft¹⁸, Rachel M. Dent⁶, Susan K. Dutcher¹⁹, Emilio Fernández²⁰, Hideya Fukuzawa²¹, David González-Ballester²², Diego González-Halphen²³, Armin Hallmann, Marc Hanikenne¹⁶, Michael Hippler²⁴, William Inwood⁶, Kamel Jabbari²⁵, Ming Kalanon²⁶, Richard Kuras³, Paul A. Lefebvre¹¹, Stéphane D. Lemaire²⁷, Alexey V. Lobanov¹⁷, Martin Lohr²⁸, Andrea L Manuell²⁹, Iris Meier³⁰, Laurens Mets³¹, Maria Mittag³², Telsa M. Mittelmeier³³, James V. Moroney³⁴, Jeffrey L. Moseley²², Carolyn A. Napoli³³, Aurora M. Nedelcu³⁵, Krishna K. Niyogi⁶, Sergey V. Novoselov¹⁷, Ian T. Paulsen, Greg Pazour⁵, Saul Purton³⁶, Jean-Philippe Ral⁷, Diego Mauricio Riaño-Pachón³⁷, Wayne R. Riekhof, Linda A. Rymarquis³⁸, Michael Schroda, David B. Stern³⁹, James G. Umen¹⁴, Robert D. Willows⁴⁰, Nedra F. Wilson⁴¹, Sara L. Zimmer³⁹, Jens Allmer⁴², Janneke Balk¹⁸, Katerina Bisova⁴³, Chong-Jian Chen⁹, Marek Eliáš⁴⁴, Karla C Gendler³³, Charles R. Hauser⁴⁵, Mary Rose Lamb⁴⁶, Heidi K. Ledford⁶, Joanne C. Long¹, Jun Minagawa⁴⁷, M. Dudley Page¹, Junmin Pan⁴⁸, Wirulda Pootakham²², Sanja Roje⁴⁹, Annkatrin Rose⁵⁰, Eric Stahlberg³⁰, Aimee M. Terauchi¹, Pinfen Yang⁵¹, Steven G. Ball⁷, Chris Bowler²⁵, Carol L. Dieckmann³³, Vadim N. Gladyshev¹⁷, Pamela J. Green³⁸, Richard A. Jorgensen³³, Stephen P. Mayfield²⁹, Bernd Mueller-Roeber³⁷, Sathish Rajamani³⁰, Richard T. Sayre³⁰, Peter Brokstein², Inna Dubchak², David Goodstein², Leila Hornick², Y. Wayne Huang², Jinal Jhaveri², Yigong Luo², Diego Martinez², Wing Chi Abby Ngau², Bobby Otillar², Alexander Poliakov², Aaron Porter², Lukasz Szajkowski², Gregory Werner², Kemin Zhou², Igor V. Grigoriev², Daniel S. Rokhsar², Daniel S. Rokhsar⁶, Arthur R. Grossman²² - Show less +115 more•Institutions (51)

University of California, Los Angeles¹, United States Department of Energy², University of Paris³, Duke University⁴, University of Massachusetts Medical School⁵, University of California, Berkeley⁶, Centre national de la recherche scientifique⁷, University of California, San Francisco⁸, Sun Yat-sen University⁹, University of Tennessee Health Science Center¹⁰, University of Minnesota¹¹, Iowa State University¹², Genetic Information Research Institute¹³, Salk Institute for Biological Studies¹⁴, Stanford University¹⁵, University of Liège¹⁶, University of Nebraska–Lincoln¹⁷, University of Cambridge¹⁸, Washington University in St. Louis¹⁹, University of Córdoba (Spain)²⁰, Kyoto University²¹, Carnegie Institution for Science²², National Autonomous University of Mexico²³, University of Münster²⁴, École Normale Supérieure²⁵, University of Melbourne²⁶, University of Paris-Sud²⁷, University of Mainz²⁸, Scripps Research Institute²⁹, Ohio State University³⁰, University of Chicago³¹, University of Jena³², University of Arizona³³, Louisiana State University³⁴, University of New Brunswick³⁵, University College London³⁶, University of Potsdam³⁷, Delaware Biotechnology Institute³⁸, Boyce Thompson Institute for Plant Research³⁹, Macquarie University⁴⁰, Oklahoma State University Center for Health Sciences⁴¹, İzmir University of Economics⁴², Academy of Sciences of the Czech Republic⁴³, Charles University in Prague⁴⁴, St. Edward's University⁴⁵, University of Puget Sound⁴⁶, Hokkaido University⁴⁷, Tsinghua University⁴⁸, Washington State University⁴⁹, Appalachian State University⁵⁰, Marquette University⁵¹

12 Oct 2007-Science

TL;DR: Analyses of the Chlamydomonas genome advance the understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

...read moreread less

Abstract: Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the approximately 120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

...read moreread less

2,554 citations

Journal Article•DOI•

Evolution of genes and genomes on the Drosophila phylogeny.

[...]

Andrew G. Clark¹, Michael B. Eisen², Michael B. Eisen³, Douglas Smith +426 more•Institutions (70)

08 Nov 2007-Nature

TL;DR: These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution.

...read moreread less

Abstract: Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.

...read moreread less

Journal Article•DOI•

The mammalian epigenome.

[...]

Bradley E. Bernstein¹, Bradley E. Bernstein², Alexander Meissner³, Eric S. Lander³, Eric S. Lander¹ - Show less +1 more•Institutions (3)

Broad Institute¹, Harvard University², Massachusetts Institute of Technology³

23 Feb 2007-Cell

TL;DR: Current research efforts are reviewed, with an emphasis on large-scale studies, emerging technologies, and challenges ahead.

...read moreread less

Journal Article•DOI•

CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes.

[...]

Genís Parra¹, Keith Bradnam¹, Ian F Korf¹•Institutions (1)

University of California, Davis¹

01 May 2007-Bioinformatics

TL;DR: This study reports a computational method, CEGMA (Core Eukaryotic Genes Mapping Approach), for building a highly reliable set of gene annotations in the absence of experimental data, and defines a set of conserved protein families that occur in a wide range of eukaryotes and presents a mapping procedure that accurately identifies their exon-intron structures in a novel genomic sequence.

...read moreread less

Abstract: Motivation The numbers of finished and ongoing genome projects are increasing at a rapid rate, and providing the catalog of genes for these new genomes is a key challenge. Obtaining a set of well-characterized genes is a basic requirement in the initial steps of any genome annotation process. An accurate set of genes is needed in order to learn about species-specific properties, to train gene-finding programs, and to validate automatic predictions. Unfortunately, many new genome projects lack comprehensive experimental data to derive a reliable initial set of genes. Results In this study, we report a computational method, CEGMA (Core Eukaryotic Genes Mapping Approach), for building a highly reliable set of gene annotations in the absence of experimental data. We define a set of conserved protein families that occur in a wide range of eukaryotes, and present a mapping procedure that accurately identifies their exon-intron structures in a novel genomic sequence. CEGMA includes the use of profile-hidden Markov models to ensure the reliability of the gene structures. Our procedure allows one to build an initial set of reliable gene annotations in potentially any eukaryotic genome, even those in draft stages. Availability Software and data sets are available online at http://korflab.ucdavis.edu/Datasets.

...read moreread less

Journal Article•DOI•

The Diploid Genome Sequence of an Individual Human

[...]

Samuel Levy¹, Granger G. Sutton¹, Pauline C. Ng¹, Lars Feuk², Aaron L. Halpern¹, Brian P. Walenz¹, Nelson Axelrod¹, Jiaqi Huang¹, Ewen F. Kirkness¹, Gennady Denisov¹, Yuan Lin¹, Jeffrey R. MacDonald², Andy Wing Chun Pang², Mary Shago², Timothy B. Stockwell¹, Alexia Tsiamouri¹, Vineet Bafna³, Vikas Bansal³, Saul A. Kravitz¹, Dana A. Busam¹, Karen Beeson¹, Tina C McIntosh¹, Karin A. Remington¹, Josep F. Abril⁴, John Gill¹, Jon Borman¹, Yu-Hui Rogers¹, Marvin Frazier¹, Stephen W. Scherer², Robert L. Strausberg¹, J. Craig Venter¹ - Show less +27 more•Institutions (4)

J. Craig Venter Institute¹, University of Toronto², University of California, San Diego³, University of Barcelona⁴

04 Sep 2007-PLOS Biology

TL;DR: A modified version of the Celera assembler is developed to facilitate the identification and comparison of alternate alleles within this individual diploid genome, and a novel haplotype assembly strategy is used, able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploids nature of the genome.

...read moreread less

Abstract: Presented here is a genome sequence of an individual human. It was produced from ∼32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

...read moreread less

Journal Article•DOI•

Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III

[...]

Joey Shaw¹, Edgar B. Lickey, Edward E. Schilling, Randall L. Small•Institutions (1)

University of Tennessee at Chattanooga¹

01 Mar 2007-American Journal of Botany

TL;DR: Nine newly explored regions of the chloroplast genome offer levels of variation better than the best regions identified in an earlier study and are therefore likely to be the best choices for molecular studies at low taxonomic levels.

...read moreread less

Abstract: Although the chloroplast genome contains many noncoding regions, relatively few have been exploited for interspecific phylogenetic and intraspecific phylogeographic studies. In our recent evaluation of the phylogenetic utility of 21 noncoding chloroplast regions, we found the most widely used noncoding regions are among the least variable, but the more variable regions have rarely been employed. That study led us to conclude that there may be unexplored regions of the chloroplast genome that have even higher relative levels of variability. To explore the potential variability of previously unexplored regions, we compared three pairs of single-copy chloroplast genome sequences in three disparate angiosperm lineages: Atropa vs. Nicotiana (asterids); Lotus vs. Medicago (rosids); and Saccharum vs. Oryza (monocots). These three separate sequence alignments highlighted 13 mutational hotspots that may be more variable than the best regions of our former study. These 13 regions were then selected for a more detailed analysis. Here we show that nine of these newly explored regions (rpl32-trnL((UAG)), trnQ((UUG))-5'rps16, 3'trnV((UAC))-ndhC, ndhF-rpl32, psbD-trnT((GGU)), psbJ-petA, 3'rps16-5'trnK((UUU)), atpI-atpH, and petL-psbE) offer levels of variation better than the best regions identified in our earlier study and are therefore likely to be the best choices for molecular studies at low taxonomic levels.

...read moreread less

Journal Article•DOI•

Transposable elements and the epigenetic regulation of the genome.

[...]

R. Keith Slotkin¹, Robert A. Martienssen¹•Institutions (1)

Cold Spring Harbor Laboratory¹

01 Apr 2007-Nature Reviews Genetics

TL;DR: New insights have been gained into how silencing in eukaryotic cells has been co-opted to serve essential functions in 'host' cells, highlighting the importance of TEs in the epigenetic regulation of the genome.

...read moreread less

Abstract: Overlapping epigenetic mechanisms have evolved in eukaryotic cells to silence the expression and mobility of transposable elements (TEs). Owing to their ability to recruit the silencing machinery, TEs have served as building blocks for epigenetic phenomena, both at the level of single genes and across larger chromosomal regions. Important progress has been made recently in understanding these silencing mechanisms. In addition, new insights have been gained into how this silencing has been co-opted to serve essential functions in 'host' cells, highlighting the importance of TEs in the epigenetic regulation of the genome.

...read moreread less

Journal Article•DOI•

Sea Anemone Genome Reveals Ancestral Eumetazoan Gene Repertoire and Genomic Organization

[...]

Nicholas H. Putnam¹, Mansi Srivastava², Uffe Hellsten¹, Bill Dirks², Jarrod Chapman¹, Asaf Salamov¹, Astrid Terry¹, Harris Shapiro¹, Erika Lindquist¹, Vladimir V. Kapitonov³, Jerzy Jurka³, Grigory Genikhovich⁴, Igor V. Grigoriev¹, Susan Lucas¹, Robert Steele⁵, John R. Finnerty⁶, Ulrich Technau⁴, Mark Q. Martindale⁷, Daniel S. Rokhsar¹, Daniel S. Rokhsar² - Show less +16 more•Institutions (7)

Joint Genome Institute¹, University of California, Berkeley², Genetic Information Research Institute³, University of Bergen⁴, University of California, Irvine⁵, Boston University⁶, University of Hawaii⁷

06 Jul 2007-Science

TL;DR: A comparative analysis of the draft genome of an emerging cnidarian model, the starlet sea anemone Nematostella vectensis, suggests that gene “inventions” along the lineage leading to animals were likely already well integrated with preexisting eukaryotic genes in the eumetazoan progenitor.

...read moreread less

Abstract: Sea anemones are seemingly primitive animals that, along with corals, jellyfish, and hydras, constitute the oldest eumetazoan phylum, the Cnidaria. Here, we report a comparative analysis of the draft genome of an emerging cnidarian model, the starlet sea anemone Nematostella vectensis. The sea anemone genome is complex, with a gene repertoire, exon-intron structure, and large-scale gene linkage more similar to vertebrates than to flies or nematodes, implying that the genome of the eumetazoan ancestor was similarly complex. Nearly one-fifth of the inferred genes of the ancestor are eumetazoan novelties, which are enriched for animal functions like cell signaling, adhesion, and synaptic transmission. Analysis of diverse pathways suggests that these gene "inventions" along the lineage leading to animals were likely already well integrated with preexisting eukaryotic genes in the eumetazoan progenitor.

...read moreread less

Journal Article•DOI•

Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing

[...]

Gordon Robertson, Martin Hirst, Matthew N. Bainbridge, Misha Bilenky, Yongjun Zhao, Thomas Zeng, Ghia Euskirchen¹, Bridget Bernier, Richard Varhol, Allen Delaney, Nina Thiessen, Obi L. Griffith, A He, Marco A. Marra, Michael Snyder¹, Steven J.M. Jones - Show less +12 more•Institutions (1)

Yale University¹

11 Jun 2007-Nature Methods

TL;DR: ChIP-seq identified 41,582 and 11,004 putative STAT1-binding regions in stimulated and unstimulated cells, respectively, and found 24 loci known to contain STAT1 interferon-responsive binding sites, including 24 that were enriched in sequences similar to known STAT1 binding motifs.

...read moreread less

Abstract: We developed a method, ChIP-sequencing (ChIP-seq), combining chromatin immunoprecipitation (ChIP) and massively parallel sequencing to identify mammalian DNA sequences bound by transcription factors in vivo. We used ChIP-seq to map STAT1 targets in interferon-γ (IFN-γ)–stimulated and unstimulated human HeLa S3 cells, and compared the method's performance to ChIP-PCR and to ChIP-chip for four chromosomes. By ChIP-seq, using 15.1 and 12.9 million uniquely mapped sequence reads, and an estimated false discovery rate of less than 0.001, we identified 41,582 and 11,004 putative STAT1-binding regions in stimulated and unstimulated cells, respectively. Of the 34 loci known to contain STAT1 interferon-responsive binding sites, ChIP-seq found 24 (71%). ChIP-seq targets were enriched in sequences similar to known STAT1 binding motifs. Comparisons with two ChIP-PCR data sets suggested that ChIP-seq sensitivity was between 70% and 92% and specificity was at least 95%.

...read moreread less

Journal Article•DOI•

Evolutionary and biomedical insights from the rhesus macaque genome

[...]

Richard A. Gibbs¹, Jeffrey Rogers², Michael G. Katze³, Roger E. Bumgarner³ +174 more•Institutions (28)

13 Apr 2007-Science

TL;DR: The genome sequence of an Indian-origin Macaca mulatta female is determined and compared with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families.

...read moreread less

Abstract: The rhesus macaque (Macaca mulatta) is an abundant primate species that diverged from the ancestors of Homo sapiens about 25 million years ago. Because they are genetically and physiologically similar to humans, rhesus monkeys are the most widely used nonhuman primate in basic and applied biomedical research. We determined the genome sequence of an Indian-origin Macaca mulatta female and compared the data with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families. A comparison of sequences from individual animals was used to investigate their underlying genetic diversity. The complete description of the macaque genome blueprint enhances the utility of this animal model for biomedical research and improves our understanding of the basic biology of the species.

...read moreread less

Journal Article•DOI•

Paired-end mapping reveals extensive structural variation in the human genome.

[...]

19 Oct 2007-Science

TL;DR: High-throughput and massive paired-end mapping (PEM) was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome, documenting that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function.

...read moreread less

Abstract: Structural variation of the genome involves kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements. We introduce high-throughput and massive paired-end mapping (PEM), a large-scale genome-sequencing method to identify structural variants (SVs) ∼3 kilobases (kb) or larger that combines the rescue and capture of paired ends of 3-kb fragments, massive 454 sequencing, and a computational approach to map DNA reads onto a reference genome. PEM was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome. Overall, we fine-mapped more than 1000 SVs and documented that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function. The breakpoint junction sequences of more than 200 SVs were determined with a novel pooling strategy and computational analysis. Our analysis provided insights into the mechanisms of SV formation in humans.

...read moreread less

Journal Article•DOI•

G-quadruplexes in promoters throughout the human genome

[...]

Julian L. Huppert¹, Shankar Balasubramanian¹•Institutions (1)

University of Cambridge¹

01 Jan 2007-Nucleic Acids Research

TL;DR: It is shown that the promoter regions (1 kb upstream of the transcription start site TSS) of genes are significantly enriched in quadruplex motifs relative to the rest of the genome, with >40% of human gene promoters containing one or more quadruplexaterials.

...read moreread less

Abstract: Certain G-rich DNA sequences readily form four-stranded structures called G-quadruplexes. These sequence motifs are located in telomeres as a repeated unit, and elsewhere in the genome, where their function is currently unknown. It has been proposed that G-quadruplexes may be directly involved in gene regulation at the level of transcription. In support of this hypothesis, we show that the promoter regions (1 kb upstream of the transcription start site TSS) of genes are significantly enriched in quadruplex motifs relative to the rest of the genome, with >40% of human gene promoters containing one or more quadruplex motif. Furthermore, these promoter quadruplexes strongly associate with nuclease hypersensitive sites identified throughout the genome via biochemical measurement. Regions of the human genome that are both nuclease hypersensitive and within promoters show a remarkable (230-fold) enrichment of quadruplex elements, compared to the rest of the genome. These quadruplex motifs identified in promoter regions also show an interesting structural bias towards more stable forms. These observations support the proposal that promoter G-quadruplexes are directly involved in the regulation of gene expression.

...read moreread less

Journal Article•DOI•

Beyond the sequence : Cellular organization of genome function

[...]

Tom Misteli¹•Institutions (1)

National Institutes of Health¹

23 Feb 2007-Cell

TL;DR: The functional relevance of spatial and temporal genome organization at three hierarchical levels: the organization of nuclear processes, the higher-order organization of the chromatin fiber, and the spatial arrangement of genomes within the cell nucleus are discussed.

...read moreread less

Journal Article•DOI•

The TIGR Rice Genome Annotation Resource: Improvements and New Features

[...]

Shu Ouyang, Wei Zhu, John A. Hamilton, Haining Lin, Matthew Campbell, Kevin L. Childs, Françoise Thibaud-Nissen, Renae L. Malek, Yuandan Lee, Li Zheng, Joshua Orvis, Brian J. Haas, Jennifer R. Wortman, C. Robin Buell - Show less +10 more

01 Jan 2007-Nucleic Acids Research

TL;DR: Through incorporation of multiple transcript and proteomic expression data sets, the Institute for Genomic Research has been able to annotate 24 799 genes (31 739 gene models), representing ∼50% of the total gene models, as expressed in the rice genome.

...read moreread less

Abstract: In The Institute for Genomic Research Rice Genome Annotation project (http://rice.tigr.org), we have continued to update the rice genome sequence with new data and improve the quality of the annotation. In our current release of annotation (Release 4.0; January 12, 2006), we have identified 42,653 non-transposable element-related genes encoding 49,472 gene models as a result of the detection of alternative splicing. We have refined our identification methods for transposable element-related genes resulting in 13,237 genes that are related to transposable elements. Through incorporation of multiple transcript and proteomic expression data sets, we have been able to annotate 24 799 genes (31,739 gene models), representing approximately 50% of the total gene models, as expressed in the rice genome. All structural and functional annotation is viewable through our Rice Genome Browser which currently supports 59 tracks. Enhanced data access is available through web interfaces, FTP downloads and a Data Extractor tool developed in order to support discrete dataset downloads.

...read moreread less

Journal Article•DOI•

Genome sequence of Aedes aegypti, a major arbovirus vector

[...]

Vishvanath Nene¹, Jennifer R. Wortman¹, Daniel Lawson, Brian J. Haas¹, Chinnappa D. Kodira², Zhijian Jake Tu³, Brendan J. Loftus, Zhiyong Xi⁴, Karyn Megy, Manfred Grabherr², Quinghu Ren¹, Evgeny M. Zdobnov, Neil F. Lobo⁵, Kathryn S. Campbell⁶, Susan E. Brown⁷, Maria de Fatima Bonaldo⁸, Jingsong Zhu⁹, Steven P. Sinkins¹⁰, David G. Hogenkamp¹¹, Paolo Amedeo¹, Peter Arensburger⁹, Peter W. Atkinson⁹, Shelby L. Bidwell¹, Jim Biedler³, Ewan Birney, Robert V. Bruggner⁵, Javier Costas, Monique R. Coy³, Jonathan Crabtree¹, Matt Crawford², Becky deBruyn⁵, David DeCaprio², Karin Eiglmeier¹², Eric Eisenstadt¹, Hamza El-Dorry¹³, William M. Gelbart⁶, Suely Lopes Gomes¹³, Martin Hammond, Linda Hannick¹, James R. Hogan⁵, Michael H. Holmes¹, David M. Jaffe², J. Spencer Johnston, Ryan C. Kennedy⁵, Hean Koo¹, Saul A. Kravitz, Evgenia V. Kriventseva¹⁴, David Kulp¹⁵, Kurt LaButti², Eduardo Lee¹, Song Li³, Diane D. Lovin⁵, Chunhong Mao³, Evan Mauceli², Carlos Frederico Martins Menck¹³, Jason R. Miller¹, Philip Montgomery², Akio Mori⁵, Ana L. T. O. Nascimento¹⁶, Horacio Naveira¹⁷, Chad Nusbaum², Sinéad B. O'Leary², Joshua Orvis¹, Mihaela Pertea, Hadi Quesneville, Kyanne R. Reidenbach¹¹, Yu-Hui Rogers, Charles Roth¹², Jennifer R. Schneider⁵, Michael C. Schatz, Martin Shumway¹, Mario Stanke, Eric O. Stinson⁵, Jose M. C. Tubio, Janice P. Vanzee¹¹, Sergio Verjovski-Almeida¹³, Doreen Werner¹⁸, Owen White¹, Stefan Wyder¹⁴, Qiandong Zeng², Qi Zhao¹, Yongmei Zhao¹, Catherine A. Hill¹¹, Alexander S. Raikhel⁹, Marcelo B. Soares⁸, Dennis L. Knudson⁷, Norman H. Lee, James E. Galagan², Steven L. Salzberg, Ian T. Paulsen¹, George Dimopoulos⁴, Frank H. Collins⁵, Bruce W. Birren², Claire M. Fraser-Liggett, David W. Severson⁵ - Show less +91 more•Institutions (18)

J. Craig Venter Institute¹, Broad Institute², Virginia Tech³, Johns Hopkins University⁴, University of Notre Dame⁵, Harvard University⁶, Colorado State University⁷, Northwestern University⁸, University of California, Riverside⁹, University of Oxford¹⁰, Purdue University¹¹, Pasteur Institute¹², University of São Paulo¹³, University of Geneva¹⁴, University of Massachusetts Amherst¹⁵, Instituto Butantan¹⁶, University of A Coruña¹⁷, University of Göttingen¹⁸

22 Jun 2007-Science

TL;DR: A draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genomes of the malaria vector Anopheles gambiae was presented in this paper.

...read moreread less

Abstract: We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of approximately 4 to 6 increase in average gene length and in sizes of intergenic regions relative to An. gambiae and Drosophila melanogaster. Nonetheless, chromosomal synteny is generally maintained among all three insects, although conservation of orthologous gene order is higher (by a factor of approximately 2) between the mosquito species than between either of them and the fruit fly. An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species.

...read moreread less

Journal Article•DOI•

The UCSC genome browser database: update 2007

[...]

01 Jan 2007-Nucleic Acids Research

TL;DR: The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data.

...read moreread less

Abstract: The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up to a full chromosome and includes assembly data, genes and gene predictions, mRNA and EST alignments, and comparative genomics, regulation, expression and variation data. The database is optimized for fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. In the past year, 22 new assemblies and several new sets of human variation annotation have been released. New features include VisiGene, a fully integrated in situ hybridization image browser; phyloGif, for drawing evolutionary tree diagrams; a redesigned Custom Track feature; an expanded SNP annotation track; and many new display options. The Genome Browser, other tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.

...read moreread less

Journal Article•DOI•

The medaka draft genome and insights into vertebrate genome evolution

[...]

Masahiro Kasahara¹, Kiyoshi Naruse¹, Shin Sasaki¹, Yoichiro Nakatani¹, Wei Qu¹, Budrul Ahsan¹, Tomoyuki Yamada¹, Yukinobu Nagayasu¹, Koichiro Doi¹, Yasuhiro Kasai¹, Tomoko Jindo¹, Daisuke Kobayashi¹, Atsuko Shimada¹, Atsushi Toyoda, Yoko Kuroki, Asao Fujiyama², Takashi Sasaki³, Atsushi Shimizu³, Shuichi Asakawa³, Nobuyoshi Shimizu³, Shin-ichi Hashimoto¹, Jun Yang¹, Yongjun Lee¹, Kouji Matsushima¹, Sumio Sugano¹, Mitsuru Sakaizumi⁴, Takanori Narita¹, Takanori Narita⁵, Kazuko Ohishi⁵, Shinobu Haga⁵, Fumiko Ohta⁵, Hisayo Nomoto⁵, Keiko Nogata⁵, Tomomi Morishita⁵, Tomoko Endo⁵, Tadasu Shin-I⁵, Hiroyuki Takeda¹, Shinichi Morishita¹, Yuji Kohara⁵ - Show less +35 more•Institutions (5)

University of Tokyo¹, National Institute of Informatics², Keio University³, Niigata University⁴, National Institute of Genetics⁵

07 Jun 2007-Nature

TL;DR: A high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka, revealed that eight major interchromosomal rearrangements took place in a remarkably short period of ∼50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300‬Myr.

...read moreread less

Abstract: The medaka fish (Oryzias latipes) is a popular pet in Japan and more recently a laboratory model organism for developmental genetics and evolutionary biology. Now the medaka's genome has been sequenced and analysed by a large Japanese consortium. Cichlids and stickleback, which are emerging model systems for understanding the genetic basis of vertebrate speciation, are evolutionarily closer to medaka than zebrafish, so the medaka's genome sequence will yield valuable insights into 400 million years of vertebrate genome evolution. The medaka fish (Oryzias latipes) has long been a popular pet in Japan and more recently a laboratory model organism; it now has its genome sequenced and analysed by a Japanese consortium. Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats1. Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published2,3, analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination4,5,6 and developmental genetics7. In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including ∼2,900 new genes, using 5′-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of ∼50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.

...read moreread less

Book•

The Origins of Genome Architecture

[...]

Michael Lynch

01 Jan 2007

TL;DR: The Origin of Eukaryotes Genome Size and Organismal Complexity The Human Genome Why Population Size Matters Three Keys to Chromosomal Integrity The Nucleotide-composition Landscape Mobile Genetic Elements Genomic Expansion by Gene Duplication Genes in Pieces Transcription and Regulatory-region Complexity Expansion and Contraction of Organelle Genomes

...read moreread less

Abstract: The Origin of Eukaryotes Genome Size and Organismal Complexity The Human Genome Why Population Size Matters Three Keys to Chromosomal Integrity The Nucleotide-composition Landscape Mobile Genetic Elements Genomic Expansion by Gene Duplication Genes in Pieces Transcription and Regulatory-region Complexity Expansion and Contraction of Organelle Genomes Sex Chromosome Evolution Genomfart

...read moreread less

Journal Article•DOI•

DNA Transposons and the Evolution of Eukaryotic Genomes

[...]

Cleacuteldric Feschotte¹, Ellen J. Pritham¹•Institutions (1)

University of Texas at Arlington¹

12 Dec 2007-Annual Review of Genetics

TL;DR: This review focuses on DNA-mediated or class 2 transposons and emphasizes how this class of elements is distinguished from other types of mobile elements in terms of their structure, amplification dynamics, and genomic effect.

...read moreread less

Abstract: Transposable elements are mobile genetic units that exhibit broad diversity in their structure and transposition mechanisms. Transposable elements occupy a large fraction of many eukaryotic genomes and their movement and accumulation represent a major force shaping the genes and genomes of almost all organisms. This review focuses on DNA-mediated or class 2 transposons and emphasizes how this class of elements is distinguished from other types of mobile elements in terms of their structure, amplification dynamics, and genomic effect. We provide an up-to-date outlook on the diversity and taxonomic distribution of all major types of DNA transposons in eukaryotes, including Helitrons and Mavericks. We discuss some of the evolutionary forces that influence their maintenance and diversification in various genomic environments. Finally, we highlight how the distinctive biological features of DNA transposons have contributed to shape genome architecture and led to the emergence of genetic innovations in different eukaryotic lineages.

...read moreread less

Journal Article•DOI•

A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety

[...]

Riccardo Velasco, Andrey Zharkikh¹, Michela Troggio, Dustin Cartwright¹, Alessandro Cestaro, Dmitry Pruss¹, Massimo Pindo, Lisa M. Fitzgerald¹, Silvia Vezzulli, Julia Reid¹, Giulia Malacarne, Diana Iliev¹, G. Coppola, Bryan Wardell¹, Diego Micheletti, Teresita Macalma¹, Marco Facci, J.T. Mitchell¹, Michele Perazzolli, Glenn Eldredge¹, Pamela Gatto, Rozan Oyzerski¹, Marco Moretto, N. Gutin¹, Marco Stefanini, Yang Chen¹, C. Segala, Christine Davenport¹, Lorenzo Dematte, Amy Mraz, Juri Battilana, Keith E. Stormo, Fabrizio Costa, Quanzhou Tao, Azeddine Si-Ammour, Tim Harkins², Angie Lackey², Clotilde Perbost, Bruce E Taillon, Alessandra Stella, Victor V. Solovyev³, Jeffrey A. Fawcett⁴, Lieven Sterck⁴, Klaas Vandepoele⁴, Stella M. Grando, Stefano Toppo, Claudio Moser, Jerry S. Lanchbury¹, Robert Bogden, Mark H. Skolnick¹, Vittorio Sgaramella, Satish Bhatnagar¹, Paolo Fontana, Alexander Gutin¹, Yves Van de Peer⁴, Francesco Salamini, Roberto Viola - Show less +53 more•Institutions (4)

Myriad Genetics¹, Roche Applied Science², Royal Holloway, University of London³, Ghent University⁴

19 Dec 2007-PLOS ONE

TL;DR: A high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens.

...read moreread less

Abstract: Background. Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented. Principal Findings. We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitisspecific large scale duplication event concerning at least 10 chromosomes (duplication not reported before). Conclusions. Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape.

...read moreread less

Journal Article•DOI•

Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns.

[...]

Robert K. Jansen¹, Zhengqiu Cai¹, Linda A. Raubeson², Henry Daniell³, Claude W. dePamphilis, Jim Leebens-Mack⁴, Kai F. Müller⁵, Mary Guisinger-Bellian¹, Rosemarie C. Haberle¹, Anne K. Hansen¹, Timothy W. Chumley¹, Seung Bum Lee³, Rhiannon M. Peery², Joel R. McNeal⁴, Jennifer V. Kuehl⁶, Jeffrey L. Boore⁶, Jeffrey L. Boore⁷ - Show less +13 more•Institutions (7)

University of Texas at Austin¹, Central Washington University², University of Central Florida³, University of Georgia⁴, University of Bonn⁵, Lawrence Berkeley National Laboratory⁶, University of California, Berkeley⁷

04 Dec 2007-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: Phylogenetic trees from multiple methods provide strong support for the position of Amborella as the earliest diverging lineage of flowering plants, followed by Nymphaeales and Austrobaileyales, and the plastid genome trees also provide strongSupport for a sister relationship between eudicots and monocots, and this group is sister to a clade that includes Chloranthales and magnoliids.

...read moreread less

Abstract: Angiosperms are the largest and most successful clade of land plants with >250,000 species distributed in nearly every terrestrial habitat. Many phylogenetic studies have been based on DNA sequences of one to several genes, but, despite decades of intensive efforts, relationships among early diverging lineages and several of the major clades remain either incompletely resolved or weakly supported. We performed phylogenetic analyses of 81 plastid genes in 64 sequenced genomes, including 13 new genomes, to estimate relationships among the major angiosperm clades, and the resulting trees are used to examine the evolution of gene and intron content. Phylogenetic trees from multiple methods, including model-based approaches, provide strong support for the position of Amborella as the earliest diverging lineage of flowering plants, followed by Nymphaeales and Austrobaileyales. The plastid genome trees also provide strong support for a sister relationship between eudicots and monocots, and this group is sister to a clade that includes Chloranthales and magnoliids. Resolution of relationships among the major clades of angiosperms provides the necessary framework for addressing numerous evolutionary questions regarding the rapid diversification of angiosperms. Gene and intron content are highly conserved among the early diverging angiosperms and basal eudicots, but 62 independent gene and intron losses are limited to the more derived monocot and eudicot clades. Moreover, a lineage-specific correlation was detected between rates of nucleotide substitutions, indels, and genomic rearrangements.

...read moreread less

Collapse