The African coelacanth genome provides insights into tetrapod evolution
Virginia Mason Medical Center1, University of Washington2, Broad Institute3, Agency for Science, Technology and Research4, University of Konstanz5, Université de Montréal6, University of Oregon7, Federal University of Pará8, Harvard University9, University of Utah10, École normale supérieure de Lyon11, University of Kentucky12, Rhodes University13, University of Trieste14, Wellcome Trust Sanger Institute15, Marche Polytechnic University16, University of Liège17, Victoria University, Australia18, University of Hamburg19, University of South Florida20, University of the Western Cape21, Woods Hole Oceanographic Institution22, University of Oxford23, Leipzig University24, Keio University25, Johns Hopkins University26, University of Tennessee Health Science Center27, Graduate University for Advanced Studies28, National Institute of Genetics29, University of Chicago30, University of Würzburg31, Uppsala University32
TL;DR: Through a phylogenomic analysis, it is concluded that the lungfish, and not the coelacanth, is the closest living relative of tetrapods.
Abstract: The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.
Citations
More filters
••
TL;DR: A database and website, "circBase," where merged and unified data sets of circRNAs and the evidence supporting their expression can be accessed, downloaded, and browsed within the genomic context.
Abstract: Recently, several laboratories have reported thousands of circular RNAs (circRNAs) in animals. Numerous circRNAs are highly stable and have specific spatiotemporal expression patterns. Even though a function for circRNAs is unknown, these features make circRNAs an interesting class of RNAs as possible biomarkers and for further research. We developed a database and website, "circBase," where merged and unified data sets of circRNAs and the evidence supporting their expression can be accessed, downloaded, and browsed within the genomic context. circBase also provides scripts to identify known and novel circRNAs in sequencing data. The database is freely accessible through the web server at http://www.circbase.org/.
1,285 citations
••
TL;DR: The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects and generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets.
Abstract: The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based ann ...
849 citations
••
Agency for Science, Technology and Research1, National University of Singapore2, Max Planck Society3, University of Maryland, Baltimore4, Hokkaido University5, San Francisco State University6, University of Toronto7, Catalan Institution for Research and Advanced Studies8, Spanish National Research Council9, University of California, Santa Cruz10, Washington University in St. Louis11
TL;DR: The whole-genome analysis of a cartilaginous fish, the elephant shark (Callorhinchus milii), finds that the C. milii genome is the slowest evolving of all known vertebrates, and features extensive synteny conservation with tetrapod genomes, making it a good model for comparative analyses of gnathostome genomes.
Abstract: The emergence of jawed vertebrates (gnathostomes) from jawless vertebrates was accompanied by major morphological and physiological innovations, such as hinged jaws, paired fins and immunoglobulin-based adaptive immunity. Gnathostomes subsequently diverged into two groups, the cartilaginous fishes and the bony vertebrates. Here we report the whole-genome analysis of a cartilaginous fish, the elephant shark (Callorhinchus milii). We find that the C. milii genome is the slowest evolving of all known vertebrates, including the ‘living fossil’ coelacanth, and features extensive synteny conservation with tetrapod genomes, making it a good model for comparative analyses of gnathostome genomes. Our functional studies suggest that the lack of genes encoding secreted calcium-binding phosphoproteins in cartilaginous fishes explains the absence of bone in their endoskeleton. Furthermore, the adaptive immune system of cartilaginous fishes is unusual: it lacks the canonical CD4 co-receptor and most transcription factors, cytokines and cytokine receptors related to the CD4 lineage, despite the presence of polymorphic major histocompatibility complex class II molecules. It thus presents a new model for understanding the origin of adaptive immunity. Whole-genome analysis of the elephant shark, a cartilaginous fish, shows that it is the slowest evolving of all known vertebrates, lacks critical bone formation genes and has an unusual adaptive immune system. The elephant shark (Callorhinchus milii) is a cartilaginous fish native to the temperate waters off southern Australia and New Zealand, living at depths of 200 to 500 metres and migrating into shallow waters during spring for breeding. The genome sequence is published in this issue of Nature. Comparison with other vertebrate genomes shows that it is the slowest evolving genome of all known vertebrates — coelacanth included. Genome analysis points to an unusual adaptive immune system lacking the CD4 receptor and some associated cytokines, indicating that cartilaginous fishes possess a primordial gnathostome adaptive immune system. Also absent are genes encoding secreted calcium-binding phosphoproteins, in line with the absence of bone in cartilaginous fish.
616 citations
••
TL;DR: High-throughput sequencing technologies are revolutionizing the life sciences, and the past 12 months have seen a burst of genome sequences from non-model organisms, in each case representing a fundamental source of data of significant importance to biological research.
Abstract: High-throughput sequencing technologies are revolutionizing the life sciences. The past 12 months have seen a burst of genome sequences from non-model organisms, in each case representing a fundamental source of data of significant importance to biological research. This has bearing on several aspects of evolutionary biology, and we are now beginning to see patterns emerging from these studies. These include significant heterogeneity in the rate of recombination that affects adaptive evolution and base composition, the role of population size in adaptive evolution, and the importance of expansion of gene families in lineage-specific adaptation. Moreover, resequencing of population samples (population genomics) has enabled the identification of the genetic basis of critical phenotypes and cast light on the landscape of genomic divergence during speciation.
607 citations
••
University of Oregon1, University of Chicago2, University of Kentucky3, Pennsylvania State University4, Institut national de la recherche agronomique5, University of Illinois at Urbana–Champaign6, Broad Institute7, University of Utah8, European Bioinformatics Institute9, Wellcome Trust Sanger Institute10, University of Oxford11, Bangor University12, Agency for Science, Technology and Research13, École normale supérieure de Lyon14, University of Konstanz15, North Carolina State University16, University of Barcelona17, University of Victoria18, Soochow University (Suzhou)19, Leipzig University20, The Nippon Dental University21, University of South Florida22, Graduate University for Advanced Studies23, Benaroya Research Institute24, Nicholls State University25, Federal University of Pará26, Science for Life Laboratory27
TL;DR: In this article, the authors sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD).
Abstract: To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.
494 citations
References
More filters
••
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.
15,665 citations
••
TL;DR: It is found that the patterns of evolution in human and chimpanzee protein-coding genes are highly correlated and dominated by the fixation of neutral and slightly deleterious alleles.
Abstract: Here we present a draft genome sequence of the common chimpanzee (Pan troglodytes). Through comparison with the human genome, we have generated a largely complete catalogue of the genetic differenc ...
2,267 citations
••
TL;DR: There are 481 segments longer than 200 base pairs that are absolutely conserved between orthologous regions of the human, rat, and mouse genomes, which represent a class of genetic elements whose functions and evolutionary origins are yet to be determined, but which are more highly conserving between these species than are proteins.
Abstract: There are 481 segments longer than 200 base pairs (bp) that are absolutely conserved (100% identity with no insertions or deletions) between orthologous regions of the human, rat, and mouse genomes. Nearly all of these segments are also conserved in the chicken and dog genomes, with an average of 95 and 99% identity, respectively. Many are also significantly conserved in fish. These ultraconserved elements of the human genome are most often located either overlapping exons in genes involved in RNA processing or in introns or nearby genes involved in the regulation of transcription and development. Along with more than 5000 sequences of over 100 bp that are absolutely conserved among the three sequenced mammals, these represent a class of genetic elements whose functions and evolutionary origins are yet to be determined, but which are more highly conserved between these species than are proteins and appear to be essential for the ontogeny of mammals and other vertebrates.
1,690 citations
••
TL;DR: The development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform, have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome.
Abstract: Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range of biomedical applications, it has proven difficult to use them to generate high-quality de novo genome assemblies of large, repeat-rich vertebrate genomes. To date, the genome assemblies generated from such data have fallen far short of those obtained with the older (but much more expensive) capillary-based sequencing approach. Here, we report the development of an algorithm for genome assembly, ALLPATHS-LG, and its application to massively parallel DNA sequence data from the human and mouse genomes, generated on the Illumina platform. The resulting draft genome assemblies have good accuracy, short-range contiguity, long-range connectivity, and coverage of the genome. In particular, the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size = 11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with capillary-based sequencing. The combination of improved sequencing technology and improved computational methods should now make it possible to increase dramatically the de novo sequencing of large genomes. The ALLPATHS-LG program is available at http://www.broadinstitute.org/science/programs/genome-biology/crd.
1,616 citations
••
TL;DR: A high-quality reference genome assembly for threespine stickleback fish is developed and it is indicated that reuse of globally shared standing genetic variation has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation.
Abstract: Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine-freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine-freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature.
1,557 citations