scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The Lungfish Transcriptome: A Glimpse into Molecular Evolution Events at the Transition from Water to Land.

TL;DR: It is indicated that lungfish, not coelacanths, are the closest relatives to land-adapted vertebrates and transposable elements appear to be active and show high diversity, suggesting a role for them in the remarkable expansion of the lungfish genome.
Abstract: Lungfish and coelacanths are the only living sarcopterygian fish. The phylogenetic relationship of lungfish to the last common ancestor of tetrapods and their close morphological similarity to their fossil ancestors make this species uniquely interesting. However their genome size, the largest among vertebrates, is hampering the generation of a whole genome sequence. To provide a partial solution to the problem, a high-coverage lungfish reference transcriptome was generated and assembled. The present findings indicate that lungfish, not coelacanths, are the closest relatives to land-adapted vertebrates. Whereas protein-coding genes evolve at a very slow rate, possibly reflecting a “living fossil” status, transposable elements appear to be active and show high diversity, suggesting a role for them in the remarkable expansion of the lungfish genome. Analyses of single genes and gene families documented changes connected to the water to land transition and demonstrated the value of the lungfish reference transcriptome for comparative studies of vertebrate evolution.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The current understanding of vertebrate TE diversity and evolution is reviewed and the current bottleneck in genome analyses lies in the proper annotation of TEs and examples where superficial analyses led to misleading conclusions about genome evolution are provided.
Abstract: Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes.

197 citations

Journal ArticleDOI
TL;DR: The current situation represents a balance between insertion and amplification of transposons and the mechanisms responsible for their deletion or for decreasing their activity, and methylation and the silencing action of small RNAs likely represent the most frequent mechanisms.
Abstract: The relationship between genome size and the percentage of transposons in 161 animal species evidenced that variations in genome size are linked to the amplification or the contraction of transposable elements. The activity of transposable elements could represent a response to environmental stressors. Indeed, although with different trends in protostomes and deuterostomes, comprehensive changes in genome size were recorded in concomitance with particular periods of evolutionary history or adaptations to specific environments. During evolution, genome size and the presence of transposable elements have influenced structural and functional parameters of genomes and cells. Changes of these parameters have had an impact on morphological and functional characteristics of the organism on which natural selection directly acts. Therefore, the current situation represents a balance between insertion and amplification of transposons and the mechanisms responsible for their deletion or for decreasing their activity. Among the latter, methylation and the silencing action of small RNAs likely represent the most frequent mechanisms.

120 citations


Cites background from "The Lungfish Transcriptome: A Glimp..."

  • ...The extant lobe-finned fishes include the coelacanths, very popular in the Devonian, and are represented today only by the Latimeria genus with 2 species dwelling in the deep waters of Africa and Indonesia and the lungfishes, which are shown by molecular studies to be the direct ancestors of the tetrapods [Amemiya et al., 2013; Biscotti et al., 2016]....

    [...]

  • ...…very popular in the Devonian, and are represented today only by the Latimeria genus with 2 species dwelling in the deep waters of Africa and Indonesia and the lungfishes, which are shown by molecular studies to be the direct ancestors of the tetrapods [Amemiya et al., 2013; Biscotti et al., 2016]....

    [...]

Journal ArticleDOI
TL;DR: The genome of the endangered European eel is sequenced using the MinION by Oxford Nanopore and assembled using a novel algorithm specifically designed for large eukaryotic genomes, improving on a previous draft based on short reads only.
Abstract: We have sequenced the genome of the endangered European eel using the MinION by Oxford Nanopore, and assembled these data using a novel algorithm specifically designed for large eukaryotic genomes. For this 860 Mbp genome, the entire computational process takes two days on a single CPU. The resulting genome assembly significantly improves on a previous draft based on short reads only, both in terms of contiguity (N50 1.2 Mbp) and structural quality. This combination of affordable nanopore sequencing and light weight assembly promises to make high-quality genomic resources accessible for many non-model plants and animals.

102 citations


Cites background from "The Lungfish Transcriptome: A Glimp..."

  • ...As scaling behavior is approximately 97 quadratic with genome size, assembling a salamander [27] or lungfish [28] genome dozens of 98 gigabases long would require several years on a cluster....

    [...]

Journal ArticleDOI
04 Mar 2021-Cell
TL;DR: In this paper, the authors reported a 40-Gb chromosome-level assembly of the African lungfish (Protopterus annectens) genome, which is the largest genome assembly ever reported and has a contig and chromosome N50 of 1.60 mb and 2.81 Gb, respectively.

77 citations

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Journal ArticleDOI
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.

88,255 citations

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations

Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
TL;DR: A computerized method is presented that reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.
Abstract: The use of some multiple-sequence alignments in phylogenetic analysis, particularly those that are not very well conserved, requires the elimination of poorly aligned positions and divergent regions, since they may not be homologous or may have been saturated by multiple substitutions. A computerized method that eliminates such positions and at the same time tries to minimize the loss of informative sites is presented here. The method is based on the selection of blocks of positions that fulfill a simple set of requirements with respect to the number of contiguous conserved positions, lack of gaps, and high conservation of flanking positions, making the final alignment more suitable for phylogenetic analysis. To illustrate the efficiency of this method, alignments of 10 mitochondrial proteins from several completely sequenced mitochondrial genomes belonging to diverse eukaryotes were used as examples. The percentages of removed positions were higher in the most divergent alignments. After removing divergent segments, the amino acid composition of the different sequences was more uniform, and pairwise distances became much smaller. Phylogenetic trees show that topologies can be different after removing conserved blocks, particularly when there are several poorly resolved nodes. Strong support was found for the grouping of animals and fungi but not for the position of more basal eukaryotes. The use of a computerized method such as the one presented here reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.

8,757 citations

Related Papers (5)
18 Apr 2013-Nature
Chris T. Amemiya, Chris T. Amemiya, Jessica Alföldi, Alison P. Lee, Shaohua Fan, Hervé Philippe, Iain MacCallum, Ingo Braasch, Tereza Manousaki, Igor Schneider, Nicolas Rohner, Chris L. Organ, Domitille Chalopin, J. Joshua Smith, Mark Robinson, Rosemary A. Dorrington, Marco Gerdol, Bronwen Aken, Maria Assunta Biscotti, Marco Barucca, Denis Baurain, Aaron M. Berlin, Gregory L. Blatch, Gregory L. Blatch, Francesco Buonocore, Thorsten Burmester, Michael S. Campbell, Adriana Canapa, John P. Cannon, Alan Christoffels, Gianluca De Moro, Adrienne L. Edkins, Lin Fan, Anna Maria Fausto, Nathalie Feiner, Mariko Forconi, Junaid Gamieldien, Sante Gnerre, Andreas Gnirke, Jared V. Goldstone, Wilfried Haerty, Mark E. Hahn, Uljana Hesse, Steve Hoffmann, Jeremy Johnson, Sibel I. Karchner, Shigehiro Kuraku, Marcia Lara, Joshua Z. Levin, Gary W. Litman, Evan Mauceli, Evan Mauceli, Tsutomu Miyake, M. Gail Mueller, David R. Nelson, Anne Nitsche, Ettore Olmo, Tatsuya Ota, Alberto Pallavicini, Sumir Panji, Barbara Picone, Chris P. Ponting, Sonja J. Prohaska, Dariusz Przybylski, Nil Ratan Saha, Vydianathan Ravi, Filipe J. Ribeiro, Tatjana Sauka-Spengler, Giuseppe Scapigliati, Stephen M. J. Searle, Ted Sharpe, Oleg Simakov, Peter F. Stadler, John J. Stegeman, Kenta Sumiyama, Diana Tabbaa, Hakim Tafer, Jason Turner-Maier, Peter van Heusden, Simon D. M. White, Louise Williams, Mark Yandell, Henner Brinkmann, Jean Nicolas Volff, Clifford J. Tabin, Neil H. Shubin, Manfred Schartl, David B. Jaffe, John H. Postlethwait, Byrappa Venkatesh, Federica Di Palma, Eric S. Lander, Axel Meyer, Kerstin Lindblad-Toh, Kerstin Lindblad-Toh