Journal ArticleDOI: 10.1016/J.CELL.2021.01.046

Tracing the genetic footprints of vertebrate landing in non-teleost ray-finned fishes.

04 Mar 2021-Cell (Elsevier)-Vol. 184, Iss: 5
Abstract: Rich fossil evidence suggests that many traits and functions related to terrestrial evolution were present long before the ancestor of lobe- and ray-finned fishes. Here, we present genome sequences of the bichir, paddlefish, bowfin, and alligator gar, covering all major early divergent lineages of ray-finned fishes. Our analyses show that these species exhibit many mosaic genomic features of lobe- and ray-finned fishes. In particular, many regulatory elements for limb development are present in these fishes, supporting the hypothesis that the relevant ancestral regulation networks emerged before the origin of tetrapods. Transcriptome analyses confirm the homology between the lung and swim bladder and reveal the presence of functional lung-related genes in early ray-finned fishes. Furthermore, we functionally validate the essential role of a jawed vertebrate highly conserved element for cardiovascular development. Our results imply the ancestors of jawed vertebrates already had the potential gene networks for cardio-respiratory systems supporting air breathing.

Journal ArticleDOI: 10.1016/J.CELL.2021.01.047
Kun Wang1, Jun Wang2, Chenglong Zhu3, Liandong Yang4  +34 moreInstitutions (7)
04 Mar 2021-Cell
Abstract: Summary Lungfishes are the closest extant relatives of tetrapods and preserve ancestral traits linked with the water-to-land transition. However, their huge genome sizes have hindered understanding of this key transition in evolution. Here, we report a 40-Gb chromosome-level assembly of the African lungfish (Protopterus annectens) genome, which is the largest genome assembly ever reported and has a contig and chromosome N50 of 1.60 Mb and 2.81 Gb, respectively. The large size of the lungfish genome is due mainly to retrotransposons. Genes with ultra-long length show similar expression levels to other genes, indicating that lungfishes have evolved high transcription efficacy to keep gene expression balanced. Together with transcriptome and experimental data, we identified potential genes and regulatory elements related to such terrestrial adaptation traits as pulmonary surfactant, anxiolytic ability, pentadactyl limbs, and pharyngeal remodeling. Our results provide insights and key resources for understanding the evolutionary pathway leading from fishes to humans.

Open accessJournal ArticleDOI: 10.1038/S41588-021-00914-Y
30 Aug 2021-Nature Genetics
Abstract: The bowfin (Amia calva) is a ray-finned fish that possesses a unique suite of ancestral and derived phenotypes, which are key to understanding vertebrate evolution. The phylogenetic position of bowfin as a representative of neopterygian fishes, its archetypical body plan and its unduplicated and slowly evolving genome make bowfin a central species for the genomic exploration of ray-finned fishes. Here we present a chromosome-level genome assembly for bowfin that enables gene-order analyses, settling long-debated neopterygian phylogenetic relationships. We examine chromatin accessibility and gene expression through bowfin development to investigate the evolution of immune, scale, respiratory and fin skeletal systems and identify hundreds of gene-regulatory loci conserved across vertebrates. These resources connect developmental evolution among bony fishes, further highlighting the bowfin's importance for illuminating vertebrate biology and diversity in the genomic era.

Open accessDOI: 10.1016/J.WATBS.2021.11.001
Jian-Fang Gui1, Li Zhou1, Xi-Yin Li1Institutions (1)
22 Nov 2021-
Abstract: Fish biology has been developed for more than 100 years, but some important breakthroughs have been made in the last decade. Early studies commonly concentrated on morphology, phylogenetics, development, growth, reproduction manipulation, and disease control. Recent studies have mostly focused on genetics, molecular biology, genomics, and genome biotechnologies, which have provided a solid foundation for enhancing aquaculture to ensure food security and improving aquatic environments to sustain ecosystem health. Here, we review research advances in five major areas: (1) biological innovations and genomic evolution of four significant fish lineages including non-teleost ray-finned fishes, northern hemisphere sticklebacks, East African cichlid fishes, and East Asian cyprinid fishes; (2) evolutionary fates and consequences of natural polyploid fishes; (3) biological consequences of fish domestication and selection; (4) development and innovation of fish breeding biotechnologies; and (5) applicable approaches and potential of fish genetic breeding biotechnologies. Moreover, five precision breeding biotechniques are examined and discussed in detail including gene editing for the introgression or removal of beneficial or detrimental alleles, use of sex-specific markers for the production of mono-sex populations, controllable primordial germ cell on-off strategy for producing sterile offspring, surrogate broodstock-based strategies to accelerate breeding, and genome incorporation and sexual reproduction regain-based approach to create synthetic polyploids. Based on these scientific and technological advances, we propose a blueprint for genetic improvement and new breed creation for aquaculture species and analyze the potential of these new breeding strategies for improving aquaculture seed industry and strengthening food security.

Journal ArticleDOI: 10.1144/JGS2020-245
Douglas H. Erwin1, Douglas H. Erwin2Institutions (2)
Abstract: Disentangling the factors underlying the appearance of macroscopic, often skeletonized, bilaterians during the Ediacaran–Cambrian diversification of animals requires carefully parsing the contributions of ecological opportunity, environmental potential and developmental capacity. The early evolution of animals involved the introduction of genomic, developmental, morphologic and behavioural novelties, identified as the individuation of new characters, which led to the construction of new ecological networks (innovation). Here I employ a recently introduced conceptual framework for novelty and individuation that distinguishes between potentiation, novelty, innovation and adaptive adjustments to the Ediacaran–Cambrian Radiation, and focus on the roles of potentiation and novelty in the expansion of developmental capacity. Comparative developmental studies combined with molecular clock estimates and data from the fossil record suggest that developmental capacity, the potential to generate a range of morphologies, may expand rapidly through developmental novelties without leading directly to morphological novelties, or to innovation. The expected patterns from this framework are markedly different from those in adaptive radiation scenarios. Thematic collection: This article is part of the Advances in the Cambrian Explosion collection available at:

Journal ArticleDOI: 10.1006/METH.2001.1262
01 Dec 2001-Methods
Abstract: The two most commonly used methods to analyze data from real-time, quantitative PCR experiments are absolute quantification and relative quantification. Absolute quantification determines the input copy number, usually by relating the PCR signal to a standard curve. Relative quantification relates the PCR signal of the target transcript in a treatment group to that of another sample such as an untreated control. The 2(-Delta Delta C(T)) method is a convenient way to analyze the relative changes in gene expression from real-time quantitative PCR experiments. The purpose of this report is to present the derivation, assumptions, and applications of the 2(-Delta Delta C(T)) method. In addition, we present the derivation and applications of two variations of the 2(-Delta Delta C(T)) method that may be useful in the analysis of real-time, quantitative PCR data.

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTU033
Alexandros Stamatakis1Institutions (1)
01 May 2014-Bioinformatics
Abstract: Motivation: Phylogenies are increasingly used in all fields of medical and biological research. Moreover, because of the next-generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. RAxML (Randomized Axelerated Maximum Likelihood) is a popular program for phylogenetic analyses of large datasets under maximum likelihood. Since the last RAxML paper in 2006, it has been continuously maintained and extended to accommodate the increasingly growing input datasets and to serve the needs of the user community. Results: I present some of the most notable new features and extensions of RAxML, such as a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting postanalyses on sets of trees. In addition, an up-to-date 50-page user manual covering all new RAxML options is available. Availability and implementation: The code is available under GNU

Open accessJournal ArticleDOI: 10.1038/NBT.1621
Cole Trapnell1, Cole Trapnell2, Brian A. Williams3, Geo Pertea2  +6 moreInstitutions (4)
Abstract: High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

Open accessJournal ArticleDOI: 10.1093/BIOINFORMATICS/BTP120
01 May 2009-Bioinformatics
Abstract: Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-Seq, generates millions of short sequence fragments in a single run. These fragments, or ‘reads’, can be used to measure levels of gene expression and to identify novel splice variants of genes. However, current software for aligning RNA-Seq data to a genome relies on known splice junctions and cannot identify novel ones. TopHat is an efficient read-mapping algorithm designed to align reads from an RNA-Seq experiment to a reference genome without relying on known splice sites. Results: We mapped the RNA-Seq reads from a recent mammalian RNA-Seq experiment and recovered more than 72% of the splice junctions reported by the annotation-based software from that study, along with nearly 20 000 previously unreported junctions. The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. We describe several challenges unique to ab initio splice site discovery from RNA-Seq reads that will require further algorithm development. Availability: TopHat is free, open-source software available from Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

Open accessJournal ArticleDOI: 10.1093/MOLBEV/MSM088
Ziheng Yang1Institutions (1)
Abstract: PAML, currently in version 4, is a package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML). The programs may be used to compare and test phylogenetic trees, but their main strengths lie in the rich repertoire of evolutionary models implemented, which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses. Uses of the programs include estimation of synonymous and nonsynonymous rates (d(N) and d(S)) between two protein-coding DNA sequences, inference of positive Darwinian selection through phylogenetic comparison of protein-coding genes, reconstruction of ancestral genes and proteins for molecular restoration studies of extinct life forms, combined analysis of heterogeneous data sets from multiple gene loci, and estimation of species divergence times incorporating uncertainties in fossil calibrations. This note discusses some of the major applications of the package, which includes example data sets to demonstrate their use. The package is written in ANSI C, and runs under Windows, Mac OSX, and UNIX systems. It is available at -- (

