scispace - formally typeset
Search or ask a question

Showing papers in "Genome Biology and Evolution in 2016"


Journal ArticleDOI
TL;DR: This work shows that the root of Placentalia lies between Atlantogenata and Boreoeutheria, and finds evidence for ILS in early placental evolution, and is able to reject previous conclusions that the placental root is a hard polytomy that cannot be resolved.
Abstract: Placental mammals comprise three principal clades: Afrotheria (e.g., elephants and tenrecs), Xenarthra (e.g., armadillos and sloths), and Boreoeutheria (all other placental mammals), the relationships among which are the subject of controversy and a touchstone for debate on the limits of phylogenetic inference. Previous analyses have found support for all three hypotheses, leading some to conclude that this phylogenetic problem might be impossible to resolve due to the compounded effects of incomplete lineage sorting (ILS) and a rapid radiation. Here we show, using a genome scale nucleotide data set, microRNAs, and the reanalysis of the three largest previously published amino acid data sets, that the root of Placentalia lies between Atlantogenata and Boreoeutheria. Although we found evidence for ILS in early placental evolution, we are able to reject previous conclusions that the placental root is a hard polytomy that cannot be resolved. Reanalyses of previous data sets recover Atlantogenata + Boreoeutheria and show that contradictory results are a consequence of poorly fitting evolutionary models; instead, when the evolutionary process is better-modeled, all data sets converge on Atlantogenata. Our Bayesian molecular clock analysis estimates that marsupials diverged from placentals 157–170 Ma, crown Placentalia diverged 86–100 Ma, and crown Atlantogenata diverged 84–97 Ma. Our results are compatible with placental diversification being driven by dispersal rather than vicariance mechanisms, postdating early phases in the protracted opening of the Atlantic Ocean.

185 citations


Journal ArticleDOI
TL;DR: It is shown that laterally transferred genes into arthropods underpin many adaptations to phytophagy, including efficient assimilation and detoxification of plant produced metabolites.
Abstract: Within animals, evolutionary transition toward herbivory is severely limited by the hostile characteristics of plants. Arthropods have nonetheless counteracted many nutritional and defensive barriers imposed by plants and are currently considered as the most successful animal herbivores in terrestrial ecosystems. We gather a body of evidence showing that genomes of various plant feeding insects and mites possess genes whose presence can only be explained by horizontal gene transfer (HGT). HGT is the asexual transmission of genetic information between reproductively isolated species. Although HGT is known to have great adaptive significance in prokaryotes, its impact on eukaryotic evolution remains obscure. Here, we show that laterally transferred genes into arthropods underpin many adaptations to phytophagy, including efficient assimilation and detoxification of plant produced metabolites. Horizontally acquired genes and the traits they encode often functionally diversify within arthropod recipients, enabling the colonization of more host plant species and organs. We demonstrate that HGT can drive metazoan evolution by uncovering its prominent role in the adaptations of arthropods to exploit plants.

151 citations


Journal ArticleDOI
TL;DR: How the association between methanogenesis and the Wood–Ljungdahl pathway appears to be much more flexible than previously thought, and might provide information on the processes that led to loss of this metabolism in many archaeal lineages is discussed.
Abstract: Methanogenesis coupled to the Wood–Ljungdahl pathway is one of the most ancient metabolisms for energy generation and carbon fixation in the Archaea Recent results are sensibly changing our view on the diversity of methane-cycling capabilities in this Domain of Life The availability of genomic sequences from uncharted branches of the archaeal tree has highlighted the existence of novel methanogenic lineages phylogenetically distant to previously known ones, such as the Methanomassiliicoccales At the same time, phylogenomic analyses have suggested a methanogenic ancestor for all Archaea, implying multiple independent losses of this metabolism during archaeal diversification This prediction has been strengthened by the report of genes involved in methane cycling in members of the Bathyarchaeota (a lineage belonging to the TACK clade), representing the first indication of the presence of methanogenesis outside of the Euryarchaeota In light of these new data, we discuss how the association between methanogenesis and the Wood–Ljungdahl pathway appears to be much more flexible than previously thought, and might provide information on the processes that led to loss of this metabolism in many archaeal lineages The combination of environmental microbiology, experimental characterization and phylogenomics opens up exciting avenues of research to unravel the diversity and evolutionary history of fundamental metabolic pathways

143 citations


Journal ArticleDOI
TL;DR: The study shows that, although compositional heterogeneity is not universal, it cannot be eliminated for some mitochondrial genes, but dense taxon sampling and the use of appropriate Bayesian analyses can still produce robust phylogenetic trees.
Abstract: Mitochondrial genomes are readily sequenced with recent technology and thus evolutionary lineages can be densely sampled. This permits better phylogenetic estimates and assessment of potential biases resulting from heterogeneity in nucleotide composition and rate of change. We gathered 245 mitochondrial sequences for the Coleoptera representing all 4 suborders, 15 superfamilies of Polyphaga, and altogether 97 families, including 159 newly sequenced full or partial mitogenomes. Compositional heterogeneity greatly affected 3rd codon positions, and to a lesser extent the 1st and 2nd positions, even after RY coding. Heterogeneity also affected the encoded protein sequence, in particular in the nad2, nad4, nad5, and nad6 genes. Credible tree topologies were obtained with the nhPhyML ("nonhomogeneous") algorithm implementing a model for branch-specific equilibrium frequencies. Likelihood searches using RAxML were improved by data partitioning by gene and codon position. Finally, the PhyloBayes software, which allows different substitution processes for amino acid replacement at various sites, produced a tree that best matched known higher level taxa and defined basal relationships in Coleoptera. After rooting with Neuropterida outgroups, suborder relationships were resolved as (Polyphaga (Myxophaga (Archostemata + Adephaga))). The infraorder relationships in Polyphaga were (Scirtiformia (Elateriformia ((Staphyliniformia + Scarabaeiformia) (Bostrichiformia (Cucujiformia))))). Polyphagan superfamilies were recovered as monophyla except Staphylinoidea (paraphyletic for Scarabaeiformia) and Cucujoidea, which can no longer be considered a valid taxon. The study shows that, although compositional heterogeneity is not universal, it cannot be eliminated for some mitochondrial genes, but dense taxon sampling and the use of appropriate Bayesian analyses can still produce robust phylogenetic trees.

141 citations


Journal ArticleDOI
TL;DR: Discovery of newly discovered diversity in animal mitochondrial genome organization allows a better understanding of the evolutionary plasticity and conservation of animal mtDNA and provides insights into the molecular and evolutionary mechanisms shaping mitochondrial genomes.
Abstract: Animal mitochondrial DNA (mtDNA) is commonly described as a small, circular molecule that is conserved in size, gene content, and organization. Data collected in the last decade have challenged this view by revealing considerable diversity in animal mitochondrial genome organization. Much of this diversity has been found in nonbilaterian animals (phyla Cnidaria, Ctenophora, Placozoa, and Porifera), which, from a phylogenetic perspective, form the main branches of the animal tree along with Bilateria. Within these groups, mt-genomes are characterized by varying numbers of both linear and circular chromosomes, extra genes (e.g. atp9, polB, tatC), large variation in the number of encoded mitochondrial transfer RNAs (tRNAs) (0-25), at least seven different genetic codes, presence/absence of introns, tRNA and mRNA editing, fragmented ribosomal RNA genes, translational frameshifting, highly variable substitution rates, and a large range of genome sizes. This newly discovered diversity allows a better understanding of the evolutionary plasticity and conservation of animal mtDNA and provides insights into the molecular and evolutionary mechanisms shaping mitochondrial genomes.

140 citations


Journal ArticleDOI
TL;DR: It is found that the pathogenicity of host-adapted fungi evolved multiple times, and that both divergent and convergent evolutions occurred during pathogen–host cospeciation thus resulting in protein families with similar features in each fungal group.
Abstract: Fungal pathogens of plants and animals have multifarious effects; they cause devastating damages to agricultures, lead to life-threatening diseases in humans, or induce beneficial effects by reducing insect pest populations. Many virulence factors have been determined in different fungal pathogens; however, the molecular determinants contributing to fungal host selection and adaptation are largely unknown. In this study, we sequenced the genomes of seven ascomycete insect pathogens and performed the genome-wide analyses of 33 species of filamentous ascomycete pathogenic fungi that infect insects (12 species), plants (12), and humans (9). Our results revealed that the genomes of plant pathogens encode more proteins and protein families than the insect and human pathogens. Unexpectedly, more common orthologous protein groups are shared between the insect and plant pathogens than between the two animal group pathogens. We also found that the pathogenicity of host-adapted fungi evolved multiple times, and that both divergent and convergent evolutions occurred during pathogen-host cospeciation thus resulting in protein families with similar features in each fungal group. However, the role of phylogenetic relatedness on the evolution of protein families and therefore pathotype formation could not be ruled out due to the effect of common ancestry. The evolutionary correlation analyses led to the identification of different protein families that correlated with alternate pathotypes. Particularly, the effector-like proteins identified in plant and animal pathogens were strongly linked to fungal host adaptation, suggesting the existence of similar gene-for-gene relationships in fungus-animal interactions that has not been established before. These results well advance our understanding of the evolution of fungal pathogenicity and the factors that contribute to fungal pathotype formation.

134 citations


Journal ArticleDOI
TL;DR: This study assembled the richest taxon sampling of Holometabola to date, and analyzed both nucleotide and amino acid data sets using several methods, finding the standard Bayesian inference and maximum-likelihood analyses were strongly affected by systematic biases, but the site-heterogeneous mixture model implemented in PhyloBayes avoided the false grouping of unrelated taxa exhibiting similar base composition and accelerated evolutionary rate.
Abstract: After decades of debate, a mostly satisfactory resolution of relationships among the 11 recognized holometabolan orders of insects has been reached based on nuclear genes, resolving one of the most substantial branches of the tree-of-life, but the relationships are still not well established with mitochondrial genome data. The main reasons have been the absence of sufficient data in several orders and lack of appropriate phylogenetic methods that avoid the systematic errors from compositional and mutational biases in insect mitochondrial genomes. In this study, we assembled the richest taxon sampling of Holometabola to date (199 species in 11 orders), and analyzed both nucleotide and amino acid data sets using several methods. We find the standard Bayesian inference and maximum-likelihood analyses were strongly affected by systematic biases, but the site-heterogeneous mixture model implemented in PhyloBayes avoided the false grouping of unrelated taxa exhibiting similar base composition and accelerated evolutionary rate. The inclusion of rRNA genes and removal of fast-evolving sites with the observed variability sorting method for identifying sites deviating from the mean rates improved the phylogenetic inferences under a site-heterogeneous model, correctly recovering most deep branches of the Holometabola phylogeny. We suggest that the use of mitochondrial genome data for resolving deep phylogenetic relationships requires an assessment of the potential impact of substitutional saturation and compositional biases through data deletion strategies and by using site-heterogeneous mixture models. Our study suggests a practical approach for how to use densely sampled mitochondrial genome data in phylogenetic analyses.

134 citations


Journal ArticleDOI
TL;DR: The utility and novel features of Lep-MAP2 in assembling high-density linkage maps, and their usefulness in revealing evolutionarily interesting properties of genomes, such as strong genome-wide sex bias in recombination rates are illustrated.
Abstract: High-density linkage maps are important tools for genome biology and evolutionary genetics by quantifying the extent of recombination, linkage disequilibrium, and chromosomal rearrangements across chromosomes, sexes, and populations. They provide one of the best ways to validate and refine de novo genome assemblies, with the power to identify errors in assemblies increasing with marker density. However, assembly of high-density linkage maps is still challenging due to software limitations. We describe Lep-MAP2, a software for ultradense genome-wide linkage map construction. Lep-MAP2 can handle various family structures and can account for achiasmatic meiosis to gain linkage map accuracy. Simulations show that Lep-MAP2 outperforms other available mapping software both in computational efficiency and accuracy. When applied to two large F2-generation recombinant crosses between two nine-spined stickleback (Pungitius pungitius) populations, it produced two high-density (∼6 markers/cM) linkage maps containing 18,691 and 20,054 single nucleotide polymorphisms. The two maps showed a high degree of synteny, but female maps were 1.5-2 times longer than male maps in all linkage groups, suggesting genome-wide recombination suppression in males. Comparison with the genome sequence of the three-spined stickleback (Gasterosteus aculeatus) revealed a high degree of interspecific synteny with a low frequency (<5%) of interchromosomal rearrangements. However, a fairly large (ca. 10 Mb) translocation from autosome to sex chromosome was detected in both maps. These results illustrate the utility and novel features of Lep-MAP2 in assembling high-density linkage maps, and their usefulness in revealing evolutionarily interesting properties of genomes, such as strong genome-wide sex bias in recombination rates.

130 citations


Journal ArticleDOI
TL;DR: Improved resolution of the timing of WGD events in monocot history provides evidence for the influence of polyploidization on functional evolution and species diversification.
Abstract: Comparisons of flowering plant genomes reveal multiple rounds of ancient polyploidy characterized by large intragenomic syntenic blocks. Three such whole-genome duplication (WGD) events, designated as rho (ρ), sigma (σ), and tau (τ), have been identified in the genomes of cereal grasses. Precise dating of these WGD events is necessary to investigate how they have influenced diversification rates, evolutionary innovations, and genomic characteristics such as the GC profile of protein-coding sequences. The timing of these events has remained uncertain due to the paucity of monocot genome sequence data outside the grass family (Poaceae). Phylogenomic analysis of protein-coding genes from sequenced genomes and transcriptome assemblies from 35 species, including representatives of all families within the Poales, has resolved the timing of rho and sigma relative to speciation events and placed tau prior to divergence of Asparagales and the commelinids but after divergence with eudicots. Examination of gene family phylogenies indicates that rho occurred just prior to the diversification of Poaceae and sigma occurred before early diversification of Poales lineages but after the Poales-commelinid split. Additional lineage-specific WGD events were identified on the basis of the transcriptome data. Gene families exhibiting high GC content are underrepresented among those with duplicate genes that persisted following these genome duplications. However, genome duplications had little overall influence on lineage-specific changes in the GC content of coding genes. Improved resolution of the timing of WGD events in monocot history provides evidence for the influence of polyploidization on functional evolution and species diversification.

123 citations


Journal ArticleDOI
TL;DR: The genome of the western orchard predatory mite improves genomic sampling of chelicerates and provides invaluable new resources for functional genomic analyses of this family of agriculturally important mites.
Abstract: Metaseiulus occidentalis is an eyeless phytoseiid predatory mite employed for the biological control of agricultural pests including spider mites. Despite appearances, these predator and prey mites are separated by some 400 Myr of evolution and radically different lifestyles. We present a 152-Mb draft assembly of the M. occidentalis genome: Larger than that of its favored prey, Tetranychus urticae, but considerably smaller than those of many other chelicerates, enabling an extremely contiguous and complete assembly to be built-the best arachnid to date. Aided by transcriptome data, genome annotation cataloged 18,338 protein-coding genes and identified large numbers of Helitron transposable elements. Comparisons with other arthropods revealed a particularly dynamic and turbulent genomic evolutionary history. Its genes exhibit elevated molecular evolution, with strikingly high numbers of intron gains and losses, in stark contrast to the deer tick Ixodes scapularis Uniquely among examined arthropods, this predatory mite's Hox genes are completely atomized, dispersed across the genome, and it encodes five copies of the normally single-copy RNA processing Dicer-2 gene. Examining gene families linked to characteristic biological traits of this tiny predator provides initial insights into processes of sex determination, development, immune defense, and how it detects, disables, and digests its prey. As the first reference genome for the Phytoseiidae, and for any species with the rare sex determination system of parahaploidy, the genome of the western orchard predatory mite improves genomic sampling of chelicerates and provides invaluable new resources for functional genomic analyses of this family of agriculturally important mites.

110 citations


Journal ArticleDOI
TL;DR: It is shown that the genome of S. scitamineum, a smut fungus parasitizing sugar cane with a phylogenetic position intermediate to the two previously sequenced species U. maydis and Sporisorium reilianum, contains more and larger gene clusters encoding secreted effectors than any previously described species in this group.
Abstract: Smut fungi are plant pathogens mostly parasitizing wild species of grasses as well as domesticated cereal crops. Genome analysis of several smut fungi including Ustilago maydis revealed a singular clustered organization of genes encoding secreted effectors. In U. maydis, many of these clusters have a role in virulence. Reconstructing the evolutionary history of clusters of effector genes is difficult because of their intrinsically fast evolution, which erodes the phylogenetic signal and homology relationships. Here, we describe the use of comparative evolutionary analyses of quality draft assemblies of genomes to study the mechanisms of this evolution. We report the genome sequence of a South African isolate of Sporisorium scitamineum, a smut fungus parasitizing sugar cane with a phylogenetic position intermediate to the two previously sequenced species U. maydis and Sporisorium reilianum. We show that the genome of S. scitamineum contains more and larger gene clusters encoding secreted effectors than any previously described species in this group. We trace back the origin of the clusters and find that their evolution is mainly driven by tandem gene duplication. In addition, transposable elements play a major role in the evolution of the clustered genes. Transposable elements are significantly associated with clusters of genes encoding fast evolving secreted effectors. This suggests that such clusters represent a case of genome compartmentalization that restrains the activity of transposable elements on genes under diversifying selection for which this activity is potentially beneficial, while protecting the rest of the genome from its deleterious effect.

Journal ArticleDOI
TL;DR: The kernels of extant opsin diversity arose much earlier in animal history than previously known and can be used to understand how and when opsins were incorporated into complex traits like eyes and extraocular sensors.
Abstract: The opsin gene family encodes key proteins animals use to sense light and has expanded dramatically as it originated early in animal evolution. Understanding the origins of opsin diversity can offer clues to how separate lineages of animals have repurposed different opsin paralogs for different light-detecting functions. However, the more we look for opsins outside of eyes and from additional animal phyla, the more opsins we uncover, suggesting we still do not know the true extent of opsin diversity, nor the ancestry of opsin diversity in animals. To estimate the number of opsin paralogs present in both the last common ancestor of the Nephrozoa (bilaterians excluding Xenoacoelomorpha), and the ancestor of Cnidaria + Bilateria, we reconstructed a reconciled opsin phylogeny using sequences from 14 animal phyla, especially the traditionally poorly-sampled echinoderms and molluscs. Our analysis strongly supports a repertoire of at least nine opsin paralogs in the bilaterian ancestor and at least four opsin paralogs in the last common ancestor of Cnidaria + Bilateria. Thus, the kernels of extant opsin diversity arose much earlier in animal history than previously known. Further, opsins likely duplicated and were lost many times, with different lineages of animals maintaining different repertoires of opsin paralogs. This phylogenetic information can inform hypotheses about the functions of different opsin paralogs and can be used to understand how and when opsins were incorporated into complex traits like eyes and extraocular sensors.

Journal ArticleDOI
TL;DR: Comparisons revealed species-specific and isolate-specific differences in the composition and expression of genes involved in SM production including those for phytohormome biosynthesis, suggesting that SMs might be determinants of host specificity.
Abstract: Species of the Fusarium fujikuroi species complex (FFC) cause a wide spectrum of often devastating diseases on diverse agricultural crops, including coffee, fig, mango, maize, rice, and sugarcane. Although species within the FFC are difficult to distinguish by morphology, and their genes often share 90% sequence similarity, they can differ in host plant specificity and life style. FFC species can also produce structurally diverse secondary metabolites (SMs), including the mycotoxins fumonisins, fusarins, fusaric acid, and beauvericin, and the phytohormones gibberellins, auxins, and cytokinins. The spectrum of SMs produced can differ among closely related species, suggesting that SMs might be determinants of host specificity. To date, genomes of only a limited number of FFC species have been sequenced. Here, we provide draft genome sequences of three more members of the FFC: a single isolate of F. mangiferae, the cause of mango malformation, and two isolates of F. proliferatum, one a pathogen of maize and the other an orchid endophyte. We compared these genomes to publicly available genome sequences of three other FFC species. The comparisons revealed species-specific and isolate-specific differences in the composition and expression (in vitro and in planta) of genes involved in SM production including those for phytohormome biosynthesis. Such differences have the potential to impact host specificity and, as in the case of F. proliferatum, the pathogenic versus endophytic life style.

Journal ArticleDOI
TL;DR: Elevated genetic differentiation in these genomic regions has previously been described on both sides of the Atlantic Ocean, and it is suggested that these polymorphisms are involved in adaptive divergence across the species distributional range.
Abstract: In several species genetic differentiation across environmental gradients or between geographically separate populations has been reported to center at "genomic islands of divergence," resulting in heterogeneous differentiation patterns across genomes. Here, genomic regions of elevated divergence were observed on three chromosomes of the highly mobile fish Atlantic cod (Gadus morhua) within geographically fine-scaled coastal areas. The "genomic islands" extended at least 5, 9.5, and 13 megabases on linkage groups 2, 7, and 12, respectively, and coincided with large blocks of linkage disequilibrium. For each of these three chromosomes, pairs of segregating, highly divergent alleles were identified, with little or no gene exchange between them. These patterns of recombination and divergence mirror genomic signatures previously described for large polymorphic inversions, which have been shown to repress recombination across extensive chromosomal segments. The lack of genetic exchange permits divergence between noninverted and inverted chromosomes in spite of gene flow. For the rearrangements on linkage groups 2 and 12, allelic frequency shifts between coastal and oceanic environments suggest a role in ecological adaptation, in agreement with recently reported associations between molecular variation within these genomic regions and temperature, oxygen, and salinity levels. Elevated genetic differentiation in these genomic regions has previously been described on both sides of the Atlantic Ocean, and we therefore suggest that these polymorphisms are involved in adaptive divergence across the species distributional range.

Journal ArticleDOI
TL;DR: A novel method to measure the local GC-content bias in genomes and a survey of published fungal species identified species containing distinct AT-rich regions, supporting the hypothesis that these regions play an important role in fungal evolution.
Abstract: We present a novel method to measure the local GC-content bias in genomes and a survey of published fungal species. The method, enacted as “OcculterCut” (https://sourceforge.net/projects/occultercut, last accessed April 30, 2016), identified species containing distinct AT-rich regions. In most fungal taxa, AT-rich regions are a signature of repeat-induced point mutation (RIP), which targets repetitive DNA and decreases GC-content though the conversion of cytosine to thymine bases. RIP has in turn been identified as a driver of fungal genome evolution, as RIP mutations can also occur in single-copy genes neighboring repeat-rich regions. Over time RIP perpetuates “two speeds” of gene evolution in the GC-equilibrated and AT-rich regions of fungal genomes. In this study, genomes showing evidence of this process are found to be common, particularly among the Pezizomycotina. Further analysis highlighted differences in amino acid composition and putative functions of genes from these regions, supporting the hypothesis that these regions play an important role in fungal evolution. OcculterCut can also be used to identify genes undergoing RIP-assisted diversifying selection, such as small, secreted effector proteins that mediate host-microbe disease interactions.

Journal ArticleDOI
TL;DR: It is found that across a variety of taxa, the ability to accurately identify TEs based solely on homology decreased as the phylogenetic distance between the queried genome and a reference increased.
Abstract: Transposable elements (TEs) are mobile genetic elements with the ability to replicate themselves throughout the host genome. In some taxa TEs reach copy numbers in hundreds of thousands and can occupy more than half of the genome. The increasing number of reference genomes from nonmodel species has begun to outpace efforts to identify and annotate TE content and methods that are used vary significantly between projects. Here, we demonstrate variation that arises in TE annotations when less than optimal methods are used. We found that across a variety of taxa, the ability to accurately identify TEs based solely on homology decreased as the phylogenetic distance between the queried genome and a reference increased. Next we annotated repeats using homology alone, as is often the case in new genome analyses, and a combination of homology and de novo methods as well as an additional manual curation step. Reannotation using these methods identified a substantial number of new TE subfamilies in previously characterized genomes, recognized a higher proportion of the genome as repetitive, and decreased the average genetic distance within TE families, implying recent TE accumulation. Finally, these finding-increased recognition of younger TEs-were confirmed via an analysis of the postman butterfly (Heliconius melpomene). These observations imply that complete TE annotation relies on a combination of homology and de novo-based repeat identification, manual curation, and classification and that relying on simple, homology-based methods is insufficient to accurately describe the TE landscape of a newly sequenced genome.

Journal ArticleDOI
TL;DR: There are large differences in evolutionary rates and gene turnover between pathways, and that paralogs of Ago2, Ago3, and Piwi/Aub show contrasting rates of evolution after duplication, which suggests that Argonautes undergo frequent evolutionary expansions that facilitate functional divergence.
Abstract: Genetic studies of Drosophila melanogaster have provided a paradigm for RNA interference (RNAi) in arthropods, in which the microRNA and antiviral pathways are each mediated by a single Argonaute (Ago1 and Ago2) and germline suppression of transposable elements is mediated by a trio of Piwi-subfamily Argonaute proteins (Ago3, Aub, and Piwi). Without a suitable evolutionary context, deviations from this can be interpreted as derived or idiosyncratic. Here we analyze the evolution of Argonaute genes across the genomes and transcriptomes of 86 Dipteran species, showing that variation in copy number can occur rapidly, and that there is constant flux in some RNAi mechanisms. The lability of the RNAi pathways is illustrated by the divergence of Aub and Piwi (182-156 Ma), independent origins of multiple Piwi-family genes in Aedes mosquitoes (less than 25Ma), and the recent duplications of Ago2 and Ago3 in the tsetse fly Glossina morsitans. In each case the tissue specificity of these genes has altered, suggesting functional divergence or innovation, and consistent with the action of dynamic selection pressures across the Argonaute gene family. We find there are large differences in evolutionary rates and gene turnover between pathways, and that paralogs of Ago2, Ago3, and Piwi/Aub show contrasting rates of evolution after duplication. This suggests that Argonautes undergo frequent evolutionary expansions that facilitate functional divergence.

Journal ArticleDOI
TL;DR: The higher plastome degeneration in both these families of endoparasites, Rafflesiaceae and Apodanthaceae, of similar high age, compared with exoparAsites points to a difference of plastomes function between those two modes of parasitic life.
Abstract: The 23 species of mycoheterotrophic or exoparasitic land plants (from 15 genera and 6 families) studied so far all retain a minimal set of 17 of the normally 116 plastome genes. Only Rafflesia lagascae, an endoparasite concealed in its host except when flowering, has been reported as perhaps lacking a plastome, although it still possesses plastid-like compartments. We analyzed two other endoparasites, the African Apodanthaceae Pilostyles aethiopica and the Australian Pilostyles hamiltonii, both living inside Fabaceae. Illumina and 454 data and Sanger resequencing yielded circularized plastomes of 11,348 and 15,167 bp length, with both species containing five possibly functional genes (accD, rps3, rps4, rrn16, rrn23) and two/three pseudogenes (rpoC2 in P. aethiopica and rpl2 and rps12 in both species; rps12 may be functional in P. hamiltonii). Previously known smallest land plant plastomes contain 27-29 genes, making these Apodanthaceae plastomes the most reduced in size and gene content. A similar extent of divergence might have caused the plastome of Rafflesia to escape detection. The higher plastome degeneration in both these families of endoparasites, Rafflesiaceae and Apodanthaceae, of similar high age, compared with exoparasites points to a difference of plastome function between those two modes of parasitic life.

Journal ArticleDOI
TL;DR: The evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products, are recapitulated to establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms.
Abstract: Natural products from microbes have provided humans with beneficial antibiotics for millennia. However, a decline in the pace of antibiotic discovery exerts pressure on human health as antibiotic resistance spreads, a challenge that may better faced by unveiling chemical diversity produced by microbes. Current microbial genome mining approaches have revitalized research into antibiotics, but the empirical nature of these methods limits the chemical space that is explored.Here, we address the problem of finding novel pathways by incorporating evolutionary principles into genome mining. We recapitulated the evolutionary history of twenty-three enzyme families previously uninvestigated in the context of natural product biosynthesis in Actinobacteria, the most proficient producers of natural products. Our genome evolutionary analyses where based on the assumption that expanded-repurposed enzyme families-from central metabolism, occur frequently and thus have the potential to catalyze new conversions in the context of natural products biosynthesis. Our analyses led to the discovery of biosynthetic gene clusters coding for hidden chemical diversity, as validated by comparing our predictions with those from state-of-the-art genome mining tools; as well as experimentally demonstrating the existence of a biosynthetic pathway for arseno-organic metabolites in Streptomyces coelicolor and Streptomyces lividans, Using a gene knockout and metabolite profile combined strategy.As our approach does not rely solely on sequence similarity searches of previously identified biosynthetic enzymes, these results establish the basis for the development of an evolutionary-driven genome mining tool termed EvoMining that complements current platforms. We anticipate that by doing so real 'chemical dark matter' will be unveiled.

Journal ArticleDOI
TL;DR: The transcriptomes of four species within one Symbiodinium clade (Clade B) at ∼20,000 orthologous genes, as well as multiple isoclonal cell lines within species (i.e., cultured strains), expand the genomic resources available for this important symbiont group and emphasize the power of comparative transcriptomics as a method for studying speciation processes and interindividual variation in nonmodel organisms.
Abstract: Reef-building corals depend on symbiotic mutualisms with photosynthetic dinoflagellates in the genus Symbiodinium. This large microalgal group comprises many highly divergent lineages ("Clades A-I") and hundreds of undescribed species. Given their ecological importance, efforts have turned to genomic approaches to characterize the functional ecology of Symbiodinium. To date, investigators have only compared gene expression between representatives from separate clades-the equivalent of contrasting genera or families in other dinoflagellate groups-making it impossible to distinguish between clade-level and species-level functional differences. Here, we examined the transcriptomes of four species within one Symbiodinium clade (Clade B) at ∼20,000 orthologous genes, as well as multiple isoclonal cell lines within species (i.e., cultured strains). These species span two major adaptive radiations within Clade B, each encompassing both host-specialized and ecologically cryptic taxa. Species-specific expression differences were consistently enriched for photosynthesis-related genes, likely reflecting selection pressures driving niche diversification. Transcriptional variation among strains involved fatty acid metabolism and biosynthesis pathways. Such differences among individuals are potentially a major source of physiological variation, contributing to the functional diversity of coral holobionts composed of unique host-symbiont genotype pairings. Our findings expand the genomic resources available for this important symbiont group and emphasize the power of comparative transcriptomics as a method for studying speciation processes and interindividual variation in nonmodel organisms.

Journal ArticleDOI
TL;DR: RNA sequencing revealed that several differentially expressed genes were enriched in gene ontology terms related to blood vessel and respiratory system development, and showed an evident genetic admixture in Tibetan chickens, suggesting a history of introgression from lowland gene pools.
Abstract: Tibetan chicken, unlike their lowland counterparts, exhibit specific adaptations to high-altitude conditions. The genetic mechanisms of such adaptations in highland chickens were determined by resequencing the genomes of four highland (Tibetan and Lhasa White) and four lowland (White Leghorn, Lindian, and Chahua) chicken populations. Our results showed an evident genetic admixture in Tibetan chickens, suggesting a history of introgression from lowland gene pools. Genes showing positive selection in highland populations were related to cardiovascular and respiratory system development, DNA repair, response to radiation, inflammation, and immune responses, indicating a strong adaptation to oxygen scarcity and high-intensity solar radiation. The distribution of allele frequencies of nonsynonymous single nucleotide polymorphisms between highland and lowland populations was analyzed using chi-square test, which showed that several differentially distributed genes with missense mutations were enriched in several functional categories, especially in blood vessel development and adaptations to hypoxia and intense radiation. RNA sequencing revealed that several differentially expressed genes were enriched in gene ontology terms related to blood vessel and respiratory system development. Several candidate genes involved in the development of cardiorespiratory system (FGFR1, CTGF, ADAM9, JPH2, SATB1, BMP4, LOX, LPR, ANGPTL4, and HYAL1), inflammation and immune responses (AIRE, MYO1F, ZAP70, DDX60, CCL19, CD47, JSC, and FAS), DNA repair, and responses to radiation (VCP, ASH2L, and FANCG) were identified to play key roles in the adaptation to high-altitude conditions. Our data provide new insights into the unique adaptations of highland animals to extreme environments.

Journal ArticleDOI
TL;DR: The analysis characterizes P. polycephalum as a prototypical eukaryote with features attributed to the last common ancestor of Amorphea, that is, the Amoebozoa and Opisthokonts, and argues against the later emergence of tyrosine kinase signaling in the opistHokont lineage.
Abstract: Physarum polycephalum is a well-studied microbial eukaryote with unique experimental attributes relative to other experimental model organisms. It has a sophisticated life cycle with several distinct stages including amoebal, flagellated, and plasmodial cells. It is unusual in switching between open and closed mitosis according to specific life-cycle stages. Here we present the analysis of the genome of this enigmatic and important model organism and compare it with closely related species. The genome is littered with simple and complex repeats and the coding regions are frequently interrupted by introns with a mean size of 100 bases. Complemented with extensive transcriptome data, we define approximately 31,000 gene loci, providing unexpected insights into early eukaryote evolution. We describe extensive use of histidine kinase-based two-component systems and tyrosine kinase signaling, the presence of bacterial and plant type photoreceptors (phytochromes, cryptochrome, and phototropin) and of plant-type pentatricopeptide repeat proteins, as well as metabolic pathways, and a cell cycle control system typically found in more complex eukaryotes. Our analysis characterizes P. polycephalum as a prototypical eukaryote with features attributed to the last common ancestor of Amorphea, that is, the Amoebozoa and Opisthokonts. Specifically, the presence of tyrosine kinases in Acanthamoeba and Physarum as representatives of two distantly related subdivisions of Amoebozoa argues against the later emergence of tyrosine kinase signaling in the opisthokont lineage and also against the acquisition by horizontal gene transfer.

Journal ArticleDOI
Quan Lei1, Cong Li1, Zhixiang Zuo1, Chunhua Huang1, Hanhua Cheng1, Rongjia Zhou1 
TL;DR: Pre-RNA splicing is an essential step in generating mature mRNA and RNA trans-splicing combines two separate pre-mRNA molecules to form a chimeric non-co-linear RNA, which may exert a function distinct from its original molecules.
Abstract: Pre-RNA splicing is an essential step in generating mature mRNA. RNA trans-splicing combines two separate pre-mRNA molecules to form a chimeric non-co-linear RNA, which may exert a function distinct from its original molecules. Trans-spliced RNAs may encode novel proteins or serve as noncoding or regulatory RNAs. These novel RNAs not only increase the complexity of the proteome but also provide new regulatory mechanisms for gene expression. An increasing amount of evidence indicates that trans-splicing occurs frequently in both physiological and pathological processes. In addition, mRNA reprogramming based on trans-splicing has been successfully applied in RNA-based therapies for human genetic diseases. Nevertheless, clarifying the extent and evolution of trans-splicing in vertebrates and developing detection methods for trans-splicing remain challenging. In this review, we summarize previous research, highlight recent advances in trans-splicing, and discuss possible splicing mechanisms and functions from an evolutionary viewpoint.

Journal ArticleDOI
TL;DR: The authors showed that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes and finding that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne).
Abstract: The spatial distribution of microbes on our planet is famously formulated in the Baas Becking hypothesis as “everything is everywhere but the environment selects.” While this hypothesis does not strictly rule out patterns caused by geographical effects on ecology and historical founder effects, it does propose that the remarkable dispersal potential of microbes leads to distributions generally shaped by environmental factors rather than geographical distance. By constructing sequence similarity networks from uncultured environmental samples, we show that microbial gene pool distributions are not influenced nearly as much by geography as ecology, thus extending the Bass Becking hypothesis from whole organisms to microbial genes. We find that gene pools are shaped by their broad ecological niche (such as sea water, fresh water, host, and airborne). We find that freshwater habitats act as a gene exchange bridge between otherwise disconnected habitats. Finally, certain antibiotic resistance genes deviate from the general trend of habitat specificity by exhibiting a high degree of cross-habitat mobility. The strong cross-habitat mobility of antibiotic resistance genes is a cause for concern and provides a paradigmatic example of the rate by which genes colonize new habitats when new selective forces emerge.

Journal ArticleDOI
TL;DR: The greatly reduced and highly divergent, yet functional, plastome of the nonphotosynthetic holoparasite Hydnora visseri (Hydnoraceae, Piperales) is presented and a four-stage model of gene reduction is proposed to account for the range of plastid genomes in non photosynthetic plants.
Abstract: Plastid genomes of photosynthetic flowering plants are usually highly conserved in both structure and gene content. However, the plastomes of parasitic and mycoheterotrophic plants may be released from selective constraint due to the reduction or loss of photosynthetic ability. Here we present the greatly reduced and highly divergent, yet functional, plastome of the nonphotosynthetic holoparasite Hydnora visseri (Hydnoraceae, Piperales). The plastome is 27 kb in length, with 24 genes encoding ribosomal proteins, ribosomal RNAs, tRNAs, and a few nonbioenergetic genes, but no genes related to photosynthesis. The inverted repeat and the small single copy region are only approximately 1.5 kb, and intergenic regions have been drastically reduced. Despite extreme reduction, gene order and orientation are highly similar to the plastome of Piper cenocladum, a related photosynthetic plant in Piperales. Gene sequences in Hydnora are highly divergent and several complementary approaches using the highest possible sensitivity were required for identification and annotation of this plastome. Active transcription is detected for all of the protein-coding genes in the plastid genome, and one of two introns is appropriately spliced out of rps12 transcripts. The whole-genome shotgun read depth is 1,400× coverage for the plastome, whereas the mitochondrial genome is covered at 40× and the nuclear genome at 2×. Despite the extreme reduction of the genome and high sequence divergence, the presence of syntenic, long transcriptionally active open-reading frames with distant similarity to other plastid genomes and a high plastome stoichiometry relative to the mitochondrial and nuclear genomes suggests that the plastome remains functional in H. visseri. A four-stage model of gene reduction, including the potential for complete plastome loss, is proposed to account for the range of plastid genomes in nonphotosynthetic plants.

Journal ArticleDOI
TL;DR: A converging body of evidence points to a scenario in which ESX-encoding plasmids have been a major driving force for acquisition and diversification of type VII systems in mycobacteria, which likely played (and possibly still play) important roles in the adaptation to new environments and hosts during evolution ofMycobacterial pathogenesis.
Abstract: In mycobacteria, various type VII secretion systems corresponding to different ESX (ESAT-6 secretory) types, are contributing to pathogenicity, iron acquisition, and/or conjugation. In addition to the known chromosomal ESX loci, the existence of plasmid-encoded ESX systems was recently reported. To investigate the potential role of ESX-encoding plasmids on mycobacterial evolution we analysed a large representative collection of mycobacterial genomes, including both chromosomal and plasmid-borne sequences. Data obtained for chromosomal ESX loci confirmed the previous 5 classical ESX types and identified a novel mycobacterial ESX-4-like type, termed ESX-4-bis. Moreover, analysis of the plasmid-encoded ESX loci showed extensive diversification, with at least 7 new ESX profiles, identified. Three of them (ESX-P clusters 1, 2 and 3) were found in multiple plasmids, while four corresponded to singletons. Our phylogenetic and gene-order-analyses revealed two main groups of ESX types: i) ancestral types, including ESX-4 and ESX-4-like systems from mycobacterial and non-mycobacterial actinobacteria, and ii) mycobacteria-specific ESX systems, including ESX-1-2-3-5 systems and the plasmid-encoded ESX types. Synteny analysis revealed that ESX-P systems are part of phylogenetic groups that derived from a common ancestor, which diversified and resulted in the different ESX types through extensive gene rearrangements. A converging body of evidence, derived from composition bias-, phylogenetic- and synteny analyses points to a scenario in which ESX-encoding plasmids have been a major driving force for acquisition and diversification of type VII systems in mycobacteria, which likely played (and possibly still play) important roles in the adaptation to new environments and hosts during evolution of mycobacterial pathogenesis.

Journal ArticleDOI
TL;DR: The genome of both Buchnera and S. symbiotica is sequenced to provide further evidence for the previously proposed establishment of a secondary co-obligate endosymbiont in the common ancestor of the Lachninae aphids, and it is proposed that the putative convergent split of the tryptophan biosynthetic role between Buchnero-Cedri and Cinara-Cupressobium could be behind the establishment of S
Abstract: Virtually all aphids (Aphididae) harbor Buchnera aphidicola as an obligate endosymbiont to compensate nutritional deficiencies arising from their phloem diet. Many species within the Lachninae subfamily seem to be consistently associated also with Serratia symbiotica We have previously shown that both Cinara (Cinara) cedri and Cinara (Cupressobium) tujafilina (Lachninae: Eulachnini tribe) have indeed established co-obligate associations with both Buchnera and S. symbiotica However, while Buchnera genomes of both Cinara species are similar, genome degradation differs greatly between the two S. symbiotica strains. To gain insight into the essentiality and degree of integration of S. symbiotica within the Lachninae, we sequenced the genome of both Buchnera and S. symbiotica endosymbionts from the distantly related aphid Tuberolachnus salignus (Lachninae: Tuberolachnini tribe). We found a striking level of similarity between the endosymbiotic system of this aphid and that of C. cedri In both aphid hosts, S. symbiotica possesses a highly reduced genome and is found exclusively intracellularly inside bacteriocytes. Interestingly, T. salignus' endosymbionts present the same tryptophan biosynthetic metabolic complementation as C. cedri's, which is not present in C. tujafilina's. Moreover, we corroborate the riboflavin-biosynthetic-role take-over/rescue by S. symbiotica in T. salignus, and therefore, provide further evidence for the previously proposed establishment of a secondary co-obligate endosymbiont in the common ancestor of the Lachninae aphids. Finally, we propose that the putative convergent split of the tryptophan biosynthetic role between Buchnera and S. symbiotica could be behind the establishment of S. symbiotica as an obligate intracellular symbiont and the triggering of further genome degradation.

Journal ArticleDOI
TL;DR: Tests for positive selection in relation to photic niche reveal evidence for adaptive evolution in UV-sensitive opsins in day-flying insects in general, and in LWS opsins of day- flying Lepidoptera specifically.
Abstract: Opsin proteins covalently bind to small molecular chromophores and each protein-chromophore complex is sensitive to particular wavelengths of light. Multiple opsins with different wavelength absorbance peaks are required for color vision. Comparing opsin responses is challenging at low light levels, explaining why color vision is often lost in nocturnal species. Here, we investigated opsin evolution in 27 phylogenetically diverse insect species including several transitions between photic niches (nocturnal, diurnal, and crepuscular). We find widespread conservation of five distinct opsin genes, more than commonly considered. These comprise one c-opsin plus four r-opsins (long wavelength sensitive or LWS, blue sensitive, ultra violet [UV] sensitive and the often overlooked Rh7 gene). Several recent opsin gene duplications are also detected. The diversity of opsin genes is consistent with color vision in diurnal, crepuscular, and nocturnal insects. Tests for positive selection in relation to photic niche reveal evidence for adaptive evolution in UV-sensitive opsins in day-flying insects in general, and in LWS opsins of day-flying Lepidoptera specifically.

Journal ArticleDOI
TL;DR: It is hypothesize that cells cannot tune their horizontal transfer rates to be below the threshold required for parasite persistence without experiencing highly detrimental side-effects, and suggests that microbial populations cannot purge parasites while escaping Muller’s ratchet.
Abstract: Almost all cellular life forms are hosts to diverse genetic parasites with various levels of autonomy including plasmids, transposons and viruses. Theoretical modeling of the evolution of primordial replicators indicates that parasites (cheaters) necessarily evolve in such systems and can be kept at bay primarily via compartmentalization. Given the (near) ubiquity, abundance and diversity of genetic parasites, the question becomes pertinent: are such parasites intrinsic to life? At least in prokaryotes, the persistence of parasites is linked to the rate of horizontal gene transfer (HGT). We mathematically derive the threshold value of the minimal transfer rate required for selfish element persistence, depending on the element duplication and loss rates as well as the cost to the host. Estimation of the characteristic gene duplication, loss and transfer rates for transposons, plasmids and virus-related elements in multiple groups of diverse bacteria and archaea indicates that most of these rates are compatible with the long term persistence of parasites. Notably, a small but non-zero rate of HGT is also required for the persistence of non-parasitic genes. We hypothesize that cells cannot tune their horizontal transfer rates to be below the threshold required for parasite persistence without experiencing highly detrimental side-effects. As a lower boundary to the minimum DNA transfer rate that a cell can withstand, we consider the process of genome degradation and mutational meltdown of populations through Muller's ratchet. A numerical assessment of this hypothesis suggests that microbial populations cannot purge parasites while escaping Muller's ratchet. Thus, genetic parasites appear to be virtually inevitable in cellular organisms.

Journal ArticleDOI
TL;DR: In insights into the genome structure, genome organization, molecular basis of variation, and pathogenicity of P. triticina, it is demonstrated that Race77 has more virulence factors than Race 106, which may be responsible for the greater degree of adaptation of this pathogen.
Abstract: Leaf rust is one of the most important diseases of wheat and is caused by Puccinia triticina, a highly variable rust pathogen prevalent worldwide. Decoding the genome of this pathogen will help in unraveling the molecular basis of its evolution and in the identification of genes responsible for its various biological functions. We generated high quality draft genome sequences (approximately 100- 106 Mb) of two races of P. triticina; the variable and virulent Race77 and the old, avirulent Race106. The genomes of races 77 and 106 had 33X and 27X coverage, respectively. We predicted 27678 and 26384 genes, with average lengths of 1,129 and 1,086 bases in races 77 and 106, respectively and found that the genomes consisted of 37.49% and 39.99% repetitive sequences. Genome wide comparative analysis revealed that Race77 differs substantially from Race106 with regard to segmental duplication (SD), repeat element, and SNP/InDel characteristics. Comparative analyses showed that Race 77 is a recent, highly variable and adapted Race compared with Race106. Further sequence analyses of 13 additional pathotypes of Race77 clearly differentiated the recent, active and virulent, from the older pathotypes. Average densities of 2.4 SNPs and 0.32 InDels per kb were obtained for all P. triticina pathotypes. Secretome analysis demonstrated that Race77 has more virulence factors than Race 106, which may be responsible for the greater degree of adaptation of this pathogen. We also found that genes under greater selection pressure were conserved in the genomes of both races, and may affect functions crucial for the higher levels of virulence factors in Race77. This study provides insights into the genome structure, genome organization, molecular basis of variation, and pathogenicity of P. triticina The genome sequence data generated in this study have been submitted to public domain databases and will be an important resource for comparative genomics studies of the more than 4000 existing Puccinia species.