scispace - formally typeset
Search or ask a question

Showing papers in "Genetics in 2017"


Journal ArticleDOI
01 May 2017-Genetics
TL;DR: An overview of the currently available information on the natural environment of Caenorhabditis elegans focuses on the biotic environment, which is usually less predictable and thus can create high selective constraints that are likely to have had a strong impact on C. elegans evolution.
Abstract: Organisms evolve in response to their natural environment. Consideration of natural ecological parameters are thus of key importance for our understanding of an organism’s biology. Curiously, the natural ecology of the model species Caenorhabditis elegans has long been neglected, even though this nematode has become one of the most intensively studied models in biological research. This lack of interest changed ∼10 yr ago. Since then, an increasing number of studies have focused on the nematode’s natural ecology. Yet many unknowns still remain. Here, we provide an overview of the currently available information on the natural environment of C. elegans. We focus on the biotic environment, which is usually less predictable and thus can create high selective constraints that are likely to have had a strong impact on C. elegans evolution. This nematode is particularly abundant in microbe-rich environments, especially rotting plant matter such as decomposing fruits and stems. In this environment, it is part of a complex interaction network, which is particularly shaped by a species-rich microbial community. These microbes can be food, part of a beneficial gut microbiome, parasites and pathogens, and possibly competitors. C. elegans is additionally confronted with predators; it interacts with vector organisms that facilitate dispersal to new habitats, and also with competitors for similar food environments, including competitors from congeneric and also the same species. Full appreciation of this nematode’s biology warrants further exploration of its natural environment and subsequent integration of this information into the well-established laboratory-based research approaches.

305 citations


Journal ArticleDOI
01 Apr 2017-Genetics
TL;DR: Flies remain a valuable tool for both discovery of novel molecules and deep mechanistic understanding of sleep and circadian rhythms, and are beginning to understand how the identified molecules and neurons interact with each other, and with the environment, to regulate sleep.
Abstract: The advantages of the model organism Drosophila melanogaster, including low genetic redundancy, functional simplicity, and the ability to conduct large-scale genetic screens, have been essential for understanding the molecular nature of circadian (∼24 hr) rhythms, and continue to be valuable in discovering novel regulators of circadian rhythms and sleep. In this review, we discuss the current understanding of these interrelated biological processes in Drosophila and the wider implications of this research. Clock genes period and timeless were first discovered in large-scale Drosophila genetic screens developed in the 1970s. Feedback of period and timeless on their own transcription forms the core of the molecular clock, and accurately timed expression, localization, post-transcriptional modification, and function of these genes is thought to be critical for maintaining the circadian cycle. Regulators, including several phosphatases and kinases, act on different steps of this feedback loop to ensure strong and accurately timed rhythms. Approximately 150 neurons in the fly brain that contain the core components of the molecular clock act together to translate this intracellular cycling into rhythmic behavior. We discuss how different groups of clock neurons serve different functions in allowing clocks to entrain to environmental cues, driving behavioral outputs at different times of day, and allowing flexible behavioral responses in different environmental conditions. The neuropeptide PDF provides an important signal thought to synchronize clock neurons, although the details of how PDF accomplishes this function are still being explored. Secreted signals from clock neurons also influence rhythms in other tissues. SLEEP is, in part, regulated by the circadian clock, which ensures appropriate timing of sleep, but the amount and quality of sleep are also determined by other mechanisms that ensure a homeostatic balance between sleep and wake. Flies have been useful for identifying a large set of genes, molecules, and neuroanatomic loci important for regulating sleep amount. Conserved aspects of sleep regulation in flies and mammals include wake-promoting roles for catecholamine neurotransmitters and involvement of hypothalamus-like regions, although other neuroanatomic regions implicated in sleep in flies have less clear parallels. Sleep is also subject to regulation by factors such as food availability, stress, and social environment. We are beginning to understand how the identified molecules and neurons interact with each other, and with the environment, to regulate sleep. Drosophila researchers can also take advantage of increasing mechanistic understanding of other behaviors, such as learning and memory, courtship, and aggression, to understand how sleep loss impacts these behaviors. Flies thus remain a valuable tool for both discovery of novel molecules and deep mechanistic understanding of sleep and circadian rhythms.

287 citations


Journal ArticleDOI
01 Nov 2017-Genetics
TL;DR: This work focuses on discoveries that were made using C. elegans of cell autonomous and nonautonomous pathways controlling the mitochondrial unfolded protein response, as well as mechanisms for degradation of paternal mitochondria after fertilization.
Abstract: Mitochondria are best known for harboring pathways involved in ATP synthesis through the tricarboxylic acid cycle and oxidative phosphorylation. Major advances in understanding these roles were made with Caenorhabditiselegans mutants affecting key components of the metabolic pathways. These mutants have not only helped elucidate some of the intricacies of metabolism pathways, but they have also served as jumping off points for pharmacology, toxicology, and aging studies. The field of mitochondria research has also undergone a renaissance, with the increased appreciation of the role of mitochondria in cell processes other than energy production. Here, we focus on discoveries that were made using C. elegans, with a few excursions into areas that were studied more thoroughly in other organisms, like mitochondrial protein import in yeast. Advances in mitochondrial biogenesis and membrane dynamics were made through the discoveries of novel functions in mitochondrial fission and fusion proteins. Some of these functions were only apparent through the use of diverse model systems, such as C. elegans Studies of stress responses, exemplified by mitophagy and the mitochondrial unfolded protein response, have also benefitted greatly from the use of model organisms. Recent developments include the discoveries in C. elegans of cell autonomous and nonautonomous pathways controlling the mitochondrial unfolded protein response, as well as mechanisms for degradation of paternal mitochondria after fertilization. The evolutionary conservation of many, if not all, of these pathways ensures that results obtained with C. elegans are equally applicable to studies of human mitochondria in health and disease.

237 citations


Journal ArticleDOI
01 Feb 2017-Genetics
TL;DR: It is shown that resistance to standard CGD approaches should evolve almost inevitably in most natural populations, unless repair of CGD-induced cleavage via NHEJ can be effectively suppressed, or resistance costs are on par with those of the driver.
Abstract: CRISPR/Cas9 gene drive (CGD) promises to be a highly adaptable approach for spreading genetically engineered alleles throughout a species, even if those alleles impair reproductive success. CGD has been shown to be effective in laboratory crosses of insects, yet it remains unclear to what extent potential resistance mechanisms will affect the dynamics of this process in large natural populations. Here we develop a comprehensive population genetic framework for modeling CGD dynamics, which incorporates potential resistance mechanisms as well as random genetic drift. Using this framework, we calculate the probability that resistance against CGD evolves from standing genetic variation, de novo mutation of wild-type alleles, or cleavage repair by nonhomologous end joining (NHEJ)-a likely by-product of CGD itself. We show that resistance to standard CGD approaches should evolve almost inevitably in most natural populations, unless repair of CGD-induced cleavage via NHEJ can be effectively suppressed, or resistance costs are on par with those of the driver. The key factor determining the probability that resistance evolves is the overall rate at which resistance alleles arise at the population level by mutation or NHEJ. By contrast, the conversion efficiency of the driver, its fitness cost, and its introduction frequency have only minor impact. Our results shed light on strategies that could facilitate the engineering of drivers with lower resistance potential, and motivate the possibility to embrace resistance as a possible mechanism for controlling a CGD approach. This study highlights the need for careful modeling of the population dynamics of CGD prior to the actual release of a driver construct into the wild.

236 citations


Journal ArticleDOI
01 Oct 2017-Genetics
TL;DR: This review outlines lipid and carbohydrate structures as well as biosynthesis and breakdown pathways that have been characterized in C. elegans and brings attention to functional studies using mutant strains that reveal physiological roles for specific lipids and carbohydrates during development, aging, and adaptation to changing environmental conditions.
Abstract: Lipid and carbohydrate metabolism are highly conserved processes that affect nearly all aspects of organismal biology. Caenorhabditis elegans eat bacteria, which consist of lipids, carbohydrates, and proteins that are broken down during digestion into fatty acids, simple sugars, and amino acid precursors. With these nutrients, C. elegans synthesizes a wide range of metabolites that are required for development and behavior. In this review, we outline lipid and carbohydrate structures as well as biosynthesis and breakdown pathways that have been characterized in C. elegans We bring attention to functional studies using mutant strains that reveal physiological roles for specific lipids and carbohydrates during development, aging, and adaptation to changing environmental conditions.

187 citations


Journal ArticleDOI
01 Jun 2017-Genetics
TL;DR: The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.
Abstract: The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.

180 citations


Journal ArticleDOI
01 May 2017-Genetics
TL;DR: The concept of topology weighting, a method for quantifying relationships between taxa that are not necessarily monophyletic, and visualizing how these relationships change across the genome is introduced, suitable for exploring relationships in almost any genomic dataset.
Abstract: We introduce the concept of topology weighting, a method for quantifying relationships between taxa that are not necessarily monophyletic, and visualizing how these relationships change across the genome. A given set of taxa can be related in a limited number of ways, but if each taxon is represented by multiple sequences, the number of possible topologies becomes very large. Topology weighting reduces this complexity by quantifying the contribution of each taxon topology to the full tree. We describe our method for topology weighting by iterative sampling of subtrees (Twisst), and test it on both simulated and real genomic data. Overall, we show that this is an informative and versatile approach, suitable for exploring relationships in almost any genomic dataset. Scripts to implement the method described are available at http://github.com/simonhmartin/twisst.

173 citations


Journal ArticleDOI
01 Jul 2017-Genetics
TL;DR: A tractable model of ordinary differential equations for the evolution of allele frequencies that is closely related to the diffusion approximation but avoids many of its limitations and approximations is proposed.
Abstract: Understanding variation in allele frequencies across populations is a central goal of population genetics. Classical models for the distribution of allele frequencies, using forward simulation, coalescent theory, or the diffusion approximation, have been applied extensively for demographic inference, medical study design, and evolutionary studies. Here we propose a tractable model of ordinary differential equations for the evolution of allele frequencies that is closely related to the diffusion approximation but avoids many of its limitations and approximations. We show that the approach is typically faster, more numerically stable, and more easily generalizable than the state-of-the-art software implementation of the diffusion approximation. We present a number of applications to human sequence data, including demographic inference with a five-population joint frequency spectrum and a discussion of the robustness of the out-of-Africa model inference to the choice of modern population.

168 citations


Journal ArticleDOI
01 Aug 2017-Genetics
TL;DR: A large body of genetic and biochemical experiments in Drosophila onPolycomb group (PcG) and Trithorax group (TrxG) genes encode important regulators of development and differentiation in metazoans.
Abstract: Polycomb group (PcG) and Trithorax group (TrxG) genes encode important regulators of development and differentiation in metazoans. These two groups of genes were discovered in Drosophila by their opposing effects on homeotic gene (Hox) expression. PcG genes collectively behave as genetic repressors of Hox genes, while the TrxG genes are necessary for HOX gene expression or function. Biochemical studies showed that many PcG proteins are present in two protein complexes, Polycomb repressive complexes 1 and 2, which repress transcription via chromatin modifications. TrxG proteins activate transcription via a variety of mechanisms. Here we summarize the large body of genetic and biochemical experiments in Drosophila on these two important groups of genes.

160 citations


Journal ArticleDOI
01 Mar 2017-Genetics
TL;DR: This review describes the main concepts, methods, and landmarks of molecular population genetics, using the Drosophila model as a reference, and describes the different genetic data sets made available by advances in molecular technologies, and the theoretical developments fostered by these data.
Abstract: Molecular population genetics aims to explain genetic variation and molecular evolution from population genetics principles. The field was born 50 years ago with the first measures of genetic variation in allozyme loci, continued with the nucleotide sequencing era, and is currently in the era of population genomics. During this period, molecular population genetics has been revolutionized by progress in data acquisition and theoretical developments. The conceptual elegance of the neutral theory of molecular evolution or the footprint carved by natural selection on the patterns of genetic variation are two examples of the vast number of inspiring findings of population genetics research. Since the inception of the field, Drosophila has been the prominent model species: molecular variation in populations was first described in Drosophila and most of the population genetics hypotheses were tested in Drosophila species. In this review, we describe the main concepts, methods, and landmarks of molecular population genetics, using the Drosophila model as a reference. We describe the different genetic data sets made available by advances in molecular technologies, and the theoretical developments fostered by these data. Finally, we review the results and new insights provided by the population genomics approach, and conclude by enumerating challenges and new lines of inquiry posed by increasingly large population scale sequence data.

159 citations


Journal ArticleDOI
01 May 2017-Genetics
TL;DR: A flexible and computationally tractable method, called Fit∂a∂i, is presented to estimate the DFE of new mutations using the site frequency spectrum from a large number of individuals, suggesting that nearly neutral forces play a larger role in human evolution than previously thought.
Abstract: The distribution of fitness effects (DFE) has considerable importance in population genetics. To date, estimates of the DFE come from studies using a small number of individuals. Thus, estimates of the proportion of moderately to strongly deleterious new mutations may be unreliable because such variants are unlikely to be segregating in the data. Additionally, the true functional form of the DFE is unknown, and estimates of the DFE differ significantly between studies. Here we present a flexible and computationally tractable method, called Fit∂a∂i, to estimate the DFE of new mutations using the site frequency spectrum from a large number of individuals. We apply our approach to the frequency spectrum of 1300 Europeans from the Exome Sequencing Project ESP6400 data set, 1298 Danes from the LuCamp data set, and 432 Europeans from the 1000 Genomes Project to estimate the DFE of deleterious nonsynonymous mutations. We infer significantly fewer (0.38-0.84 fold) strongly deleterious mutations with selection coefficient |s| > 0.01 and more (1.24-1.43 fold) weakly deleterious mutations with selection coefficient |s| < 0.001 compared to previous estimates. Furthermore, a DFE that is a mixture distribution of a point mass at neutrality plus a gamma distribution fits better than a gamma distribution in two of the three data sets. Our results suggest that nearly neutral forces play a larger role in human evolution than previously thought.

Journal ArticleDOI
01 Sep 2017-Genetics
TL;DR: It is discussed how merging human genetics with model organism research guides experimental studies to solve these medical mysteries, gain new insights into disease pathogenesis, and uncover new therapeutic strategies.
Abstract: Efforts to identify the genetic underpinnings of rare undiagnosed diseases increasingly involve the use of next-generation sequencing and comparative genomic hybridization methods. These efforts are limited by a lack of knowledge regarding gene function, and an inability to predict the impact of genetic variation on the encoded protein function. Diagnostic challenges posed by undiagnosed diseases have solutions in model organism research, which provides a wealth of detailed biological information. Model organism geneticists are by necessity experts in particular genes, gene families, specific organs, and biological functions. Here, we review the current state of research into undiagnosed diseases, highlighting large efforts in North America and internationally, including the Undiagnosed Diseases Network (UDN) (Supplemental Material, File S1) and UDN International (UDNI), the Centers for Mendelian Genomics (CMG), and the Canadian Rare Diseases Models and Mechanisms Network (RDMM). We discuss how merging human genetics with model organism research guides experimental studies to solve these medical mysteries, gain new insights into disease pathogenesis, and uncover new therapeutic strategies.

Journal ArticleDOI
01 Oct 2017-Genetics
TL;DR: Multivariable Mendelian randomization using summarized genetic data provides a rapid and accessible analytic strategy that can be undertaken using publicly available data to better understand causal mechanisms.
Abstract: Mendelian randomization is the use of genetic variants as instrumental variables to estimate causal effects of risk factors on outcomes. The total causal effect of a risk factor is the change in the outcome resulting from intervening on the risk factor. This total causal effect may potentially encompass multiple mediating mechanisms. For a proposed mediator, the direct effect of the risk factor is the change in the outcome resulting from a change in the risk factor, keeping the mediator constant. A difference between the total effect and the direct effect indicates that the causal pathway from the risk factor to the outcome acts at least in part via the mediator (an indirect effect). Here, we show that Mendelian randomization estimates of total and direct effects can be obtained using summarized data on genetic associations with the risk factor, mediator, and outcome, potentially from different data sources. We perform simulations to test the validity of this approach when there is unmeasured confounding and/or bidirectional effects between the risk factor and mediator. We illustrate this method using the relationship between age at menarche and risk of breast cancer, with body mass index (BMI) as a potential mediator. We show an inverse direct causal effect of age at menarche on risk of breast cancer (independent of BMI), and a positive indirect effect via BMI. In conclusion, multivariable Mendelian randomization using summarized genetic data provides a rapid and accessible analytic strategy that can be undertaken using publicly available data to better understand causal mechanisms.

Journal ArticleDOI
01 Jun 2017-Genetics
TL;DR: Joint linkage mapping for two major adaptive traits, flowering time and plant height, validate the NAM resource for trait mapping in sorghum, and demonstrate the value of NAM for dissection of adaptive traits.
Abstract: Adaptation of domesticated species to diverse agroclimatic regions has led to abundant trait diversity. However, the resulting population structure and genetic heterogeneity confounds association mapping of adaptive traits. To address this challenge in sorghum [Sorghum bicolor (L.) Moench]-a widely adapted cereal crop-we developed a nested association mapping (NAM) population using 10 diverse global lines crossed with an elite reference line RTx430. We characterized the population of 2214 recombinant inbred lines at 90,000 SNPs using genotyping-by-sequencing. The population captures ∼70% of known global SNP variation in sorghum, and 57,411 recombination events. Notably, recombination events were four- to fivefold enriched in coding sequences and 5' untranslated regions of genes. To test the power of the NAM population for trait dissection, we conducted joint linkage mapping for two major adaptive traits, flowering time and plant height. We precisely mapped several known genes for these two traits, and identified several additional QTL. Considering all SNPs simultaneously, genetic variation accounted for 65% of flowering time variance and 75% of plant height variance. Further, we directly compared NAM to genome-wide association mapping (using panels of the same size) and found that flowering time and plant height QTL were more consistently identified with the NAM population. Finally, for simulated QTL under strong selection in diversity panels, the power of QTL detection was up to three times greater for NAM vs. association mapping with a diverse panel. These findings validate the NAM resource for trait mapping in sorghum, and demonstrate the value of NAM for dissection of adaptive traits.

Journal ArticleDOI
01 Jul 2017-Genetics
TL;DR: The results identify additional components of theRNAi inheritance machinery whose conservation provides insights into the molecular mechanism of RNAi inheritance, further the understanding of how the RNAi inherited machinery promotes germline immortality, and show that HRDE-2 couples the inheritance AgoHRDE-1 with the small RNAs it needs to direct RNAi Inheritance and germ line immortality.
Abstract: Gene silencing mediated by dsRNA (RNAi) can persist for multiple generations in Caenorhabditis elegans (termed RNAi inheritance) Here we describe the results of a forward genetic screen in C elegans that has identified six factors required for RNAi inheritance: GLH-1/VASA, PUP-1/CDE-1, MORC-1, SET-32, and two novel nematode-specific factors that we term here (heritable RNAi defective) HRDE-2 and HRDE-4 The new RNAi inheritance factors exhibit mortal germline (Mrt) phenotypes, which we show is likely caused by epigenetic deregulation in germ cells We also show that HRDE-2 contributes to RNAi inheritance by facilitating the binding of small RNAs to the inheritance Argonaute (Ago) HRDE-1 Together, our results identify additional components of the RNAi inheritance machinery whose conservation provides insights into the molecular mechanism of RNAi inheritance, further our understanding of how the RNAi inheritance machinery promotes germline immortality, and show that HRDE-2 couples the inheritance Ago HRDE-1 with the small RNAs it needs to direct RNAi inheritance and germline immortality

Journal ArticleDOI
01 Mar 2017-Genetics
TL;DR: An approach to estimate the nonlinear scales of arbitrary genotype-phenotype maps and extract high-order epistasis is developed, which provides strong evidence for extensive high- order epistasis, even after nonlinear scale is taken into account.
Abstract: High-order epistasis has been observed in many genotype-phenotype maps. These multi-way interactions between mutations may be useful for dissecting complex traits and could have profound implications for evolution. Alternatively, they could be a statistical artifact. High-order epistasis models assume the effects of mutations should add, when they could in fact multiply or combine in some other nonlinear way. A mismatch in the “scale” of the epistasis model and the scale of the underlying map would lead to spurious epistasis. In this article, we develop an approach to estimate the nonlinear scales of arbitrary genotype-phenotype maps. We can then linearize these maps and extract high-order epistasis. We investigated seven experimental genotype-phenotype maps for which high-order epistasis had been reported previously. We find that five of the seven maps exhibited nonlinear scales. Interestingly, even after accounting for nonlinearity, we found statistically significant high-order epistasis in all seven maps. The contributions of high-order epistasis to the total variation ranged from 2.2 to 31.0%, with an average across maps of 12.7%. Our results provide strong evidence for extensive high-order epistasis, even after nonlinear scale is taken into account. Further, we describe a simple method to estimate and account for nonlinearity in genotype-phenotype maps.

Journal ArticleDOI
01 Nov 2017-Genetics
TL;DR: The data suggest that amh might act as a guardian to control the balance between proliferation and differentiation of male germ cells, whereas dmrt1 might be required for the maintenance, self-renewal, and differentiate ofmale germ cells.
Abstract: Spermatogenesis is a fundamental process in male reproductive biology and depends on precise balance between self-renewal and differentiation of male germ cells. However, the regulative factors for controlling the balance are poorly understood. In this study, we examined the roles of amh and dmrt1 in male germ cell development by generating their mutants with Crispr/Cas9 technology in zebrafish. Amh mutant zebrafish displayed a female-biased sex ratio, and both male and female amh mutants developed hypertrophic gonads due to uncontrolled proliferation and impaired differentiation of germ cells. A large number of proliferating spermatogonium-like cells were observed within testicular lobules of the amh-mutated testes, and they were demonstrated to be both Vasa- and PH3-positive. Moreover, the average number of Sycp3- and Vasa-positive cells in the amh mutants was significantly lower than in wild-type testes, suggesting a severely impaired differentiation of male germ cells. Conversely, all the dmrt1-mutated testes displayed severe testicular developmental defects and gradual loss of all Vasa-positive germ cells by inhibiting their self-renewal and inducing apoptosis. In addition, several germ cell and Sertoli cell marker genes were significantly downregulated, whereas a prominent increase of Insl3-positive Leydig cells was revealed by immunohistochemical analysis in the disorganized dmrt1-mutated testes. Our data suggest that amh might act as a guardian to control the balance between proliferation and differentiation of male germ cells, whereas dmrt1 might be required for the maintenance, self-renewal, and differentiation of male germ cells. Significantly, this study unravels novel functions of amh gene in fish.

Journal ArticleDOI
01 Jul 2017-Genetics
TL;DR: Since the pig is a well-suited animal for modeling the human digestive tract, M-BLUP might be beneficial for predicting human predispositions to some diseases, and, consequently, for preventative and personalized medicine.
Abstract: The aim of the present study was to analyze the interplay between gastrointestinal tract (GIT) microbiota, host genetics, and complex traits in pigs using extended quantitative-genetic methods. The study design consisted of 207 pigs that were housed and slaughtered under standardized conditions, and phenotyped for daily gain, feed intake, and feed conversion rate. The pigs were genotyped with a standard 60 K SNP chip. The GIT microbiota composition was analyzed by 16S rRNA gene amplicon sequencing technology. Eight from 49 investigated bacteria genera showed a significant narrow sense host heritability, ranging from 0.32 to 0.57. Microbial mixed linear models were applied to estimate the microbiota variance for each complex trait. The fraction of phenotypic variance explained by the microbial variance was 0.28, 0.21, and 0.16 for daily gain, feed conversion, and feed intake, respectively. The SNP data and the microbiota composition were used to predict the complex traits using genomic best linear unbiased prediction (G-BLUP) and microbial best linear unbiased prediction (M-BLUP) methods, respectively. The prediction accuracies of G-BLUP were 0.35, 0.23, and 0.20 for daily gain, feed conversion, and feed intake, respectively. The corresponding prediction accuracies of M-BLUP were 0.41, 0.33, and 0.33. Thus, in addition to SNP data, microbiota abundances are an informative source of complex trait predictions. Since the pig is a well-suited animal for modeling the human digestive tract, M-BLUP, in addition to G-BLUP, might be beneficial for predicting human predispositions to some diseases, and, consequently, for preventative and personalized medicine.

Journal ArticleDOI
01 Jul 2017-Genetics
TL;DR: It is proved for the first time that, even not in HWE, the multiple-loci NOIA method is equivalent to construct epistatic genomic relationship matrices for higher-order interactions using Hadamard products of additive and dominant genomic orthogonal relationships.
Abstract: Genomic prediction methods based on multiple markers have potential to include nonadditive effects in prediction and analysis of complex traits. However, most developments assume a Hardy–Weinberg equilibrium (HWE). Statistical approaches for genomic selection that account for dominance and epistasis in a general context, without assuming HWE (e.g., crosses or homozygous lines), are therefore needed. Our method expands the natural and orthogonal interactions (NOIA) approach, which builds incidence matrices based on genotypic (not allelic) frequencies, to include genome-wide epistasis for an arbitrary number of interacting loci in a genomic evaluation context. This results in an orthogonal partition of the variances, which is not warranted otherwise. We also present the partition of variance as a function of genotypic values and frequencies following Cockerham’s orthogonal contrast approach. Then we prove for the first time that, even not in HWE, the multiple-loci NOIA method is equivalent to construct epistatic genomic relationship matrices for higher-order interactions using Hadamard products of additive and dominant genomic orthogonal relationships. A standardization based on the trace of the relationship matrices is, however, needed. We illustrate these results with two simulated F1 (not in HWE) populations, either in linkage equilibrium (LE), or in linkage disequilibrium (LD) and divergent selection, and pure biological dominant pairwise epistasis. In the LE case, correct and orthogonal estimates of variances were obtained using NOIA genomic relationships but not if relationships were constructed assuming HWE. For the LD simulation, differences were smaller, due to the smaller deviation of the F1 from HWE. Wrongly assuming HWE to build genomic relationships and estimate variance components yields biased estimates, inflates the total genetic variance, and the estimates are not empirically orthogonal. The NOIA method to build genomic relationships, coupled with the use of Hadamard products for epistatic terms, allows the obtaining of correct estimates in populations either in HWE or not in HWE, and extends to any order of epistatic interactions.

Journal ArticleDOI
25 Sep 2017-Genetics
TL;DR: A hierarchical probabilistic framework is developed that extends previous methods to infer DFE and α from polymorphism data alone and is compared with one of the most widely used inference methods available and applies it on a recently published chimpanzee exome data set.
Abstract: The distribution of fitness effects (DFE) encompasses the fraction of deleterious, neutral, and beneficial mutations. It conditions the evolutionary trajectory of populations, as well as the rate o ...

Journal ArticleDOI
01 Aug 2017-Genetics
TL;DR: Treatment of population structure, relatedness, and inbreeding are recast to make explicit that the parameters of interest involve the differences in degrees of allelic dependence between the target and the reference sets of alleles, and so can be negative.
Abstract: Many population genetic activities, ranging from evolutionary studies to association mapping, to forensic identification, rely on appropriate estimates of population structure or relatedness. All applications require recognition that quantities with an underlying meaning of allelic dependence are not defined in an absolute sense, but instead are made "relative to" some set of alleles other than the target set. The 1984 Weir and Cockerham [Formula: see text] estimate made explicit that the reference set of alleles was across populations, whereas standard kinship estimates do not make the reference explicit. Weir and Cockerham stated that their [Formula: see text] estimates were for independent populations, and standard kinship estimates have an implicit assumption that pairs of individuals in a study sample, other than the target pair, are unrelated or are not inbred. However, populations lose independence when there is migration between them, and dependencies between pairs of individuals in a population exist for more than one target pair. We have therefore recast our treatments of population structure, relatedness, and inbreeding to make explicit that the parameters of interest involve the differences in degrees of allelic dependence between the target and the reference sets of alleles, and so can be negative. We take the reference set to be the population from which study individuals have been sampled. We provide simple moment estimates of these parameters, phrased in terms of allelic matching within and between individuals for relatedness and inbreeding, or within and between populations for population structure. A multi-level hierarchy of alleles within individuals, alleles between individuals within populations, and alleles between populations, allows a unified treatment of relatedness and population structure. We expect our new measures to have a wide range of applications, but we note that their estimates are sensitive to rare or private variants: some population-characterization applications suggest exploiting those sensitivities, whereas estimation of relatedness may best use all genetic markers without filtering on minor allele frequency.

Journal ArticleDOI
01 Dec 2017-Genetics
TL;DR: Drosophila is emerging as an important model system for research on carbohydrate metabolism, due to the high degree of conservation of relevant regulatory pathways, as well as vast possibilities for the analysis of gene–nutrient interactions and tissue-specific gene function.
Abstract: Carbohydrate metabolism is essential for cellular energy balance as well as for the biosynthesis of new cellular building blocks. As animal nutrient intake displays temporal fluctuations and each cell type within the animal possesses specific metabolic needs, elaborate regulatory systems are needed to coordinate carbohydrate metabolism in time and space. Carbohydrate metabolism is regulated locally through gene regulatory networks and signaling pathways, which receive inputs from nutrient sensors as well as other pathways, such as developmental signals. Superimposed on cell-intrinsic control, hormonal signaling mediates intertissue information to maintain organismal homeostasis. Misregulation of carbohydrate metabolism is causative for many human diseases, such as diabetes and cancer. Recent work in Drosophila melanogaster has uncovered new regulators of carbohydrate metabolism and introduced novel physiological roles for previously known pathways. Moreover, genetically tractable Drosophila models to study carbohydrate metabolism-related human diseases have provided new insight into the mechanisms of pathogenesis. Due to the high degree of conservation of relevant regulatory pathways, as well as vast possibilities for the analysis of gene–nutrient interactions and tissue-specific gene function, Drosophila is emerging as an important model system for research on carbohydrate metabolism.

Journal ArticleDOI
Yan-Jing Yang1, Yang Wang1, Zhi Li1, Li Zhou1, Jian-Fang Gui1 
01 Apr 2017-Genetics
TL;DR: A model in whichfoxl2a and foxl2b cooperate to regulate zebrafish ovary development and maintenance is proposed, with foxl 2b potentially having a dominant role in preventing the ovary from differentiating as testis, as compared to foxl1a.
Abstract: Foxl2 is essential for mammalian ovary maintenance. Although sexually dimorphic expression of foxl2 was observed in many teleosts, its role and regulative mechanism in fish remained largely unclear. In this study, we first identified two transcript variants of foxl2a and its homologous gene foxl2b in zebrafish, and revealed their specific expression in follicular layer cells in a sequential and divergent fashion during ovary differentiation, maturation, and maintenance. Then, homozygous foxl2a mutants (foxl2a−/−) and foxl2b mutants (foxl2b−/−) were constructed and detailed comparisons, such as sex ratio, gonadal histological structure, transcriptome profiling, and dynamic expression of gonadal development-related genes, were carried out. Initial ovarian differentiation and oocyte development occur normally both in foxl2a−/− and foxl2b−/− mutants, but foxl2a and foxl2b disruptions result in premature ovarian failure and partial sex reversal, respectively, in adult females. In foxl2a−/− female mutants, sox9a-amh/cyp19a1a signaling was upregulated at 150 days postfertilization (dpf) and subsequently oocyte apoptosis was triggered after 180 dpf. In contrast, dmrt1 expression was greater at 105 dpf and increased several 100-fold in foxl2b−/− mutated ovaries at 270 dpf, along with other testis-related genes. Finally, homozygous foxl2a−/−/foxl2b−/− double mutants were constructed in which complete sex reversal occurs early and testis-differentiation genes robustly increase at 60 dpf. Given mutual compensation between foxl2a and foxl2b in foxl2b−/− and foxl2a−/− mutants, we proposed a model in which foxl2a and foxl2b cooperate to regulate zebrafish ovary development and maintenance, with foxl2b potentially having a dominant role in preventing the ovary from differentiating as testis, as compared to foxl2a.

Journal ArticleDOI
01 Dec 2017-Genetics
TL;DR: This work identifies a single locus at which both independent mutation events and selection on an allele shared via gene flow, either slightly before or during selection, play a role in adaptation across the species’ range.
Abstract: Geographically separated populations can convergently adapt to the same selection pressure. Convergent evolution at the level of a gene may arise via three distinct modes. The selected alleles can (1) have multiple independent mutational origins, (2) be shared due to shared ancestral standing variation, or (3) spread throughout subpopulations via gene flow. We present a model-based, statistical approach that utilizes genomic data to detect cases of convergent adaptation at the genetic level, identify the loci involved and distinguish among these modes. To understand the impact of convergent positive selection on neutral diversity at linked loci, we make use of the fact that hitchhiking can be modeled as an increase in the variance in neutral allele frequencies around a selected site within a population. We build on coalescent theory to show how shared hitchhiking events between subpopulations act to increase covariance in allele frequencies between subpopulations at loci near the selected site, and extend this theory under different models of migration and selection on the same standing variation. We incorporate this hitchhiking effect into a multivariate normal model of allele frequencies that also accounts for population structure. Based on this theory, we present a composite-likelihood-based approach that utilizes genomic data to identify loci involved in convergence, and distinguishes among alternate modes of convergent adaptation. We illustrate our method on genome-wide polymorphism data from two distinct cases of convergent adaptation. First, we investigate the adaptation for copper toxicity tolerance in two populations of the common yellow monkey flower, Mimulus guttatus. We show that selection has occurred on an allele that has been standing in these populations prior to the onset of copper mining in this region. Lastly, we apply our method to data from four populations of the killifish, Fundulus heteroclitus, that show very rapid convergent adaptation for tolerance to industrial pollutants. Here, we identify a single locus at which both independent mutation events and selection on an allele shared via gene flow, either slightly before or during selection, play a role in adaptation across the species’ range.

Journal ArticleDOI
01 Jan 2017-Genetics
TL;DR: This work proposes a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses and shows how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS.
Abstract: With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.

Journal ArticleDOI
01 Sep 2017-Genetics
TL;DR: This work analyzed 34,373 mutations in 14 proteins whose effects were measured using large-scale mutagenesis approaches and found that several substitutions, including histidine and asparagine, best recapitulated the effects of other substitutions when the identity of the wild-type amino acid was considered.
Abstract: Mutagenesis is a widely used method for identifying protein positions that are important for function or ligand binding. Advances in high-throughput DNA sequencing and mutagenesis techniques have enabled measurement of the effects of nearly all possible amino acid substitutions in many proteins. The resulting large-scale mutagenesis data sets offer a unique opportunity to draw general conclusions about the effects of different amino acid substitutions. Thus, we analyzed 34,373 mutations in 14 proteins whose effects were measured using large-scale mutagenesis approaches. Methionine was the most tolerated substitution, while proline was the least tolerated. We found that several substitutions, including histidine and asparagine, best recapitulated the effects of other substitutions, even when the identity of the wild-type amino acid was considered. The effects of histidine and asparagine substitutions also correlated best with the effects of other substitutions in different structural contexts. Furthermore, highly disruptive substitutions like aspartic and glutamic acid had the most discriminatory power for detecting ligand interface positions. Our work highlights the utility of large-scale mutagenesis data, and our conclusions can help guide future single substitution mutational scans.

Journal ArticleDOI
01 Apr 2017-Genetics
TL;DR: These results point to previously undescribed mechanisms for modulating the color of specific wing pattern elements in butterflies, and provide an expanded portrait of the insect melanin pathway.
Abstract: Despite the variety, prominence, and adaptive significance of butterfly wing patterns, surprisingly little is known about the genetic basis of wing color diversity. Even though there is intense interest in wing pattern evolution and development, the technical challenge of genetically manipulating butterflies has slowed efforts to functionally characterize color pattern development genes. To identify candidate wing pigmentation genes, we used RNA sequencing to characterize transcription across multiple stages of butterfly wing development, and between different color pattern elements, in the painted lady butterfly Vanessa cardui. This allowed us to pinpoint genes specifically associated with red and black pigment patterns. To test the functions of a subset of genes associated with presumptive melanin pigmentation, we used clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 genome editing in four different butterfly genera. pale, Ddc, and yellow knockouts displayed reduction of melanin pigmentation, consistent with previous findings in other insects. Interestingly, however, yellow-d, ebony, and black knockouts revealed that these genes have localized effects on tuning the color of red, brown, and ochre pattern elements. These results point to previously undescribed mechanisms for modulating the color of specific wing pattern elements in butterflies, and provide an expanded portrait of the insect melanin pathway.

Journal ArticleDOI
01 Jun 2017-Genetics
TL;DR: Questions about basic mechanisms of evolution, speciation, hybridization, domestication, as well as about the molecular machineries underlying them are examined at two distinct levels offered by the broad evolutionary range of yeasts: inside the best-studied Saccharomyces species complex, and across the entire and diversified subphylum ofSaccharomycotina.
Abstract: Considerable progress in our understanding of yeast genomes and their evolution has been made over the last decade with the sequencing, analysis, and comparisons of numerous species, strains, or isolates of diverse origins. The role played by yeasts in natural environments as well as in artificial manufactures, combined with the importance of some species as model experimental systems sustained this effort. At the same time, their enormous evolutionary diversity (there are yeast species in every subphylum of Dikarya) sparked curiosity but necessitated further efforts to obtain appropriate reference genomes. Today, yeast genomes have been very informative about basic mechanisms of evolution, speciation, hybridization, domestication, as well as about the molecular machineries underlying them. They are also irreplaceable to investigate in detail the complex relationship between genotypes and phenotypes with both theoretical and practical implications. This review examines these questions at two distinct levels offered by the broad evolutionary range of yeasts: inside the best-studied Saccharomyces species complex, and across the entire and diversified subphylum of Saccharomycotina. While obviously revealing evolutionary histories at different scales, data converge to a remarkably coherent picture in which one can estimate the relative importance of intrinsic genome dynamics, including gene birth and loss, vs. horizontal genetic accidents in the making of populations. The facility with which novel yeast genomes can now be studied, combined with the already numerous available reference genomes, offer privileged perspectives to further examine these fundamental biological questions using yeasts both as eukaryotic models and as fungi of practical importance.

Journal ArticleDOI
01 Feb 2017-Genetics
TL;DR: In a surprising number of cases, however, DNA repair genes whose products play important roles in these pathways in other organisms are missing from the Drosophila genome, raising interesting questions for continued investigations.
Abstract: The numerous processes that damage DNA are counterbalanced by a complex network of repair pathways that, collectively, can mend diverse types of damage. Insights into these pathways have come from studies in many different organisms, including Drosophila melanogaster. Indeed, the first ideas about chromosome and gene repair grew out of Drosophila research on the properties of mutations produced by ionizing radiation and mustard gas. Numerous methods have been developed to take advantage of Drosophila genetic tools to elucidate repair processes in whole animals, organs, tissues, and cells. These studies have led to the discovery of key DNA repair pathways, including synthesis-dependent strand annealing, and DNA polymerase theta-mediated end joining. Drosophila appear to utilize other major repair pathways as well, such as base excision repair, nucleotide excision repair, mismatch repair, and interstrand crosslink repair. In a surprising number of cases, however, DNA repair genes whose products play important roles in these pathways in other organisms are missing from the Drosophila genome, raising interesting questions for continued investigations.

Journal ArticleDOI
01 Sep 2017-Genetics
TL;DR: Phylogenetic and population genomic analyses of isolates from Brazil reveal that the previously “African” VNB lineage occurs naturally in the South American environment, which suggests migration of the V NB lineage between Africa and South America prior to its diversification.
Abstract: Cryptococcus neoformans var. grubii is the causative agent of cryptococcal meningitis, a significant source of mortality in immunocompromised individuals, typically human immunodeficiency virus/AIDS patients from developing countries. Despite the worldwide emergence of this ubiquitous infection, little is known about the global molecular epidemiology of this fungal pathogen. Here we sequence the genomes of 188 diverse isolates and characterize the major subdivisions, their relative diversity, and the level of genetic exchange between them. While most isolates of C. neoformans var. grubii belong to one of three major lineages (VNI, VNII, and VNB), some haploid isolates show hybrid ancestry including some that appear to have recently interbred, based on the detection of large blocks of each ancestry across each chromosome. Many isolates display evidence of aneuploidy, which was detected for all chromosomes. In diploid isolates of C. neoformans var. grubii (serotype AA) and of hybrids with C. neoformans var. neoformans (serotype AD) such aneuploidies have resulted in loss of heterozygosity, where a chromosomal region is represented by the genotype of only one parental isolate. Phylogenetic and population genomic analyses of isolates from Brazil reveal that the previously "African" VNB lineage occurs naturally in the South American environment. This suggests migration of the VNB lineage between Africa and South America prior to its diversification, supported by finding ancestral recombination events between isolates from different lineages and regions. The results provide evidence of substantial population structure, with all lineages showing multi-continental distributions; demonstrating the highly dispersive nature of this pathogen.