scispace - formally typeset
Search or ask a question

Showing papers in "G3: Genes, Genomes, Genetics in 2014"


Journal ArticleDOI
TL;DR: The genome of the budding yeast Saccharomyces cerevisiae, the first completely sequenced from a eukaryote, was updated recently in its first major update since 1996 and serves as the anchor for further innovations in yeast genomic science.
Abstract: The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science.

375 citations


Journal ArticleDOI
TL;DR: A complete analytical pipeline for genetic mapping in DO mice is presented, including algorithms for probabilistic reconstruction of founder haplotypes from genotyping array intensity data, and mapping methods that accommodate multiplefounder haplotypes and account for relatedness among animals.
Abstract: Genetic mapping studies in the mouse and other model organisms are used to search for genes underlying complex phenotypes. Traditional genetic mapping studies that employ single-generation crosses have poor mapping resolution and limit discovery to loci that are polymorphic between the two parental strains. Multiparent outbreeding populations address these shortcomings by increasing the density of recombination events and introducing allelic variants from multiple founder strains. However, multiparent crosses present new analytical challenges and require specialized software to take full advantage of these benefits. Each animal in an outbreeding population is genetically unique and must be genotyped using a high-density marker set; regression models for mapping must accommodate multiple founder alleles, and complex breeding designs give rise to polygenic covariance among related animals that must be accounted for in mapping analysis. The Diversity Outbred (DO) mice combine the genetic diversity of eight founder strains in a multigenerational breeding design that has been maintained for >16 generations. The large population size and randomized mating ensure the long-term genetic stability of this population. We present a complete analytical pipeline for genetic mapping in DO mice, including algorithms for probabilistic reconstruction of founder haplotypes from genotyping array intensity data, and mapping methods that accommodate multiple founder haplotypes and account for relatedness among animals. Power analysis suggests that studies with as few as 200 DO mice can detect loci with large effects, but loci that account for <5% of trait variance may require a sample size of up to 1000 animals. The methods described here are implemented in the freely available R package DOQTL.

194 citations


Journal ArticleDOI
TL;DR: Analysis show this large, diverse, and highly recombined MAGIC population to be a powerful resource for the genetic dissection of target traits in wheat, and it is well-placed to efficiently exploit ongoing advances in phenomics and genomics.
Abstract: MAGIC populations represent one of a new generation of crop genetic mapping resources combining high genetic recombination and diversity. We describe the creation and validation of an eight-parent MAGIC population consisting of 1091 F7 lines of winter-sown wheat (Triticum aestivum L.). Analyses based on genotypes from a 90,000-single nucleotide polymorphism (SNP) array find the population to be well-suited as a platform for fine-mapping quantitative trait loci (QTL) and gene isolation. Patterns of linkage disequilibrium (LD) show the population to be highly recombined; genetic marker diversity among the founders was 74% of that captured in a larger set of 64 wheat varieties, and 54% of SNPs segregating among the 64 lines also segregated among the eight founder lines. In contrast, a commonly used reference bi-parental population had only 54% of the diversity of the 64 varieties with 27% of SNPs segregating. We demonstrate the potential of this MAGIC resource by identifying a highly diagnostic marker for the morphological character "awn presence/absence" and independently validate it in an association-mapping panel. These analyses show this large, diverse, and highly recombined MAGIC population to be a powerful resource for the genetic dissection of target traits in wheat, and it is well-placed to efficiently exploit ongoing advances in phenomics and genomics. Genetic marker and trait data, together with instructions for access to seed, are available at http://www.niab.com/MAGIC/.

191 citations


Journal ArticleDOI
TL;DR: It is shown that the Hi-C signal can be used to create scaffolded genome assemblies of individual eukaryotic species present within the microbial community, with higher levels of contiguity than some of the species’ published reference genomes.
Abstract: Microbial communities consist of mixed populations of organisms, including unknown species in unknown abundances. These communities are often studied through metagenomic shotgun sequencing, but standard library construction methods remove long-range contiguity information; thus, shotgun sequencing and de novo assembly of a metagenome typically yield a collection of contigs that cannot readily be grouped by species. Methods for generating chromatin-level contact probability maps, e.g., as generated by the Hi-C method, provide a signal of contiguity that is completely intracellular and contains both intrachromosomal and interchromosomal information. Here, we demonstrate how this signal can be exploited to reconstruct the individual genomes of microbial species present within a mixed sample. We apply this approach to two synthetic metagenome samples, successfully clustering the genome content of fungal, bacterial, and archaeal species with more than 99% agreement with published reference genomes. We also show that the Hi-C signal can secondarily be used to create scaffolded genome assemblies of individual eukaryotic species present within the microbial community, with higher levels of contiguity than some of the species’ published reference genomes.

191 citations


Journal ArticleDOI
TL;DR: Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis, and were slightly better than nonparametric methods for additive genetic architectures.
Abstract: Parametric and nonparametric methods have been developed for purposes of predicting phenotypes. These methods are based on retrospective analyses of empirical data consisting of genotypic and phenotypic scores. Recent reports have indicated that parametric methods are unable to predict phenotypes of traits with known epistatic genetic architectures. Herein, we review parametric methods including least squares regression, ridge regression, Bayesian ridge regression, least absolute shrinkage and selection operator (LASSO), Bayesian LASSO, best linear unbiased prediction (BLUP), Bayes A, Bayes B, Bayes C, and Bayes Cπ. We also review nonparametric methods including Nadaraya-Watson estimator, reproducing kernel Hilbert space, support vector machine regression, and neural networks. We assess the relative merits of these 14 methods in terms of accuracy and mean squared error (MSE) using simulated genetic architectures consisting of completely additive or two-way epistatic interactions in an F2 population derived from crosses of inbred lines. Each simulated genetic architecture explained either 30% or 70% of the phenotypic variability. The greatest impact on estimates of accuracy and MSE was due to genetic architecture. Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis. Parametric methods were slightly better than nonparametric methods for additive genetic architectures. Distinctions among parametric methods for additive genetic architectures were incremental. Heritability, i.e., proportion of phenotypic variability, had the second greatest impact on estimates of accuracy and MSE.

155 citations


Journal ArticleDOI
TL;DR: De novo assemblies of the transcriptomes of both a clonal line of symbiotic anemones and their endogenous dinoflagellate symbionts are generated and testable hypotheses about the cellular functions affected by symbiosis establishment are generated.
Abstract: Coral reefs provide habitats for a disproportionate number of marine species relative to the small area of the oceans that they occupy. The mutualism between the cnidarian animal hosts and their intracellular dinoflagellate symbionts provides the nutritional foundation for coral growth and formation of reef structures, because algal photosynthesis can provide >90% of the total energy of the host. Disruption of this symbiosis (“coral bleaching”) is occurring on a large scale due primarily to anthropogenic factors and poses a major threat to the future of coral reefs. Despite the importance of this symbiosis, the cellular mechanisms involved in its establishment, maintenance, and breakdown remain largely unknown. We report our continued development of genomic tools to study these mechanisms in Aiptasia, a small sea anemone with great promise as a model system for studies of cnidarian–dinoflagellate symbiosis. Specifically, we have generated de novo assemblies of the transcriptomes of both a clonal line of symbiotic anemones and their endogenous dinoflagellate symbionts. We then compared transcript abundances in animals with and without dinoflagellates. This analysis identified >900 differentially expressed genes and allowed us to generate testable hypotheses about the cellular functions affected by symbiosis establishment. The differentially regulated transcripts include >60 encoding proteins that may play roles in transporting various nutrients between the symbiotic partners; many more encoding proteins functioning in several metabolic pathways, providing clues regarding how the transported nutrients may be used by the partners; and several encoding proteins that may be involved in host recognition and tolerance of the dinoflagellate.

152 citations


Journal ArticleDOI
TL;DR: This work uses RNA-Seq to identify genes expressed in isolated XX gonads, which are approximately 95% germline and 5% somatic gonadal tissue, and suggests that this new dataset will prove useful for studies focusing on C. elegans germ cell biology.
Abstract: The nematode Caenorhabditis elegans is an important model for studies of germ cell biology, including the meiotic cell cycle, gamete specification as sperm or oocyte, and gamete development. Fundamental to those studies is a genome-level knowledge of the germline transcriptome. Here, we use RNA-Seq to identify genes expressed in isolated XX gonads, which are approximately 95% germline and 5% somatic gonadal tissue. We generate data from mutants making either sperm [fem-3(q96)] or oocytes [fog-2(q71)], both grown at 22°. Our dataset identifies a total of 10,754 mRNAs in the polyadenylated transcriptome of XX gonads, with 2748 enriched in spermatogenic gonads, 1732 enriched in oogenic gonads, and the remaining 6274 not enriched in either. These spermatogenic, oogenic, and gender-neutral gene datasets compare well with those of previous studies, but double the number of genes identified. A comparison of the additional genes found in our study with in situ hybridization patterns in the Kohara database suggests that most are expressed in the germline. We also query our RNA-Seq data for differential exon usage and find 351 mRNAs with sex-enriched isoforms. We suggest that this new dataset will prove useful for studies focusing on C. elegans germ cell biology.

151 citations


Journal ArticleDOI
TL;DR: A uniform analysis of vertebrate transcription factor ChIP-seq datasets in the Gene Expression Omnibus (GEO) repository as of April 1, 2012 is reported, and it is discovered that a significant subset of control datasets display an enrichment structure similar to successful Chip-seq data.
Abstract: ChIP-seq has become the primary method for identifying in vivo protein-DNA interactions on a genome-wide scale, with nearly 800 publications involving the technique in PubMed as of December 2012. Individually and in aggregate these data are an important and information-rich resource. However, uncertainties about data quality confound their use by the wider research community. Recently, the Encyclopedia Of DNA Elements (ENCODE) project, developed and applied metrics to objectively measure ChIP-seq data quality. The ENCODE quality analysis was useful for flagging datasets for closer inspection, eliminating or replacing poor data, and for driving changes in experimental pipelines. There had been no similarly systematic quality analysis of the large and disparate body of published ChIP-seq profiles. Here we report a uniform analysis of vertebrate transcription factor ChIP-seq datasets in the Gene Expression Omnibus (GEO) repository as of April 1st 2012. The majority (55%) of datasets scored as highly successful, but a substantial minority (20%) were of apparently poor quality, and another ~25% were of intermediate quality. We discuss how different uses of ChIP-Seq data are affected by specific aspects of data quality, and we highlight exceptional instances for which the metric values should not be taken at face value. Unexpectedly, we discovered that a significant subset of control datasets (i.e. no-immunoprecipitation and mock-immunoprecipitation samples) display an enrichment structure similar to successful ChIP-seq data. This can, in turn, affect peak calling and data interpretation. Published datasets identified here as high quality comprise a large group that users can draw on for large-scale integrated analysis. In the future, ChIP-seq quality assessment similar to that used here could guide experimentalists at early stages in a study, provide useful input in the publication process, and be used to stratify ChIP-seq data for different community-wide uses.

131 citations


Journal ArticleDOI
TL;DR: It is demonstrated that GBS is a cost-effective method for generating genome-wide SNP data suitable for genetic mapping in a highly diverse and heterozygous agricultural species and future improvements to the GBS analysis pipeline presented here will enhance the utility of next-generation DNA sequence data for the purposes of genetic mapping across diverse species.
Abstract: Next-generation DNA sequencing (NGS) produces vast amounts of DNA sequence data, but it is not specifically designed to generate data suitable for genetic mapping. Recently developed DNA library preparation methods for NGS have helped solve this problem, however, by combining the use of reduced representation libraries with DNA sample barcoding to generate genome-wide genotype data from a common set of genetic markers across a large number of samples. Here we use such a method, called genotyping-by-sequencing (GBS), to produce a data set for genetic mapping in an F1 population of apples (Malus × domestica) segregating for skin color. We show that GBS produces a relatively large, but extremely sparse, genotype matrix: over 270,000 SNPs were discovered but most SNPs have too much missing data across samples to be useful for genetic mapping. After filtering for genotype quality and missing data, only 6% of the 85 million DNA sequence reads contributed to useful genotype calls. Despite this limitation, using existing software and a set of simple heuristics, we generated a final genotype matrix containing 3967 SNPs from 89 DNA samples from a single lane of Illumina HiSeq and used it to create a saturated genetic linkage map and to identify a known QTL underlying apple skin color. We therefore demonstrate that GBS is a cost-effective method for generating genome-wide SNP data suitable for genetic mapping in a highly diverse and heterozygous agricultural species. We anticipate future improvements to the GBS analysis pipeline presented here that will enhance the utility of next-generation DNA sequence data for the purposes of genetic mapping across diverse species.

127 citations


Journal ArticleDOI
TL;DR: A population genetic model of prokaryotes undergoing HGT via homologous recombination indicates that HGT can prevent the operation of Muller’s ratchet even when the source of transferred genes is eDNA that comes from dead cells and on average carries more deleterious mutations than the DNA of recipient live cells.
Abstract: Horizontal gene transfer (HGT) is a major factor in the evolution of prokaryotes. An intriguing question is whether HGT is maintained during evolution of prokaryotes owing to its adaptive value or is a byproduct of selection driven by other factors such as consumption of extracellular DNA (eDNA) as a nutrient. One hypothesis posits that HGT can restore genes inactivated by mutations and thereby prevent stochastic, irreversible deterioration of genomes in finite populations known as Muller’s ratchet. To examine this hypothesis, we developed a population genetic model of prokaryotes undergoing HGT via homologous recombination. Analysis of this model indicates that HGT can prevent the operation of Muller’s ratchet even when the source of transferred genes is eDNA that comes from dead cells and on average carries more deleterious mutations than the DNA of recipient live cells. Moreover, if HGT is sufficiently frequent and eDNA diffusion sufficiently rapid, a subdivided population is shown to be more resistant to Muller’s ratchet than an undivided population of an equal overall size. Thus, to maintain genomic information in the face of Muller’s ratchet, it is more advantageous to partition individuals into multiple subpopulations and let them “cross-reference” each other’s genetic information through HGT than to collect all individuals in one population and thereby maximize the efficacy of natural selection. Taken together, the results suggest that HGT could be an important condition for the long-term maintenance of genomic information in prokaryotes through the prevention of Muller’s ratchet.

126 citations


Journal ArticleDOI
TL;DR: In this article, the authors sequenced S. carlsbergensis using next generation sequencing technologies and showed that the 19.5-Mb genome is substantially larger than the S. cerevisiae genome.
Abstract: Lager yeast beer production was revolutionized by the introduction of pure culture strains. The first established lager yeast strain is known as the bottom fermenting Saccharomyces carlsbergensis, which was originally termed Unterhefe No. 1 by Emil Chr. Hansen and has been used in production in since 1883. S. carlsbergensis belongs to group I/Saaz-type lager yeast strains and is better adapted to cold growth conditions than group II/Frohberg-type lager yeasts, e.g., the Weihenstephan strain WS34/70. Here, we sequenced S. carlsbergensis using next generation sequencing technologies. Lager yeasts are descendants from hybrids formed between a S. cerevisiae parent and a parent similar to S. eubayanus. Accordingly, the S. carlsbergensis 19.5-Mb genome is substantially larger than the 12-Mb S. cerevisiae genome. Based on the sequence scaffolds, synteny to the S. cerevisae genome, and by using directed polymerase chain reaction for gap closure, we generated a chromosomal map of S. carlsbergensis consisting of 29 unique chromosomes. We present evidence for genome and chromosome evolution within S. carlsbergensis via chromosome loss and loss of heterozygosity specifically of parts derived from the S. cerevisiae parent. Based on our sequence data and via fluorescence-activated cell-sorting analysis, we determined the ploidy of S. carlsbergensis. This inferred that this strain is basically triploid with a diploid S. eubayanus and haploid S. cerevisiae genome content. In contrast the Weihenstephan strain, which we resequenced, is essentially tetraploid composed of two diploid S. cerevisiae and S. eubayanus genomes. Based on conserved translocations between the parental genomes in S. carlsbergensis and the Weihenstephan strain we propose a joint evolutionary ancestry for lager yeast strains.

Journal ArticleDOI
TL;DR: This work presents gitter, an image analysis tool for robust and accurate processing of images from colony-based screens, and shows that gitter produces comparable colony sizes to other tools in simple cases but outperforms them by being able to handle a wider variety of screens and more accurately quantify colony sizes from difficult images.
Abstract: Colony-based screens that quantify the fitness of clonal populations on solid agar plates are perhaps the most important source of genome-scale functional information in microorganisms. The images of ordered arrays of mutants produced by such experiments can be difficult to process because of laboratory-specific plate features, morphed colonies, plate edges, noise, and other artifacts. Most of the tools developed to address this problem are optimized to handle a single setup and do not work out of the box in other settings. We present gitter, an image analysis tool for robust and accurate processing of images from colony-based screens. gitter works by first finding the grid of colonies from a preprocessed image and then locating the bounds of each colony separately. We show that gitter produces comparable colony sizes to other tools in simple cases but outperforms them by being able to handle a wider variety of screens and more accurately quantify colony sizes from difficult images. gitter is freely available as an R package from http://cran.r-project.org/web/packages/gitter under the LGPL. Tutorials and demos can be found at http://omarwagih.github.io/gitter

Journal ArticleDOI
TL;DR: An efficient two-step strategy to flexibly engineer the fly genome by combining CRISPR with recombinase-mediated cassette exchange (RMCE) enabling flexible gene modification is developed and suggests that any fly laboratory can engineer their favorite gene for a broad range of applications within approximately 3 months.
Abstract: The development of clustered, regularly interspaced, short palindromic repeats (CRISPR)/CRISPR-associated (Cas) technologies promises a quantum leap in genome engineering of model organisms. However, CRISPR-mediated gene targeting reports in Drosophila melanogaster are still restricted to a few genes, use variable experimental conditions, and vary in efficiency, questioning the universal applicability of the method. Here, we developed an efficient two-step strategy to flexibly engineer the fly genome by combining CRISPR with recombinase-mediated cassette exchange (RMCE). In the first step, two sgRNAs, whose activity had been tested in cell culture, were co-injected together with a donor plasmid into transgenic Act5C-Cas9, Ligase4 mutant embryos and the homologous integration events were identified by eye fluorescence. In the second step, the eye marker was replaced with DNA sequences of choice using RMCE enabling flexible gene modification. We applied this strategy to engineer four different locations in the genome, including a gene on the fourth chromosome, at comparably high efficiencies. Our data suggest that any fly laboratory can engineer their favorite gene for a broad range of applications within approximately 3 months.

Journal ArticleDOI
TL;DR: The efficient use of nutrient stores to support embryonic development is demonstrated, sequential metabolic transitions during this stage are defined, and striking similarities between the metabolic state of late-stage fly embryos and tumor cells are demonstrated.
Abstract: Rapidly proliferating cells such as cancer cells and embryonic stem cells rely on a specialized metabolic program known as aerobic glycolysis, which supports biomass production from carbohydrates. The fruit fly Drosophila melanogaster also utilizes aerobic glycolysis to support the rapid growth that occurs during larval development. Here we use singular value decomposition analysis of modENCODE RNA-seq data combined with GC-MS-based metabolomic analysis to analyze the changes in gene expression and metabolism that occur during Drosophila embryogenesis, spanning the onset of aerobic glycolysis. Unexpectedly, we find that the most common pattern of co-expressed genes in embryos includes the global switch to glycolytic gene expression that occurs midway through embryogenesis. In contrast to the canonical aerobic glycolytic pathway, however, which is accompanied by reduced mitochondrial oxidative metabolism, the expression of genes involved in the tricarboxylic cycle (TCA cycle) and the electron transport chain are also upregulated at this time. Mitochondrial activity, however, appears to be attenuated, as embryos exhibit a block in the TCA cycle that results in elevated levels of citrate, isocitrate, and α-ketoglutarate. We also find that genes involved in lipid breakdown and β-oxidation are upregulated prior to the transcriptional initiation of glycolysis, but are downregulated before the onset of larval development, revealing coordinated use of lipids and carbohydrates during development. These observations demonstrate the efficient use of nutrient stores to support embryonic development, define sequential metabolic transitions during this stage, and demonstrate striking similarities between the metabolic state of late-stage fly embryos and tumor cells.

Journal ArticleDOI
TL;DR: A simple and versatile alternative method for CRISPR-mediated genome editing in Drosophila using bicistronic Cas9/sgRNA expression vectors and allows the isolation of targeted knock-out and knock-in alleles by molecular screening within 2 months is reported.
Abstract: The CRISPR-associated RNA-guided nuclease Cas9 has emerged as a powerful tool for genome engineering in a variety of organisms. To achieve efficient gene targeting rates in Drosophila, current approaches require either injection of in vitro transcribed RNAs or injection into transgenic Cas9-expressing embryos. We report a simple and versatile alternative method for CRISPR-mediated genome editing in Drosophila using bicistronic Cas9/sgRNA expression vectors. Gene targeting with this single-plasmid injection approach is as efficient as in transgenic nanos-Cas9 embryos and allows the isolation of targeted knock-out and knock-in alleles by molecular screening within 2 months. Our strategy is independent of genetic background and does not require prior establishment of transgenic flies.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the chromosomal evolution in Chinook salmon (Oncorhynchus tshawytscha) after genome duplication by mapping 7146 restriction-site associated DNA loci.
Abstract: Comparisons between the genomes of salmon species reveal that they underwent extensive chromosomal rearrangements following whole genome duplication that occurred in their lineage 58−63 million years ago. Extant salmonids are diploid, but occasional pairing between homeologous chromosomes exists in males. The consequences of re-diploidization can be characterized by mapping the position of duplicated loci in such species. Linkage maps are also a valuable tool for genome-wide applications such as genome-wide association studies, quantitative trait loci mapping or genome scans. Here, we investigated chromosomal evolution in Chinook salmon (Oncorhynchus tshawytscha) after genome duplication by mapping 7146 restriction-site associated DNA loci in gynogenetic haploid, gynogenetic diploid, and diploid crosses. In the process, we developed a reference database of restriction-site associated DNA loci for Chinook salmon comprising 48528 non-duplicated loci and 6409 known duplicated loci, which will facilitate locus identification and data sharing. We created a very dense linkage map anchored to all 34 chromosomes for the species, and all arms were identified through centromere mapping. The map positions of 799 duplicated loci revealed that homeologous pairs have diverged at different rates following whole genome duplication, and that degree of differentiation along arms was variable. Many of the homeologous pairs with high numbers of duplicated markers appear conserved with other salmon species, suggesting that retention of conserved homeologous pairing in some arms preceded species divergence. As chromosome arms are highly conserved across species, the major resources developed for Chinook salmon in this study are also relevant for other related species.

Journal ArticleDOI
TL;DR: Computational methods are presented that efficiently and accurately estimate effect sizes and their statistical significance by adapting existing methods for RNA-seq analysis, and are relevant to a variety of pooled genetic screening methods that use high-throughput quantitative DNA sequencing, including Tn-seq.
Abstract: High-throughput quantitative DNA sequencing enables the parallel phenotyping of pools of thousands of mutants. However, the appropriate analytical methods and experimental design that maximize the efficiency of these methods while maintaining statistical power are currently unknown. Here, we have used Bar-seq analysis of the Saccharomyces cerevisiae yeast deletion library to systematically test the effect of experimental design parameters and sequence read depth on experimental results. We present computational methods that efficiently and accurately estimate effect sizes and their statistical significance by adapting existing methods for RNA-seq analysis. Using simulated variation of experimental designs, we found that biological replicates are critical for statistical analysis of Bar-seq data, whereas technical replicates are of less value. By subsampling sequence reads, we found that when using four-fold biological replication, 6 million reads per condition achieved 96% power to detect a two-fold change (or more) at a 5% false discovery rate. Our guidelines for experimental design and computational analysis enables the study of the yeast deletion collection in up to 30 different conditions in a single sequencing lane. These findings are relevant to a variety of pooled genetic screening methods that use high-throughput quantitative DNA sequencing, including Tn-seq.

Journal ArticleDOI
TL;DR: To further the understanding of auxin/ethylene/ABA crosstalk, ABA responsiveness of double mutants of ethylene overproducer1 or ein2 combined with auxin-resistant mutants is examined and found that auxin and ethylene likely operate in a linear pathway to affect ABA-responsive inhibition of root elongation, whereas these two hormones likely act independently to affect aBA- responsive inhibition of seed germination.
Abstract: Abscisic acid (ABA) regulates many aspects of plant growth and development, including inhibition of root elongation and seed germination. We performed an ABA resistance screen to identify factors required for ABA response in root elongation inhibition. We identified two classes of Arabidopsis thaliana AR mutants that displayed ABA-resistant root elongation: those that displayed resistance to ABA in both root elongation and seed germination and those that displayed resistance to ABA in root elongation but not in seed germination. We used PCR-based genotyping to identify a mutation in ABA INSENSITIVE2 (ABI2), positional information to identify mutations in AUXIN RESISTANT1 (AUX1) and ETHYLENE INSENSITIVE2 (EIN2), and whole genome sequencing to identify mutations in AUX1, AUXIN RESISTANT4 (AXR4), and ETHYLENE INSENSITIVE ROOT1/PIN-FORMED2 (EIR1/PIN2). Identification of auxin and ethylene response mutants among our isolates suggested that auxin and ethylene responsiveness were required for ABA inhibition of root elongation. To further our understanding of auxin/ethylene/ABA crosstalk, we examined ABA responsiveness of double mutants of ethylene overproducer1 (eto1) or ein2 combined with auxin-resistant mutants and found that auxin and ethylene likely operate in a linear pathway to affect ABA-responsive inhibition of root elongation, whereas these two hormones likely act independently to affect ABA-responsive inhibition of seed germination.

Journal ArticleDOI
TL;DR: Quantitative mapping of RNAseq reads to the reference assembly showed that expression of genes with predicted functions in cellular immunity, wound healing, melanization, and the production of reactive oxygen species was transiently induced immediately after immune challenge, indicating possible interactive roles in vivo.
Abstract: The course of microbial infection in insects is shaped by a two-stage process of immune defense. Constitutive defenses, such as engulfment and melanization, act immediately and are followed by inducible defenses, archetypically the production of antimicrobial peptides, which eliminate or suppress the remaining microbes. By applying RNAseq across a 7-day time course, we sought to characterize the long-lasting immune response to bacterial challenge in the mealworm beetle Tenebrio molitor, a model for the biochemistry of insect immunity and persistent bacterial infection. By annotating a hybrid de novo assembly of RNAseq data, we were able to identify putative orthologs for the majority of components of the conserved insect immune system. Compared with Tribolium castaneum, the most closely related species with a reference genome sequence and a manually curated immune system annotation, the T. molitor immune gene count was lower, with lineage-specific expansions of genes encoding serine proteases and their countervailing inhibitors accounting for the majority of the deficit. Quantitative mapping of RNAseq reads to the reference assembly showed that expression of genes with predicted functions in cellular immunity, wound healing, melanization, and the production of reactive oxygen species was transiently induced immediately after immune challenge. In contrast, expression of genes encoding antimicrobial peptides or components of the Toll signaling pathway and iron sequestration response remained elevated for at least 7 days. Numerous genes involved in metabolism and nutrient storage were repressed, indicating a possible cost of immune induction. Strikingly, the expression of almost all antibacterial peptides followed the same pattern of long-lasting induction, regardless of their spectra of activity, signaling possible interactive roles in vivo.

Journal ArticleDOI
TL;DR: The results associate transposon-linked differential methylation with allelic state and gene expression at a major flowering time quantitative trait locus in maize.
Abstract: One of the major quantitative trait loci for flowering time in maize, the Vegetative to generative transition 1 (Vgt1) locus, corresponds to an upstream (70 kb) noncoding regulatory element of ZmRap2.7, a repressor of flowering. At Vgt1, a miniature transposon (MITE) insertion into a conserved noncoding sequence was previously found to be highly associated with early flowering in independent studies. Because cytosine methylation is known to be associated with transposons and to influence gene expression, we aimed to investigate how DNA methylation patterns in wild-type and mutant Vgt1 correlate with ZmRap2.7 expression. The methylation state at Vgt1 was assayed in leaf samples of maize inbred and F1 hybrid samples, and at the syntenic region in sorghum. The Vgt1-linked conserved noncoding sequence was very scarcely methylated both in maize and sorghum. However, in the early maize Vgt1 allele, the region immediately flanking the highly methylated MITE insertion was significantly more methylated and showed features of methylation spreading. Allele-specific expression assays revealed that the presence of the MITE and its heavy methylation appear to be linked to altered ZmRap2.7 transcription. Although not providing proof of causative connection, our results associate transposon-linked differential methylation with allelic state and gene expression at a major flowering time quantitative trait locus in maize.

Journal ArticleDOI
TL;DR: Genotyping-by-sequencing (GBS) enabled us to develop a saturated linkage map for alfalfa that greatly improved genome coverage relative to previous maps and that will facilitate investigation of genome structure.
Abstract: A genetic linkage map is a valuable tool for quantitative trait locus mapping, map-based gene cloning, comparative mapping, and whole-genome assembly. Alfalfa, one of the most important forage crops in the world, is autotetraploid, allogamous, and highly heterozygous, characteristics that have impeded the construction of a high-density linkage map using traditional genetic marker systems. Using genotyping-by-sequencing (GBS), we constructed low-cost, reasonably high-density linkage maps for both maternal and paternal parental genomes of an autotetraploid alfalfa F1 population. The resulting maps contain 3591 single-nucleotide polymorphism markers on 64 linkage groups across both parents, with an average density of one marker per 1.5 and 1.0 cM for the maternal and paternal haplotype maps, respectively. Chromosome assignments were made based on homology of markers to the M. truncatula genome. Four linkage groups representing the four haplotypes of each alfalfa chromosome were assigned to each of the eight Medicago chromosomes in both the maternal and paternal parents. The alfalfa linkage groups were highly syntenous with M. truncatula, and clearly identified the known translocation between Chromosomes 4 and 8. In addition, a small inversion on Chromosome 1 was identified between M. truncatula and M. sativa. GBS enabled us to develop a saturated linkage map for alfalfa that greatly improved genome coverage relative to previous maps and that will facilitate investigation of genome structure. GBS could be used in breeding populations to accelerate molecular breeding in alfalfa.

Journal ArticleDOI
TL;DR: The authors found significant changes in expression of numerous genes involved in innate and inflammatory responses in infected frogs despite high susceptibility to chytridiomycosis, including increased expression of immunoglobulins and major histocompatibility complex (HCC) genes.
Abstract: The emergence of the disease chytridiomycosis caused by the chytrid fungus Batrachochytrium dendrobatidis (Bd) has been implicated in dramatic global amphibian declines. Although many species have undergone catastrophic declines and/or extinctions, others appear to be unaffected or persist at reduced frequencies after Bd outbreaks. The reasons behind this variance in disease outcomes are poorly understood: differences in host immune responses have been proposed, yet previous studies suggest a lack of robust immune responses to Bd in susceptible species. Here, we sequenced transcriptomes from clutch-mates of a highly susceptible amphibian, Atelopus zeteki, with different infection histories. We found significant changes in expression of numerous genes involved in innate and inflammatory responses in infected frogs despite high susceptibility to chytridiomycosis. We show evidence of acquired immune responses generated against Bd, including increased expression of immunoglobulins and major histocompatibility complex genes. In addition, fungal-killing genes had significantly greater expression in frogs previously exposed to Bd compared with Bd-naive frogs, including chitinase and serine-type proteases. However, our results appear to confirm recent in vitro evidence of immune suppression by Bd, demonstrated by decreased expression of lymphocyte genes in the spleen of infected compared with control frogs. We propose susceptibility to chytridiomycosis is not due to lack of Bd-specific immune responses but instead is caused by failure of those responses to be effective. Ineffective immune pathway activation and timing of antibody production are discussed as potential mechanisms. However, in light of our findings, suppression of key immune responses by Bd is likely an important factor in the lethality of this fungus.

Journal ArticleDOI
TL;DR: By obtaining complete genome sequences for the four parents and analyzing sequence variation in the QTL confidence intervals, 16 candidate genes likely to affect melanization were identified, including PKS1, a polyketide synthase gene known to play a role in the synthesis of dihydroxynaphthalene melanin.
Abstract: Melanin plays an important role in virulence and antimicrobial resistance in several fungal pathogens. The wheat pathogen Zymoseptoria tritici is important worldwide, but little is known about the genetic architecture of pathogenicity, including the production of melanin. Because melanin production can exhibit complex inheritance, we used quantitative trait locus (QTL) mapping in two crosses to identify the underlying genes. Restriction site−associated DNA sequencing was used to genotype 263 (cross 1) and 261 (cross 2) progeny at ~8500 single-nucleotide polymorphisms and construct two dense linkage maps. We measured gray values, representing degrees of melanization, for single-spore colonies growing on Petri dishes by using a novel image-processing approach that enabled high-throughput phenotyping. Because melanin production can be affected by stress, each offspring was grown in two stressful environments and one control environment. We detected six significant QTL in cross 1 and nine in cross 2, with three QTL shared between the crosses. Different QTL were identified in different environments and at different colony ages. By obtaining complete genome sequences for the four parents and analyzing sequence variation in the QTL confidence intervals, we identified 16 candidate genes likely to affect melanization. One of these candidates was PKS1, a polyketide synthase gene known to play a role in the synthesis of dihydroxynaphthalene melanin. Three candidate quantitative trait nucleotides were identified in PKS1. Many of the other candidate genes were not previously associated with melanization.

Journal ArticleDOI
TL;DR: This paper used genome-wide genotyping data to characterize deleterious variants in a large panel of maize inbred lines and found that genes associated with a number of complex traits are enriched for deleterius variants.
Abstract: Most nonsynonymous mutations are thought to be deleterious because of their effect on protein sequence and are expected to be removed or kept at low frequency by the action of natural selection. Nonetheless, the effect of positive selection on linked sites or drift in small or inbred populations may also impact the evolution of deleterious alleles. Despite their potential to affect complex trait phenotypes, deleterious alleles are difficult to study precisely because they are often at low frequency. Here, we made use of genome-wide genotyping data to characterize deleterious variants in a large panel of maize inbred lines. We show that, despite small effective population sizes and inbreeding, most putatively deleterious SNPs are indeed at low frequencies within individual genetic groups. We find that genes associated with a number of complex traits are enriched for deleterious variants. Together, these data are consistent with the dominance model of heterosis, in which complementation of numerous low-frequency, weak deleterious variants contribute to hybrid vigor.

Journal ArticleDOI
TL;DR: This study simulated phenotypes resulting from a range of genetic architectures in soybean, and found that with a heritability of 0.5, ∼100% and ∼33% of the 4 and 20 simulated QTL can be recovered, respectively, with a false-positive rate of less than ∼6×10−5 per marker tested.
Abstract: Soybean oil and meal are major contributors to world-wide food production. Consequently, the genetic basis for soybean seed composition has been intensely studied using family-based mapping. Population-based mapping approaches, in the form of genome-wide association (GWA) scans, have been able to resolve loci controlling moderately complex quantitative traits (QTL) in numerous crop species. Yet, it is still unclear how soybean’s unique population history will affect GWA scans. Using one of the populations in this study, we simulated phenotypes resulting from a range of genetic architectures. We found that with a heritability of 0.5, ∼100% and ∼33% of the 4 and 20 simulated QTL can be recovered, respectively, with a false-positive rate of less than ∼6×10−5 per marker tested. Additionally, we demonstrated that combining information from multi-locus mixed models and compressed linear-mixed models improves QTL identification and interpretation. We applied these insights to exploring seed composition in soybean, refining the linkage group I (chromosome 20) protein QTL and identifying additional oil QTL that may allow some decoupling of highly correlated oil and protein phenotypes. Because the value of protein meal is closely related to its essential amino acid profile, we attempted to identify QTL underlying methionine, threonine, cysteine, and lysine content. Multiple QTL were found that have not been observed in family-based mapping studies, and each trait exhibited associations across multiple populations. Chromosomes 1 and 8 contain strong candidate alleles for essential amino acid increases. Overall, we present these and additional data that will be useful in determining breeding strategies for the continued improvement of soybean’s nutrient portfolio.

Journal ArticleDOI
Zhaoyu Xue1, Menghua Wu1, Kejia Wen1, Menda Ren1, Li Long1, Xuedi Zhang1, Guanjun Gao1 
TL;DR: A CRISPR/Cas9-mediated conditional mutagenesis system provides a simple and effective tool for gene function analysis, and complements the existing RNAi approach.
Abstract: Existing transgenic RNA interference (RNAi) methods greatly facilitate functional genome studies via controlled silencing of targeted mRNA in Drosophila. Although the RNAi approach is extremely powerful, concerns still linger about its low efficiency. Here, we developed a CRISPR/Cas9-mediated conditional mutagenesis system by combining tissue-specific expression of Cas9 driven by the Gal4/upstream activating site system with various ubiquitously expressed guide RNA transgenes to effectively inactivate gene expression in a temporally and spatially controlled manner. Furthermore, by including multiple guide RNAs in a transgenic vector to target a single gene, we achieved a high degree of gene mutagenesis in specific tissues. The CRISPR/Cas9-mediated conditional mutagenesis system provides a simple and effective tool for gene function analysis, and complements the existing RNAi approach.

Journal ArticleDOI
TL;DR: The trajectory of evolution is followed to determine the identity and fate of beneficial mutations in Saccharomyces cerevisiae populations grown in sulfate limitation, with adaptive variants both persisting and replacing one another.
Abstract: Population adaptation to strong selection can occur through the sequential or parallel accumulation of competing beneficial mutations. The dynamics, diversity, and rate of fixation of beneficial mutations within and between populations are still poorly understood. To study how the mutational landscape varies across populations during adaptation, we performed experimental evolution on seven parallel populations of Saccharomyces cerevisiae continuously cultured in limiting sulfate medium. By combining quantitative polymerase chain reaction, array comparative genomic hybridization, restriction digestion and contour-clamped homogeneous electric field gel electrophoresis, and whole-genome sequencing, we followed the trajectory of evolution to determine the identity and fate of beneficial mutations. During a period of 200 generations, the yeast populations displayed parallel evolutionary dynamics that were driven by the coexistence of independent beneficial mutations. Selective amplifications rapidly evolved under this selection pressure, in particular common inverted amplifications containing the sulfate transporter gene SUL1. Compared with single clones, detailed analysis of the populations uncovers a greater complexity whereby multiple subpopulations arise and compete despite a strong selection. The most common evolutionary adaptation to strong selection in these populations grown in sulfate limitation is determined by clonal interference, with adaptive variants both persisting and replacing one another.

Journal ArticleDOI
TL;DR: The results suggest that similar errors exist in pseudomolecules from other large genomes that have been assembled using only linkage maps to predict scaffold arrangement, and these errors can be corrected using FISH and/or optical mapping.
Abstract: The order and orientation (arrangement) of all 91 sequenced scaffolds in the 12 pseudomolecules of the recently published tomato (Solanum lycopersicum, 2n = 2x = 24) genome sequence were positioned based on marker order in a high-density linkage map. Here, we report the arrangement of these scaffolds determined by two independent physical methods, bacterial artificial chromosome–fluorescence in situ hybridization (BAC-FISH) and optical mapping. By localizing BACs at the ends of scaffolds to spreads of tomato synaptonemal complexes (pachytene chromosomes), we showed that 45 scaffolds, representing one-third of the tomato genome, were arranged differently than predicted by the linkage map. These scaffolds occur mostly in pericentric heterochromatin where 77% of the tomato genome is located and where linkage mapping is less accurate due to reduced crossing over. Although useful for only part of the genome, optical mapping results were in complete agreement with scaffold arrangement by FISH but often disagreed with scaffold arrangement based on the linkage map. The scaffold arrangement based on FISH and optical mapping changes the positions of hundreds of markers in the linkage map, especially in heterochromatin. These results suggest that similar errors exist in pseudomolecules from other large genomes that have been assembled using only linkage maps to predict scaffold arrangement, and these errors can be corrected using FISH and/or optical mapping. Of note, BAC-FISH also permits estimates of the sizes of gaps between scaffolds, and unanchored BACs are often visualized by FISH in gaps between scaffolds and thus represent starting points for filling these gaps.

Journal ArticleDOI
TL;DR: The presence of key vertebrate sex-determining genes in a mollusc with expression profiles consistent with expected roles in sex determination suggest that sex determination may be deeply conserved in animals, despite rapid evolution of the regulatory pathways that in C. gigas may involve both genetic and environmental factors.
Abstract: Despite the prevalence of sex in animal kingdom, we have only limited understanding of how sex is determined and evolved in many taxa. The mollusc Pacific oyster Crassostrea gigas exhibits complex modes of sexual reproduction that consists of protandric dioecy, sex change, and occasional hermaphroditism. This complex system is controlled by both environmental and genetic factors through unknown molecular mechanisms. In this study, we investigated genes related to sex-determining pathways in C. gigas through transcriptome sequencing and analysis of female and male gonads. Our analysis identified or confirmed novel homologs in the oyster of key sex-determining genes (SoxH or Sry-like and FoxL2) that were thought to be vertebrate-specific. Their expression profile in C. gigas is consistent with conserved roles in sex determination, under a proposed model where a novel testis-determining CgSoxH may serve as a primary regulator, directly or indirectly interacting with a testis-promoting CgDsx and an ovary-promoting CgFoxL2. Our findings plus previous results suggest that key vertebrate sex-determining genes such as Sry and FoxL2 may not be inventions of vertebrates. The presence of such genes in a mollusc with expression profiles consistent with expected roles in sex determination suggest that sex determination may be deeply conserved in animals, despite rapid evolution of the regulatory pathways that in C. gigas may involve both genetic and environmental factors.

Journal ArticleDOI
TL;DR: A large class of mRNAs, enriched for transcripts specifying products involved in rRNA metabolism, showed decreased expression in response to light, indicating a heretofore undocumented effect of light on this pathway.
Abstract: The filamentous fungus Neurospora crassa responds to light in complex ways. To thoroughly study the transcriptional response of this organism to light, RNA-seq was used to analyze capped and polyadenylated mRNA prepared from mycelium grown for 24 hr in the dark and then exposed to light for 0 (control) 15, 60, 120, and 240 min. More than three-quarters of all defined protein coding genes (79%) were expressed in these cells. The increased sensitivity of RNA-seq compared with previous microarray studies revealed that the RNA levels for 31% of expressed genes were affected two-fold or more by exposure to light. Additionally, a large class of mRNAs, enriched for transcripts specifying products involved in rRNA metabolism, showed decreased expression in response to light, indicating a heretofore undocumented effect of light on this pathway. Based on measured changes in mRNA levels, light generally increases cellular metabolism and at the same time causes significant oxidative stress to the organism. To deal with this stress, protective photopigments are made, antioxidants are produced, and genes involved in ribosome biogenesis are transiently repressed.