scispace - formally typeset
Search or ask a question

Showing papers on "Gene published in 2019"


Journal ArticleDOI
TL;DR: The phylogenetic analysis complemented with synteny analyses suggests that Bmp2, -4 and -16 are remnants of a gene quartet that originated during the two rounds of whole-genome duplication (2R-WGD) early in vertebrate evolution.
Abstract: The vertebrate gene repertoire is characterized by “cryptic” genes whose identification has been hampered by their absence from the genomes of well-studied species. One example is the Bmp16 gene, a paralog of the developmental key genes Bmp2 and -4. We focus on the Bmp2/4/16 group of genes to study the evolutionary dynamics following gen(om)e duplications with special emphasis on the poorly studied Bmp16 gene. We reveal the presence of Bmp16 in chondrichthyans in addition to previously reported teleost fishes and reptiles. Using comprehensive, vertebrate-wide gene sampling, our phylogenetic analysis complemented with synteny analyses suggests that Bmp2, -4 and -16 are remnants of a gene quartet that originated during the two rounds of whole-genome duplication (2R-WGD) early in vertebrate evolution. We confirm that Bmp16 genes were lost independently in at least three lineages (mammals, archelosaurs and amphibians) and report that they have elevated rates of sequence evolution. This finding agrees with their more “flexible” deployment during development; while Bmp16 has limited embryonic expression domains in the cloudy catshark, it is broadly expressed in the green anole lizard. Our study illustrates the dynamics of gene family evolution by integrating insights from sequence diversification, gene repertoire changes, and shuffling of expression domains.

1,376 citations


Posted ContentDOI
Konrad J. Karczewski1, Konrad J. Karczewski2, Laurent C. Francioli2, Laurent C. Francioli1, Grace Tiao2, Grace Tiao1, Beryl B. Cummings2, Beryl B. Cummings1, Jessica Alföldi2, Jessica Alföldi1, Qingbo Wang1, Qingbo Wang2, Ryan L. Collins1, Ryan L. Collins2, Kristen M. Laricchia2, Kristen M. Laricchia1, Andrea Ganna1, Andrea Ganna2, Andrea Ganna3, Daniel P. Birnbaum2, Laura D. Gauthier2, Harrison Brand2, Harrison Brand1, Matthew Solomonson1, Matthew Solomonson2, Nicholas A. Watts2, Nicholas A. Watts1, Daniel R. Rhodes4, Moriel Singer-Berk2, Eleanor G. Seaby1, Eleanor G. Seaby2, Jack A. Kosmicki1, Jack A. Kosmicki2, Raymond K. Walters2, Raymond K. Walters1, Katherine Tashman2, Katherine Tashman1, Yossi Farjoun2, Eric Banks2, Timothy Poterba1, Timothy Poterba2, Arcturus Wang1, Arcturus Wang2, Cotton Seed1, Cotton Seed2, Nicola Whiffin5, Nicola Whiffin2, Jessica X. Chong6, Kaitlin E. Samocha7, Emma Pierce-Hoffman2, Zachary Zappala2, Zachary Zappala8, Anne H. O’Donnell-Luria9, Anne H. O’Donnell-Luria1, Anne H. O’Donnell-Luria2, Eric Vallabh Minikel2, Ben Weisburd2, Monkol Lek2, Monkol Lek10, James S. Ware5, James S. Ware2, Christopher Vittal2, Christopher Vittal1, Irina M. Armean11, Irina M. Armean1, Irina M. Armean2, Louis Bergelson2, Kristian Cibulskis2, Kristen M. Connolly2, Miguel Covarrubias2, Stacey Donnelly2, Steven Ferriera2, Stacey Gabriel2, Jeff Gentry2, Namrata Gupta2, Thibault Jeandet2, Diane Kaplan2, Christopher Llanwarne2, Ruchi Munshi2, Sam Novod2, Nikelle Petrillo2, David Roazen2, Valentin Ruano-Rubio2, Andrea Saltzman2, Molly Schleicher2, Jose Soto2, Kathleen Tibbetts2, Charlotte Tolonen2, Gordon Wade2, Michael E. Talkowski2, Michael E. Talkowski1, Benjamin M. Neale1, Benjamin M. Neale2, Mark J. Daly2, Daniel G. MacArthur1, Daniel G. MacArthur2 
30 Jan 2019-bioRxiv
TL;DR: Using an improved human mutation rate model, human protein-coding genes are classified along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.
Abstract: Summary Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved model of human mutation, we classify human protein-coding genes along a spectrum representing intolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.

1,128 citations


Journal ArticleDOI
TL;DR: The mechanisms and functions of DNA methylation and demethylation in both mice and humans at CpG-rich promoters, gene bodies and transposable elements are discussed and the dynamic erasure and re-establishment in embryonic, germline and somatic cell development is highlighted.
Abstract: DNA methylation is of paramount importance for mammalian embryonic development. DNA methylation has numerous functions: it is implicated in the repression of transposons and genes, but is also associated with actively transcribed gene bodies and, in some cases, with gene activation per se. In recent years, sensitive technologies have been developed that allow the interrogation of DNA methylation patterns from a small number of cells. The use of these technologies has greatly improved our knowledge of DNA methylation dynamics and heterogeneity in embryos and in specific tissues. Combined with genetic analyses, it is increasingly apparent that regulation of DNA methylation erasure and (re-)establishment varies considerably between different developmental stages. In this Review, we discuss the mechanisms and functions of DNA methylation and demethylation in both mice and humans at CpG-rich promoters, gene bodies and transposable elements. We highlight the dynamic erasure and re-establishment of DNA methylation in embryonic, germline and somatic cell development. Finally, we provide insights into DNA methylation gained from studying genetic diseases. DNA methylation is essential for mammalian embryogenesis owing to its repression of transposons and genes, but it is also associated with gene activation. The recent use of sensitive technologies has revealed that DNA methylation dynamics vary considerably between embryonic, germline and somatic cell development, with implications for genetic diseases and cancer.

1,039 citations


Journal ArticleDOI
03 Apr 2019-Nature
TL;DR: Transcriptional adaptation, a genetic compensation process by which organisms respond to mutations by upregulating related genes, is triggered by mRNA decay and involves a sequence-dependent mechanism.
Abstract: Genetic robustness, or the ability of an organism to maintain fitness in the presence of harmful mutations, can be achieved via protein feedback loops. Previous work has suggested that organisms may also respond to mutations by transcriptional adaptation, a process by which related gene(s) are upregulated independently of protein feedback loops. However, the prevalence of transcriptional adaptation and its underlying molecular mechanisms are unknown. Here, by analysing several models of transcriptional adaptation in zebrafish and mouse, we uncover a requirement for mutant mRNA degradation. Alleles that fail to transcribe the mutated gene do not exhibit transcriptional adaptation, and these alleles give rise to more severe phenotypes than alleles displaying mutant mRNA decay. Transcriptome analysis in alleles displaying mutant mRNA decay reveals the upregulation of a substantial proportion of the genes that exhibit sequence similarity with the mutated gene's mRNA, suggesting a sequence-dependent mechanism. These findings have implications for our understanding of disease-causing mutations, and will help in the design of mutant alleles with minimal transcriptional adaptation-derived compensation.

679 citations


Journal ArticleDOI
TL;DR: This Review discusses how the interaction of p53 with DNA and chromatin affects gene expression, and how p53 post-translational modifications, its temporal expression dynamics and its interactions with chromatin regulators and transcription factors influence cell fate.
Abstract: The tumour suppressor p53 has a central role in the response to cellular stress. Activated p53 transcriptionally regulates hundreds of genes that are involved in multiple biological processes, including in DNA damage repair, cell cycle arrest, apoptosis and senescence. In the context of DNA damage, p53 is thought to be a decision-making transcription factor that selectively activates genes as part of specific gene expression programmes to determine cellular outcomes. In this Review, we discuss the multiple molecular mechanisms of p53 regulation and how they modulate the induction of apoptosis or cell cycle arrest following DNA damage. Specifically, we discuss how the interaction of p53 with DNA and chromatin affects gene expression, and how p53 post-translational modifications, its temporal expression dynamics and its interactions with chromatin regulators and transcription factors influence cell fate. These multiple layers of regulation enable p53 to execute cellular responses that are appropriate for specific cellular states and environmental conditions.

611 citations


Journal ArticleDOI
TL;DR: A comprehensive landscape of different modes of gene duplication across the plant kingdom is identified by comparing 141 genomes, which provides a solid foundation for further investigation of the dynamic evolution of duplicate genes.
Abstract: The sharp increase of plant genome and transcriptome data provide valuable resources to investigate evolutionary consequences of gene duplication in a range of taxa, and unravel common principles underlying duplicate gene retention. We survey 141 sequenced plant genomes to elucidate consequences of gene and genome duplication, processes central to the evolution of biodiversity. We develop a pipeline named DupGen_finder to identify different modes of gene duplication in plants. Genes derived from whole-genome, tandem, proximal, transposed, or dispersed duplication differ in abundance, selection pressure, expression divergence, and gene conversion rate among genomes. The number of WGD-derived duplicate genes decreases exponentially with increasing age of duplication events—transposed duplication- and dispersed duplication-derived genes declined in parallel. In contrast, the frequency of tandem and proximal duplications showed no significant decrease over time, providing a continuous supply of variants available for adaptation to continuously changing environments. Moreover, tandem and proximal duplicates experienced stronger selective pressure than genes formed by other modes and evolved toward biased functional roles involved in plant self-defense. The rate of gene conversion among WGD-derived gene pairs declined over time, peaking shortly after polyploidization. To provide a platform for accessing duplicated gene pairs in different plants, we constructed the Plant Duplicate Gene Database. We identify a comprehensive landscape of different modes of gene duplication across the plant kingdom by comparing 141 genomes, which provides a solid foundation for further investigation of the dynamic evolution of duplicate genes.

461 citations


Journal ArticleDOI
TL;DR: This study revealed that METTL3, acting as an oncogene, maintained SOX2 expression through an m6A-IGF2BP2-dependent mechanism in CRC cells, and indicated a potential biomarker panel for prognostic prediction in CRC.
Abstract: Colorectal carcinoma (CRC) is one of the most common malignant tumors, and its main cause of death is tumor metastasis. RNA N6-methyladenosine (m6A) is an emerging regulatory mechanism for gene expression and methyltransferase-like 3 (METTL3) participates in tumor progression in several cancer types. However, its role in CRC remains unexplored. Western blot, quantitative real-time PCR (RT-qPCR) and immunohistochemical (IHC) were used to detect METTL3 expression in cell lines and patient tissues. Methylated RNA immunoprecipitation sequencing (MeRIP-seq) and transcriptomic RNA sequencing (RNA-seq) were used to screen the target genes of METTL3. The biological functions of METTL3 were investigated in vitro and in vivo. RNA pull-down and RNA immunoprecipitation assays were conducted to explore the specific binding of target genes. RNA stability assay was used to detect the half-lives of the downstream genes of METTL3. Using TCGA database, higher METTL3 expression was found in CRC metastatic tissues and was associated with a poor prognosis. MeRIP-seq revealed that SRY (sex determining region Y)-box 2 (SOX2) was the downstream gene of METTL3. METTL3 knockdown in CRC cells drastically inhibited cell self-renewal, stem cell frequency and migration in vitro and suppressed CRC tumorigenesis and metastasis in both cell-based models and PDX models. Mechanistically, methylated SOX2 transcripts, specifically the coding sequence (CDS) regions, were subsequently recognized by the specific m6A “reader”, insulin-like growth factor 2 mRNA binding protein 2 (IGF2BP2), to prevent SOX2 mRNA degradation. Further, SOX2 expression positively correlated with METTL3 and IGF2BP2 in CRC tissues. The combined IHC panel, including “writer”, “reader”, and “target”, exhibited a better prognostic value for CRC patients than any of these components individually. Overall, our study revealed that METTL3, acting as an oncogene, maintained SOX2 expression through an m6A-IGF2BP2-dependent mechanism in CRC cells, and indicated a potential biomarker panel for prognostic prediction in CRC.

454 citations


Journal ArticleDOI
29 Nov 2019-Science
TL;DR: The list of genes likely to be influenced by noncoding variants in AD is revised and expanded and the probable cell types in which they function are suggested to help better understand common genetic variation associated with brain diseases.
Abstract: Noncoding genetic variation is a major driver of phenotypic diversity, but functional interpretation is challenging. To better understand common genetic variation associated with brain diseases, we defined noncoding regulatory regions for major cell types of the human brain. Whereas psychiatric disorders were primarily associated with variants in transcriptional enhancers and promoters in neurons, sporadic Alzheimer's disease (AD) variants were largely confined to microglia enhancers. Interactome maps connecting disease-risk variants in cell-type-specific enhancers to promoters revealed an extended microglia gene network in AD. Deletion of a microglia-specific enhancer harboring AD-risk variants ablated BIN1 expression in microglia, but not in neurons or astrocytes. These findings revise and expand the list of genes likely to be influenced by noncoding variants in AD and suggest the probable cell types in which they function.

414 citations


Journal ArticleDOI
TL;DR: The ability to perform spatially resolved, genome-wide RNA profiling with high detection efficiency and accuracy by MERFISH could help address a wide array of questions ranging from the regulation of gene expression in cells to the development of cell fate and organization in tissues.
Abstract: The expression profiles and spatial distributions of RNAs regulate many cellular functions. Image-based transcriptomic approaches provide powerful means to measure both expression and spatial information of RNAs in individual cells within their native environment. Among these approaches, multiplexed error-robust fluorescence in situ hybridization (MERFISH) has achieved spatially resolved RNA quantification at transcriptome scale by massively multiplexing single-molecule FISH measurements. Here, we increased the gene throughput of MERFISH and demonstrated simultaneous measurements of RNA transcripts from ∼10,000 genes in individual cells with ∼80% detection efficiency and ∼4% misidentification rate. We combined MERFISH with cellular structure imaging to determine subcellular compartmentalization of RNAs. We validated this approach by showing enrichment of secretome transcripts at the endoplasmic reticulum, and further revealed enrichment of long noncoding RNAs, RNAs with retained introns, and a subgroup of protein-coding mRNAs in the cell nucleus. Leveraging spatially resolved RNA profiling, we developed an approach to determine RNA velocity in situ using the balance of nuclear versus cytoplasmic RNA counts. We applied this approach to infer pseudotime ordering of cells and identified cells at different cell-cycle states, revealing ∼1,600 genes with putative cell cycle-dependent expression and a gradual transcription profile change as cells progress through cell-cycle stages. Our analysis further revealed cell cycle-dependent and cell cycle-independent spatial heterogeneity of transcriptionally distinct cells. We envision that the ability to perform spatially resolved, genome-wide RNA profiling with high detection efficiency and accuracy by MERFISH could help address a wide array of questions ranging from the regulation of gene expression in cells to the development of cell fate and organization in tissues.

402 citations


Journal ArticleDOI
17 Apr 2019-Nature
TL;DR: It is shown that a CBE with rat APOBEC1 can cause extensive transcriptome-wide deamination of RNA cytosines in human cells, inducing tens of thousands of C-to-U edits and the need to more fully define and characterize the RNA off-target effects of deaminase enzymes in base editor platforms is suggested.
Abstract: CRISPR-Cas base-editor technology enables targeted nucleotide alterations, and is being increasingly used for research and potential therapeutic applications1,2. The most widely used cytosine base editors (CBEs) induce deamination of DNA cytosines using the rat APOBEC1 enzyme, which is targeted by a linked Cas protein-guide RNA complex3,4. Previous studies of the specificity of CBEs have identified off-target DNA edits in mammalian cells5,6. Here we show that a CBE with rat APOBEC1 can cause extensive transcriptome-wide deamination of RNA cytosines in human cells, inducing tens of thousands of C-to-U edits with frequencies ranging from 0.07% to 100% in 38-58% of expressed genes. CBE-induced RNA edits occur in both protein-coding and non-protein-coding sequences and generate missense, nonsense, splice site, and 5' and 3' untranslated region mutations. We engineered two CBE variants bearing mutations in rat APOBEC1 that substantially decreased the number of RNA edits (by more than 390-fold and more than 3,800-fold) in human cells. These variants also showed more precise on-target DNA editing than the wild-type CBE and, for most guide RNAs tested, no substantial reduction in editing efficiency. Finally, we show that an adenine base editor7 can also induce transcriptome-wide RNA edits. These results have implications for the use of base editors in both research and clinical settings, illustrate the feasibility of engineering improved variants with reduced RNA editing activities, and suggest the need to more fully define and characterize the RNA off-target effects of deaminase enzymes in base editor platforms.

394 citations


Journal ArticleDOI
10 Jan 2019-Cell
TL;DR: A multiplex, expression quantitative trait locus (eQTL)-inspired framework for mapping enhancer-gene pairs by introducing random combinations of CRISPR/Cas9-mediated perturbations to each of many cells, followed by single-cell RNA sequencing (RNA-seq).

Journal ArticleDOI
TL;DR: Paddy trials showed that genome-edited SWEET promoters endow rice lines with robust, broad-spectrum resistance to all Xanthomonas bacterial blight strains tested.
Abstract: Bacterial blight of rice is an important disease in Asia and Africa. The pathogen, Xanthomonas oryzae pv. oryzae (Xoo), secretes one or more of six known transcription-activator-like effectors (TALes) that bind specific promoter sequences and induce, at minimum, one of the three host sucrose transporter genes SWEET11, SWEET13 and SWEET14, the expression of which is required for disease susceptibility. We used CRISPR-Cas9-mediated genome editing to introduce mutations in all three SWEET gene promoters. Editing was further informed by sequence analyses of TALe genes in 63 Xoo strains, which revealed multiple TALe variants for SWEET13 alleles. Mutations were also created in SWEET14, which is also targeted by two TALes from an African Xoo lineage. A total of five promoter mutations were simultaneously introduced into the rice line Kitaake and the elite mega varieties IR64 and Ciherang-Sub1. Paddy trials showed that genome-edited SWEET promoters endow rice lines with robust, broad-spectrum resistance.

Journal ArticleDOI
28 Aug 2019-Nature
TL;DR: Structural and microscopy studies of gene transcription underpin a model in which phosphorylation controls the shuttling of RNA polymerase II between promoter and gene-body condensates to regulate transcription initiation and elongation.
Abstract: The regulated transcription of genes determines cell identity and function. Recent structural studies have elucidated mechanisms that govern the regulation of transcription by RNA polymerases during the initiation and elongation phases. Microscopy studies have revealed that transcription involves the condensation of factors in the cell nucleus. A model is emerging for the transcription of protein-coding genes in which distinct transient condensates form at gene promoters and in gene bodies to concentrate the factors required for transcription initiation and elongation, respectively. The transcribing enzyme RNA polymerase II may shuttle between these condensates in a phosphorylation-dependent manner. Molecular principles are being defined that rationalize transcriptional organization and regulation, and that will guide future investigations. Structural and microscopy studies of gene transcription underpin a model in which phosphorylation controls the shuttling of RNA polymerase II between promoter and gene-body condensates to regulate transcription initiation and elongation.

Journal ArticleDOI
TL;DR: A tomato pan-genome constructed using genome sequences of 725 phylogenetically and geographically representative accessions captures 4,873 genes absent from the reference genome and identifies a rare allele of TomLoxC regulating fruit flavor.
Abstract: Modern tomatoes have narrow genetic diversity limiting their improvement potential. We present a tomato pan-genome constructed using genome sequences of 725 phylogenetically and geographically representative accessions, revealing 4,873 genes absent from the reference genome. Presence/absence variation analyses reveal substantial gene loss and intense negative selection of genes and promoters during tomato domestication and improvement. Lost or negatively selected genes are enriched for important traits, especially disease resistance. We identify a rare allele in the TomLoxC promoter selected against during domestication. Quantitative trait locus mapping and analysis of transgenic plants reveal a role for TomLoxC in apocarotenoid production, which contributes to desirable tomato flavor. In orange-stage fruit, accessions harboring both the rare and common TomLoxC alleles (heterozygotes) have higher TomLoxC expression than those homozygous for either and are resurgent in modern tomatoes. The tomato pan-genome adds depth and completeness to the reference genome, and is useful for future biological discovery and breeding.

Journal ArticleDOI
TL;DR: Tumors with TP53 mutations differ from their non-mutated counterparts in RNA, miRNA, and protein expression patterns, with mutant TP53 tumors displaying enhanced expression of cell cycle progression genes and proteins.

Journal ArticleDOI
18 Mar 2019-Nature
TL;DR: In this paper, optical reconstruction of chromatin architecture (ORCA) is used to trace the DNA path in single cells with nanoscale accuracy and genomic resolution reaching two kilobases.
Abstract: The establishment of cell types during development requires precise interactions between genes and distal regulatory sequences. We have a limited understanding of how these interactions look in three dimensions, vary across cell types in complex tissue, and relate to transcription. Here we describe optical reconstruction of chromatin architecture (ORCA), a method that can trace the DNA path in single cells with nanoscale accuracy and genomic resolution reaching two kilobases. We used ORCA to study a Hox gene cluster in cryosectioned Drosophila embryos and labelled around 30 RNA species in parallel. We identified cell-type-specific physical borders between active and Polycomb-repressed DNA, and unexpected Polycomb-independent borders. Deletion of Polycomb-independent borders led to ectopic enhancer-promoter contacts, aberrant gene expression, and developmental defects. Together, these results illustrate an approach for high-resolution, single-cell DNA domain analysis in vivo, identify domain structures that change with cell identity, and show that border elements contribute to the formation of physical domains in Drosophila.

Journal ArticleDOI
01 May 2019-Nature
TL;DR: The number of codons used to encode the canonical amino acids can be reduced, through the genome-wide substitution of target codons by defined synonyms, through a high-fidelity convergent total synthesis.
Abstract: Nature uses 64 codons to encode the synthesis of proteins from the genome, and chooses 1 sense codon-out of up to 6 synonyms-to encode each amino acid. Synonymous codon choice has diverse and important roles, and many synonymous substitutions are detrimental. Here we demonstrate that the number of codons used to encode the canonical amino acids can be reduced, through the genome-wide substitution of target codons by defined synonyms. We create a variant of Escherichia coli with a four-megabase synthetic genome through a high-fidelity convergent total synthesis. Our synthetic genome implements a defined recoding and refactoring scheme-with simple corrections at just seven positions-to replace every known occurrence of two sense codons and a stop codon in the genome. Thus, we recode 18,214 codons to create an organism with a 61-codon genome; this organism uses 59 codons to encode the 20 amino acids, and enables the deletion of a previously essential transfer RNA.

Journal ArticleDOI
10 Jun 2019-Nature
TL;DR: In this paper, the deaminases that are integral to commonly used DNA base editors often bind to RNA, and the authors quantitatively evaluated RNA single nucleotide variations (SNVs) that were induced by CBEs or ABEs.
Abstract: Recently developed DNA base editing methods enable the direct generation of desired point mutations in genomic DNA without generating any double-strand breaks1-3, but the issue of off-target edits has limited the application of these methods. Although several previous studies have evaluated off-target mutations in genomic DNA4-8, it is now clear that the deaminases that are integral to commonly used DNA base editors often bind to RNA9-13. For example, the cytosine deaminase APOBEC1-which is used in cytosine base editors (CBEs)-targets both DNA and RNA12, and the adenine deaminase TadA-which is used in adenine base editors (ABEs)-induces site-specific inosine formation on RNA9,11. However, any potential RNA mutations caused by DNA base editors have not been evaluated. Adeno-associated viruses are the most common delivery system for gene therapies that involve DNA editing; these viruses can sustain long-term gene expression in vivo, so the extent of potential RNA mutations induced by DNA base editors is of great concern14-16. Here we quantitatively evaluated RNA single nucleotide variations (SNVs) that were induced by CBEs or ABEs. Both the cytosine base editor BE3 and the adenine base editor ABE7.10 generated tens of thousands of off-target RNA SNVs. Subsequently, by engineering deaminases, we found that three CBE variants and one ABE variant showed a reduction in off-target RNA SNVs to the baseline while maintaining efficient DNA on-target activity. This study reveals a previously overlooked aspect of off-target effects in DNA editing and also demonstrates that such effects can be eliminated by engineering deaminases.

Journal ArticleDOI
TL;DR: The first annotated chromosome-level reference genome assembly for pea, Gregor Mendel’s original genetic model, provides insights into legume genome evolution and the molecular basis of agricultural traits forpea improvement.
Abstract: We report the first annotated chromosome-level reference genome assembly for pea, Gregor Mendel’s original genetic model. Phylogenetics and paleogenomics show genomic rearrangements across legumes and suggest a major role for repetitive elements in pea genome evolution. Compared to other sequenced Leguminosae genomes, the pea genome shows intense gene dynamics, most likely associated with genome size expansion when the Fabeae diverged from its sister tribes. During Pisum evolution, translocation and transposition differentially occurred across lineages. This reference sequence will accelerate our understanding of the molecular basis of agronomically important traits and support crop improvement.

Journal ArticleDOI
TL;DR: The use of a commercially available droplet-based microfluidics platform for high-throughput scRNA-seq to obtain single-cell transcriptomes from protoplasts of more than 10,000 Arabidopsis (Arabidopsis thaliana) root cells demonstrates the feasibility and utility of sc RNA-seq in plants and provides a first-generation gene expression map of the Arabicidopsis root at single- cell resolution.
Abstract: Single-cell RNA sequencing (scRNA-seq) has been used extensively to study cell-specific gene expression in animals, but it has not been widely applied to plants. Here, we describe the use of a commercially available droplet-based microfluidics platform for high-throughput scRNA-seq to obtain single-cell transcriptomes from protoplasts of more than 10,000 Arabidopsis (Arabidopsis thaliana) root cells. We find that all major tissues and developmental stages are represented in this single-cell transcriptome population. Further, distinct subpopulations and rare cell types, including putative quiescent center cells, were identified. A focused analysis of root epidermal cell transcriptomes defined developmental trajectories for individual cells progressing from meristematic through mature stages of root-hair and nonhair cell differentiation. In addition, single-cell transcriptomes were obtained from root epidermis mutants, enabling a comparative analysis of gene expression at single-cell resolution and providing an unprecedented view of the impact of the mutated genes. Overall, this study demonstrates the feasibility and utility of scRNA-seq in plants and provides a first-generation gene expression map of the Arabidopsis root at single-cell resolution.

Journal ArticleDOI
TL;DR: It is shown that following viral infection or stimulation of cells with an inactivated virus, deletion of the m6A ‘writer’ METTL3 or ‘reader’ YTHDF2 led to an increase in the induction of interferon-stimulated genes, and propagation of different viruses was suppressed in an interferON-signaling-dependent manner.
Abstract: N6-methyladenosine (m6A) is the most common mRNA modification. Recent studies have revealed that depletion of m6A machinery leads to alterations in the propagation of diverse viruses. These effects were proposed to be mediated through dysregulated methylation of viral RNA. Here we show that following viral infection or stimulation of cells with an inactivated virus, deletion of the m6A 'writer' METTL3 or 'reader' YTHDF2 led to an increase in the induction of interferon-stimulated genes. Consequently, propagation of different viruses was suppressed in an interferon-signaling-dependent manner. Significantly, the mRNA of IFNB, the gene encoding the main cytokine that drives the type I interferon response, was m6A modified and was stabilized following repression of METTL3 or YTHDF2. Furthermore, we show that m6A-mediated regulation of interferon genes was conserved in mice. Together, our findings uncover the role m6A serves as a negative regulator of interferon response by dictating the fast turnover of interferon mRNAs and consequently facilitating viral propagation.

Journal ArticleDOI
20 Dec 2019-Science
TL;DR: The cellular distribution of genes known to cause primary immunodeficiencies in humans are shown and find that many of these genes are expressed in cells not currently implicated in these diseases, illustrating how this global atlas can help us better understand the function of specific genes across cells and tissues in humans.
Abstract: INTRODUCTION Blood is the predominant source for molecular analyses in humans, both in clinical and research settings, and is the target for many therapeutic strategies, emphasizing the need for comprehensive molecular maps of the cells constituting human blood.The Human Protein Atlas program (www.proteinatlas.org) is an open-access database that aims to map all human proteins by integrating various omics technologies, including antibody-based imaging. Previously, the Human Protein Atlas included gene expression information from peripheral blood mononuclear cells but not the many subpopulations of blood cells within this cell type. To increase the resolution, we performed an in-depth characterization of the constituent cells in blood to provide a detailed view of the gene expression in individual human blood cells and relate these to the other tissues in the body. RATIONALE A quantitative transcriptomics-based expression analysis was performed in 18 canonical immune cell populations (Fig. 1) isolated by flow cytometric sorting. The blood cell expression profiles are presented in combination with expression profiles of tissues, including transcriptomics data from external sources to expand the number of tissue types as well as brain regions included in the database. A genome-wide classification of the protein-coding genes has been performed in terms of expression specificity and distribution, both in blood cells and tissues. RESULTS We present an atlas of the expression of all protein-coding genes in human blood cells, integrated with a classification of the specificity and distribution of all protein-coding genes in all major tissues and organs in the human body. A genome-wide analysis of blood cell RNA expression profiles allowed the identification of genes with elevated expression in various immune cells, confirming well-known protein markers, but also identified novel targets for in-depth analysis. There are 1448 protein-coding genes that have enriched expression in a single immune cell type. It will be interesting to study the corresponding proteins further to explore the biological functions linked to the respective cell phenotypes. A network plot of all cell type–enriched and group-enriched genes (Fig. 1B) reveals that many of the cell type–enriched genes are in neutrophils, eosinophils, and plasmacytoid dendritic cells, while many of the elevated genes in T and B cells are group-enriched across subpopulations of these lymphocytes. To illustrate the usefulness of this resource, we show the cellular distribution of genes known to cause primary immunodeficiencies in humans and find that many of these genes are expressed in cells not currently implicated in these diseases, illustrating how this global atlas can help us better understand the function of specific genes across cells and tissues in humans. CONCLUSION In this study, we have performed a genome-wide transcriptomic analysis of protein-coding genes in sorted blood immune cell populations to characterize the expression levels of each individual gene across all cell types. All data are presented in an interactive, open-access Blood Atlas as part of the Human Protein Atlas and are integrated with expression profiles across all major tissues to provide spatial classification of all protein-coding genes. This allows for a genome-wide exploration of the expression profiles across human immune cell populations and all major human tissues and organs.

Journal ArticleDOI
TL;DR: This study proposed a complex KIAA1429-GATA3 regulatory model based on m6A modification and provided insights into the epi-transcriptomic dysregulation in hepatocarcinogenesis and metastasis.
Abstract: N6-methyladenosine (m6A) modification, the most abundant internal methylation of eukaryotic RNA transcripts, is critically implicated in RNA processing. As the largest known component in the m6A methyltransferase complex, KIAA1429 plays a vital role in m6A methylation. However, its function and mechanism in hepatocellular carcinoma (HCC) remain poorly defined. Quantitative PCR, western blot and immunohistochemistry were used to measure the expression of KIAA1429 in HCC. The effects of KIAA1429 on the malignant phenotypes of hepatoma cells were examined in vitro and in vivo. MeRIP-seq, RIP-seq and RNA-seq were performed to identify the target genes of KIAA1429. KIAA1429 was considerably upregulated in HCC tissues. High expression of KIAA1429 was associated with poor prognosis among HCC patients. Silencing KIAA1429 suppressed cell proliferation and metastasis in vitro and in vivo. GATA3 was identified as the direct downstream target of KIAA1429-mediated m6A modification. KIAA1429 induced m6A methylation on the 3′ UTR of GATA3 pre-mRNA, leading to the separation of the RNA-binding protein HuR and the degradation of GATA3 pre-mRNA. Strikingly, a long noncoding RNA (lncRNA) GATA3-AS, transcribed from the antisense strand of the GATA3 gene, functioned as a cis-acting element for the preferential interaction of KIAA1429 with GATA3 pre-mRNA. Accordingly, we found that the tumor growth and metastasis driven by KIAA1429 or GATA3-AS were mediated by GATA3. Our study proposed a complex KIAA1429-GATA3 regulatory model based on m6A modification and provided insights into the epi-transcriptomic dysregulation in hepatocarcinogenesis and metastasis.

Journal ArticleDOI
TL;DR: It is proposed, that the transcription compartment is part of the regulatory architecture of embryonic nuclei and offers a transcriptionally competent environment to facilitate early escape from repression before global genome activation.
Abstract: Most metazoan embryos commence development with rapid, transcriptionally silent cell divisions, with genome activation delayed until the mid-blastula transition (MBT). However, a set of genes escapes global repression and gets activated before MBT. Here we describe the formation and the spatio-temporal dynamics of a pair of distinct transcription compartments, which encompasses the earliest gene expression in zebrafish. 4D imaging of pri-miR430 and zinc-finger-gene activities by a novel, native transcription imaging approach reveals transcriptional sharing of nuclear compartments, which are regulated by homologous chromosome organisation. These compartments carry the majority of nascent-RNAs and active Polymerase II, are chromatin-depleted and represent the main sites of detectable transcription before MBT. Transcription occurs during the S-phase of increasingly permissive cleavage cycles. It is proposed, that the transcription compartment is part of the regulatory architecture of embryonic nuclei and offers a transcriptionally competent environment to facilitate early escape from repression before global genome activation. Transcription is globally repressed in early stage of embryo development, but a set of genes including pri-miR-430 and zinc finger genes is known to escape the repression. Here the authors image the very first transcriptional activities in the living zebra fish embryo, demonstrating a cell cycle-coordinated polymerase II transcription compartment.

Journal ArticleDOI
TL;DR: Mapping long-range chromatin interactions in 27 human cell/tissue types identifies candidate target genes of 70,329 putative regulatory elements and suggests potential regulatory function for 27,325 noncoding sequence variants associated with 2,117 physiological traits and diseases.
Abstract: A large number of putative cis-regulatory sequences have been annotated in the human genome, but the genes they control remain poorly defined. To bridge this gap, we generate maps of long-range chromatin interactions centered on 18,943 well-annotated promoters for protein-coding genes in 27 human cell/tissue types. We use this information to infer the target genes of 70,329 candidate regulatory elements and suggest potential regulatory function for 27,325 noncoding sequence variants associated with 2,117 physiological traits and diseases. Integrative analysis of these promoter-centered interactome maps reveals widespread enhancer-like promoters involved in gene regulation and common molecular pathways underlying distinct groups of human traits and diseases.

Journal ArticleDOI
TL;DR: Systematic analysis of highly rearranged balancer chromosomes in Drosophila shows that extensive changes to chromatin topology affect the expression of only a subset of genes, and suggests that properties other than chromatinTopology ensure productive enhancer–promoter interactions.
Abstract: Chromatin topology is intricately linked to gene expression, yet its functional requirement remains unclear. Here, we comprehensively assessed the interplay between genome topology and gene expression using highly rearranged chromosomes (balancers) spanning ~75% of the Drosophila genome. Using transheterozyte (balancer/wild-type) embryos, we measured allele-specific changes in topology and gene expression in cis, while minimizing trans effects. Through genome sequencing, we resolved eight large nested inversions, smaller inversions, duplications and thousands of deletions. These extensive rearrangements caused many changes to chromatin topology, disrupting long-range loops, topologically associating domains (TADs) and promoter interactions, yet these are not predictive of changes in expression. Gene expression is generally not altered around inversion breakpoints, indicating that mis-appropriate enhancer–promoter activation is a rare event. Similarly, shuffling or fusing TADs, changing intra-TAD connections and disrupting long-range inter-TAD loops does not alter expression for the majority of genes. Our results suggest that properties other than chromatin topology ensure productive enhancer–promoter interactions. Systematic analysis of highly rearranged balancer chromosomes in Drosophila shows that extensive changes to chromatin topology affect the expression of only a subset of genes.

Journal ArticleDOI
TL;DR: It is shown that DNA sequences encoding TF binding site number, density, and affinity above sharply defined thresholds drive condensation of TFs and coactivators, which helps to understand how the genome can scaffold transcriptional condensates at specific loci and how the universal phenomenon of phase separation might regulate this process.

Journal ArticleDOI
TL;DR: In this article, a review of the mechanisms of miRNA-mediated gene transcriptional and post-transcriptional regulation are summarized, and the synergistic effects among these actions which form a regulatory network of a miRNA on its target are particularly elaborated.
Abstract: MicroRNAs (miRNAs) are a class of endogenous small noncoding RNAs that participate in a majority of biological processes via regulating target gene expression. The post-transcriptional repression through miRNA seed region binding to 3' UTR of target mRNA is considered as the canonical mode of miRNA-mediated gene regulation. However, emerging evidence suggests that other regulatory modes exist beyond the canonical mechanism. In particular, the function of intranuclear miRNA in gene transcriptional regulation is gradually revealed, with evidence showing their contribution to gene silencing or activating. Therefore, miRNA-mediated regulation of gene transcription not only expands our understanding of the molecular mechanism underlying miRNA regulatory function, but also provides new evidence to explain its ability in the sophisticated regulation of many bioprocesses. In this review, mechanisms of miRNA-mediated gene transcriptional and post-transcriptional regulation are summarized, and the synergistic effects among these actions which form a regulatory network of a miRNA on its target are particularly elaborated. With these discussions, we aim to emphasize the importance of miRNA regulatory network on target gene regulation and further highlight the potential application of the network mode in the achievement of a more effective and stable modulation of the target gene expression.

Journal ArticleDOI
TL;DR: The results demonstrate that single cell transcriptomics holds promise for studying plant development and plant physiology with unprecedented resolution and address the longstanding question of possible heterogeneity among cell types in the response to an abiotic stress.
Abstract: Single cell RNA sequencing can yield high-resolution cell-type–specific expression signatures that reveal new cell types and the developmental trajectories of cell lineages. Here, we apply this approach to Arabidopsis (Arabidopsis thaliana) root cells to capture gene expression in 3,121 root cells. We analyze these data with Monocle 3, which orders single cell transcriptomes in an unsupervised manner and uses machine learning to reconstruct single cell developmental trajectories along pseudotime. We identify hundreds of genes with cell-type–specific expression, with pseudotime analysis of several cell lineages revealing both known and novel genes that are expressed along a developmental trajectory. We identify transcription factor motifs that are enriched in early and late cells, together with the corresponding candidate transcription factors that likely drive the observed expression patterns. We assess and interpret changes in total RNA expression along developmental trajectories and show that trajectory branch points mark developmental decisions. Finally, by applying heat stress to whole seedlings, we address the longstanding question of possible heterogeneity among cell types in the response to an abiotic stress. Although the response of canonical heat-shock genes dominates expression across cell types, subtle but significant differences in other genes can be detected among cell types. Taken together, our results demonstrate that single cell transcriptomics holds promise for studying plant development and plant physiology with unprecedented resolution.

Journal ArticleDOI
TL;DR: A single-cell framework that integrates highly multiplexed protein quantification, transcriptome profiling and analysis of chromatin accessibility is presented that uncovers cancer-regulatory programs in single cells.
Abstract: Identifying the causes of human diseases requires deconvolution of abnormal molecular phenotypes spanning DNA accessibility, gene expression and protein abundance1–3. We present a single-cell framework that integrates highly multiplexed protein quantification, transcriptome profiling and analysis of chromatin accessibility. Using this approach, we establish a normal epigenetic baseline for healthy blood development, which we then use to deconvolve aberrant molecular features within blood from patients with mixed-phenotype acute leukemia4,5. Despite widespread epigenetic heterogeneity within the patient cohort, we observe common malignant signatures across patients as well as patient-specific regulatory features that are shared across phenotypic compartments of individual patients. Integrative analysis of transcriptomic and chromatin-accessibility maps identified 91,601 putative peak-to-gene linkages and transcription factors that regulate leukemia-specific genes, such as RUNX1-linked regulatory elements proximal to the marker gene CD69. These results demonstrate how integrative, multiomic analysis of single cells within the framework of normal development can reveal both distinct and shared molecular mechanisms of disease from patient samples. Analyzing DNA accessibility, transcriptome and protein expression in single cells uncovers cancer-regulatory programs.