scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Genome-wide quantitative enhancer activity maps identified by STARR-seq.

01 Mar 2013-Science (American Association for the Advancement of Science)-Vol. 339, Iss: 6123, pp 1074-1077
TL;DR: STARR-seq identifies thousands of cell type–specific enhancers across a broad continuum of strengths, links differential gene expression to differences in enhancer activity, and creates a genome-wide quantitative enhancer map, revealing the highly complex regulation of transcription.
Abstract: Genomic enhancers are important regulators of gene expression, but their identification is a challenge, and methods depend on indirect measures of activity. We developed a method termed STARR-seq to directly and quantitatively assess enhancer activity for millions of candidates from arbitrary sources of DNA, which enables screens across entire genomes. When applied to the Drosophila genome, STARR-seq identifies thousands of cell type–specific enhancers across a broad continuum of strengths, links differential gene expression to differences in enhancer activity, and creates a genome-wide quantitative enhancer map. This map reveals the highly complex regulation of transcription, with several independent enhancers for both developmental regulators and ubiquitously expressed genes. STARR-seq can be used to identify and quantify enhancer activity in other eukaryotes, including humans.
Citations
More filters
Journal ArticleDOI
TL;DR: An overview of enhancer-associated modifications of histones and DNA is given and enzymatic activities involved in their dynamic deposition and removal are discussed and potential downstream effectors of these marks are described.

1,215 citations


Cites background from "Genome-wide quantitative enhancer a..."

  • ...Moreover, although limited in scope of validation, existing studies strongly support predictive power of epigenomic annotation in enhancer discovery (Arnold et al., 2013; Blow et al., 2010; Bonn et al., 2012; May et al., 2012; Visel et al., 2009)....

    [...]

Journal ArticleDOI
TL;DR: How properties of enhancer sequences and chromatin are used to predict enhancers in genome-wide studies are discussed and recently developed high-throughput methods that allow the direct testing and identification of enhancers on the basis of their activity are covered.
Abstract: Cellular development, morphology and function are governed by precise patterns of gene expression. These are established by the coordinated action of genomic regulatory elements known as enhancers or cis-regulatory modules. More than 30 years after the initial discovery of enhancers, many of their properties have been elucidated; however, despite major efforts, we only have an incomplete picture of enhancers in animal genomes. In this Review, we discuss how properties of enhancer sequences and chromatin are used to predict enhancers in genome-wide studies. We also cover recently developed high-throughput methods that allow the direct testing and identification of enhancers on the basis of their activity. Finally, we discuss recent technological advances and current challenges in the field of regulatory genomics.

1,163 citations

20 Nov 2014
TL;DR: The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways.
Abstract: © 2014 Macmillan Publishers Limited. All rights reserved.The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain

1,020 citations

Journal ArticleDOI
TL;DR: The latest understanding of long-range enhancer–promoter crosstalk is discussed, including target-gene specificity, interaction dynamics, protein and RNA architects of interactions, roles of 3D genome organization and the pathological consequences of regulatory rewiring.
Abstract: Spatiotemporal gene expression programmes are orchestrated by transcriptional enhancers, which are key regulatory DNA elements that engage in physical contacts with their target-gene promoters, often bridging considerable genomic distances. Recent progress in genomics, genome editing and microscopy methodologies have enabled the genome-wide mapping of enhancer-promoter contacts and their functional dissection. In this Review, we discuss novel concepts on how enhancer-promoter interactions are established and maintained, how the 3D architecture of mammalian genomes both facilitates and constrains enhancer-promoter contacts, and the role they play in gene expression control during normal development and disease.

646 citations

Journal ArticleDOI
29 Sep 2016-Science
TL;DR: A high-throughput approach that uses clustered regularly interspaced short palindromic repeats (CRISPRi) interference to discover regulatory elements and identify their target genes and can be applied to dissect transcriptional networks and interpret the contributions of noncoding genetic variation to human disease.
Abstract: Gene expression in mammals is regulated by noncoding elements that can affect physiology and disease, yet the functions and target genes of most noncoding elements remain unknown. We present a high-throughput approach that uses clustered regularly interspaced short palindromic repeats (CRISPR) interference (CRISPRi) to discover regulatory elements and identify their target genes. We assess >1 megabase of sequence in the vicinity of two essential transcription factors, MYC and GATA1, and identify nine distal enhancers that control gene expression and cellular proliferation. Quantitative features of chromatin state and chromosome conformation distinguish the seven enhancers that regulate MYC from other elements that do not, suggesting a strategy for predicting enhancer-promoter connectivity. This CRISPRi-based approach can be applied to dissect transcriptional networks and interpret the contributions of noncoding genetic variation to human disease.

500 citations

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations

Journal ArticleDOI
TL;DR: A new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format, which allows the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks.
Abstract: Motivation: Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing webbased methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner. Results: This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets. Availability and implementation: BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools

18,858 citations


Additional excerpts

  • ...For all libraries we computed the read coverage at each position for the non-­‐ repetitive euchromatic genome using BEDTools (34)....

    [...]

  • ...To intersect genomic coordinates such as STARR-­‐seq enhancer elements and DHS-­‐seq open regions or closest finding the closest TSS to an specific element we used BEDTools (34) suite of programs....

    [...]

Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

13,548 citations

Journal ArticleDOI
TL;DR: This work presents Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer, and uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions.
Abstract: We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, and is freely available.

13,008 citations


Additional excerpts

  • ...We also called accessible regions genome-­‐wide by MACS (32) with default parameters (-­‐m 5,50 -­‐g dm -­‐p 1e-­‐5) on the merged replicates for S2 and OSC (Table S4-­‐S5)....

    [...]

Journal ArticleDOI
TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Abstract: We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 10 5 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices. The mRNA population specifies a cell’s identity and helps to govern its present and future activities. This has made transcriptome analysis a general phenotyping method, with expression microarrays of many kinds in routine use. Here we explore the possibility that transcriptome analysis, transcript discovery and transcript refinement can be done effectively in large and complex mammalian genomes by ultra-high-throughput sequencing. Expression microarrays are currently the most widely used methodology for transcriptome analysis, although some limitations persist. These include hybridization and cross-hybridization artifacts 1–3 , dye-based detection issues and design constraints that preclude or seriously limit the detection of RNA splice patterns and previously unmapped genes. These issues have made it difficult for standard array designs to provide full sequence comprehensiveness (coverage of all possible genes, including unknown ones, in large genomes) or transcriptome comprehensiveness (reliable detection of all RNAs of all prevalence classes, including the least abundant ones that are physiologically relevant). Other

12,293 citations


Additional excerpts

  • ...31 (29) and calculated RPKM values (Reads Per Kilobase of exon model per Million mapped reads) (30)....

    [...]

Related Papers (5)
19 Feb 2015-Nature
Anshul Kundaje, Wouter Meuleman, Wouter Meuleman, Jason Ernst, Misha Bilenky, Angela Yen, Angela Yen, Alireza Heravi-Moussavi, Pouya Kheradpour, Pouya Kheradpour, Zhizhuo Zhang, Zhizhuo Zhang, Jianrong Wang, Jianrong Wang, Michael J. Ziller, Viren Amin, John W. Whitaker, Matthew D. Schultz, Lucas D. Ward, Lucas D. Ward, Abhishek Sarkar, Abhishek Sarkar, Gerald Quon, Gerald Quon, Richard Sandstrom, Matthew L. Eaton, Matthew L. Eaton, Yi-Chieh Wu, Yi-Chieh Wu, Andreas R. Pfenning, Andreas R. Pfenning, Xinchen Wang, Xinchen Wang, Melina Claussnitzer, Melina Claussnitzer, Yaping Liu, Yaping Liu, Cristian Coarfa, R. Alan Harris, Noam Shoresh, Charles B. Epstein, Elizabeta Gjoneska, Elizabeta Gjoneska, Danny Leung, Wei Xie, R. David Hawkins, Ryan Lister, Chibo Hong, Philippe Gascard, Andrew J. Mungall, Richard A. Moore, Eric Chuah, Angela Tam, Theresa K. Canfield, R. Scott Hansen, Rajinder Kaul, Peter J. Sabo, Mukul S. Bansal, Mukul S. Bansal, Mukul S. Bansal, Annaick Carles, Jesse R. Dixon, Kai How Farh, Soheil Feizi, Soheil Feizi, Rosa Karlic, Ah Ram Kim, Ah Ram Kim, Ashwinikumar Kulkarni, Daofeng Li, Rebecca F. Lowdon, Ginell Elliott, Tim R. Mercer, Shane Neph, Vitor Onuchic, Paz Polak, Paz Polak, Nisha Rajagopal, Pradipta R. Ray, Richard C Sallari, Richard C Sallari, Kyle Siebenthall, Nicholas A Sinnott-Armstrong, Nicholas A Sinnott-Armstrong, Michael Stevens, Robert E. Thurman, Jie Wu, Bo Zhang, Xin Zhou, Arthur E. Beaudet, Laurie A. Boyer, Philip L. De Jager, Philip L. De Jager, Peggy J. Farnham, Susan J. Fisher, David Haussler, Steven J.M. Jones, Steven J.M. Jones, Wei Li, Marco A. Marra, Michael T. McManus, Shamil R. Sunyaev, Shamil R. Sunyaev, James A. Thomson, Thea D. Tlsty, Li-Huei Tsai, Li-Huei Tsai, Wei Wang, Robert A. Waterland, Michael Q. Zhang, Lisa Helbling Chadwick, Bradley E. Bernstein, Bradley E. Bernstein, Bradley E. Bernstein, Joseph F. Costello, Joseph R. Ecker, Martin Hirst, Alexander Meissner, Aleksandar Milosavljevic, Bing Ren, John A. Stamatoyannopoulos, Ting Wang, Manolis Kellis, Manolis Kellis