scispace - formally typeset
Search or ask a question
Author

Sam John

Other affiliations: National Institutes of Health
Bio: Sam John is an academic researcher from University of Washington. The author has contributed to research in topics: Chromatin & DNA methylation. The author has an hindex of 9, co-authored 12 publications receiving 3721 citations. Previous affiliations of Sam John include National Institutes of Health.

Papers
More filters
Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types is presented, revealing novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns.
Abstract: DNase I hypersensitive sites (DHSs) are markers of regulatory DNA and have underpinned the discovery of all classes of cis-regulatory elements including enhancers, promoters, insulators, silencers and locus control regions. Here we present the first extensive map of human DHSs identified through genome-wide profiling in 125 diverse cell and tissue types. We identify ∼2.9 million DHSs that encompass virtually all known experimentally validated cis-regulatory sequences and expose a vast trove of novel elements, most with highly cell-selective regulation. Annotating these elements using ENCODE data reveals novel relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. We connect ∼580,000 distal DHSs with their target promoters, revealing systematic pairing of different classes of distal DHSs and specific promoter types. Patterning of chromatin accessibility at many regulatory regions is organized with dozens to hundreds of co-activated elements, and the transcellular DNase I sensitivity pattern at a given region can predict cell-type-specific functional behaviours. The DHS landscape shows signatures of recent functional evolutionary constraint. However, the DHS compartment in pluripotent and immortalized cells exhibits higher mutation rates than that in highly differentiated cells, exposing an unexpected link between chromatin accessibility, proliferative potential and patterns of human variation. An extensive map of human DNase I hypersensitive sites, markers of regulatory DNA, in 125 diverse cell and tissue types is described; integration of this information with other ENCODE-generated data sets identifies new relationships between chromatin accessibility, transcription, DNA methylation and regulatory factor occupancy patterns. This paper describes the first extensive map of human DNaseI hypersensitive sites — markers of regulatory DNA — in 125 diverse cell and tissue types. Integration of this information with other data sets generated by ENCODE (Encyclopedia of DNA Elements) identified new relationships between chromatin accessibility, transcription, DNA methylation and regulatory-factor occupancy patterns. Evolutionary-conservation analysis revealed signatures of recent functional constraint within DNaseI hypersensitive sites.

2,628 citations

Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: A stereotyped 50-base-pair footprint is identified that precisely defines the site of transcript origination within thousands of human promoters, and a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function are described.
Abstract: Regulatory factor binding to genomic DNA protects the underlying sequence from cleavage by DNase I, leaving nucleotide-resolution footprints. Using genomic DNase I footprinting across 41 diverse cell and tissue types, we detected 45 million transcription factor occupancy events within regulatory regions, representing differential binding to 8.4 million distinct short sequence elements. Here we show that this small genomic sequence compartment, roughly twice the size of the exome, encodes an expansive repertoire of conserved recognition sequences for DNA-binding proteins that nearly doubles the size of the human cis-regulatory lexicon. We find that genetic variants affecting allelic chromatin states are concentrated in footprints, and that these elements are preferentially sheltered from DNA methylation. High-resolution DNase I cleavage patterns mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces, indicating that transcription factor structure has been evolutionarily imprinted on the human genome sequence. We identify a stereotyped 50-base-pair footprint that precisely defines the site of transcript origination within thousands of human promoters. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency.

846 citations

Journal ArticleDOI
TL;DR: The results suggest that DNA methylation is not a primary groundskeeper of genomic TF landscapes, but rather a specialized mechanism for stabilizing intrinsically labile Sites, which are characterized by highly variable CTCF occupancy across cell types.

242 citations

Journal ArticleDOI
TL;DR: It is shown that select exonic regions are demarcated within the three-dimensional structure of the human genome, connecting local genome topography, chromatin structure and cis-regulatory landscapes with the generation of human transcriptional complexity by cotranscriptional splicing.
Abstract: The precise splicing of genes confers an enormous transcriptional complexity to the human genome. The majority of gene splicing occurs cotranscriptionally, permitting epigenetic modifications to affect splicing outcomes. Here we show that select exonic regions are demarcated within the three-dimensional structure of the human genome. We identify a subset of exons that exhibit DNase I hypersensitivity and are accompanied by 'phantom' signals in chromatin immunoprecipitation and sequencing (ChIP-seq) that result from cross-linking with proximal promoter- or enhancer-bound factors. The capture of structural features by ChIP-seq is confirmed by chromatin interaction analysis that resolves local intragenic loops that fold exons close to cognate promoters while excluding intervening intronic sequences. These interactions of exons with promoters and enhancers are enriched for alternative splicing events, an effect reflected in cell type-specific periexonic DNase I hypersensitivity patterns. Collectively, our results connect local genome topography, chromatin structure and cis-regulatory landscapes with the generation of human transcriptional complexity by cotranscriptional splicing.

128 citations

Reference EntryDOI
TL;DR: Methods are described for nuclei isolation, digestion of nuclei with limiting concentrations of DNase I, and the biochemical fractionation ofDNase I hypersensitive sites in preparation for high-throughput sequencing.
Abstract: DNaseI-seq is a global and high-resolution method that uses the non-specific endonuclease DNaseI to map chromatin accessibility. These accessible regions, designated as DNaseI hypersensitive sites (DHSs), define the regulatory features, (eg. promoters, enhancers, insulators, locus control regions) of complex genomes. In this unit, we will describe systematic methods for nuclei isolation, digestion of nuclei with limiting concentrations of DNaseI and the biochemical fractionation of DNaseI hypersensitive sites in preparation for high-throughput sequencing. DNaseI-seq is an unbiased and robust method that is not predicated on an a priori understanding of regulatory patterns or chromatin features.

96 citations


Cited by
More filters
Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

13,548 citations

Journal ArticleDOI
15 Feb 2013-Science
TL;DR: The type II bacterial CRISPR system is engineer to function with custom guide RNA (gRNA) in human cells to establish an RNA-guided editing tool for facile, robust, and multiplexable human genome engineering.
Abstract: Bacteria and archaea have evolved adaptive immune defenses, termed clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated (Cas) systems, that use short RNA to direct degradation of foreign nucleic acids. Here, we engineer the type II bacterial CRISPR system to function with custom guide RNA (gRNA) in human cells. For the endogenous AAVS1 locus, we obtained targeting rates of 10 to 25% in 293T cells, 13 to 8% in K562 cells, and 2 to 4% in induced pluripotent stem cells. We show that this process relies on CRISPR components; is sequence-specific; and, upon simultaneous introduction of multiple gRNAs, can effect multiplex editing of target loci. We also compute a genome-wide resource of ~190 K unique gRNAs targeting ~40.5% of human exons. Our results establish an RNA-guided editing tool for facile, robust, and multiplexable human genome engineering.

8,197 citations

Journal Article
01 Jan 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

8,106 citations

Journal ArticleDOI
Anshul Kundaje1, Wouter Meuleman1, Wouter Meuleman2, Jason Ernst3, Misha Bilenky4, Angela Yen1, Angela Yen2, Alireza Heravi-Moussavi4, Pouya Kheradpour1, Pouya Kheradpour2, Zhizhuo Zhang1, Zhizhuo Zhang2, Jianrong Wang2, Jianrong Wang1, Michael J. Ziller2, Viren Amin5, John W. Whitaker, Matthew D. Schultz6, Lucas D. Ward2, Lucas D. Ward1, Abhishek Sarkar2, Abhishek Sarkar1, Gerald Quon1, Gerald Quon2, Richard Sandstrom7, Matthew L. Eaton2, Matthew L. Eaton1, Yi-Chieh Wu2, Yi-Chieh Wu1, Andreas R. Pfenning2, Andreas R. Pfenning1, Xinchen Wang1, Xinchen Wang2, Melina Claussnitzer1, Melina Claussnitzer2, Yaping Liu1, Yaping Liu2, Cristian Coarfa5, R. Alan Harris5, Noam Shoresh2, Charles B. Epstein2, Elizabeta Gjoneska2, Elizabeta Gjoneska1, Danny Leung8, Wei Xie8, R. David Hawkins8, Ryan Lister6, Chibo Hong9, Philippe Gascard9, Andrew J. Mungall4, Richard A. Moore4, Eric Chuah4, Angela Tam4, Theresa K. Canfield7, R. Scott Hansen7, Rajinder Kaul7, Peter J. Sabo7, Mukul S. Bansal2, Mukul S. Bansal10, Mukul S. Bansal1, Annaick Carles4, Jesse R. Dixon8, Kai How Farh2, Soheil Feizi2, Soheil Feizi1, Rosa Karlic11, Ah Ram Kim1, Ah Ram Kim2, Ashwinikumar Kulkarni12, Daofeng Li13, Rebecca F. Lowdon13, Ginell Elliott13, Tim R. Mercer14, Shane Neph7, Vitor Onuchic5, Paz Polak15, Paz Polak2, Nisha Rajagopal8, Pradipta R. Ray12, Richard C Sallari1, Richard C Sallari2, Kyle Siebenthall7, Nicholas A Sinnott-Armstrong2, Nicholas A Sinnott-Armstrong1, Michael Stevens13, Robert E. Thurman7, Jie Wu16, Bo Zhang13, Xin Zhou13, Arthur E. Beaudet5, Laurie A. Boyer1, Philip L. De Jager15, Philip L. De Jager2, Peggy J. Farnham17, Susan J. Fisher9, David Haussler18, Steven J.M. Jones4, Steven J.M. Jones19, Wei Li5, Marco A. Marra4, Michael T. McManus9, Shamil R. Sunyaev15, Shamil R. Sunyaev2, James A. Thomson20, Thea D. Tlsty9, Li-Huei Tsai2, Li-Huei Tsai1, Wei Wang, Robert A. Waterland5, Michael Q. Zhang21, Lisa Helbling Chadwick22, Bradley E. Bernstein6, Bradley E. Bernstein15, Bradley E. Bernstein2, Joseph F. Costello9, Joseph R. Ecker11, Martin Hirst4, Alexander Meissner2, Aleksandar Milosavljevic5, Bing Ren8, John A. Stamatoyannopoulos7, Ting Wang13, Manolis Kellis2, Manolis Kellis1 
19 Feb 2015-Nature
TL;DR: It is shown that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

5,037 citations

Journal ArticleDOI
TL;DR: The feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making is demonstrated and classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes are discovered.
Abstract: We describe an assay for transposase-accessible chromatin using sequencing (ATAC-seq), based on direct in vitro transposition of sequencing adaptors into native chromatin, as a rapid and sensitive method for integrative epigenomic analysis. ATAC-seq captures open chromatin sites using a simple two-step protocol with 500-50,000 cells and reveals the interplay between genomic locations of open chromatin, DNA-binding proteins, individual nucleosomes and chromatin compaction at nucleotide resolution. We discovered classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes. Using ATAC-seq maps of human CD4(+) T cells from a proband obtained on consecutive days, we demonstrated the feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making.

4,984 citations