scispace - formally typeset
Search or ask a question
Author

Martin Hemberg

Bio: Martin Hemberg is an academic researcher from Wellcome Trust Sanger Institute. The author has contributed to research in topics: Gene & Biology. The author has an hindex of 37, co-authored 102 publications receiving 10274 citations. Previous affiliations of Martin Hemberg include Boston Children's Hospital & Wellcome Trust/Cancer Research UK Gurdon Institute.
Topics: Gene, Biology, RNA, Promoter, Transcriptome


Papers
More filters
Journal ArticleDOI
13 May 2010-Nature
TL;DR: It is revealed that a widespread mechanism of enhancer activation involves RNAPII binding and eRNA synthesis, which occurs specifically at enhancers that are actively engaged in promoting mRNA synthesis.
Abstract: We used genome-wide sequencing methods to study stimulus-dependent enhancer function in mouse cortical neurons. We identified approximately 12,000 neuronal activity-regulated enhancers that are bound by the general transcriptional co-activator CBP in an activity-dependent manner. A function of CBP at enhancers may be to recruit RNA polymerase II (RNAPII), as we also observed activity-regulated RNAPII binding to thousands of enhancers. Notably, RNAPII at enhancers transcribes bi-directionally a novel class of enhancer RNAs (eRNAs) within enhancer domains defined by the presence of histone H3 monomethylated at lysine 4. The level of eRNA expression at neuronal enhancers positively correlates with the level of messenger RNA synthesis at nearby genes, suggesting that eRNA synthesis occurs specifically at enhancers that are actively engaged in promoting mRNA synthesis. These findings reveal that a widespread mechanism of enhancer activation involves RNAPII binding and eRNA synthesis.

2,177 citations

Journal ArticleDOI
Aviv Regev1, Aviv Regev2, Aviv Regev3, Sarah A. Teichmann4, Sarah A. Teichmann5, Sarah A. Teichmann6, Eric S. Lander1, Eric S. Lander3, Eric S. Lander7, Ido Amit8, Christophe Benoist7, Ewan Birney5, Bernd Bodenmiller5, Bernd Bodenmiller9, Peter J. Campbell4, Peter J. Campbell6, Piero Carninci6, Menna R. Clatworthy10, Hans Clevers11, Bart Deplancke12, Ian Dunham5, James Eberwine13, Roland Eils14, Roland Eils15, Wolfgang Enard16, Andrew Farmer, Lars Fugger17, Berthold Göttgens6, Nir Hacohen7, Nir Hacohen3, Muzlifah Haniffa18, Martin Hemberg4, Seung K. Kim19, Paul Klenerman20, Paul Klenerman17, Arnold R. Kriegstein21, Ed S. Lein22, Sten Linnarsson23, Emma Lundberg24, Emma Lundberg19, Joakim Lundeberg24, Partha P. Majumder, John C. Marioni5, John C. Marioni6, John C. Marioni4, Miriam Merad25, Musa M. Mhlanga26, Martijn C. Nawijn27, Mihai G. Netea28, Garry P. Nolan19, Dana Pe'er29, Anthony Phillipakis3, Chris P. Ponting30, Stephen R. Quake19, Wolf Reik4, Wolf Reik6, Wolf Reik31, Orit Rozenblatt-Rosen3, Joshua R. Sanes7, Rahul Satija32, Ton N. Schumacher33, Alex K. Shalek3, Alex K. Shalek34, Alex K. Shalek1, Ehud Shapiro8, Padmanee Sharma35, Jay W. Shin, Oliver Stegle5, Michael R. Stratton4, Michael J. T. Stubbington4, Fabian J. Theis36, Matthias Uhlen37, Matthias Uhlen24, Alexander van Oudenaarden11, Allon Wagner38, Fiona M. Watt39, Jonathan S. Weissman, Barbara J. Wold40, Ramnik J. Xavier, Nir Yosef38, Nir Yosef34, Human Cell Atlas Meeting Participants 
05 Dec 2017-eLife
TL;DR: An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease.
Abstract: The recent advent of methods for high-throughput single-cell molecular profiling has catalyzed a growing sense in the scientific community that the time is ripe to complete the 150-year-old effort to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles (such as gene expression profiles) and to connect this information with classical cellular descriptions (such as location and morphology). An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease. Here we describe the idea, its potential utility, early proofs-of-concept, and some design considerations for the Human Cell Atlas, including a commitment to open data, code, and community.

1,391 citations

Journal ArticleDOI
TL;DR: It is demonstrated that SC3 is capable of identifying subclones from the transcriptomes of neoplastic cells collected from patients and achieves high accuracy and robustness by combining multiple clustering solutions through a consensus approach.
Abstract: Single-cell RNA-seq enables the quantitative characterization of cell types based on global transcriptome profiles. We present single-cell consensus clustering (SC3), a user-friendly tool for unsupervised clustering, which achieves high accuracy and robustness by combining multiple clustering solutions through a consensus approach (http://bioconductor.org/packages/SC3). We demonstrate that SC3 is capable of identifying subclones from the transcriptomes of neoplastic cells collected from patients.

1,120 citations

Journal ArticleDOI
22 May 2008-Nature
TL;DR: Clonal heterogeneity of gene expression level is not due to independent noise in the expression of individual genes, but reflects metastable states of a slowly fluctuating transcriptome that is distinct in individual cells and may govern the reversible, stochastic priming of multipotent progenitor cells in cell fate decision.
Abstract: Phenotypic cell-to-cell variability within clonal populations may be a manifestation of 'gene expression noise', or it may reflect stable phenotypic variants. Such 'non-genetic cell individuality' can arise from the slow fluctuations of protein levels in mammalian cells. These fluctuations produce persistent cell individuality, thereby rendering a clonal population heterogeneous. However, it remains unknown whether this heterogeneity may account for the stochasticity of cell fate decisions in stem cells. Here we show that in clonal populations of mouse haematopoietic progenitor cells, spontaneous 'outlier' cells with either extremely high or low expression levels of the stem cell marker Sca-1 (also known as Ly6a; ref. 9) reconstitute the parental distribution of Sca-1 but do so only after more than one week. This slow relaxation is described by a gaussian mixture model that incorporates noise-driven transitions between discrete subpopulations, suggesting hidden multi-stability within one cell type. Despite clonality, the Sca-1 outliers had distinct transcriptomes. Although their unique gene expression profiles eventually reverted to that of the median cells, revealing an attractor state, they lasted long enough to confer a greatly different proclivity for choosing either the erythroid or the myeloid lineage. Preference in lineage choice was associated with increased expression of lineage-specific transcription factors, such as a >200-fold increase in Gata1 (ref. 10) among the erythroid-prone cells, or a >15-fold increased PU.1 (Sfpi1) (ref. 11) expression among myeloid-prone cells. Thus, clonal heterogeneity of gene expression level is not due to independent noise in the expression of individual genes, but reflects metastable states of a slowly fluctuating transcriptome that is distinct in individual cells and may govern the reversible, stochastic priming of multipotent progenitor cells in cell fate decision.

1,087 citations

Journal ArticleDOI
TL;DR: This Review discusses the multiple algorithmic options for clustering scRNA-seq data, including various technical, biological and computational considerations.
Abstract: Single-cell RNA sequencing (scRNA-seq) allows researchers to collect large catalogues detailing the transcriptomes of individual cells. Unsupervised clustering is of central importance for the analysis of these data, as it is used to identify putative cell types. However, there are many challenges involved. We discuss why clustering is a challenging problem from a computational point of view and what aspects of the data make it challenging. We also consider the difficulties related to the biological interpretation and annotation of the identified clusters.

741 citations


Cited by
More filters
Journal ArticleDOI
13 Jun 2019-Cell
TL;DR: A strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities.

7,892 citations

Journal ArticleDOI
TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.
Abstract: Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

7,741 citations

Journal ArticleDOI
Sarah Djebali, Carrie A. Davis1, Angelika Merkel, Alexander Dobin1, Timo Lassmann, Ali Mortazavi2, Ali Mortazavi3, Andrea Tanzer, Julien Lagarde, Wei Lin1, Felix Schlesinger1, Chenghai Xue1, Georgi K. Marinov2, Jainab Khatun4, Brian A. Williams2, Chris Zaleski1, Joel Rozowsky5, Marion S. Röder, Felix Kokocinski6, Rehab F. Abdelhamid, Tyler Alioto, Igor Antoshechkin2, Michael T. Baer1, Nadav Bar7, Philippe Batut1, Kimberly Bell1, Ian Bell8, Sudipto K. Chakrabortty1, Xian Chen9, Jacqueline Chrast10, Joao Curado, Thomas Derrien, Jorg Drenkow1, Erica Dumais8, Jacqueline Dumais8, Radha Duttagupta8, Emilie Falconnet11, Meagan Fastuca1, Kata Fejes-Toth1, Pedro G. Ferreira, Sylvain Foissac8, Melissa J. Fullwood12, Hui Gao8, David Gonzalez, Assaf Gordon1, Harsha P. Gunawardena9, Cédric Howald10, Sonali Jha1, Rory Johnson, Philipp Kapranov8, Brandon King2, Colin Kingswood, Oscar Junhong Luo12, Eddie Park3, Kimberly Persaud1, Jonathan B. Preall1, Paolo Ribeca, Brian A. Risk4, Daniel Robyr11, Michael Sammeth, Lorian Schaffer2, Lei-Hoon See1, Atif Shahab12, Jørgen Skancke7, Ana Maria Suzuki, Hazuki Takahashi, Hagen Tilgner13, Diane Trout2, Nathalie Walters10, Huaien Wang1, John A. Wrobel4, Yanbao Yu9, Xiaoan Ruan12, Yoshihide Hayashizaki, Jennifer Harrow6, Mark Gerstein5, Tim Hubbard6, Alexandre Reymond10, Stylianos E. Antonarakis11, Gregory J. Hannon1, Morgan C. Giddings4, Morgan C. Giddings9, Yijun Ruan12, Barbara J. Wold2, Piero Carninci, Roderic Guigó14, Thomas R. Gingeras1, Thomas R. Gingeras8 
06 Sep 2012-Nature
TL;DR: Evidence that three-quarters of the human genome is capable of being transcribed is reported, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs that prompt a redefinition of the concept of a gene.
Abstract: Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.

4,450 citations

01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
TL;DR: The most complete human lncRNA annotation to date is presented, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts, and expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes.
Abstract: The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical and experimental approaches to investigate these genes have been hampered by the lack of comprehensive lncRNA annotation. Here, we present and analyze the most complete human lncRNA annotation to date, produced by the GENCODE consortium within the framework of the ENCODE project and comprising 9277 manually annotated genes producing 14,880 transcripts. Our analyses indicate that lncRNAs are generated through pathways similar to that of protein-coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths. In contrast to protein-coding genes, however, lncRNAs display a striking bias toward two-exon transcripts, they are predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs. They are under stronger selective pressure than neutrally evolving sequences-particularly in their promoter regions, which display levels of selection comparable to protein-coding genes. Importantly, about one-third seem to have arisen within the primate lineage. Comprehensive analysis of their expression in multiple human organs and brain regions shows that lncRNAs are generally lower expressed than protein-coding genes, and display more tissue-specific expression patterns, with a large fraction of tissue-specific lncRNAs expressed in the brain. Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. This GENCODE annotation represents a valuable resource for future studies of lncRNAs.

4,291 citations