Showing papers by "Manolis Kellis published in 2011"

PDF

Open Access

Journal Article•DOI•

Mapping and analysis of chromatin state dynamics in nine human cell types

[...]

Jason Ernst¹, Pouya Kheradpour², Pouya Kheradpour¹, Tarjei S. Mikkelsen¹, Noam Shoresh¹, Lucas D. Ward¹, Lucas D. Ward², Charles B. Epstein¹, Xiaolan Zhang¹, Li Wang¹, Robbyn Issner¹, Michael Coyne¹, Manching Ku³, Manching Ku⁴, Manching Ku¹, Timothy Durham¹, Manolis Kellis², Manolis Kellis¹, Bradley E. Bernstein⁴, Bradley E. Bernstein¹, Bradley E. Bernstein³ - Show less +17 more•Institutions (4)

Broad Institute¹, Massachusetts Institute of Technology², Howard Hughes Medical Institute³, Harvard University⁴

05 May 2011-Nature

TL;DR: This study presents a general framework for deciphering cis-regulatory connections and their roles in disease, and maps nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions.

...read moreread less

Abstract: Chromatin profiling has emerged as a powerful means of genome annotation and detection of regulatory activity. The approach is especially well suited to the characterization of non-coding portions of the genome, which critically contribute to cellular phenotypes yet remain largely uncharted. Here we map nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions. Focusing on cell-type-specific patterns of promoters and enhancers, we define multicell activity profiles for chromatin state, gene expression, regulatory motif enrichment and regulator expression. We use correlations between these profiles to link enhancers to putative target genes, and predict the cell-type-specific activators and repressors that modulate them. The resulting annotations and regulatory predictions have implications for the interpretation of genome-wide association studies. Top-scoring disease single nucleotide polymorphisms are frequently positioned within enhancer elements specifically active in relevant cell types, and in some cases affect a motif instance for a predicted regulator, thus suggesting a mechanism for the association. Our study presents a general framework for deciphering cis-regulatory connections and their roles in disease.

...read moreread less

2,646 citations

Journal Article•

Mapping and Analysis of Chromatin State Dynamics in Nine Human Cell Types

[...]

Jason Ernst¹, Pouya Kheradpour¹, Pouya Kheradpour², Tarjei S. Mikkelsen¹, Noam Shoresh¹, Lucas D. Ward¹, Lucas D. Ward², Charles B. Epstein¹, Xiaolan Zhang¹, Li Wang¹, Robbyn Issner¹, Michael Coyne¹, Manching Ku³, Manching Ku⁴, Manching Ku¹, Timothy Durham¹, Manolis Kellis¹, Manolis Kellis², Bradley E. Bernstein³, Bradley E. Bernstein⁴, Bradley E. Bernstein¹ - Show less +17 more•Institutions (4)

Broad Institute¹, Massachusetts Institute of Technology², Howard Hughes Medical Institute³, Harvard University⁴

01 Mar 2011-PubMed Central

TL;DR: This study presents a general framework for deciphering cis-regulatory connections and their roles in disease, and defines multi-cell activity profiles for chromatin state, gene expression, regulatory motif enrichment, and regulator expression.

...read moreread less

1,624 citations

Journal Article•DOI•

A User's Guide to the Encyclopedia of DNA Elements (ENCODE)

[...]

Richard M. Myers, John A. Stamatoyannopoulos¹, Michael Snyder², Ian Dunham +325 more•Institutions (31)

01 Apr 2011-PLOS Biology

TL;DR: An overview of the project and the resources it is generating and the application of ENCODE data to interpret the human genome are provided.

...read moreread less

Abstract: The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.

...read moreread less

1,446 citations

Journal Article•DOI•

A high-resolution map of human evolutionary constraint using 29 mammals.

[...]

Kerstin Lindblad-Toh¹, Manuel Garber¹, Or Zuk¹, Michael F. Lin¹, Michael F. Lin², Brian J. Parker³, Stefan Washietl², Pouya Kheradpour¹, Pouya Kheradpour², Jason Ernst², Jason Ernst¹, Gregory E. Jordan⁴, Evan Mauceli¹, Lucas D. Ward², Lucas D. Ward¹, Craig B. Lowe⁵, Craig B. Lowe⁶, Craig B. Lowe⁷, Alisha K. Holloway⁸, Michele Clamp¹, Sante Gnerre¹, Jessica Alföldi¹, Kathryn Beal⁴, Jean Chang¹, Hiram Clawson⁷, James Cuff⁹, Federica Di Palma¹, Stephen Fitzgerald⁴, Paul Flicek⁴, Mitchell Guttman¹, Melissa J. Hubisz¹⁰, David B. Jaffe¹, Irwin Jungreis², W. James Kent⁸, Dennis Kostka⁸, Marcia Lara¹, André L. Martins¹⁰, Tim Massingham⁴, Ida Moltke³, Brian J. Raney⁷, Matthew D. Rasmussen², James Robinson¹, Alexander Stark¹¹, Albert J. Vilella⁴, Jiayu Wen³, Xiaohui Xie¹, Michael C. Zody¹, Kim C. Worley¹², Christie Kovar¹², Donna M. Muzny¹², Richard A. Gibbs¹², Wesley C. Warren¹³, Elaine R. Mardis¹³, George M. Weinstock¹³, George M. Weinstock¹², Richard K. Wilson¹³, Ewan Birney⁴, Elliott H. Margulies¹⁴, Javier Herrero⁴, Eric D. Green¹⁴, David Haussler⁶, David Haussler⁷, Adam Siepel¹⁰, Nick Goldman⁴, Katherine S. Pollard⁸, Jakob Skou Pedersen³, Jakob Skou Pedersen¹⁵, Eric S. Lander¹, Manolis Kellis¹, Manolis Kellis² - Show less +66 more•Institutions (15)

Massachusetts Institute of Technology¹, Vassar College², University of Copenhagen³, Wellcome Trust⁴, Stanford University⁵, Howard Hughes Medical Institute⁶, University of California, Santa Cruz⁷, University of California, San Francisco⁸, Harvard University⁹, Cornell University¹⁰, Research Institute of Molecular Pathology¹¹, Human Genome Sequencing Center¹², Washington University in St. Louis¹³, National Institutes of Health¹⁴, Aarhus University Hospital¹⁵

27 Oct 2011-Nature

TL;DR: The comparison of related genomes has emerged as a powerful lens for genome interpretation and sequencing and comparative analysis of 29 eutherian genomes confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2%" of the genome.

...read moreread less

Abstract: The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.

...read moreread less

1,023 citations

Journal Article•DOI•

Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters

[...]

Tiffany Hung¹, Yulei Wang², Michael F. Lin³, Michael F. Lin⁴, Ashley K. Koegel¹, Yojiro Kotake⁵, Yojiro Kotake⁶, Gavin D. Grant⁷, Hugo M. Horlings⁸, Nilay Shah⁹, Christopher B. Umbricht⁹, Pei Wang¹, Yu Wang², Benjamin Kong², Anita Langerød¹⁰, Anne Lise Børresen-Dale¹⁰, Seung K. Kim¹, Marc J. van de Vijver⁸, Saraswati Sukumar⁹, Michael L. Whitfield⁷, Manolis Kellis³, Manolis Kellis⁴, Yue Xiong⁶, David J. Wong¹, Howard Y. Chang¹ - Show less +21 more•Institutions (10)

Stanford University¹, Life Technologies², Broad Institute³, Massachusetts Institute of Technology⁴, Hamamatsu University School of Medicine⁵, University of North Carolina at Chapel Hill⁶, Dartmouth College⁷, University of Amsterdam⁸, Johns Hopkins University⁹, University of Oslo¹⁰

01 Jul 2011-Nature Genetics

TL;DR: In this article, an ultra-high-density array that tiles the promoters of 56 cell-cycle genes was used to interrogate 108 samples representing diverse perturbations, identifying 216 transcribed regions that encode putative lncRNAs, many with RT-PCR-validated periodic expression during the cell cycle.

...read moreread less

Abstract: Transcription of long noncoding RNAs (lncRNAs) within gene regulatory elements can modulate gene activity in response to external stimuli, but the scope and functions of such activity are not known. Here we use an ultrahigh-density array that tiles the promoters of 56 cell-cycle genes to interrogate 108 samples representing diverse perturbations. We identify 216 transcribed regions that encode putative lncRNAs, many with RT-PCR-validated periodic expression during the cell cycle, show altered expression in human cancers and are regulated in expression by specific oncogenic stimuli, stem cell differentiation or DNA damage. DNA damage induces five lncRNAs from the CDKN1A promoter, and one such lncRNA, named PANDA, is induced in a p53-dependent manner. PANDA interacts with the transcription factor NF-YA to limit expression of pro-apoptotic genes; PANDA depletion markedly sensitized human fibroblasts to apoptosis by doxorubicin. These findings suggest potentially widespread roles for promoter lncRNAs in cell-growth control.

...read moreread less

969 citations

Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters

[...]

Stanford University¹, Life Technologies², Massachusetts Institute of Technology³, Broad Institute⁴, Hamamatsu University School of Medicine⁵, University of North Carolina at Chapel Hill⁶, Dartmouth College⁷, University of Amsterdam⁸, Johns Hopkins University⁹, University of Oslo¹⁰

01 Jun 2011

TL;DR: This work uses an ultrahigh-density array that tiles the promoters of 56 cell-cycle genes to interrogate 108 samples representing diverse perturbations and identifies 216 transcribed regions that encode putative lncRNAs, many with RT-PCR–validated periodic expression during the cell cycle.

...read moreread less

933 citations

Journal Article•

A High-Resolution Map of Human Evolutionary Constraint Using 29 Mammals

[...]

Stefan Washietl, Pouya Kheradpour, Jason Ernst, Lucas D. Ward, Irwin Jungreis, Matthew D. Rasmussen, Manolis Kellis - Show less +3 more

01 Oct 2011-PubMed Central

TL;DR: The comparison of related genomes has emerged as a powerful lens for genome interpretation as mentioned in this paper, which reveals a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons.

...read moreread less

926 citations

Journal Article•DOI•

PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions

[...]

Michael F. Lin¹, Irwin Jungreis¹, Manolis Kellis¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jul 2011

TL;DR: PhyloCSF, a novel comparative genomics method that analyzes a multispecies nucleotide sequence alignment to determine whether it is likely to represent a conserved protein-coding region, based on a formal statistical comparison of phylogenetic codon models, is presented.

...read moreread less

Abstract: Motivation: As high-throughput transcriptome sequencing provides evidence for novel transcripts in many species, there is a renewed need for accurate methods to classify small genomic regions as protein coding or non-coding. We present PhyloCSF, a novel comparative genomics method that analyzes a multispecies nucleotide sequence alignment to determine whether it is likely to represent a conserved protein-coding region, based on a formal statistical comparison of phylogenetic codon models. Results: We show that PhyloCSF’s classification performance in 12-species Drosophila genome alignments exceeds all other methods we compared in a previous study. We anticipate that this method will be widely applicable as the transcriptomes of many additional species, tissues and subcellular compartments are sequenced, particularly in the context of ENCODE and modENCODE, and as interest grows in long non-coding RNAs, often initially recognized by their lack of protein coding potential rather than conserved RNA secondary structures. Availability and Implementation: The Objective Caml source code and executables for GNU/Linux and Mac OS X are freely available at

...read moreread less

854 citations

Journal Article•DOI•

Comprehensive analysis of the chromatin landscape in Drosophila melanogaster

[...]

Peter V. Kharchenko¹, Artyom A. Alekseyenko¹, Artyom A. Alekseyenko², Yuri B. Schwartz³, Aki Minoda⁴, Nicole C. Riddle⁵, Jason Ernst⁶, Jason Ernst⁷, Peter J. Sabo⁸, Erica Larschan⁹, Erica Larschan¹, Erica Larschan², Andrey A. Gorchakov², Andrey A. Gorchakov¹, Tingting Gu⁵, Daniela Linder-Basso³, Annette Plachetka¹, Annette Plachetka², Gregory A. Shanower³, Michael Y. Tolstorukov¹, Michael Y. Tolstorukov¹⁰, Lovelace J. Luquette¹, Ruibin Xi¹, Youngsook L. Jung², Youngsook L. Jung¹, Richard W. Park¹, Richard W. Park¹¹, Eric P. Bishop¹¹, Eric P. Bishop¹, Theresa K. Canfield⁸, Richard Sandstrom⁸, Robert E. Thurman⁸, David M. MacAlpine¹², John A. Stamatoyannopoulos⁸, Manolis Kellis⁷, Manolis Kellis⁶, Sarah C. R. Elgin⁵, Mitzi I. Kuroda², Mitzi I. Kuroda¹, Vincenzo Pirrotta³, Gary H. Karpen⁴, Peter J. Park¹, Peter J. Park², Peter J. Park¹⁰ - Show less +40 more•Institutions (12)

Harvard University¹, Brigham and Women's Hospital², Rutgers University³, University of California, Berkeley⁴, Washington University in St. Louis⁵, Massachusetts Institute of Technology⁶, Broad Institute⁷, University of Washington⁸, Brown University⁹, Boston Children's Hospital¹⁰, Boston University¹¹, Duke University¹²

24 Mar 2011-Nature

TL;DR: In this article, the authors present a genome-wide chromatin landscape for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial patterns.

...read moreread less

Abstract: Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which have an impact on cell differentiation, gene regulation and other key cellular processes. Here we present a genome-wide chromatin landscape for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial patterns. Integrative analysis with other data (non-histone chromatin proteins, DNase I hypersensitivity, GRO-Seq reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, genes, regulatory elements and other functional domains. We find that active genes display distinct chromatin signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions and genomic contexts. We also demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are regulated, and will serve as a resource for future experimental investigations of genome structure and function.

...read moreread less

787 citations

Journal Article•DOI•

A cis-regulatory map of the Drosophila genome

[...]

Nicolas Nègre¹, Christopher D. Brown¹, Lijia Ma¹, Christopher A. Bristow², Steven W. Miller³, Ulrich Wagner⁴, Pouya Kheradpour², Matthew L. Eaton⁵, Paul Loriaux³, Rachel Sealfon², Zirong Li⁴, Haruhiko Ishii³, Rebecca Spokony¹, Jia Chen⁶, Lindsay Hwang⁴, Chao Cheng⁷, Richard P. Auburn⁸, Melissa Davis¹, Marc Domanus¹, Parantu K. Shah⁹, Carolyn A. Morrison¹, Jennifer Zieba¹, Sarah Suchy¹, Lionel Senderowicz¹, Alec Victorsen¹, Nicholas A. Bild¹, A. Jason Grundstad¹, David Hanley⁶, David M. MacAlpine⁵, Mattias Mannervik¹⁰, Koen J. T. Venken, Hugo J. Bellen, Robert J. White⁸, Mark Gerstein⁷, Steven Russell⁸, Robert L. Grossman⁶, Robert L. Grossman¹, Bing Ren⁴, James W. Posakony³, Manolis Kellis², Kevin P. White¹ - Show less +37 more•Institutions (10)

University of Chicago¹, Massachusetts Institute of Technology², University of California, San Diego³, Ludwig Institute for Cancer Research⁴, Duke University⁵, University of Illinois at Chicago⁶, Yale University⁷, University of Cambridge⁸, Harvard University⁹, Stockholm University¹⁰

24 Mar 2011-Nature

TL;DR: The modENCODE cis-regulatory annotation project as discussed by the authors has identified more than 20,000 candidate regulatory elements and validated a subset of predictions for promoters, enhancers and insulators in vivo.

...read moreread less

Abstract: Systematic annotation of gene regulatory elements is a major challenge in genome science. Direct mapping of chromatin modification marks and transcriptional factor binding sites genome-wide has successfully identified specific subtypes of regulatory elements. In Drosophila several pioneering studies have provided genome-wide identification of Polycomb response elements, chromatin states, transcription factor binding sites, RNA polymerase II regulation and insulator elements; however, comprehensive annotation of the regulatory genome remains a significant challenge. Here we describe results from the modENCODE cis-regulatory annotation project. We produced a map of the Drosophila melanogaster regulatory genome on the basis of more than 300 chromatin immunoprecipitation data sets for eight chromatin features, five histone deacetylases and thirty-eight site-specific transcription factors at different stages of development. Using these data we inferred more than 20,000 candidate regulatory elements and validated a subset of predictions for promoters, enhancers and insulators in vivo. We identified also nearly 2,000 genomic regions of dense transcription factor binding associated with chromatin activity and accessibility. We discovered hundreds of new transcription factor co-binding relationships and defined a transcription factor network with over 800 potential regulatory relationships.

...read moreread less

522 citations

Journal Article•DOI•

Comparative functional genomics of the fission yeasts

[...]

Nicholas Rhind¹, Zehua Chen², Moran Yassour³, Moran Yassour², Dawn Thompson², Brian J. Haas², Naomi Habib³, Ilan Wapinski⁴, Ilan Wapinski², Sushmita Roy², Michael F. Lin², David I. Heiman², Sarah Young², Kanji Furuya⁵, Yabin Guo⁶, Alison L. Pidoux⁷, Huei Mei Chen⁸, Barbara Robbertse⁹, Jonathan M. Goldberg², Keita Aoki⁵, Elizabeth H. Bayne⁷, Aaron M. Berlin², Christopher A. Desjardins², Edward Dobbs⁷, Livio Dukaj¹, Lin Fan², Michael Fitzgerald², Courtney French³, Sharvari Gujja², Klavs R. Hansen¹⁰, Daniel Keifenheim¹, Joshua Z. Levin², Rebecca A. Mosher¹¹, Carolin A. Müller¹², Jenna Pfiffner², Margaret Priest², Carsten Russ², Agata Smialowska¹³, Agata Smialowska¹⁴, Peter Swoboda¹³, Sean M. Sykes², Matthew W. Vaughn¹⁰, Sonya Vengrova¹⁵, Ryan J. Yoder⁹, Qiandong Zeng², Robin C. Allshire⁷, David C. Baulcombe¹¹, Bruce W. Birren², William Brown¹², Karl Ekwall¹⁴, Karl Ekwall¹³, Manolis Kellis², Janet Leatherwood⁸, Henry L. Levin⁶, Hanah Margalit³, Robert A. Martienssen¹⁰, Conrad A. Nieduszynski¹², Joseph W. Spatafora⁹, Nir Friedman³, Jacob Z. Dalgaard¹⁵, Peter Baumann¹⁶, Peter Baumann¹⁷, Peter Baumann¹⁸, Hironori Niki⁵, Aviv Regev¹⁷, Aviv Regev², Chad Nusbaum² - Show less +63 more•Institutions (18)

University of Massachusetts Medical School¹, Massachusetts Institute of Technology², Hebrew University of Jerusalem³, Harvard University⁴, National Institute of Genetics⁵, National Institutes of Health⁶, University of Edinburgh⁷, State University of New York System⁸, Oregon State University⁹, Cold Spring Harbor Laboratory¹⁰, University of Cambridge¹¹, University of Nottingham¹², Karolinska Institutet¹³, Södertörn University¹⁴, University of Warwick¹⁵, University of Kansas¹⁶, Howard Hughes Medical Institute¹⁷, Stowers Institute for Medical Research¹⁸

20 May 2011-Science

TL;DR: Differences in gene content and regulation explain why, unlike the budding yeast of Saccharomycotina, fission yeasts cannot use ethanol as a primary carbon source and provide tools for investigation across the Schizosaccharomyces clade.

...read moreread less

Abstract: The fission yeast clade--comprising Schizosaccharomyces pombe, S. octosporus, S. cryophilus, and S. japonicus--occupies the basal branch of Ascomycete fungi and is an important model of eukaryote biology. A comparative annotation of these genomes identified a near extinction of transposons and the associated innovation of transposon-free centromeres. Expression analysis established that meiotic genes are subject to antisense transcription during vegetative growth, which suggests a mechanism for their tight regulation. In addition, trans-acting regulators control new genes within the context of expanded functional modules for meiosis and stress response. Differences in gene content and regulation also explain why, unlike the budding yeast of Saccharomycotina, fission yeasts cannot use ethanol as a primary carbon source. These analyses elucidate the genome structure and gene regulation of fission yeast and provide tools for investigation across the Schizosaccharomyces clade.

...read moreread less

Journal Article•DOI•

Combinatorial Patterning of Chromatin Regulators Uncovered by Genome-wide Location Analysis in Human Cells

[...]

Oren Ram¹, Alon Goren, Ido Amit¹, Ido Amit², Noam Shoresh¹, Nir Yosef¹, Nir Yosef², Jason Ernst¹, Jason Ernst³, Manolis Kellis³, Manolis Kellis¹, Melissa Gymrek, Robbyn Issner¹, Michael Coyne¹, Timothy Durham¹, Xiaolan Zhang¹, Julie Donaghey¹, Charles B. Epstein¹, Aviv Regev², Aviv Regev³, Aviv Regev¹, Bradley E. Bernstein - Show less +18 more•Institutions (3)

Broad Institute¹, Howard Hughes Medical Institute², Massachusetts Institute of Technology³

23 Dec 2011-Cell

TL;DR: This work developed ChIP-string, a meso-scale assay that combines chromatin immunoprecipitation with a signature readout of 487 representative loci that was applied to screen 145 antibodies, thereby identifying effective reagents, which were used to map the genome-wide binding of 29 CRs in two cell types.

...read moreread less

Journal Article•DOI•

An Epigenetic Signature for Monoallelic Olfactory Receptor Expression

[...]

Angeliki Magklara¹, Angela Yen², Angela Yen³, Bradley M. Colquitt¹, E. Josephine Clowney¹, William E. Allen⁴, Eirene Markenscoff-Papadimitriou¹, Zoe A. Evans¹, Pouya Kheradpour³, Pouya Kheradpour², George Mountoufaris¹, Catriona Carey¹, Gilad Barnea⁴, Manolis Kellis², Manolis Kellis³, Stavros Lomvardas¹ - Show less +12 more•Institutions (4)

University of California, San Francisco¹, Broad Institute², Massachusetts Institute of Technology³, Brown University⁴

13 May 2011-Cell

TL;DR: The data suggest that OR silencing takes place before OR expression, indicating that it is not the product of an OR-elicited feedback signal, and suggests that chromatin-mediated silencing lays a molecular foundation upon which singular and stochastic selection for gene expression can be applied.

...read moreread less

Book Chapter•DOI•

Discovery and characterization of chromatin states for systematic annotation of the human genome

[...]

Jason Ernst¹, Manolis Kellis¹•Institutions (1)

Massachusetts Institute of Technology¹

28 Mar 2011

TL;DR: A multivariate Hidden Markov Model is used to reveal chromatin states in human T cells, based on recurrent and spatially coherent combinations of chromatin marks, providing a complementary functional annotation of the human genome that reveals the genome-wide locations of diverse classes of epigenetic function.

...read moreread less

Abstract: A plethora of epigenetic modifications have been described in the human genome and shown to play diverse roles in gene regulation, cellular differentiation and the onset of disease. Although individual modifications have been linked to the activity levels of various genetic functional elements, their combinatorial patterns are still unresolved and their potential for systematic de novo genome annotation remains untapped. Here, we use a multivariate Hidden Markov Model to reveal chromatin states in human T cells, based on recurrent and spatially coherent combinations of chromatin marks.We define 51 distinct chromatin states, including promoter-associated, transcription-associated, active intergenic, largescale repressed and repeat-associated states. Each chromatin state shows specific enrichments in functional annotations, sequence motifs and specific experimentally observed characteristics, suggesting distinct biological roles. This approach provides a complementary functional annotation of the human genome that reveals the genome-wide locations of diverse classes of epigenetic function.

...read moreread less

Journal Article•DOI•

Evidence of abundant stop codon readthrough in Drosophila and other metazoa.

[...]

Irwin Jungreis¹, Michael F. Lin, Rebecca Spokony, Clara S. Chan¹, Nicolas Nègre, Alec Victorsen, Kevin P. White, Manolis Kellis¹ - Show less +4 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Dec 2011-Genome Research

TL;DR: An expanded set of 283 readthrough candidates is reported, including 16 double-readthrough candidates; these were manually curated to rule out alternatives such as A-to-I editing, alternative splicing, dicistronic translation, and selenocysteine incorporation.

...read moreread less

Abstract: .While translational stop codon readthrough is often used by viral genomes, it has been observed for only a handful of eukaryotic genes. We previously used comparative genomics evidence to recognize protein-coding regions in 12 species of Drosophila and showed that for 149 genes, the open reading frame following the stop codon has a protein-coding conservationsignature,hintingthatstopcodonreadthroughmightbecommoninDrosophila.Wereturntothisobservationarmed with deep RNA sequence data from the modENCODE project, an improved higher-resolution comparative genomics metric for detecting protein-coding regions, comparative sequence information from additional species, and directed experimental evidence. We report an expanded set of 283 readthrough candidates, including 16 double-readthrough candidates; these were manually curated to rule out alternatives such as A-to-I editing, alternative splicing, dicistronic translation, and selenocysteine incorporation. We report experimental evidence of translation using GFP tagging and mass spectrometry for several readthrough regions. We find that the set of readthrough candidates differs from other genes in length, composition, conservation, stop codon context, and in some cases, conserved stem–loops, providing clues about readthrough regulation and potential mechanisms. Lastly, we expand our studies beyond Drosophila and find evidence of abundant readthrough in several other insect species and one crustacean, and several readthrough candidates in nematode andhuman,suggestingthatfunctionallyimportanttranslational stopcodonreadthroughissignificantlymoreprevalentin Metazoa than previously recognized. [Supplemental material is available for this article.]

...read moreread less

Journal Article•DOI•

Dynamics of the epigenetic landscape during erythroid differentiation after GATA1 restoration.

[...]

Weisheng Wu¹, Yong Cheng¹, Cheryl A. Keller, Jason Ernst², Jason Ernst³, Swathi Ashok Kumar¹, Tejaswini Mishra¹, Christapher S. Morrissey¹, Christine M. Dorman¹, Kuan-Bei Chen¹, Daniela I. Drautz¹, Belinda Giardine¹, Yoichiro Shibata⁴, Lingyun Song⁴, Maxim Pimkin⁵, Gregory E. Crawford⁴, Terrence S. Furey⁶, Manolis Kellis³, Manolis Kellis², Webb Miller¹, James Taylor⁷, Stephan C. Schuster¹, Yu Zhang, Francesca Chiaromonte, Gerd A. Blobel⁵, Mitchell J. Weiss⁵, Ross C. Hardison - Show less +23 more•Institutions (7)

Pennsylvania State University¹, Broad Institute², Massachusetts Institute of Technology³, Duke University⁴, University of Pennsylvania⁵, University of North Carolina at Chapel Hill⁶, Emory University⁷

01 Oct 2011-Genome Research

TL;DR: The results indicate that during erythroid differentiation, the broad features of chromatin states are established at the stage of lineage commitment, largely independently of GATA1, which determine permissiveness for expression, with subsequent induction or repression mediated by distinctive combinations of transcription factors.

...read moreread less

Abstract: Interplays among lineage-specific nuclear proteins, chromatin modifying enzymes, and the basal transcription machinery govern cellular differentiation, but their dynamics of action and coordination with transcriptional control are not fully understood Alterations in chromatin structure appear to establish a permissive state for gene activation at some loci, but they play an integral role in activation at other loci To determine the predominant roles of chromatin states and factor occupancy in directing gene regulation during differentiation, we mapped chromatin accessibility, histone modifications, and nuclear factor occupancy genome-wide during mouse erythroid differentiation dependent on the master regulatory transcription factor GATA1 Notably, despite extensive changes in gene expression, the chromatin state profiles (proportions of a gene in a chromatin state dominated by activating or repressive histone modifications) and accessibility remain largely unchanged during GATA1-induced erythroid differentiation In contrast, gene induction and repression are strongly associated with changes in patterns of transcription factor occupancy Our results indicate that during erythroid differentiation, the broad features of chromatin states are established at the stage of lineage commitment, largely independently of GATA1 These determine permissiveness for expression, with subsequent induction or repression mediated by distinctive combinations of transcription factors

...read moreread less

Journal Article•DOI•

Three Periods of Regulatory Innovation During Vertebrate Evolution

[...]

Craig B. Lowe¹, Craig B. Lowe², Manolis Kellis³, Manolis Kellis⁴, Adam Siepel⁵, Brian J. Raney², Michele Clamp³, Sofie R. Salama², David M. Kingsley², David M. Kingsley¹, Kerstin Lindblad-Toh³, Kerstin Lindblad-Toh⁶, David Haussler² - Show less +9 more•Institutions (6)

Stanford University¹, University of California, Santa Cruz², Broad Institute³, Massachusetts Institute of Technology⁴, Cornell University⁵, Science for Life Laboratory⁶

19 Aug 2011-Science

TL;DR: This analysis identified three extended periods in the evolution of gene regulatory elements in vertebrates, characterized by regulatory gains near transcription factors and developmental genes, but this trend was replaced by innovations near extracellular signaling genes, and then innovations near posttranslational protein modifiers.

...read moreread less

Abstract: The gain, loss, and modification of gene regulatory elements may underlie a substantial proportion of phenotypic changes on animal lineages. To investigate the gain of regulatory elements throughout vertebrate evolution, we identified genome-wide sets of putative regulatory regions for five vertebrates, including humans. These putative regulatory regions are conserved nonexonic elements (CNEEs), which are evolutionarily conserved yet do not overlap any coding or noncoding mature transcript. We then inferred the branch on which each CNEE came under selective constraint. Our analysis identified three extended periods in the evolution of gene regulatory elements. Early vertebrate evolution was characterized by regulatory gains near transcription factors and developmental genes, but this trend was replaced by innovations near extracellular signaling genes, and then innovations near posttranslational protein modifiers.

...read moreread less

Three Periods of Regulatory Innovation During Vertebrate Evolution

[...]

Manolis Kellis, Craig B. Lowe, Adam Siepel, Brian J. Raney, Michele Clamp, Sofie R. Salama, David M. Kingsley, Kerstin Lindblad-Toh, David Haussler - Show less +5 more

01 Aug 2011

Journal Article•DOI•

New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

[...]

Brian J. Parker¹, Ida Moltke, Adam Roth, Stefan Washietl, Jiayu Wen, Manolis Kellis, Ronald R. Breaker, Jakob Skou Pedersen² - Show less +4 more•Institutions (2)

University of Copenhagen¹, Aarhus University²

01 Nov 2011-Genome Research

TL;DR: This work develops a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity, and applies it to a 41-way genomic vertebrate alignment.

...read moreread less

Abstract: Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3′-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

...read moreread less

Journal Article•DOI•

A Bayesian Approach for Fast and Accurate Gene Tree Reconstruction

[...]

Matthew D. Rasmussen¹, Manolis Kellis¹, Manolis Kellis²•Institutions (2)

Massachusetts Institute of Technology¹, Broad Institute²

01 Jan 2011-Molecular Biology and Evolution

TL;DR: SPIMAP, an efficient Bayesian method for reconstructing gene trees in the presence of a known species tree, is presented, finding that reconstruction inaccuracies of traditional phylogenetic methods overestimate the number of DL events by as much as 2–3-fold, whereas this method achieves significantly higher accuracy.

...read moreread less

Abstract: Recentsequencingandcomputingadvanceshaveenabledphylogeneticanalysestoexpandtobothentiregenomesandlarge clades, thus requiring more efficient and accurate methods designed specifically for the phylogenomic context. Here, we present SPIMAP, an efficient Bayesian method for reconstructing gene trees in the presence of a known species tree. We observemany improvementsinreconstructionaccuracy, achievedby modelingmultipleaspectsofevolution,includinggene duplication and loss (DL) rates, speciationtimes, andcorrelated substitutionrate variationacross both species and loci. We have implemented and appliedthis method on two clades of fully sequenced species,12 Drosophila and 16 fungal genomes as well as simulated phylogenies and find dramatic improvements in reconstruction accuracy as compared with the most popularexistingmethods,includingthosethattakethespeciestreeintoaccount.Wefindthatreconstructioninaccuraciesof traditionalphylogeneticmethodsoverestimatethenumberofDLeventsbyasmuchas2‐3-fold,whereasourmethodachieves significantlyhigher accuracy. We feelthattheresultsandmethods presentedhere willhave manyimportantimplicationsfor future investigationsofgene evolution.

...read moreread less

Journal Article•DOI•

Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes.

[...]

Michael F. Lin¹, Pouya Kheradpour, Stefan Washietl, Brian J. Parker, Jakob Skou Pedersen, Manolis Kellis - Show less +2 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Nov 2011-Genome Research

TL;DR: This study uses genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species, and collects numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements.

...read moreread less

Abstract: The degeneracy of the genetic code allows protein-coding DNA and RNA sequences to simultaneously encode additional, overlapping functional elements. A sequence in which both protein-coding and additional overlapping functions have evolved under purifying selection should show increased evolutionary conservation compared to typical protein-coding genes--especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain ∼2% of their synonymous sites. We collect numerous lines of evidence that the observed synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian genes, despite the vast genomic landscape.

...read moreread less

Journal Article•DOI•

SubMAP: aligning metabolic pathways with subnetwork mappings.

[...]

Ferhat Ay¹, Ferhat Ay², Manolis Kellis², Tamer Kahveci¹•Institutions (2)

University of Florida¹, Massachusetts Institute of Technology²

01 Mar 2011-Journal of Computational Biology

TL;DR: The empirical results demonstrate that SubMAP can identify biologically relevant mappings that are missed by traditional alignment methods and is scalable for metabolic pathways of arbitrary topology, including searching for a query pathway of size 70 against the complete KEGG database of 1,842 pathways.

...read moreread less

Abstract: We consider the problem of aligning two metabolic pathways. Unlike traditional approaches, we do not restrict the alignment to one-to-one mappings between the molecules (nodes) of the input pathways (graphs). We follow the observation that, in nature, different organisms can perform the same or similar functions through different sets of reactions and molecules. The number and the topology of the molecules in these alternative sets often vary from one organism to another. With the motivation that an accurate biological alignment should be able to reveal these functionally similar molecule sets across different species, we develop an algorithm that first measures the similarities between different nodes using a mixture of homology and topological similarity. We combine the two metrics by employing an eigenvalue formulation. We then search for an alignment between the two input pathways that maximizes a similarity score, evaluated as the sum of the similarities of the mapped subnetworks of size at most a given integer k, and also does not contain any conflicting mappings. Here we prove that this maximization is NP-hard by a reduction from the maximum weight independent set (MWIS) problem. We then convert our problem to an instance of MWIS and use an efficient vertex-selection strategy to extract the mappings that constitute our alignment. We name our algorithm SubMAP (Subnetwork Mappings in Alignment of Pathways). We evaluate its accuracy and performance on real datasets. Our empirical results demonstrate that SubMAP can identify biologically relevant mappings that are missed by traditional alignment methods. Furthermore, we observe that SubMAP is scalable for metabolic pathways of arbitrary topology, including searching for a query pathway of size 70 against the complete KEGG database of 1,842 pathways. Implementation in C++ is available at http://bioinformatics.cise.ufl.edu/SubMAP.html.

...read moreread less

SubMAP: Aligning Metabolic Pathways with Subnetwork Mappings

[...]

Ferhat Ay¹, Ferhat Ay², Manolis Kellis², Tamer Kahveci¹•Institutions (2)

University of Florida¹, Massachusetts Institute of Technology²

01 Mar 2011

TL;DR: SubMAP (Subnetwork Mappings in Alignment of Pathways) as mentioned in this paper aligns two metabolic pathways using a mixture of homology and topological similarity to find biologically relevant mappings.

...read moreread less

Abstract: We consider the problem of aligning two metabolic pathways Unlike traditional approaches, we do not restrict the alignment to one-to-one mappings between the molecules (nodes) of the input pathways (graphs) We follow the observation that, in nature, different organisms can perform the same or similar functions through different sets of reactions and molecules The number and the topology of the molecules in these alternative sets often vary from one organism to another With the motivation that an accurate biological alignment should be able to reveal these functionally similar molecule sets across different species, we develop an algorithm that first measures the similarities between different nodes using a mixture of homology and topological similarity We combine the two metrics by employing an eigenvalue formulation We then search for an alignment between the two input pathways that maximizes a similarity score, evaluated as the sum of the similarities of the mapped subnetworks of size at most a given integer k, and also does not contain any conflicting mappings Here we prove that this maximization is NP-hard by a reduction from the maximum weight independent set (MWIS) problem We then convert our problem to an instance of MWIS and use an efficient vertex-selection strategy to extract the mappings that constitute our alignment We name our algorithm SubMAP (Subnetwork Mappings in Alignment of Pathways) We evaluate its accuracy and performance on real datasets Our empirical results demonstrate that SubMAP can identify biologically relevant mappings that are missed by traditional alignment methods Furthermore, we observe that SubMAP is scalable for metabolic pathways of arbitrary topology, including searching for a query pathway of size 70 against the complete KEGG database of 1,842 pathways Implementation in C++ is available at http://bioinformaticsciseufledu/SubMAPhtml

...read moreread less

Journal Article•DOI•

Error and error mitigation in low-coverage genome assemblies

[...]

Melissa J. Hubisz¹, Michael F. Lin², Manolis Kellis², Adam Siepel¹•Institutions (2)

Cornell University¹, Massachusetts Institute of Technology²

14 Feb 2011-PLOS ONE

TL;DR: The extent of sequencing error in these 2× assemblies, and its potential impact in downstream analyses, is examined, finding that most errors are contributed by a small fraction of bases with low quality scores, in particular, by the ends of reads in regions of single-read coverage in the assembly.

...read moreread less

Abstract: The recent release of twenty-two new genome sequences has dramatically increased the data available for mammalian comparative genomics, but twenty of these new sequences are currently limited to ∼2× coverage. Here we examine the extent of sequencing error in these 2× assemblies, and its potential impact in downstream analyses. By comparing 2× assemblies with high-quality sequences from the ENCODE regions, we estimate the rate of sequencing error to be 1–4 errors per kilobase. While this error rate is fairly modest, sequencing error can still have surprising effects. For example, an apparent lineage-specific insertion in a coding region is more likely to reflect sequencing error than a true biological event, and the length distribution of coding indels is strongly distorted by error. We find that most errors are contributed by a small fraction of bases with low quality scores, in particular, by the ends of reads in regions of single-read coverage in the assembly. We explore several approaches for automatic sequencing error mitigation (SEM), making use of the localized nature of sequencing error, the fact that it is well predicted by quality scores, and information about errors that comes from comparisons across species. Our automatic methods for error mitigation cannot replace the need for additional sequencing, but they do allow substantial fractions of errors to be masked or eliminated at the cost of modest amounts of over-correction, and they can reduce the impact of error in downstream phylogenomic analyses. Our error-mitigated alignments are available for download.

...read moreread less

Supporting Online Material for Three Periods of Regulatory Innovation During Vertebrate Evolution

[...]

Craig B. Lowe, Manolis Kellis, Adam Siepel, Brian J. Raney, Michele Clamp, David M. Kingsley, Kerstin Lindblad-Toh, David Haussler¹ - Show less +4 more•Institutions (1)

University of California, Santa Cruz¹

01 Jan 2011

TL;DR: In this article, the gain, loss, and modification of gene regulatory elements may underlie a substantial proportion of phenotypic changes on animal lineages, and the authors identified genome-wide sets of putative regulatory regions for five vertebrates, including humans.

...read moreread less

Abstract: Patterns of vertebrate gene regulation have changed during the course of evolution. The gain, loss, and modification of gene regulatory elements may underlie a substantial proportion of phenotypic changes on animal lineages. To investigate the gain of regulatory elements throughout vertebrate evolution, we identified genome-wide sets of putative regulatory regions for five vertebrates, including humans. These putative regulatory regions are conserved nonexonic elements (CNEEs), which are evolutionarily conserved yet do not overlap any coding or noncoding mature transcript. We then inferred the branch on which each CNEE came under selective constraint. Our analysis identified three extended periods in the evolution of gene regulatory elements. Early vertebrate evolution was characterized by regulatory gains near transcription factors and developmental genes, but this trend was replaced by innovations near extracellular signaling genes, and then innovations near posttranslational protein modifiers.

...read moreread less

Journal Article•DOI•

Preface: RECOMB Conference on Systems Biology, Regulatory Genomics, and DREAM Challenges 2010 special issue.

[...]

Manolis Kellis, Andrea Califano, Ziv Bar-Joseph

01 Feb 2011-Journal of Computational Biology

Evolution at the Subgene Level: Domain Rearrangements in the Drosophila Phylogeny

[...]

Yi-Chieh Wu¹, Matthew D. Rasmussen¹, Manolis Kellis², Manolis Kellis¹•Institutions (2)

Massachusetts Institute of Technology¹, Broad Institute²

01 Sep 2011

TL;DR: In a recent paper as discussed by the authors, the authors present a molecular biology and evolution online journal, Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

...read moreread less

Abstract: Supplementary sections 1–13, tables S1–S10, and figures S1–S9 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

...read moreread less