Predicting A/B compartments from histone modifications using deep learning
Suchen Zheng,Nitya Thakkar,Hannah L. Harris,Megan Zhang,Susanna Liu,Mark Gerstein,Erez Lieberman Aiden,M. Jordan Rowley,William Noble,Gamze Gürsoy,Ritambhara Singh +10 more
TLDR
A prediction tool called Compartment prediction using Recurrent Neural Network (CoRNN) that models the relationship between the compartmental organization of the genome and histone modification enrichment and demonstrates the generalizability of the model by predicting compartments in independent tissue samples.Abstract:
Genomes fold into organizational units in the 3D space that can influence critical biological functions. In particular, the organization of chromatin into A and B compartments segregates its active regions from inactive regions. Compartments, evident in Hi-C contact matrices, have been used to describe cell-type specific changes in the A/B organization. However, obtaining Hi-C data for all cell and tissue types of interest is prohibitively expensive, which has limited the widespread consideration of compartment status. We present a prediction tool called Compartment prediction using Recurrent Neural Network (CoRNN) that models the relationship between the compartmental organization of the genome and histone modification enrichment. Our model predicts A/B compartments, in a cross-cell type setting, with an average area under the ROC curve of 90.9%. Our cell type-specific compartment predictions show high overlap with known functional elements. We investigate our predictions by systematically removing combinations of histone marks and find that H3K27ac and H3K36me3 are the most predictive marks. We then perform a detailed analysis of loci where compartment status cannot be accurately predicted from these marks. These regions represent chromatin with ambiguous compartmental status, likely due to variations in status within the population of cells. These ambiguous loci also show highly variable compartmental status between biological replicates in the same GM12878 cell type. Finally, we demonstrate the generalizability of our model by predicting compartments in independent tissue samples. Our software and trained model are publicly available at https://github.com/rsinghlab/CoRNN.read more
Citations
More filters
Journal ArticleDOI
Considerations and caveats for analyzing chromatin compartments
TL;DR: This paper reviewed different strategies to identify A/B and sub-compartment intervals, including a discussion of various machine-learning approaches to predict these features, and examined the strengths and limitations of current strategies and examine how these aspects of analysis may have impacted our understanding of chromatin compartments.
Journal ArticleDOI
Assignment of the somatic A/B compartments to chromatin domains in giant transcriptionally active lampbrush chromosomes
TL;DR: In this article , the distribution of A/B compartments in chicken somatic cells with chromatin domains in lampbrush chromosomes was compared, and the results indicated that gene-poor regions tend to be packed into chromomeres.
Journal ArticleDOI
PyMEGABASE: Predicting cell-type-specific structural annotations of chromosomes using the epigenome.
Esteban Dodero-Rojas,Matheus F. Mello,Sumitabha Brahmachari,Antonio B Oliveira Junior,Vinícius G. Contessoto,José N. Onuchic +5 more
TL;DR: PyMEGABASE (PYMB) as mentioned in this paper is a maximum entropy-based neural network model that predicts (sub) compartment annotations of a locus based solely on the local epigenome, such as ChIP-Seq of histone post-translational modifications.
Posted ContentDOI
Assignment of the somatic A/B compartments to chromatin domains in giant transcriptionally active lampbrush chromosomes
TL;DR: In this article , the distribution of A/B compartments in chicken somatic cells with chromatin domains in lampbrush chromosomes was compared, and it was shown that gene-poor regions tend to be packed into chromomeres.
References
More filters
Proceedings ArticleDOI
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Journal ArticleDOI
Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
Erez Lieberman Aiden,Nynke L. van Berkum,Louise Williams,Maxim Imakaev,Tobias Ragoczy,Tobias Ragoczy,Agnes Telling,Agnes Telling,Ido Amit,Bryan R. Lajoie,Peter J. Sabo,Michael O. Dorschner,Richard Sandstrom,Bradley E. Bernstein,Bradley E. Bernstein,Michaël Bender,Mark Groudine,Mark Groudine,Andreas Gnirke,John A. Stamatoyannopoulos,Leonid A. Mirny,Eric S. Lander,Eric S. Lander,Job Dekker +23 more
TL;DR: Hi-C is described, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing and demonstrates the power of Hi-C to map the dynamic conformations of entire genomes.
Journal ArticleDOI
A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping
Suhas S.P. Rao,Miriam H. Huntley,Neva C. Durand,Elena K. Stamenova,Ivan D. Bochkov,James T. Robinson,James T. Robinson,Adrian L. Sanborn,Ido Machol,Ido Machol,Arina D. Omer,Arina D. Omer,Eric S. Lander,Eric S. Lander,Eric S. Lander,Erez Lieberman Aiden +15 more
TL;DR: In situ Hi-C is used to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types, identifying ∼10,000 loops that frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species.
Journal ArticleDOI
Topological domains in mammalian genomes identified by analysis of chromatin interactions
Jesse R. Dixon,Siddarth Selvaraj,Siddarth Selvaraj,Feng Yue,Audrey Kim,Yan-Yan Li,Yin-Zhong Shen,Ming Hu,Jun Liu,Bing Ren,Bing Ren +10 more
TL;DR: It is found that the boundaries of topological domains are enriched for the insulator binding protein CTCF, housekeeping genes, transfer RNAs and short interspersed element (SINE) retrotransposons, indicating that these factors may have a role in establishing the topological domain structure of the genome.
Journal ArticleDOI
Capturing Chromosome Conformation
TL;DR: Using the yeast Saccharomyces cerevisiae, this work could confirm known qualitative features of chromosome organization within the nucleus and dynamic changes in that organization during meiosis and found that chromatin is highly flexible throughout.