Measuring the reproducibility and quality of Hi-C data
Galip Gürkan Yardımcı,Hakan Ozadam,Michael E.G. Sauria,Oana Ursu,Koon-Kiu Yan,Tao Yang,Abhijit Chakraborty,Arya Kaul,Bryan R. Lajoie,Fan Song,Ye Zhan,Ferhat Ay,Mark Gerstein,Anshul Kundaje,Qiang Li,James Taylor,Feng Yue,Job Dekker,Job Dekker,William Stafford Noble +19 more
TLDR
This work assess reproducibility and quality measures by varying sequencing depth, resolution and noise levels in Hi-C data from 13 cell lines, with two biological replicates each, as well as 176 simulated matrices, to identify low-quality experiments.Abstract:
Hi-C is currently the most widely used assay to investigate the 3D organization of the genome and to study its role in gene regulation, DNA replication, and disease. However, Hi-C experiments are costly to perform and involve multiple complex experimental steps; thus, accurate methods for measuring the quality and reproducibility of Hi-C data are essential to determine whether the output should be used further in a study. Using real and simulated data, we profile the performance of several recently proposed methods for assessing reproducibility of population Hi-C data, including HiCRep, GenomeDISCO, HiC-Spector, and QuASAR-Rep. By explicitly controlling noise and sparsity through simulations, we demonstrate the deficiencies of performing simple correlation analysis on pairs of matrices, and we show that methods developed specifically for Hi-C data produce better measures of reproducibility. We also show how to use established measures, such as the ratio of intra- to interchromosomal interactions, and novel ones, such as QuASAR-QC, to identify low-quality experiments. In this work, we assess reproducibility and quality measures by varying sequencing depth, resolution and noise levels in Hi-C data from 13 cell lines, with two biological replicates each, as well as 176 simulated matrices. Through this extensive validation and benchmarking of Hi-C data, we describe best practices for reproducibility and quality assessment of Hi-C experiments. We make all software publicly available at http://github.com/kundajelab/3DChromatin_ReplicateQC
to facilitate adoption in the community.read more
Citations
More filters
Journal ArticleDOI
Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations.
Charles P. Fulco,Charles P. Fulco,Joseph Nasser,Thouis R. Jones,Glen Munson,Drew T. Bergman,Vidya Subramanian,Sharon R. Grossman,Sharon R. Grossman,Rockwell Anyoha,Benjamin R. Doughty,Tejal A. Patwardhan,Tung T. Nguyen,Michael Kane,Elizabeth M. Perez,Neva C. Durand,Caleb A. Lareau,Elena K. Stamenova,Erez Lieberman Aiden,Eric S. Lander,Eric S. Lander,Eric S. Lander,Jesse M. Engreitz,Jesse M. Engreitz +23 more
TL;DR: A simple activity-by-contact model substantially outperformed previous methods at predicting the complex connections in the CRISPR dataset and allows systematic mapping of enhancer–gene connections in a given cell type, on the basis of chromatin-state measurements.
Journal ArticleDOI
Resolving the 3D Landscape of Transcription-Linked Mammalian Chromatin Folding.
Tsung-Han S. Hsieh,Claudia Cattoglio,Elena Slobodyanyuk,Anders S. Hansen,Oliver J. Rando,Robert Tjian,Xavier Darzacq +6 more
TL;DR: This study uncovers previously obscured finer-scale genome organization, establishing functional links between chromatin folding and gene regulation by using high-resolution Micro-C to probe links between 3D genome organization and transcriptional regulation in mouse stem cells.
Journal ArticleDOI
Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2.
TL;DR: The FitHiC2 protocol is described, which eliminates indirect/bystander interactions, leading to significant reduction in the number of reported contacts without sacrificing recovery of key loops such as those between convergent CTCF binding sites.
Journal ArticleDOI
Robust single-cell Hi-C clustering by convolution- and random-walk-based imputation.
Jingtian Zhou,Jingtian Zhou,Jianzhu Ma,Yusi Chen,Yusi Chen,Chuankai Cheng,Bokan Bao,Jian Peng,Terrence J. Sejnowski,Terrence J. Sejnowski,Jesse R. Dixon,Joseph R. Ecker,Joseph R. Ecker +12 more
TL;DR: ScHiCluster as discussed by the authors is a single-cell clustering algorithm for Hi-C contact matrices that is based on imputations using linear convolution and random walk, which significantly improves clustering accuracy when applied to low coverage datasets compared with existing methods.
Journal ArticleDOI
GenomeDISCO: a concordance score for chromosome conformation capture experiments using random walks on contact map graphs.
Oana Ursu,Nathan Boley,Maryna Taranova,Y. X. Rachel Wang,Galip Gürkan Yardımcı,William Stafford Noble,Anshul Kundaje +6 more
TL;DR: A concordance measure called DIfferences between Smoothed COntact maps (GenomeDISCO) is introduced for assessing the similarity of a pair of contact maps obtained from chromosome conformation capture experiments, which accurately distinguishes biological replicates from samples obtained from different cell types.
References
More filters
Journal ArticleDOI
Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
Erez Lieberman Aiden,Nynke L. van Berkum,Louise Williams,Maxim Imakaev,Tobias Ragoczy,Tobias Ragoczy,Agnes Telling,Agnes Telling,Ido Amit,Bryan R. Lajoie,Peter J. Sabo,Michael O. Dorschner,Richard Sandstrom,Bradley E. Bernstein,Bradley E. Bernstein,Michaël Bender,Mark Groudine,Mark Groudine,Andreas Gnirke,John A. Stamatoyannopoulos,Leonid A. Mirny,Eric S. Lander,Eric S. Lander,Job Dekker +23 more
TL;DR: Hi-C is described, a method that probes the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing and demonstrates the power of Hi-C to map the dynamic conformations of entire genomes.
Journal ArticleDOI
High-resolution profiling of histone methylations in the human genome.
Artem Barski,Suresh Cuddapah,Kairong Cui,Tae-Young Roh,Dustin E. Schones,Zhibin Wang,Gang Wei,Iouri Chepelev,Keji Zhao +8 more
TL;DR: High-resolution maps for the genome-wide distribution of 20 histone lysine and arginine methylations as well as histone variant H2A.Z, RNA polymerase II, and the insulator binding protein CTCF across the human genome using the Solexa 1G sequencing technology are generated.
Journal ArticleDOI
A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping
Suhas S.P. Rao,Miriam H. Huntley,Neva C. Durand,Elena K. Stamenova,Ivan D. Bochkov,James T. Robinson,James T. Robinson,Adrian L. Sanborn,Ido Machol,Ido Machol,Arina D. Omer,Arina D. Omer,Eric S. Lander,Eric S. Lander,Eric S. Lander,Erez Lieberman Aiden +15 more
TL;DR: In situ Hi-C is used to probe the 3D architecture of genomes, constructing haploid and diploid maps of nine cell types, identifying ∼10,000 loops that frequently link promoters and enhancers, correlate with gene activation, and show conservation across cell types and species.
Journal ArticleDOI
Topological domains in mammalian genomes identified by analysis of chromatin interactions
Jesse R. Dixon,Siddarth Selvaraj,Siddarth Selvaraj,Feng Yue,Audrey Kim,Yan-Yan Li,Yin-Zhong Shen,Ming Hu,Jun Liu,Bing Ren,Bing Ren +10 more
TL;DR: It is found that the boundaries of topological domains are enriched for the insulator binding protein CTCF, housekeeping genes, transfer RNAs and short interspersed element (SINE) retrotransposons, indicating that these factors may have a role in establishing the topological domain structure of the genome.
Journal ArticleDOI
Spatial partitioning of the regulatory landscape of the X-inactivation centre
Elphège P. Nora,Bryan R. Lajoie,Edda G. Schulz,Luca Giorgetti,Luca Giorgetti,Luca Giorgetti,Ikuhiro Okamoto,Ikuhiro Okamoto,Ikuhiro Okamoto,Nicolas Servant,Nicolas Servant,Nicolas Servant,Tristan Piolot,Tristan Piolot,Tristan Piolot,Nynke L. van Berkum,Johannes Meisig,John W. Sedat,Joost Gribnau,Emmanuel Barillot,Emmanuel Barillot,Emmanuel Barillot,Nils Blüthgen,Job Dekker,Edith Heard,Edith Heard,Edith Heard +26 more
TL;DR: In addition to uncovering a new principle of cis-regulatory architecture of mammalian chromosomes, this study sets the stage for the full genetic dissection of the mouse X-inactivation centre.
Related Papers (5)
Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
Erez Lieberman Aiden,Nynke L. van Berkum,Louise Williams,Maxim Imakaev,Tobias Ragoczy,Tobias Ragoczy,Agnes Telling,Agnes Telling,Ido Amit,Bryan R. Lajoie,Peter J. Sabo,Michael O. Dorschner,Richard Sandstrom,Bradley E. Bernstein,Bradley E. Bernstein,Michaël Bender,Mark Groudine,Mark Groudine,Andreas Gnirke,John A. Stamatoyannopoulos,Leonid A. Mirny,Eric S. Lander,Eric S. Lander,Job Dekker +23 more