scispace - formally typeset
Search or ask a question
Author

Xiaotao Wang

Bio: Xiaotao Wang is an academic researcher from Northwestern University. The author has contributed to research in topics: Chromatin & Medicine. The author has an hindex of 5, co-authored 10 publications receiving 84 citations.
Topics: Chromatin, Medicine, Enhancer, Biology, Epigenomics

Papers
More filters
Journal ArticleDOI
25 Nov 2020-Nature
TL;DR: A comprehensive map of transcriptomes, cis -regulatory elements, heterochromatin structure, the methylome and 3D genome organization in the zebrafish enables identification of species-specific and evolutionarily conserved regulatory features, and provides a foundation for modelling studies on human disease and development.
Abstract: The zebrafish (Danio rerio) has been widely used in the study of human disease and development, and about 70% of the protein-coding genes are conserved between the two species1. However, studies in zebrafish remain constrained by the sparse annotation of functional control elements in the zebrafish genome. Here we performed RNA sequencing, assay for transposase-accessible chromatin using sequencing (ATAC-seq), chromatin immunoprecipitation with sequencing, whole-genome bisulfite sequencing, and chromosome conformation capture (Hi-C) experiments in up to eleven adult and two embryonic tissues to generate a comprehensive map of transcriptomes, cis-regulatory elements, heterochromatin, methylomes and 3D genome organization in the zebrafish Tubingen reference strain. A comparison of zebrafish, human and mouse regulatory elements enabled the identification of both evolutionarily conserved and species-specific regulatory sequences and networks. We observed enrichment of evolutionary breakpoints at topologically associating domain boundaries, which were correlated with strong histone H3 lysine 4 trimethylation (H3K4me3) and CCCTC-binding factor (CTCF) signals. We performed single-cell ATAC-seq in zebrafish brain, which delineated 25 different clusters of cell types. By combining long-read DNA sequencing and Hi-C, we assembled the sex-determining chromosome 4 de novo. Overall, our work provides an additional epigenomic anchor for the functional annotation of vertebrate genomes and the study of evolutionarily conserved elements of 3D genome organization.

68 citations

Journal ArticleDOI
TL;DR: This work presents Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps, and applies it to systematically predict Chromatin loops in 56 Hi-C datasets, with results available at the 3D Genome Browser.
Abstract: Accurately predicting chromatin loops from genome-wide interaction matrices such as Hi-C data is critical to deepening our understanding of proper gene regulation. Current approaches are mainly focused on searching for statistically enriched dots on a genome-wide map. However, given the availability of orthogonal data types such as ChIA-PET, HiChIP, Capture Hi-C, and high-throughput imaging, a supervised learning approach could facilitate the discovery of a comprehensive set of chromatin interactions. Here, we present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps. We compare Peakachu with current enrichment-based approaches, and find that Peakachu identifies a unique set of short-range interactions. We show that our models perform well in different platforms, across different sequencing depths, and across different species. We apply this framework to predict chromatin loops in 56 Hi-C datasets, and release the results at the 3D Genome Browser.

53 citations

Journal ArticleDOI
09 Jun 2021-eLife
TL;DR: In this paper, the authors used a novel HP1α auxin-inducible degron human cell line to rapidly degrade HP 1α (CBX5) and found that HP 1 α is essential to chromatin-based mechanics and maintains nuclear morphology.
Abstract: Chromatin, which consists of DNA and associated proteins, contains genetic information and is a mechanical component of the nucleus. Heterochromatic histone methylation controls nucleus and chromosome stiffness, but the contribution of heterochromatin protein HP1α (CBX5) is unknown. We used a novel HP1α auxin-inducible degron human cell line to rapidly degrade HP1α. Degradation did not alter transcription, local chromatin compaction, or histone methylation, but did decrease chromatin stiffness. Single-nucleus micromanipulation reveals that HP1α is essential to chromatin-based mechanics and maintains nuclear morphology, separate from histone methylation. Further experiments with dimerization-deficient HP1αI165E indicate that chromatin crosslinking via HP1α dimerization is critical, while polymer simulations demonstrate the importance of chromatin-chromatin crosslinkers in mechanics. In mitotic chromosomes, HP1α similarly bolsters stiffness while aiding in mitotic alignment and faithful segregation. HP1α is therefore a critical chromatin-crosslinking protein that provides mechanical strength to chromosomes and the nucleus throughout the cell cycle and supports cellular functions.

48 citations

Journal ArticleDOI
TL;DR: NeLoopFinder as discussed by the authors is a computational framework to identify the chromatin interactions induced by structural variations, including interchromosomal translocations, large deletions and inversions, which can enable identification of critical oncogenic regulatory elements that can potentially reveal therapeutic targets.
Abstract: Recent efforts have shown that structural variations (SVs) can disrupt three-dimensional genome organization and induce enhancer hijacking, yet no computational tools exist to identify such events from chromatin interaction data. Here, we develop NeoLoopFinder, a computational framework to identify the chromatin interactions induced by SVs, including interchromosomal translocations, large deletions and inversions. Our framework can automatically resolve complex SVs, reconstruct local Hi-C maps surrounding the breakpoints, normalize copy number variation and allele effects and predict chromatin loops induced by SVs. We applied NeoLoopFinder in Hi-C data from 50 cancer cell lines and primary tumors and identified tens of recurrent genes associated with enhancer hijacking. To experimentally validate NeoLoopFinder, we deleted the hijacked enhancers in prostate adenocarcinoma cells using CRISPR–Cas9, which significantly reduced expression of the target oncogene. In summary, NeoLoopFinder enables identification of critical oncogenic regulatory elements that can potentially reveal therapeutic targets. This work presents NeoLoopFinder, a computational method, for identifying chromatin interactions of structurally rearranged genomes. NeoLoopFinder was applied in 50 cancer datasets and identified genes associated with enhancer-hijacking events.

47 citations

Posted ContentDOI
20 Aug 2019-bioRxiv
TL;DR: This work presents Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps with more meaningful short-range interactions, and applies this framework to systematically predict Chromatin loops in 56 Hi-C datasets.
Abstract: Accurately predicting chromatin loops from genome-wide interaction matrices such as Hi-C data is critical to deepen our understanding of proper gene regulation events. Current approaches are mainly focused on searching for statistically enriched dots on a genome-wide map. However, given the availability of a wide variety of orthogonal data types such as ChIA-PET, GAM, SPRITE, and high-throughput imaging, a supervised learning approach could facilitate the discovery of a comprehensive set of chromatin interactions. Here we present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps. Compared with current enrichment-based approaches, Peakachu identified more meaningful short-range interactions. We show that our models perform well in different platforms such as Hi-C, Micro-C, and DNA SPRITE, across different sequencing depths, and across different species. We applied this framework to systematically predict chromatin loops in 56 Hi-C datasets, and the results are available at the 3D Genome Browser (www.3dgenome.org).

45 citations


Cited by
More filters
01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal Article
TL;DR: In this article, high-resolution spatial proximity maps are consistent with a model in which a complex, including the proteins CCCTC-binding factor (CTCF) and cohesin, mediates the formation of loops by a process of extrusion.
Abstract: Significance When the human genome folds up inside the cell nucleus, it is spatially partitioned into numerous loops and contact domains. How these structures form is unknown. Here, we show that data from high-resolution spatial proximity maps are consistent with a model in which a complex, including the proteins CCCTC-binding factor (CTCF) and cohesin, mediates the formation of loops by a process of extrusion. Contact domains form as a byproduct of this process. The model accurately predicts how the genome will fold, using only information about the locations at which CTCF is bound. We demonstrate the ability to reengineer loops and domains in a predictable manner by creating highly targeted mutations, some as small as a single base pair, at CTCF sites. We recently used in situ Hi-C to create kilobase-resolution 3D maps of mammalian genomes. Here, we combine these maps with new Hi-C, microscopy, and genome-editing experiments to study the physical structure of chromatin fibers, domains, and loops. We find that the observed contact domains are inconsistent with the equilibrium state for an ordinary condensed polymer. Combining Hi-C data and novel mathematical theorems, we show that contact domains are also not consistent with a fractal globule. Instead, we use physical simulations to study two models of genome folding. In one, intermonomer attraction during polymer condensation leads to formation of an anisotropic “tension globule.” In the other, CCCTC-binding factor (CTCF) and cohesin act together to extrude unknotted loops during interphase. Both models are consistent with the observed contact domains and with the observation that contact domains tend to form inside loops. However, the extrusion model explains a far wider array of observations, such as why loops tend not to overlap and why the CTCF-binding motifs at pairs of loop anchors lie in the convergent orientation. Finally, we perform 13 genome-editing experiments examining the effect of altering CTCF-binding sites on chromatin folding. The convergent rule correctly predicts the affected loops in every case. Moreover, the extrusion model accurately predicts in silico the 3D maps resulting from each experiment using only the location of CTCF-binding sites in the WT. Thus, we show that it is possible to disrupt, restore, and move loops and domains using targeted mutations as small as a single base pair.

930 citations

01 Dec 2017
TL;DR: In this paper, the ubiquitously expressed transcription factor Yin Yang 1 (YY1) contributes to enhancer-promoter structural interactions in a manner analogous to DNA interactions mediated by CTCF.
Abstract: There is considerable evidence that chromosome structure plays important roles in gene control, but we have limited understanding of the proteins that contribute to structural interactions between gene promoters and their enhancer elements. Large DNA loops that encompass genes and their regulatory elements depend on CTCF-CTCF interactions, but most enhancer-promoter interactions do not employ this structural protein. Here, we show that the ubiquitously expressed transcription factor Yin Yang 1 (YY1) contributes to enhancer-promoter structural interactions in a manner analogous to DNA interactions mediated by CTCF. YY1 binds to active enhancers and promoter-proximal elements and forms dimers that facilitate the interaction of these DNA elements. Deletion of YY1 binding sites or depletion of YY1 protein disrupts enhancer-promoter looping and gene expression. We propose that YY1-mediated enhancer-promoter interactions are a general feature of mammalian gene control.

378 citations

01 Feb 2012
TL;DR: ChromHMM is developed, an automated computational system for learning chromatin states, characterizing their biological functions and correlations with large-scale functional datasets, and visualizing the resulting genome-wide maps of chromatin state annotations.
Abstract: Chromatin state annotation using combinations of chromatin modification patterns has emerged as a powerful approach for discovering regulatory regions and their cell type specific activity patterns, and for interpreting disease-association studies1-5. However, the computational challenge of learning chromatin state models from large numbers of chromatin modification datasets in multiple cell types still requires extensive bioinformatics expertise making it inaccessible to the wider scientific community. To address this challenge, we have developed ChromHMM, an automated computational system for learning chromatin states, characterizing their biological functions and correlations with large-scale functional datasets, and visualizing the resulting genome-wide maps of chromatin state annotations.

365 citations

01 Oct 2017
TL;DR: All loop domains are eliminated, but neither compartment domains nor histone marks are affected, and many megabase-sized loops recovered in under an hour, consistent with a model where loop extrusion is rapid.
Abstract: The human genome folds to create thousands of intervals, called "contact domains," that exhibit enhanced contact frequency within themselves. "Loop domains" form because of tethering between two loci-almost always bound by CTCF and cohesin-lying on the same chromosome. "Compartment domains" form when genomic intervals with similar histone marks co-segregate. Here, we explore the effects of degrading cohesin. All loop domains are eliminated, but neither compartment domains nor histone marks are affected. Loss of loop domains does not lead to widespread ectopic gene activation but does affect a significant minority of active genes. In particular, cohesin loss causes superenhancers to co-localize, forming hundreds of links within and across chromosomes and affecting the regulation of nearby genes. We then restore cohesin and monitor the re-formation of each loop. Although re-formation rates vary greatly, many megabase-sized loops recovered in under an hour, consistent with a model where loop extrusion is rapid.

287 citations