scispace - formally typeset
Open AccessJournal ArticleDOI

Comprehensive Human Transcription Factor Binding Site Map for Combinatory Binding Motifs Discovery

TLDR
A computational method is proposed that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters, and it is discovered that the new TFB Ms and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions.
Abstract
To know the map between transcription factors (TFs) and their binding sites is essential to reverse engineer the regulation process. Only about 10%–20% of the transcription factor binding motifs (TFBMs) have been reported. This lack of data hinders understanding gene regulation. To address this drawback, we propose a computational method that exploits never used TF properties to discover the missing TFBMs and their sites in all human gene promoters. The method starts by predicting a dictionary of regulatory “DNA words.” From this dictionary, it distills 4098 novel predictions. To disclose the crosstalk between motifs, an additional algorithm extracts TF combinatorial binding patterns creating a collection of TF regulatory syntactic rules. Using these rules, we narrowed down a list of 504 novel motifs that appear frequently in syntax patterns. We tested the predictions against 509 known motifs confirming that our system can reliably predict ab initio motifs with an accuracy of 81%—far higher than previous approaches. We found that on average, 90% of the discovered combinatorial binding patterns target at least 10 genes, suggesting that to control in an independent manner smaller gene sets, supplementary regulatory mechanisms are required. Additionally, we discovered that the new TFBMs and their combinatorial patterns convey biological meaning, targeting TFs and genes related to developmental functions. Thus, among all the possible available targets in the genome, the TFs tend to regulate other TFs and genes involved in developmental functions. We provide a comprehensive resource for regulation analysis that includes a dictionary of “DNA words,” newly predicted motifs and their corresponding combinatorial patterns. Combinatorial patterns are a useful filter to discover TFBMs that play a major role in orchestrating other factors and thus, are likely to lock/unlock cellular functional clusters.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Disclosing the crosstalk among DNA methylation, transcription factors, and histone marks in human pluripotent cells through discovery of DNA methylation motifs

TL;DR: The methylation motif discovery algorithm provides a synergistic approach to the differently methylated region algorithms and finds a collection of motifs associated with the somatic memory inherited by the iPS from the initial fibroblast cells, thus revealing the existence of epigenetic somatics memory on a fine methylation scale.
Journal ArticleDOI

Does mouse embryo primordial germ cell activation start before implantation as suggested by single-cell transcriptomics dynamics?

TL;DR: Ch Chromatin-immunoprecipitation analysis has demonstrated that Dnmt3l is indeed a target of TCFAP2C and could potentially bind on DNMT3L, the stimulatory DNA methyltransferase co-factor that assists in the process of de novo DNA methylation.

Identifying protein-binding sites from unaligned DNA fragments (specificity/regulatory sites/pattern recognition/information theory)

TL;DR: In this article, a matrix representation of the binding site pattern of a DNA-binding protein is presented, where the specificity of the protein is represented as a matrix, rather than a consensus sequence, allowing patterns that are typical of regulatory protein binding sites to be identified.
Posted ContentDOI

A map of cis-regulatory modules and constituent transcription factor binding sites in 77.5% regions of the human genome

TL;DR: A new algorithm dePCRM2 for predicting CRMs and constituent transcription factor (TF) binding sites (TFBSs) by integrating numerous TF ChIP-seq datasets based on a new ultra-fast, accurate motif-finding algorithm and an efficient combinatory motif pattern discovery method is developed.
Journal ArticleDOI

T-KDE: a method for genome-wide identification of constitutive protein binding sites from multiple ChIP-seq data sets

TL;DR: T-KDE is an efficient and effective method to predict constitutive protein binding sites using ChIP-seq peaks from multiple cell lines and can identify genomic “hot spots” where several different proteins bind and, conversely, cell-type-specific sites bound by a given protein.
References
More filters
Journal ArticleDOI

Gene Ontology: tool for the unification of biology

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI

Induction of Pluripotent Stem Cells from Adult Human Fibroblasts by Defined Factors

TL;DR: It is demonstrated that iPS cells can be generated from adult human fibroblasts with the same four factors: Oct3/4, Sox2, Klf4, and c-Myc.
Journal ArticleDOI

WebLogo: A Sequence Logo Generator

TL;DR: WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment that provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive.
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal ArticleDOI

Induced Pluripotent Stem Cell Lines Derived from Human Somatic Cells

TL;DR: This article showed that OCT4, SOX2, NANOG, and LIN28 factors are sufficient to reprogram human somatic cells to pluripotent stem cells that exhibit the essential characteristics of embryonic stem (ES) cells.
Related Papers (5)