Transcription factor–DNA binding: beyond binding site motifs

doi:10.1016/J.GDE.2017.02.007

Home
/
Papers
/
Transcription factor–DNA binding: beyond binding site motifs

Journal Article•DOI•

Transcription factor–DNA binding: beyond binding site motifs

Sachi Inukai¹, Kian Hong Kock¹, Martha L. Bulyk¹•Institutions (1)

Brigham and Women's Hospital¹

01 Apr 2017-Current Opinion in Genetics & Development (Curr Opin Genet Dev)-Vol. 43, pp 110-119

TL;DR: Novel approaches for characterizing functional binding site motifs that promise to inform the understanding of how TF binding controls gene expression and ultimately contributes to phenotype are highlighted.

read less

About: This article is published in Current Opinion in Genetics & Development.The article was published on 2017-04-01 and is currently open access. It has received 219 citations till now. The article focuses on the topics: DNA binding site & Enhancer.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Transcription Factors Associated with Abiotic and Biotic Stress Tolerance and Their Potential for Crops Improvement

[...]

Elamin Hafiz Baillo¹, Roy Njoroge Kimotho¹, Zhengbin Zhang¹, Ping Xu¹•Institutions (1)

Chinese Academy of Sciences¹

30 Sep 2019-Genes

TL;DR: The aim of this review is to illustrate the potential application of TF genes for stress tolerance improvement and the engineering of resistant crops, with an emphasis on sorghum.

...read moreread less

Abstract: In field conditions, crops are adversely affected by a wide range of abiotic stresses including drought, cold, salt, and heat, as well as biotic stresses including pests and pathogens. These stresses can have a marked effect on crop yield. The present and future effects of climate change necessitate the improvement of crop stress tolerance. Plants have evolved sophisticated stress response strategies, and genes that encode transcription factors (TFs) that are master regulators of stress-responsive genes are excellent candidates for crop improvement. Related examples in recent studies include TF gene modulation and overexpression approaches in crop species to enhance stress tolerance. However, much remains to be discovered about the diverse plant TFs. Of the >80 TF families, only a few, such as NAC, MYB, WRKY, bZIP, and ERF/DREB, with vital roles in abiotic and biotic stress responses have been intensively studied. Moreover, although significant progress has been made in deciphering the roles of TFs in important cereal crops, fewer TF genes have been elucidated in sorghum. As a model drought-tolerant crop, sorghum research warrants further focus. This review summarizes recent progress on major TF families associated with abiotic and biotic stress tolerance and their potential for crop improvement, particularly in sorghum. Other TF families and non-coding RNAs that regulate gene expression are discussed briefly. Despite the emphasis on sorghum, numerous examples from wheat, rice, maize, and barley are included. Collectively, the aim of this review is to illustrate the potential application of TF genes for stress tolerance improvement and the engineering of resistant crops, with an emphasis on sorghum.

...read moreread less

285 citations

Cites background from "Transcription factor–DNA binding: b..."

...[12] highlighted recent findings that elaborate how TF interactions, local DNA structure, and genomic features can influence TF binding to DNA....
[...]

Journal Article•DOI•

Assessing sufficiency and necessity of enhancer activities for gene expression and the mechanisms of transcription activation.

[...]

Rui R. Catarino¹, Alexander Stark¹•Institutions (1)

Research Institute of Molecular Pathology¹

01 Feb 2018-Genes & Development

TL;DR: This review covers recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations and surveys how these approaches advance the understanding of transcription regulation with respect to promoter specificity and transcriptional bursting.

...read moreread less

Abstract: Enhancers are important genomic regulatory elements directing cell type-specific transcription. They assume a key role during development and disease, and their identification and functional characterization have long been the focus of scientific interest. The advent of next-generation sequencing and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based genome editing has revolutionized the means by which we study enhancer biology. In this review, we cover recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations. We discuss that the two latter approaches provide different and complementary insights, especially in assessing enhancer sufficiency and necessity for transcription activation. Furthermore, we discuss recent insights into mechanistic aspects of enhancer function, including findings about cofactor requirements and the role of post-translational histone modifications such as monomethylation of histone H3 Lys4 (H3K4me1). Finally, we survey how these approaches advance our understanding of transcription regulation with respect to promoter specificity and transcriptional bursting and provide an outlook covering open questions and promising developments.

...read moreread less

176 citations

Journal Article•DOI•

Statistics or biology: the zero-inflation controversy about scRNA-seq data

[...]

Ruochen Jiang, Tianyi Sun, Dongyuan Song, Jingyi Jessica Li

21 Jan 2022-Genome Biology

TL;DR: The authors discuss the sources of biological and non-biological zeros in single-cell RNA-seq data, introduce five mechanisms of adding non-biological zeros to computational benchmarking, evaluate the impacts of nonbiologically zeros on data analysis, benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biology zeros; and advocate the importance of transparent analysis.

...read moreread less

Abstract: Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.

...read moreread less

176 citations

Journal Article•DOI•

EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species.

[...]

Tianshun Gao¹, Jiang Qian¹•Institutions (1)

Johns Hopkins University¹

19 Nov 2019-Nucleic Acids Research

TL;DR: The updated EnhancerAtlas 2.0 is a huge expansion of the first version, which only contains the enhancers in human cells, and predicted enhancer–target gene relationships in human, mouse and fly.

...read moreread less

Abstract: Enhancers are distal cis-regulatory elements that activate the transcription of their target genes. They regulate a wide range of important biological functions and processes, including embryogenesis, development, and homeostasis. As more and more large-scale technologies were developed for enhancer identification, a comprehensive database is highly desirable for enhancer annotation based on various genome-wide profiling datasets across different species. Here, we present an updated database EnhancerAtlas 2.0 (http://www.enhanceratlas.org/indexv2.php), covering 586 tissue/cell types that include a large number of normal tissues, cancer cell lines, and cells at different development stages across nine species. Overall, the database contains 13 494 603 enhancers, which were obtained from 16 055 datasets using 12 high-throughput experiment methods (e.g. H3K4me1/H3K27ac, DNase-seq/ATAC-seq, P300, POLR2A, CAGE, ChIA-PET, GRO-seq, STARR-seq and MPRA). The updated version is a huge expansion of the first version, which only contains the enhancers in human cells. In addition, we predicted enhancer-target gene relationships in human, mouse and fly. Finally, the users can search enhancers and enhancer-target gene relationships through five user-friendly, interactive modules. We believe the new annotation of enhancers in EnhancerAtlas 2.0 will facilitate users to perform useful functional analysis of enhancers in various genomes.

...read moreread less

174 citations

Cites background from "Transcription factor–DNA binding: b..."

...Especially for the ‘TF-binding’ track, it could contain dozens of datasets for different TFs....
[...]
...The consensus enhancers in EnhancerAtlas 2.0 were identified based on twelve high-throughput experimental approaches, including P300 (12), Histone (10), POLR2A (13,21), TF-binding (11), DHS (or ATAC) (8,9), FAIRE (16), MNase-seq (14,15), GRO-seq (6), STARR-seq (5), CAGE (2), ChIA-PET (20) and MPRA (17)....
[...]
...The TFs often regulate gene expression by binding to the DNA regulatory elements (11)....
[...]
...0 were identified based on twelve high-throughput experimental approaches, including P300 (12), Histone (10), POLR2A (13,21), TF-binding (11), DHS (or ATAC) (8,9), FAIRE (16), MNase-seq (14,15), GRO-seq (6), STARR-seq (5), CAGE (2), ChIA-PET (20) and MPRA (17)....
[...]

Systematic Localization of Common Disease-Associated Variation in

[...]

Matthew T. Maurano, Richard Humbert, Eric Rynes, Robert E. Thurman, Eric Haugen, Hao Wang, Alex Reynolds, Richard Sandstrom, Hongzhu Qu, Jennifer A. Brody, Anthony Shafer, Fidencio Neri, Kristen Lee, Tanya Kutyavin, Sandra Stehling-Sun, Audra K. Johnson, Theresa K. Canfield, Erika Giste, Morgan Diegel, Daniel Bates, R. Scott Hansen, Shane Neph, Peter J. Sabo, Shelly Heimfeld, Antony Raubitschek, Steven F. Ziegler, Chris Cotsapas, Nona Sotoodehnia, Ian A. Glass, Shamil R. Sunyaev, Rajinder Kaul, John A. Stamatoyannopoulos - Show less +28 more

01 Jan 2013

TL;DR: The authors found that 80% of the deoxyribonuclease I hypersensitive sites (DHSs) are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes.

...read moreread less

Abstract: ): Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure–related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn’s disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.

...read moreread less

171 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

An integrated encyclopedia of DNA elements in the human genome

[...]

Principal investigators¹, Nhgri groups², Data production leads³, Lead analysts³•Institutions (3)

Wellcome Trust¹, University of Washington², Pennsylvania State University³

06 Sep 2012-Nature

TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.

...read moreread less

Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

...read moreread less

13,548 citations

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

12,661 citations

Journal Article•DOI•

MEME Suite: tools for motif discovery and searching

[...]

Timothy L. Bailey¹, Mikael Bodén², Fabian A. Buske², Martin C. Frith², Charles E. Grant², Luca Clementi², Jingyuan Ren², Wilfred W. Li², William Stafford Noble² - Show less +5 more•Institutions (2)

University of Queensland¹, University of California, San Diego²

01 Jul 2009-Nucleic Acids Research

TL;DR: The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps, and all of the motif-based tools are now implemented as web services via Opal.

...read moreread less

Abstract: The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net.

...read moreread less

7,733 citations

"Transcription factor–DNA binding: b..." refers methods in this paper

...org/ Collection of motif databases and web-based tools for motif discovery, enrichment, scanning, and comparison Eukaryotes and prokaryotes Collates external motif and sequence databases for multiple species [104,105]...
[...]

Journal Article•DOI•

Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position

[...]

Jason D. Buenrostro¹, Paul G. Giresi¹, Lisa C. Zaba¹, Howard Y. Chang¹, William J. Greenleaf¹ - Show less +1 more•Institutions (1)

Stanford University¹

01 Dec 2013-Nature Methods

TL;DR: The feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making is demonstrated and classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes are discovered.

...read moreread less

Abstract: We describe an assay for transposase-accessible chromatin using sequencing (ATAC-seq), based on direct in vitro transposition of sequencing adaptors into native chromatin, as a rapid and sensitive method for integrative epigenomic analysis. ATAC-seq captures open chromatin sites using a simple two-step protocol with 500-50,000 cells and reveals the interplay between genomic locations of open chromatin, DNA-binding proteins, individual nucleosomes and chromatin compaction at nucleotide resolution. We discovered classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes. Using ATAC-seq maps of human CD4(+) T cells from a proband obtained on consecutive days, we demonstrated the feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making.

...read moreread less

4,984 citations

Journal Article•DOI•

Sequence logos: a new way to display consensus sequences

[...]

Thomas D. Schneider, R M Stephens

25 Oct 1990-Nucleic Acids Research

TL;DR: From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content at every position in a site or sequence.

...read moreread less

Abstract: A graphical method is presented for displaying the patterns in a set of aligned sequences. The characters representing the sequence are stacked on top of each other for each position in the aligned sequences. The height of each letter is made proportional to its frequency, and the letters are sorted so the most common one is on top. The height of the entire stack is then adjusted to signify the information content of the sequences at that position. From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content (measured in bits) at every position in a site or sequence. The logo displays both significant residues and subtle sequence patterns.

...read moreread less

3,232 citations