scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Transcription factor–DNA binding: beyond binding site motifs

01 Apr 2017-Current Opinion in Genetics & Development (Curr Opin Genet Dev)-Vol. 43, pp 110-119
TL;DR: Novel approaches for characterizing functional binding site motifs that promise to inform the understanding of how TF binding controls gene expression and ultimately contributes to phenotype are highlighted.
About: This article is published in Current Opinion in Genetics & Development.The article was published on 2017-04-01 and is currently open access. It has received 219 citations till now. The article focuses on the topics: DNA binding site & Enhancer.
Citations
More filters
Journal ArticleDOI
30 Sep 2019-Genes
TL;DR: The aim of this review is to illustrate the potential application of TF genes for stress tolerance improvement and the engineering of resistant crops, with an emphasis on sorghum.
Abstract: In field conditions, crops are adversely affected by a wide range of abiotic stresses including drought, cold, salt, and heat, as well as biotic stresses including pests and pathogens. These stresses can have a marked effect on crop yield. The present and future effects of climate change necessitate the improvement of crop stress tolerance. Plants have evolved sophisticated stress response strategies, and genes that encode transcription factors (TFs) that are master regulators of stress-responsive genes are excellent candidates for crop improvement. Related examples in recent studies include TF gene modulation and overexpression approaches in crop species to enhance stress tolerance. However, much remains to be discovered about the diverse plant TFs. Of the >80 TF families, only a few, such as NAC, MYB, WRKY, bZIP, and ERF/DREB, with vital roles in abiotic and biotic stress responses have been intensively studied. Moreover, although significant progress has been made in deciphering the roles of TFs in important cereal crops, fewer TF genes have been elucidated in sorghum. As a model drought-tolerant crop, sorghum research warrants further focus. This review summarizes recent progress on major TF families associated with abiotic and biotic stress tolerance and their potential for crop improvement, particularly in sorghum. Other TF families and non-coding RNAs that regulate gene expression are discussed briefly. Despite the emphasis on sorghum, numerous examples from wheat, rice, maize, and barley are included. Collectively, the aim of this review is to illustrate the potential application of TF genes for stress tolerance improvement and the engineering of resistant crops, with an emphasis on sorghum.

285 citations


Cites background from "Transcription factor–DNA binding: b..."

  • ...[12] highlighted recent findings that elaborate how TF interactions, local DNA structure, and genomic features can influence TF binding to DNA....

    [...]

Journal ArticleDOI
TL;DR: This review covers recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations and surveys how these approaches advance the understanding of transcription regulation with respect to promoter specificity and transcriptional bursting.
Abstract: Enhancers are important genomic regulatory elements directing cell type-specific transcription. They assume a key role during development and disease, and their identification and functional characterization have long been the focus of scientific interest. The advent of next-generation sequencing and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based genome editing has revolutionized the means by which we study enhancer biology. In this review, we cover recent developments in the prediction of enhancers based on chromatin characteristics and their identification by functional reporter assays and endogenous DNA perturbations. We discuss that the two latter approaches provide different and complementary insights, especially in assessing enhancer sufficiency and necessity for transcription activation. Furthermore, we discuss recent insights into mechanistic aspects of enhancer function, including findings about cofactor requirements and the role of post-translational histone modifications such as monomethylation of histone H3 Lys4 (H3K4me1). Finally, we survey how these approaches advance our understanding of transcription regulation with respect to promoter specificity and transcriptional bursting and provide an outlook covering open questions and promising developments.

176 citations

Journal ArticleDOI
TL;DR: The authors discuss the sources of biological and non-biological zeros in single-cell RNA-seq data, introduce five mechanisms of adding non-biological zeros to computational benchmarking, evaluate the impacts of nonbiologically zeros on data analysis, benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biology zeros; and advocate the importance of transparent analysis.
Abstract: Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.

176 citations

Journal ArticleDOI
TL;DR: The updated EnhancerAtlas 2.0 is a huge expansion of the first version, which only contains the enhancers in human cells, and predicted enhancer–target gene relationships in human, mouse and fly.
Abstract: Enhancers are distal cis-regulatory elements that activate the transcription of their target genes. They regulate a wide range of important biological functions and processes, including embryogenesis, development, and homeostasis. As more and more large-scale technologies were developed for enhancer identification, a comprehensive database is highly desirable for enhancer annotation based on various genome-wide profiling datasets across different species. Here, we present an updated database EnhancerAtlas 2.0 (http://www.enhanceratlas.org/indexv2.php), covering 586 tissue/cell types that include a large number of normal tissues, cancer cell lines, and cells at different development stages across nine species. Overall, the database contains 13 494 603 enhancers, which were obtained from 16 055 datasets using 12 high-throughput experiment methods (e.g. H3K4me1/H3K27ac, DNase-seq/ATAC-seq, P300, POLR2A, CAGE, ChIA-PET, GRO-seq, STARR-seq and MPRA). The updated version is a huge expansion of the first version, which only contains the enhancers in human cells. In addition, we predicted enhancer-target gene relationships in human, mouse and fly. Finally, the users can search enhancers and enhancer-target gene relationships through five user-friendly, interactive modules. We believe the new annotation of enhancers in EnhancerAtlas 2.0 will facilitate users to perform useful functional analysis of enhancers in various genomes.

174 citations


Cites background from "Transcription factor–DNA binding: b..."

  • ...Especially for the ‘TF-binding’ track, it could contain dozens of datasets for different TFs....

    [...]

  • ...The consensus enhancers in EnhancerAtlas 2.0 were identified based on twelve high-throughput experimental approaches, including P300 (12), Histone (10), POLR2A (13,21), TF-binding (11), DHS (or ATAC) (8,9), FAIRE (16), MNase-seq (14,15), GRO-seq (6), STARR-seq (5), CAGE (2), ChIA-PET (20) and MPRA (17)....

    [...]

  • ...The TFs often regulate gene expression by binding to the DNA regulatory elements (11)....

    [...]

  • ...0 were identified based on twelve high-throughput experimental approaches, including P300 (12), Histone (10), POLR2A (13,21), TF-binding (11), DHS (or ATAC) (8,9), FAIRE (16), MNase-seq (14,15), GRO-seq (6), STARR-seq (5), CAGE (2), ChIA-PET (20) and MPRA (17)....

    [...]

01 Jan 2013
TL;DR: The authors found that 80% of the deoxyribonuclease I hypersensitive sites (DHSs) are active during fetal development and are enriched in variants associated with gestational exposure-related phenotypes.
Abstract: ): Genome-wide association studies have identified many noncoding variants associated with common diseases and traits. We show that these variants are concentrated in regulatory DNA marked by deoxyribonuclease I (DNase I) hypersensitive sites (DHSs). Eighty-eight percent of such DHSs are active during fetal development and are enriched in variants associated with gestational exposure–related phenotypes. We identified distant gene targets for hundreds of variant-containing DHSs that may explain phenotype associations. Disease-associated variants systematically perturb transcription factor recognition sequences, frequently alter allelic chromatin states, and form regulatory networks. We also demonstrated tissue-selective enrichment of more weakly disease-associated variants within DHSs and the de novo identification of pathogenic cell types for Crohn’s disease, multiple sclerosis, and an electrocardiogram trait, without prior knowledge of physiological mechanisms. Our results suggest pervasive involvement of regulatory DNA variation in common human disease and provide pathogenic insights into diverse disorders.

171 citations

References
More filters
Journal ArticleDOI
06 Sep 2012-Nature
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.
Abstract: The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

13,548 citations

Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

12,661 citations

Journal ArticleDOI
TL;DR: The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps, and all of the motif-based tools are now implemented as web services via Opal.
Abstract: The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net.

7,733 citations


"Transcription factor–DNA binding: b..." refers methods in this paper

  • ...org/ Collection of motif databases and web-based tools for motif discovery, enrichment, scanning, and comparison Eukaryotes and prokaryotes Collates external motif and sequence databases for multiple species [104,105]...

    [...]

Journal ArticleDOI
TL;DR: The feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making is demonstrated and classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes are discovered.
Abstract: We describe an assay for transposase-accessible chromatin using sequencing (ATAC-seq), based on direct in vitro transposition of sequencing adaptors into native chromatin, as a rapid and sensitive method for integrative epigenomic analysis. ATAC-seq captures open chromatin sites using a simple two-step protocol with 500-50,000 cells and reveals the interplay between genomic locations of open chromatin, DNA-binding proteins, individual nucleosomes and chromatin compaction at nucleotide resolution. We discovered classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes. Using ATAC-seq maps of human CD4(+) T cells from a proband obtained on consecutive days, we demonstrated the feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making.

4,984 citations

Journal ArticleDOI
TL;DR: From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content at every position in a site or sequence.
Abstract: A graphical method is presented for displaying the patterns in a set of aligned sequences. The characters representing the sequence are stacked on top of each other for each position in the aligned sequences. The height of each letter is made proportional to its frequency, and the letters are sorted so the most common one is on top. The height of the entire stack is then adjusted to signify the information content of the sequences at that position. From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content (measured in bits) at every position in a site or sequence. The logo displays both significant residues and subtle sequence patterns.

3,232 citations


"Transcription factor–DNA binding: b..." refers background in this paper

  • ...These PWMs can be represented graphically as sequence logos [90]....

    [...]