scispace - formally typeset
Search or ask a question

Showing papers by "Alexander Stark published in 2022"


Journal ArticleDOI
TL;DR: DeepSTarr as mentioned in this paper predicts the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells, including TF motifs and higher-order syntax rules.
Abstract: Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo. A deep-learning model called DeepSTARR quantitatively predicts enhancer activity on the basis of DNA sequence. The model learns relevant motifs and syntax rules, allowing for the design of synthetic enhancers with specific strengths.

63 citations


Journal ArticleDOI
TL;DR: In this article , the authors categorized human enhancers by their cofactor dependencies and showed that these categories provide a framework to understand the sequence and chromatin diversity of enhancers and their roles in different gene-regulatory programs.
Abstract: All multicellular organisms rely on differential gene transcription regulated by genomic enhancers, which function through cofactors that are recruited by transcription factors1,2. Emerging evidence suggests that not all cofactors are required at all enhancers3–5, yet whether these observations reflect more general principles or distinct types of enhancers remained unknown. Here we categorized human enhancers by their cofactor dependencies and show that these categories provide a framework to understand the sequence and chromatin diversity of enhancers and their roles in different gene-regulatory programmes. We quantified enhancer activities along the entire human genome using STARR-seq6 in HCT116 cells, following the rapid degradation of eight cofactors. This analysis identified different types of enhancers with distinct cofactor requirements, sequences and chromatin properties. Some enhancers were insensitive to the depletion of the core Mediator subunit MED14 or the bromodomain protein BRD4 and regulated distinct transcriptional programmes. In particular, canonical Mediator7 seemed dispensable for P53-responsive enhancers, and MED14-depleted cells induced endogenous P53 target genes. Similarly, BRD4 was not required for the transcription of genes that bear CCAAT boxes and a TATA box (including histone genes and LTR12 retrotransposons) or for the induction of heat-shock genes. This categorization of enhancers through cofactor dependencies reveals distinct enhancer types that can bypass broadly utilized cofactors, which illustrates how alternative ways to activate transcription separate gene expression programmes and provide a conceptual framework to understand enhancer function and regulatory specificity. The systematic categorization of human enhancers by their cofactor dependencies provides a conceptual framework to understand the sequence and chromatin diversity of enhancers and their roles in different gene-regulatory programmes.

36 citations


Journal ArticleDOI
TL;DR: In this article , the authors categorized human enhancers by their cofactor dependencies and showed that these categories provide a framework to understand the sequence and chromatin diversity of enhancers and their roles in different gene-regulatory programs.
Abstract: All multicellular organisms rely on differential gene transcription regulated by genomic enhancers, which function through cofactors that are recruited by transcription factors1,2. Emerging evidence suggests that not all cofactors are required at all enhancers3–5, yet whether these observations reflect more general principles or distinct types of enhancers remained unknown. Here we categorized human enhancers by their cofactor dependencies and show that these categories provide a framework to understand the sequence and chromatin diversity of enhancers and their roles in different gene-regulatory programmes. We quantified enhancer activities along the entire human genome using STARR-seq6 in HCT116 cells, following the rapid degradation of eight cofactors. This analysis identified different types of enhancers with distinct cofactor requirements, sequences and chromatin properties. Some enhancers were insensitive to the depletion of the core Mediator subunit MED14 or the bromodomain protein BRD4 and regulated distinct transcriptional programmes. In particular, canonical Mediator7 seemed dispensable for P53-responsive enhancers, and MED14-depleted cells induced endogenous P53 target genes. Similarly, BRD4 was not required for the transcription of genes that bear CCAAT boxes and a TATA box (including histone genes and LTR12 retrotransposons) or for the induction of heat-shock genes. This categorization of enhancers through cofactor dependencies reveals distinct enhancer types that can bypass broadly utilized cofactors, which illustrates how alternative ways to activate transcription separate gene expression programmes and provide a conceptual framework to understand enhancer function and regulatory specificity. The systematic categorization of human enhancers by their cofactor dependencies provides a conceptual framework to understand the sequence and chromatin diversity of enhancers and their roles in different gene-regulatory programmes.

33 citations


Journal ArticleDOI
TL;DR: In this paper , the authors demonstrate that wild-type-level housekeeping gene transcription requires the Iswi and Ino80 remodelers to maintain nucleosome positioning and phasing at promoters.

8 citations


Journal ArticleDOI
27 Dec 2022-bioRxiv
TL;DR: In this paper , the authors used functional genomics to uncover regulatory specificities between co-repressors and enhancers, revealing the existence of TF motif-based regulatory rules that coordinate CoRs-enhancer compatibilities.
Abstract: Animal development and homeostasis critically depend on the accurate regulation of gene transcription, which includes the silencing of genes that should not be expressed. Repression is mediated by a specific class of transcription factors (TFs) termed repressors that, via the recruitment of co-repressors (CoRs), can dominantly prevent transcription, even in the presence of activating cues. However, the relationship between specific CoRs and enhancers has remained unclear. Here, we used functional genomics to uncover regulatory specificities between CoRs and enhancers. We show that enhancers can typically be repressed by only a subset of CoRs. Enhancers classified by CoR sensitivity also show distinct biological functions and endogenous chromatin features. Moreover, enhancers that are sensitive or resistant to silencing by specific CoRs differ in TF motif content, and their sensitivity to CoRs can be predicted based on TF motif content. Finally, we identified and validated specific TF motifs that have a direct impact on enhancers sensitivity or resistance towards specific CoRs, using large scale motif mutagenesis and addition experiments. This study reveals the existence of TF motif-based regulatory rules that coordinate CoRs-enhancer compatibilities. These specificities between repressors and activators not only suggest that repression occurs via distinct mechanisms, but also provide an additional layer in transcriptional regulation that allows for differential repression at close genomic distances and offers multiple ways for de-repression.

4 citations


Journal ArticleDOI
TL;DR: The high-throughput next-generation-sequencing-based method Repressive-Domain (RD)-seq is developed to systematically identify RDs in complex libraries and shows that RDs which contain one of five distinct repressive motifs interact with and depend on different CoRs, including Groucho, CtBP, Sin3A or Smrter.
Abstract: All multicellular life relies on differential gene expression, determined by regulatory DNA elements and DNA-binding transcription factors that mediate activation and repression via cofactor recruitment. While activators have been extensively characterized, repressors are less well studied and their repressive domains (RDs) are typically unknown, as are the RDs’ properties and the co-repressors (CoRs) they recruit. Here, we develop the high-throughput next-generation-sequencing-based method Repressive-Domain (RD)-seq to systematically identify RDs in complex libraries. Screening more than 200,000 fragments covering the coding sequences of all transcription-related proteins in Drosophila melanogaster, we identify 195 RDs in known repressors and in proteins not previously associated with repression. Many RDs contain recurrent short peptide motifs that are required for RD function, as demonstrated by motif mutagenesis, and are conserved between fly and human. Moreover, we show that RDs which contain one of five distinct repressive motifs interact with and depend on different CoRs, including Groucho, CtBP, Sin3A or Smrter. Overall, our work constitutes an invaluable resource and advances our understanding of repressors, their sequences, and the functional impact of sequence-altering mutations.

4 citations


Journal ArticleDOI
TL;DR: Different promoter classes direct distinct mechanisms of transcription initiation, which relate to different focused versus dispersed initiation patterns, as well as identifying factors that can recruit and/or stabilize TFIIA at housekeeping promoters and activate transcription.
Abstract: Recruitment of RNA polymerase II (Pol II) to promoter regions is essential for transcription. Despite conflicting evidence, the Pol II Pre-Initiation Complex (PIC) is often thought to be of uniform composition and assemble at all promoters via an identical mechanism. Here, we show using Drosophila melanogaster S2 cells as a model that promoter classes with distinct functions and initiation patterns function via PICs that display different compositions and dependencies: developmental promoter DNA readily associates with the canonical Pol II PIC, whereas housekeeping promoter DNA does not and instead recruit different factors such as DREF. Consistently, TBP and DREF are required by distinct sets of promoters, and TBP and its paralog TRF2 function at different promoter types, partly exclusively and partly redundantly. In contrast, TFIIA is required for transcription from all promoters, and we identify factors that can recruit and/or stabilize TFIIA at housekeeping promoters and activate transcription. We show that promoter activation by these factors is sufficient to induce the dispersed transcription initiation patterns characteristic of housekeeping promoters. Thus, different promoter classes direct distinct mechanisms of transcription initiation, which relate to different focused versus dispersed initiation patterns.

2 citations


Journal ArticleDOI
TL;DR: In this article , a proteome-scale functional screen was presented to systematically uncover human proteins that can activate transcription, and the results showed that these proteins can activate both transcription and non-transcription.

1 citations


Posted ContentDOI
01 Sep 2022-bioRxiv
TL;DR: Replacing important motifs by an exhaustive set of all possible 65,536 eight-nucleotide-long random sequences and paste eight important TF motif types into 763 positions within 496 enhancers reveal that enhancers display constrained sequence flexibility and the context-specific modulation of motif function.
Abstract: The information about when and where each gene is to be expressed is mainly encoded in the DNA sequence of enhancers, sequence elements that comprise binding sites (motifs) for different transcription factors (TFs). Most of the research on enhancer sequences has been focused on TF motif presence, while the enhancer syntax, i.e. the flexibility of important motif positions and how the sequence context modulates the activity of TF motifs, remain poorly understood. Here, we explore the rules of enhancer syntax by a two-pronged approach in Drosophila melanogaster S2 cells: we (1) replace important motifs by an exhaustive set of all possible 65,536 eight-nucleotide-long random sequences and (2) paste eight important TF motif types into 763 positions within 496 enhancers. These complementary strategies reveal that enhancers display constrained sequence flexibility and the context-specific modulation of motif function. Important motifs can be functionally replaced by hundreds of sequences constituting several distinct motif types, but only a fraction of all possible sequences and motif types restore enhancer activity. Moreover, TF motifs contribute with different intrinsic strengths that are strongly modulated by the enhancer sequence context (the flanking sequence, presence and diversity of other motif types, and distance between motifs), such that not all motif types can work in all positions. The context-specific modulation of motif function is also a hallmark of human enhancers and TF motifs, as we demonstrate experimentally. Overall, these two general principles of enhancer sequences are important to understand and predict enhancer function during development, evolution and in disease.

1 citations