scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Genetic effects on promoter usage are highly context-specific and contribute to complex traits.

TL;DR: It is found that promoters, splicing and 3ʹ ends were predominantly controlled by independent genetic variants enriched in distinct genomic features, suggesting promoter usage might be an underappreciated molecular mechanism mediating complex trait associations in a context-specific manner.
Abstract: Genetic variants regulating RNA splicing and transcript usage have been implicated in both common and rare diseases. Although transcript usage quantitative trait loci (tuQTLs) have been mapped across multiple cell types and contexts, it is challenging to distinguish between the main molecular mechanisms controlling transcript usage: promoter choice, splicing and 3' end choice. Here, we analysed RNA-seq data from human macrophages exposed to three inflammatory and one metabolic stimulus. In addition to conventional gene-level and transcript-level analyses, we also directly quantified promoter usage, splicing and 3' end usage. We found that promoters, splicing and 3' ends were predominantly controlled by independent genetic variants enriched in distinct genomic features. Promoter usage QTLs were also 50% more likely to be context-specific than other tuQTLs and constituted 25% of the transcript-level colocalisations with complex traits. Thus, promoter usage might be an underappreciated molecular mechanism mediating complex trait associations in a context-specific manner.
Citations
More filters
28 Nov 2013
TL;DR: Using the ImmunoChip custom genotyping array, this article analyzed 14,498 subjects with multiple sclerosis and 24,091 healthy controls for 161,311 autosomal variants and identified 135 potentially associated regions (P < 1.0 × 10(-4)).
Abstract: Using the ImmunoChip custom genotyping array, we analyzed 14,498 subjects with multiple sclerosis and 24,091 healthy controls for 161,311 autosomal variants and identified 135 potentially associated regions (P < 1.0 × 10(-4)). In a replication phase, we combined these data with previous genome-wide association study (GWAS) data from an independent 14,802 subjects with multiple sclerosis and 26,703 healthy controls. In these 80,094 individuals of European ancestry, we identified 48 new susceptibility variants (P < 5.0 × 10(-8)), 3 of which we found after conditioning on previously identified variants. Thus, there are now 110 established multiple sclerosis risk variants at 103 discrete loci outside of the major histocompatibility complex. With high-resolution Bayesian fine mapping, we identified five regions where one variant accounted for more than 50% of the posterior probability of association. This study enhances the catalog of multiple sclerosis risk variants and illustrates the value of fine mapping in the resolution of GWAS signals.

152 citations

Journal ArticleDOI
TL;DR: It is discussed new technologies that can extend the standard regulatory mapping framework to more diverse, disease-relevant cell types and states and suggest an alternative strategy drawing on the dynamic and highly context-specific nature of gene regulation.

134 citations

Journal ArticleDOI
TL;DR: The eQTL Catalogue as discussed by the authors is a set of gene expression quantitative trait locus (eQTL) studies published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization.
Abstract: Many gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified. In the present study, we present the eQTL Catalogue ( https://www.ebi.ac.uk/eqtl ), a resource of quality-controlled, uniformly re-computed gene expression and splicing QTLs from 21 studies. We find that, for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies. Although most QTLs were shared between most bulk tissues, we identified a greater diversity of cell-type-specific QTLs from purified cell types, a subset of which also manifested as new disease co-localizations. Our summary statistics are freely available to enable the systematic interpretation of human GWAS associations across many cell types and tissues.

122 citations

Journal ArticleDOI
TL;DR: A global multiomics view of IFNα-induced changes in human beta cells at the level of chromatin, mRNA and protein expression is provided.
Abstract: Interferon-α (IFNα), a type I interferon, is expressed in the islets of type 1 diabetic individuals, and its expression and signaling are regulated by T1D genetic risk variants and viral infections associated with T1D. We presently characterize human beta cell responses to IFNα by combining ATAC-seq, RNA-seq and proteomics assays. The initial response to IFNα is characterized by chromatin remodeling, followed by changes in transcriptional and translational regulation. IFNα induces changes in alternative splicing (AS) and first exon usage, increasing the diversity of transcripts expressed by the beta cells. This, combined with changes observed on protein modification/degradation, ER stress and MHC class I, may expand antigens presented by beta cells to the immune system. Beta cells also up-regulate the checkpoint proteins PDL1 and HLA-E that may exert a protective role against the autoimmune assault. Data mining of the present multi-omics analysis identifies two compound classes that antagonize IFNα effects on human beta cells.

74 citations

Posted ContentDOI
29 Jan 2020-bioRxiv
TL;DR: The eQTL Catalogue is presented, a resource which contains quality controlled, uniformly recomputed QTLs from 21 eQtl studies, and it is found that for matching cell types and tissues, the eZTL effect sizes are highly reproducible between studies, enabling the integrative analysis of these data.
Abstract: An increasing number of gene expression quantitative trait locus (QTL) studies have made summary statistics publicly available, which can be used to gain insight into human complex traits by downstream analyses such as fine-mapping and colocalisation. However, differences between these datasets in their variants tested, allele codings, and in the transcriptional features quantified are a barrier to their widespread use. Here, we present the eQTL Catalogue, a resource which contains quality controlled, uniformly re-computed QTLs from 19 eQTL publications. In addition to gene expression QTLs, we have also identified QTLs at the level of exon expression, transcript usage, and promoter, splice junction and 3ʹ end usage. Our summary statistics can be downloaded by FTP or accessed via a REST API and are also accessible via the Open Targets Genetics Portal. We demonstrate how the eQTL Catalogue and GWAS Catalog APIs can be used to perform colocalisation analysis between GWAS and QTL results without downloading and reformatting summary statistics. New datasets will continuously be added to the eQTL Catalogue, enabling systematic interpretation of human GWAS associations across a large number of cell types and tissues. The eQTL Catalogue is available at https://www.ebi.ac.uk/eqtl/.

62 citations


Cites background or methods or result from "Genetic effects on promoter usage a..."

  • ...Where splicing or transcript-level QTL summary statistics have been released, these are still difficult to compare between studies due to large differences in analysis strategy and the types of transcript-level changes captured by different methods [15]....

    [...]

  • ...This analysis replicated the previously reported colocalisation with CD40 promoter usage in stimulated macrophages [15] (Figure 3B); however, the same analysis applied to monocyte-specific eQTLs strongly supported a model of distinct causal variants underlying the eQTL and GWAS association in all four studies (Figure 3C)....

    [...]

  • ...Transcriptional event usage is quantified with txrevise [15]....

    [...]

  • ...Finally, even though both splicing [1,14] and other transcript-level QTLs [15] contribute to complex traits, these analyses have not been performed on many earlier RNA-seq-based eQTL datasets....

    [...]

  • ...3 core [17] RNA-seq pipeline written in the Nextflow [18] framework and modified it to support the quantification of four different molecular phenotypes: gene expression, exon expression [19], transcript usage, and promoter, splicing and 3ʹ end usage events defined by txrevise [15] (Supplementary Figure 3)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this article, a model is described in an lmer call by a formula, in this case including both fixed-and random-effects terms, and the formula and data together determine a numerical representation of the model from which the profiled deviance or the profeatured REML criterion can be evaluated as a function of some of model parameters.
Abstract: Maximum likelihood or restricted maximum likelihood (REML) estimates of the parameters in linear mixed-effects models can be determined using the lmer function in the lme4 package for R. As for most model-fitting functions in R, the model is described in an lmer call by a formula, in this case including both fixed- and random-effects terms. The formula and data together determine a numerical representation of the model from which the profiled deviance or the profiled REML criterion can be evaluated as a function of some of the model parameters. The appropriate criterion is optimized, using one of the constrained optimization functions in R, to provide the parameter estimates. We describe the structure of the model, the steps in evaluating the profiled deviance or REML criterion, and the structure of classes or types that represents such a model. Sufficient detail is included to allow specialization of these structures by users who wish to write functions to fit specialized linear mixed models, such as models incorporating pedigrees or smoothing splines, that are not easily expressible in the formula language used by lmer.

50,607 citations


"Genetic effects on promoter usage a..." refers methods in this paper

  • ...Furthermore, to take advantage of our profiling of gene expression in overlapping set of donors in the stimulated and naive conditions, we also included the cell line as a random effect and fitted a linear mixed model using the lme4 (Bates et al., 2015) package....

    [...]

Journal ArticleDOI
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

30,684 citations

Journal ArticleDOI
TL;DR: FeatureCounts as discussed by the authors is a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments, which implements highly efficient chromosome hashing and feature blocking techniques.
Abstract: MOTIVATION: Next-generation sequencing technologies generate millions of short sequence reads, which are usually aligned to a reference genome. In many applications, the key information required for downstream analysis is the number of reads mapping to each genomic feature, for example to each exon or each gene. The process of counting reads is called read summarization. Read summarization is required for a great variety of genomic analyses but has so far received relatively little attention in the literature. RESULTS: We present featureCounts, a read summarization program suitable for counting reads generated from either RNA or genomic DNA sequencing experiments. featureCounts implements highly efficient chromosome hashing and feature blocking techniques. It is considerably faster than existing methods (by an order of magnitude for gene-level summarization) and requires far less computer memory. It works with either single or paired-end reads and provides a wide range of options appropriate for different sequencing applications. AVAILABILITY AND IMPLEMENTATION: featureCounts is available under GNU General Public License as part of the Subread (http://subread.sourceforge.net) or Rsubread (http://www.bioconductor.org) software packages.

14,103 citations


"Genetic effects on promoter usage a..." refers background or methods in this paper

  • ...0 (Liao et al., 2014) to count the number of uniquely mapping fragments overlapping transcript annotations from Ensembl 87....

    [...]

  • ...count quantified with featureCounts (Liao et al., 2014), (ii) full-length transcript usage quantified with Salmon (Patro et al....

    [...]

  • ...In each condition, we quantified gene expression and transcript usage using the following established quantification approaches: (i) gene-level read count quantified with featureCounts (Liao et al., 2014), (ii) full-length transcript usage quantified with Salmon (Patro et al....

    [...]

01 Jan 2001

8,336 citations


"Genetic effects on promoter usage a..." refers background in this paper

  • ...Jones E, Oliphant T, Peterson P. {SciPy}: Open source scientific tools for {Python} [Internet]. citeulike.org; 2001--....

    [...]

  • ...We converted QTLtools p-values to z-scores using the stats.norm.ppf(p/2, loc=0, scale=1) function from SciPy [61], where p is the p-value from QTLtools....

    [...]

  • ...ppf(p/2, loc = 0, scale = 1) function from SciPy (Jones et al., 2001), where p is the nominal p-value from QTLtools....

    [...]

Journal ArticleDOI
Stephan Ripke1, Stephan Ripke2, Benjamin M. Neale1, Benjamin M. Neale2  +351 moreInstitutions (102)
24 Jul 2014-Nature
TL;DR: Associations at DRD2 and several genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses.
Abstract: Schizophrenia is a highly heritable disorder. Genetic risk is conferred by a large number of alleles, including common alleles of small effect that might be detected by genome-wide association studies. Here we report a multi-stage schizophrenia genome-wide association study of up to 36,989 cases and 113,075 controls. We identify 128 independent associations spanning 108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported. Associations were enriched among genes expressed in brain, providing biological plausibility for the findings. Many findings have the potential to provide entirely new insights into aetiology, but associations at DRD2 and several genes involved in glutamatergic neurotransmission highlight molecules of known and potential therapeutic relevance to schizophrenia, and are consistent with leading pathophysiological hypotheses. Independent of genes expressed in brain, associations were enriched among genes expressed in tissues that have important roles in immunity, providing support for the speculated link between the immune system and schizophrenia.

6,809 citations

Related Papers (5)