scispace - formally typeset
Search or ask a question
Posted ContentDOI

Brain expression quantitative trait locus and network analysis reveals downstream effects and putative drivers for brain-related diseases

TL;DR: In this article, the authors harmonized and integrated 8,727 RNA-seq samples with accompanying genotype data from multiple brain-regions from 14 datasets and performed both cis-and trans-expression quantitative locus (eQTL) mapping.
Abstract: Gaining insight into the downstream consequences of non-coding variants is an essential step towards the identification of therapeutic targets from genome-wide association study (GWAS) findings. Here we have harmonized and integrated 8,727 RNA-seq samples with accompanying genotype data from multiple brain-regions from 14 datasets. This sample size enabled us to perform both cis- and trans-expression quantitative locus (eQTL) mapping. Upon comparing the brain cortex cis-eQTLs (for 12,307 unique genes at FDR We inferred the brain cell type for 1,515 cis-eQTLs by using cell type proportion information. We conducted Mendelian Randomization on 31 brain-related traits using cis-eQTLs as instruments and found 159 significant findings that also passed colocalization. Furthermore, two multiple sclerosis (MS) findings had cell type specific signals, a neuron-specific cis-eQTL for CYP24A1 and a macrophage specific cis-eQTL for CLECL1. To further interpret GWAS hits, we performed trans-eQTL analysis. We identified 2,589 trans-eQTLs (at FDR We also generated a brain-specific gene-coregulation network that we used to predict which genes have brain-specific functions, and to perform a novel network analysis of Alzheimer’s disease (AD), amyotrophic lateral sclerosis (ALS), multiple sclerosis (MS) and Parkinson’s disease (PD) GWAS data. This resulted in the identification of distinct sets of genes that show significantly enriched co-regulation with genes inside the associated GWAS loci, and which might reflect drivers of these diseases.
Citations
More filters
Posted ContentDOI
18 Mar 2021-medRxiv
TL;DR: All ALS associated signals combined reveal a role for perturbations in vesicle mediated transport and autophagy, and provide evidence for cell-autonomous disease initiation in glutamatergic neurons.
Abstract: Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease with a life-time risk of 1 in 350 people and an unmet need for disease-modifying therapies. We conducted a cross-ancestry GWAS in ALS including 29,612 ALS patients and 122,656 controls which identified 15 risk loci in ALS. When combined with 8,953 whole-genome sequenced individuals (6,538 ALS patients, 2,415 controls) and the largest cortex-derived eQTL dataset (MetaBrain), analyses revealed locus-specific genetic architectures in which we prioritized genes either through rare variants, repeat expansions or regulatory effects. ALS associated risk loci were shared with multiple traits within the neurodegenerative spectrum, but with distinct enrichment patterns across brain regions and cell-types. Across environmental and life-style risk factors obtained from literature, Mendelian randomization analyses indicated a causal role for high cholesterol levels. All ALS associated signals combined reveal a role for perturbations in vesicle mediated transport and autophagy, and provide evidence for cell-autonomous disease initiation in glutamatergic neurons.

110 citations

Journal ArticleDOI
TL;DR: This article performed an eQTL analysis using single-nuclei RNA sequencing from 192 individuals in eight brain cell types derived from the prefrontal cortex, temporal cortex and white matter, and identified 7,607 eGenes, a substantial fraction (46%, 3,537/7,607) of which show cell-type-specific effects with strongest effects in microglia.
Abstract: To date, most expression quantitative trait loci (eQTL) studies, which investigate how genetic variants contribute to gene expression, have been performed in heterogeneous brain tissues rather than specific cell types. In this study, we performed an eQTL analysis using single-nuclei RNA sequencing from 192 individuals in eight brain cell types derived from the prefrontal cortex, temporal cortex and white matter. We identified 7,607 eGenes, a substantial fraction (46%, 3,537/7,607) of which show cell-type-specific effects, with strongest effects in microglia. Cell-type-level eQTLs affected more constrained genes and had larger effect sizes than tissue-level eQTLs. Integration of brain cell type eQTLs with genome-wide association studies (GWAS) revealed novel relationships between expression and disease risk for neuropsychiatric and neurodegenerative diseases. For most GWAS loci, a single gene co-localized in a single cell type, providing new clues into disease etiology. Our findings demonstrate substantial contrast in genetic regulation of gene expression among brain cell types and reveal potential mechanisms by which disease risk genes influence brain disorders. Bryois et al. mapped genetic variants regulating gene expression in eight major brain cell types. They found a large number of cell-type-specific genetic effects and leveraged their results to identify novel putative risk genes for brain disorders.

48 citations

Journal ArticleDOI
TL;DR: This article performed an eQTL analysis using single-nuclei RNA sequencing from 192 individuals in eight brain cell types derived from the prefrontal cortex, temporal cortex and white matter, and identified 7,607 eGenes, a substantial fraction (46%, 3,537/7,607) of which show cell-type-specific effects with strongest effects in microglia.
Abstract: To date, most expression quantitative trait loci (eQTL) studies, which investigate how genetic variants contribute to gene expression, have been performed in heterogeneous brain tissues rather than specific cell types. In this study, we performed an eQTL analysis using single-nuclei RNA sequencing from 192 individuals in eight brain cell types derived from the prefrontal cortex, temporal cortex and white matter. We identified 7,607 eGenes, a substantial fraction (46%, 3,537/7,607) of which show cell-type-specific effects, with strongest effects in microglia. Cell-type-level eQTLs affected more constrained genes and had larger effect sizes than tissue-level eQTLs. Integration of brain cell type eQTLs with genome-wide association studies (GWAS) revealed novel relationships between expression and disease risk for neuropsychiatric and neurodegenerative diseases. For most GWAS loci, a single gene co-localized in a single cell type, providing new clues into disease etiology. Our findings demonstrate substantial contrast in genetic regulation of gene expression among brain cell types and reveal potential mechanisms by which disease risk genes influence brain disorders. Bryois et al. mapped genetic variants regulating gene expression in eight major brain cell types. They found a large number of cell-type-specific genetic effects and leveraged their results to identify novel putative risk genes for brain disorders.

46 citations

Journal ArticleDOI
TL;DR: In this paper, the authors map e/sQTLs and allele-specific expression in cultured cells representing two major developmental stages, primary human neural progenitors and their sorted neuronal progeny, identifying numerous loci not detected in either bulk developing cortical wall or adult cortex.
Abstract: Interpretation of the function of non-coding risk loci for neuropsychiatric disorders and brain-relevant traits via gene expression and alternative splicing quantitative trait locus (e/sQTL) analyses is generally performed in bulk post-mortem adult tissue. However, genetic risk loci are enriched in regulatory elements active during neocortical differentiation, and regulatory effects of risk variants may be masked by heterogeneity in bulk tissue. Here, we map e/sQTLs, and allele-specific expression in cultured cells representing two major developmental stages, primary human neural progenitors (n = 85) and their sorted neuronal progeny (n = 74), identifying numerous loci not detected in either bulk developing cortical wall or adult cortex. Using colocalization and genetic imputation via transcriptome-wide association, we uncover cell-type-specific regulatory mechanisms underlying risk for brain-relevant traits that are active during neocortical differentiation. Specifically, we identified a progenitor-specific eQTL for CENPW co-localized with common variant associations for cortical surface area and educational attainment.

27 citations

Journal ArticleDOI
TL;DR: In this article , Mendelian randomisation (MR) was used to investigate the causal effect of metformin targets on Alzheimer's disease and potential causal mechanisms in the brain linking the two.
Abstract: Metformin use has been associated with reduced incidence of dementia in diabetic individuals in observational studies. However, the causality between the two in the general population is unclear. This study uses Mendelian randomisation (MR) to investigate the causal effect of metformin targets on Alzheimer's disease and potential causal mechanisms in the brain linking the two.Genetic proxies for the effects of metformin drug targets were identified as variants in the gene for the corresponding target that associated with HbA1c level (N=344,182) and expression level of the corresponding gene (N≤31,684). The cognitive outcomes were derived from genome-wide association studies comprising 527,138 middle-aged Europeans, including 71,880 with Alzheimer's disease or Alzheimer's disease-by-proxy. MR estimates representing lifelong metformin use on Alzheimer's disease and cognitive function in the general population were generated. Effect of expression level of 22 metformin-related genes in brain cortex (N=6601 donors) on Alzheimer's disease was further estimated.Genetically proxied metformin use, equivalent to a 6.75 mmol/mol (1.09%) reduction on HbA1c, was associated with 4% lower odds of Alzheimer's disease (OR 0.96 [95% CI 0.95, 0.98], p=1.06×10-4) in non-diabetic individuals. One metformin target, mitochondrial complex 1 (MCI), showed a robust effect on Alzheimer's disease (OR 0.88, p=4.73×10-4) that was independent of AMP-activated protein kinase. MR of expression in brain cortex tissue showed that decreased MCI-related gene (NDUFA2) expression was associated with lower Alzheimer's disease risk (OR 0.95, p=4.64×10-4) and favourable cognitive function.Metformin use may cause reduced Alzheimer's disease risk in the general population. Mitochondrial function and the NDUFA2 gene are plausible mechanisms of action in dementia protection.

11 citations

References
More filters
Journal ArticleDOI
TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.
Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

30,684 citations

Journal ArticleDOI
TL;DR: EdgeR as mentioned in this paper is a Bioconductor software package for examining differential expression of replicated count data, which uses an overdispersed Poisson model to account for both biological and technical variability and empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference.
Abstract: Summary: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. Availability: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).

29,413 citations

Journal ArticleDOI
TL;DR: This work presents HTSeq, a Python library to facilitate the rapid development of custom scripts for high-throughput sequencing data analysis, and presents htseq-count, a tool developed with HTSequ that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes.
Abstract: Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an opensource software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. Contact: sanders@fs.tum.de

15,744 citations

Journal ArticleDOI
13 Jun 2019-Cell
TL;DR: A strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities.

7,892 citations

Journal ArticleDOI
TL;DR: An analytical strategy for integrating scRNA-seq data sets based on common sources of variation is introduced, enabling the identification of shared populations across data sets and downstream comparative analysis.
Abstract: Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

7,741 citations

Related Papers (5)