scispace - formally typeset
Open AccessPosted ContentDOI

Cell-type specific cis-eQTLs in eight brain cell-types identifies novel risk genes for human brain disorders

TLDR
In this paper, the authors performed an eQTL analysis using single nuclei RNA-seq from 196 individuals in eight central nervous system (CNS) cell types and identified 6108 eGenes, a substantial fraction of which show cell-type specific effects, with strongest effects in microglia.
Abstract
Most expression quantitative trait loci (eQTL) studies to date have been performed in heterogeneous brain tissues as opposed to specific cell types. To investigate the genetics of gene expression in adult human cell types from the central nervous system (CNS), we performed an eQTL analysis using single nuclei RNA-seq from 196 individuals in eight CNS cell types. We identified 6108 eGenes, a substantial fraction (43%, 2620 out of 6108) of which show cell-type specific effects, with strongest effects in microglia. Integration of CNS cell-type eQTLs with GWAS revealed novel relationships between expression and disease risk for neuropsychiatric and neurodegenerative diseases. For most GWAS loci, a single gene colocalized in a single cell type providing new clues into disease etiology. Our findings demonstrate substantial contrast in genetic regulation of gene expression among CNS cell types and reveal genetic mechanisms by which disease risk genes influence neurological disorders.

read more

Content maybe subject to copyright    Report

1
Cell-type specific cis-eQTLs in eight brain cell-types identifies
novel risk genes for human brain disorders
Julien Bryois
1
, Daniela Calini
1
, Will Macnair
1
, Lynette Foo
1
, Eduard Urich
1
, Ward
Ortmann
2
, Victor Alejandro Iglesias
2
, Suresh Selvaraj
2
, Erik Nutma
3
, Manuel Marzin
3
,
Sandra Amor
3,4
, Anna Williams
5
, Gonçalo Castelo-Branco
6,7
, Vilas Menon
8
, Philip De
Jager
8,9
, Dheeraj Malhotra
1
1
Neuroscience and Rare Diseases (NRD), F. Hoffmann-La Roche Ltd, Grenzacherstrasse, Basel,
Switzerland.
2
Genentech, South San Francisco, California, USA
3
Pathology Department, VUmc, Amsterdam UMC, Amsterdam, The Netherlands
4
Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London,
United Kingdom
5
Centre for Regenerative Medicine, Institute for Regeneration and Repair, University of Edinburgh, Edinburgh,
EH16 4UU, UK
6
Laboratory of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska
Institutet, Stockholm, Sweden
7
Ming Wai Lau Centre for Reparative Medicine, Stockholm node, Karolinska Institutet, Stockholm, Sweden
8
Center for Translational & Computational Neuroimmunology, Department of Neurology, Columbia University
Irving Medical Center, New York, USA
9
Columbia Multiple Sclerosis Center, Department of Neurology, Columbia University Irving
Medical Center, New York, USA
Abstract:
Most expression quantitative trait loci (eQTL) studies to date have been performed in
heterogeneous brain tissues as opposed to specific cell types. To investigate the genetics of
gene expression in adult human cell types from the central nervous system (CNS), we
performed an eQTL analysis using single nuclei RNA-seq from 196 individuals in eight CNS
cell types. We identified 6108 eGenes, a substantial fraction (43%, 2620 out of 6108) of which
show cell-type specific effects, with strongest effects in microglia. Integration of CNS cell-
type eQTLs with GWAS revealed novel relationships between expression and disease risk for
neuropsychiatric and neurodegenerative diseases. For most GWAS loci, a single gene
colocalized in a single cell type providing new clues into disease etiology. Our findings
demonstrate substantial contrast in genetic regulation of gene expression among CNS cell types
and reveal genetic mechanisms by which disease risk genes influence neurological disorders.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.09.21264604doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

2
Introduction:
Most genetic associations from genome-wide association studies (GWAS) of brain disorders
lie within non-coding regions of the genome, challenging the identification of risk genes
1
, and
CNS cell types in which these risk variants regulate gene expression. Expression quantitative
trait loci (eQTLs) (i.e. genomic regions that explain variation in gene expression levels) have
become a powerful tool to uncover the molecular underpinnings of variants associated with
complex traits and diseases in non-coding regions of the genome
2–4
.
Importantly, eQTL relationships are highly dependent upon cell type
5
, cell states and
developmental stage of the human brain
6–8
, consistent with recent transcriptomic studies that
show prominent temporal and cell type specific changes in expression
9,10
. However, most prior
eQTL studies were done using bulk human brain tissues and have been partially successful in
prioritizing disease risk genes by integrating GWAS results with tissue-level eQTLs. A few
recent studies have investigated eQTLs in specific CNS cell types, for example, cis-eQTLs in
dopaminergic neurons derived from induced pluripotent stem cells
11
, and in sorted primary
human microglia
12,13
. To dissect the functional genetic variation of late-onset neuropsychiatric
and neurological diseases, we leveraged single-nucleus gene expression analysis from 196
adult human brain tissues (both cortical grey matter and deep white matter) to perform a
systematic eQTL analysis in all major adult human CNS cell types.
Here, we present the first single-cell based map of eQTLs in eight human CNS cell-types. We
show substantial cell-type specific effects in the genetic control of gene expression. Integration
of cell-type eQTLs with GWAS shows that, at most GWAS loci, a single gene colocalizes in a
single CNS cell-type thereby not only providing insights into disease-relevant genes but also
identifying new putative mechanisms of risk genes that have been missed by bulk tissue-level
analyses.
Figure 1: Study summary
We performed single nuclei RNA-seq on brain samples from 196 genotyped donors. We mapped cis-eQTLs for 8
major brain cell types and identified a total of 6108 cis-eQTL genes. We identified cell type specific genetic effects
and leveraged our results to identify risk genes for brain disorders.
196 Donors
Single nuclei !
RNA-seq
8 brain cell types
Genotyping
6108 eGenes in brain cell types
eQTL
GWAS
GWAS integration
Oligodendrocytes Excitatory neurons Inhibitory neurons Astrocytes Microglia OPCs / COPs Endothelial cells Pericytes
Pseudo-Bulk
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.09.21264604doi: medRxiv preprint

3
Results
Robust identification, fine mapping and functional characterization of brain cell type cis-
eQTLs
Our goal was to identify genetic variants regulating gene expression in CNS cell types.
Therefore, we performed single nuclei RNA-seq and genotyping in 246 human brain samples
from 123 independent individuals (Table S1). We integrated our dataset with 127 human brain
samples from published single-cell transcriptomic studies that were previously genotyped
14
16
, resulting in a total of 373 brain samples from 215 individuals. After quality control,
normalization, sample integration and genotype imputation (Online Methods), we obtained
gene expression data for 7208-10846 genes (protein coding and non-coding) and genotypes for
5.3 million SNPs in 196 individuals for 8 major CNS cell types (Figure 1) expressing clear
canonical markers of cell type identity (Figure S1).
We identified cis-eQTLs by testing all SNPs within a 1 megabase (MB) window surrounding
the transcription start site (TSS) of each expressed gene while adjusting for known covariates
(study, disease status) and inferred covariates (genotype first principal components (PCs),
expression first PCs) (Online Methods). We discovered 6108 genes with a cis-eQTL at a 5%
false discovery rate (FDR) across the 8 different CNS cell types (Figure 2A, Table S2). This
number represents only a small fraction of the potential cis-eQTL discoveries as we estimate
that at least 10-50% of the tested genes have an eQTL across the different cell types (Figure
S2, Online Methods). Most cis-eQTLs replicated in a large tissue-level cortical eQTL study
(Metabrain
17
) with 72.1-82.3% of the SNP-gene pairs having a pvalue <0.05 in this larger
study (Figure S3A). Cis-eQTLs that did not replicate (p>0.05) affected more constrained genes
(Figure S3B) and were located further from the TSS than replicating cis-eQTLs (Figure S3C).
Neuron cis-eQTLs replicated at a higher rate (80.1%, pi1=86%) than glia cis-eQTL (75%,
pi1=77%) (Figure 2C), possibly because some glial cell types are less prevalent than neurons
in the cortex. The number of detected eQTLs varied significantly between cell types (Figure
2A) (e.g. 2114 eQTL in excitatory neurons but only 23 in pericytes) and showed high
correlation with the total number of nuclei belonging to the cell type (Figure 2B). This suggests
that more nuclei allow for a better quantification of gene expression and, ultimately, the
discovery of more eQTLs.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.09.21264604doi: medRxiv preprint

4
Figure 2: Cis-eQTL discoveries
A) Number of cis-eQTLs per cell types (5% FDR). B) Number of cis-eQTLs (5% FDR) versus number of single
nuclei belonging to the cell types. C) Replication pvalues for each SNP-gene pairs of our cis-eQTL discoveries
(aggregated for glia and neurons) in a large cortical eQTL study
17
. D) Enrichment of the discovered cis-eQTLs
around the TSS. E) LOEUF scores
18
for genes with a cis-eQTL (5% FDR) and genes without a cis-eQTL
(constrained genes have low LOEUF score, not constrained genes have high LOEUF score). The black horizontal
bars indicate the medians. F) Examples of cis-eQTLs in astrocytes and microglia. G) Examples of cis-eQTLs with
a fine-mapped SNPs (causal probability of 0.94 for GRIN2A and 0.87 for GLUD1).
As expected, cis-eQTLs were enriched around the TSS (Figure 2D) and more frequently found
upstream of the gene or within the gene body than downstream of the gene (Figure S4) (2679
in gene body, 2156 upstream, 1273 downstream). We found that 48-59% of the top cis-eQTL
SNP affected the closest gene (depending on the cell type) (Figure S5, Online Methods).
Genes with a cis-eQTL were found to be less constrained than genes without a cis-eQTL
(Figure 2E), suggesting that less constrained genes are also more tolerant to variability in gene
expression levels. Genes with an eQTL in glial classes had on average higher expression levels
than genes without an eQTL (Wilcoxon pvalue=5*10
-9
), while the opposite was true for
neurons (Wilcoxon pvalue=1*10
-5
) (Figure S6). Altogether, these results suggest that cell type
level cis-eQTLs have similar properties as tissue level cis-eQTLs
2
.
We next investigated whether any of the genes with a cis-eQTL had additional independent
cis-eQTLs. We found that 126 genes had a secondary cis-eQTL, with the majority being
discovered in excitatory neurons and oligodendrocytes (112 genes) (Figure S7A,B).
Independent cis-eQTLs were enriched around the TSS of the target gene but located at a larger
distance from the TSS than the main cis-eQTLs (mean distance = 243kb vs 113kb, Wilcoxon
A
B
D
E
C
Astrocytes
Endothelial cells
Excitatory neurons
Inhibitory neurons
Microglia
Oligodendrocytes
OPCs / COPs
Pericytes
30
100
300
1000
3e+04 1e+05 3e+05
Number of cells
Number of ciseQTL (5%FDR)
Metabrain Cortex matched pvalues
prop = 74.8%
pi1 = 0.77
prop = 80.1%
pi1 = 0.86
Glia
Neurons
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0
500
1000
1500
2000
0
500
1000
1500
2000
PValue
count
Pericytes
Endothelial cells
Microglia
OPCs / COPs
Inhibitory neurons
Astrocytes
Oligodendrocytes
Excitatory neurons
0 500 1000 1500 2000
Number of ciseQTL (5% FDR)
OPCs / COPs
Pericytes
Microglia
Oligodendrocytes
Excitatory neurons
Inhibitory neurons
Astrocytes
Endothelial cells
1e+06 5e+05 0e+00 5e+05 1e+06 1e+06 5e+05 0e+00 5e+05 1e+06
0
10
20
30
40
0
100
200
300
0
200
400
600
0.0
2.5
5.0
7.5
0
100
200
300
0
250
500
750
1000
0
50
100
150
200
0
50
100
150
200
Distance to TSS
count
F
3
4
5
6
GG GA AA
Genotype
Expression (cpm) (log2+1)
NSUN2 rs567289 Astrocytes
adjusted pvalue =1.2e21
2.5
5.0
7.5
AA AG GG
Genotype
HLADRB5 rs9272105 Microglia
adjusted pvalue =1.7e16
G
4
6
8
AA AG GG
Genotype
Expression (cpm) (log2+1)
GRIN2A rs12932206 Oligodendrocytes
adjusted pvalue =1.1e26
4.5
5.0
5.5
6.0
6.5
AA AT
Genotype
GLUD1 rs17096421 Excitatory neurons
adjusted pvalue =7.5e08
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.09.21264604doi: medRxiv preprint

5
p=2*10-9) (Figure S7C,D). Larger sample sizes will be necessary to identify more
independent cis-eQTLs for cell types from the CNS.
We also performed fine-mapping to identify putative causal SNPs for our cis-eQTLs (Table
S3, Online Methods). As expected, fine-mapped SNPs with higher probability of being causal
were more likely to overlap epigenomic marks
19,20
(Figure S8A). Overall, we fine-mapped
413 cis-eQTLs (causal probability>50%) (Figure S8B). Examples of fine-mapped SNPs with
high causal probabilities include rs12932206 as the likely causal SNPs for the oligodendrocyte
GRIN2A eQTL (probability=0.94) (Figure 2G), a gene associated with schizophrenia
21
, and
rs17096421 as the likely causal SNP for the GLUD1 excitatory neuron eQTL
(probability=0.87) (Figure 2G), a gene with an important role in inhibitory synapse formation
22
.
We reasoned that CNS cell-type eQTLs could overlap with CNS cell-type specific regulatory
regions defined by snATAC-seq
20
and sought to test this as an external validation (Figure 3A,
Online Methods). We found that ATAC-seq peaks specific to glial cell types (astrocytes,
microglia, oligodendrocytes and OPCs / COPs) were enriched around the cis-eQTLs
discovered in the same cell types but not in the other cell types, suggesting that the discovered
glial eQTLs fall in regions of the genome functionally relevant to cell-type specific gene
regulation. Surprisingly, we did not observe enrichment of neuronal cis-eQTLs in neuron-
specific ATAC-seq peaks, neuron-specific CHIP-seq marks
19
(H3K4me3 and H3K27ac)
(Figure S9) and sorted nuclei bulk ATAC-seq
23
(Figure S10). The lack of enrichment of
neuronal specific regulatory regions around neuronal cis-eQTLs was robust to multiple
attempts at falsification (Online Methods), suggesting that the discovered neuronal cis-eQTLs
might be less cell-type specific than glial cis-eQTLs.
Cell-type specific eQTL effects
We used a negative binomial mixed model to investigate how many of the 6108 cis-eQTLs had
a significantly different effect size in the test cell type compared to a) the average effect size
across the 7 other cell types, b) at least one of the 7 cell types, and c) all other 7 cell types
(Online Methods). We found 1932, 2620 and 192 genes (5% FDR) respectively for the above-
mentioned comparisons (Figure 3B, Table S4). eGenes from a) and b) were positionally
enriched around TSS compared to shared eQTLs (Figure S11A, B). Among all cell-type
specific eGenes, excitatory neuron specific eGenes were significantly less constrained
compared to the shared neuronal eQTL genes (Figure S12A, B). Interestingly, the 192 cell-
type specific eGenes whose effect size is different than all other cell types are more constrained
(Figure S12C), with the strongest evidence for microglia specific cis-eQTLs genes (which
were also enriched in Alzheimer’s genetic associations (Figure S13)). Examples of these genes
include CHRM5, GRIN2A and MSRA in oligodendrocytes (Figure 3D, Figure S14), or
RNF150 in microglia and CPQ in OPCs / COPs (Figure 3E,F). GRIN2A is associated with
schizophrenia
21
, and MSRA was shown to protect dopaminergic neurons from cell death
24
.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a
perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in(which was not certified by peer review)preprint
The copyright holder for thisthis version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.09.21264604doi: medRxiv preprint

Figures
Citations
More filters
Journal ArticleDOI

The missing link between genetic association and regulatory function

- 14 Dec 2022 - 
TL;DR: In this article , the authors identify 220 gene-trait pairs in which protein-coding variants influence a complex trait or its Mendelian cognate, and find limited evidence that the baseline expression of trait-related genes explains GWAS associations, whether using colocalization methods (8% of genes implicated), transcription-wide association (2% of gene implicated), or a combination of regulatory annotations and distance (4%of genes implicated).
Posted ContentDOI

Single nuclei RNAseq stratifies multiple sclerosis patients into three distinct white matter glia responses

TL;DR: Using multi-omics factor analysis (MOFA+), three subgroups of MS patients are identified with distinct oligodendrocyte composition and WM glial gene expression signatures, suggestive of engagement of different pathological/regenerative processes.
Journal ArticleDOI

Using Stem Cell Models to Explore the Genetics Underlying Psychiatric Disorders: Linking Risk Variants, Genes, and Biology in Brain Disease.

TL;DR: Some of the technological advances that make it possible to ask exciting questions that are fundamental to the understanding of psychiatric disorders are discussed.
Posted ContentDOI

Convergent impact of schizophrenia risk genes

TL;DR: Convergence suggests a model to explain how non-additive interactions arise between risk genes and may explain cross-disorder pleiotropy of genetic risk for psychiatric disorders.
Journal ArticleDOI

Inferring cell-type-specific causal gene regulatory networks during human neurogenesis

TL;DR: In this paper , the authors identify cell-type-specific causal gene regulatory networks whereby the impacts of variants on gene expression were mediated by chromatin accessibility or distal gene expression.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Journal ArticleDOI

In situ click chemistry generation of cyclooxygenase-2 inhibitors

TL;DR: In situ click chemistry is used to develop COX-2 specific inhibitors with high in vivo anti-inflammatory activity, significantly higher than that of widely used selective cyclooxygenase-2 inhibitors.
Journal ArticleDOI

The mutational constraint spectrum quantified from variation in 141,456 humans

TL;DR: A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.
Journal ArticleDOI

glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling

TL;DR: The glmmTMB package fits many types of GLMMs and extensions, including models with continuously distributed responses, but here the authors focus on count responses and its ability to estimate the Conway-Maxwell-Poisson distribution parameterized by the mean is unique.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What have the authors contributed in "Cell-type specific cis-eqtls in eight brain cell-types identifies novel risk genes for human brain disorders" ?

To investigate the genetics of gene expression in adult human cell types from the central nervous system ( CNS ), the authors performed an eQTL analysis using single nuclei RNA-seq from 196 individuals in eight CNS cell types. The authors identified 6108 eGenes, a substantial fraction ( 43 %, 2620 out of 6108 ) of which show cell-type specific effects, with strongest effects in microglia. Is the author/funder, who has granted medRxiv a license to display the preprint in ( which was not certified by peer review ) preprint This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. 

In summary, their study provides a systematic investigation of eQTLs in cell-types of the adult human brain, defines a reference data set of CNS cell-type specific eQTLs and provides a foundational resource of high-confidence colocalized genes in disease-relevant cell types for robust future functional studies of neurodegenerative disease mechanisms using appropriate iPSC-based human cell models. 

CTSB plays an essential role in lysosomal degradation of α-synuclein 37, while TOMM7 is a small subunit of the TOM complex that is essential for the binding of PINK1 (a gene associated with monogenic forms of the disease) to the TOM complex. 

SNPs with imputation score <0.4 or with missingness greater than 5% were excluded, as well as individuals with more than 2% of missing genotypes. 

SampleQC allowed us to identify a subset of cells in many samples with both high splice ratios and high mitochondrial proportions (90% of reads being spliced), which were excluded. 

The risk column indicates whether an increase in gene expression leads to an increase in disease risk (red), a decrease in disease risk (blue) or whether the colocalization signal is due to a splicing QTL (orange). 

VM and PDJ provided snucRNAseq and whole genome sequencing data on a subset of AD samples and provided critical inputs on the interpretation of the results. 

INPP5D is a phosphatase which hydrolyze phosphatidylinositol-3,4,5-trisphosphate into phosphatidylinositol 3,4- diphosphate, which specifically binds to PLEKHA1 51, a gene recently associated with AD through proteome-wide association study 52. 

Microglia showed strongest evidence for cell-type specific genetic effects with an estimate that 60-92% of the discovered cis-eQTLs have adifferent genetic effect in the other cell types, reflecting its unique developmental origin. 

Estimates of the proportion of true alternative hypothesis (i.e. proportion of genes with a cis-eQTL, Figure S2) was performed using the pi1 statistic from the qvalue R package 25. 

most prior eQTL studies were done using bulk human brain tissues and have been partially successful in prioritizing disease risk genes by integrating GWAS results with tissue-level eQTLs.