Posted Content•DOI•

A map of transcriptional heterogeneity and regulatory variation in human microglia

Q: What are the contributions in "A map of transcriptional heterogeneity and regulatory variation in human microglia" ?

The authors mapped expression quantitative trait loci ( eQTLs ) in human microglia and show that many diseaseassociated eQTLs in microglia replicate well in a human induced pluripotent stem cell ( hIPSC ) derived macrophage model system. Preprint ( which was not certified by peer review ) is the author/funder. Using ATAC-seq from 95 individuals in this hIPSC model the authors fine-map candidate causal variants at risk loci for Alzheimer ’ s disease, the most prevalent neurodegenerative condition in acute brain injury patients.

Q: How was the iPS cell culture performed?

Samples were added to a 1.5 ml Eppendorf to which 350 µl of RNAlater (Qiagen) was added, samples were stored at -80°C prior to sequencingDNA extraction was performed from the venous blood.

Q: How many ng of gDNA was used for input for the SNP array?

200 ng of gDNA was used for input for the SNP array (Infinium Omni2.5-8 v1.4 Kit) and genotyping was performed according to the manufacturer's instructions.

Q: How many GWAS loci were used to identify independently associated SNPs?

Across 36 associated loci the authors used GCTA to identify independently associated SNPs with a threshold of p < 10-5, based on LD from 10,000 randomly-sampled UK Biobank individuals.

Q: What did the authors use to keep the posterior genotype dosage identical to the prior genotype dosage?

The authors used --no-posteriorupdate option to keep the posterior genotype dosage identical to the prior genotype dosage, that allowed us to stabilise the convergence of model fitting.

Q: How was the white cell component extracted?

The white cell component was extracted and transferred to a 1.5ml Eppendorf and stored as a frozen pellet at -80C prior to sequencing.

Q: How many cells passed the quality control criteria?

In total the authors sequenced 26,496 cells, of which 9,538 cells passed the quality control criteria: the minimum number of sequenced fragments (>10,000 autosomal fragments), the minimum number of expressed genes (>500 autosomal genes), mitochondrial fragment percentage (<20%) and the library complexity (percentage of autosomal fragment counts for the top 100 highly expressed genes<30%).

Q: What was the GWAS for the BIN1 and PTK2B loci?

For the BIN1 and PTK2B loci, the authors used GCTA --cojo-cond to determine summary statistics for each of the two independent signals at each locus, with a window of +/- 500 kb around each lead SNP.

Adam Mh Young¹, Natsuhiko Kumasaka¹, Fiona Calvert¹, Timothy R. Hammond², Andrew J Knights¹, Nikolaos Panousis¹, Jeremy Schwartzentruber³, Jimmy Z. Liu⁴, Kousik Kundu¹, Michael Segel⁵, Natalia A. Murphy⁵, Christopher E McMurran⁵, Harry Bulstrode⁵, Jason Correia⁵, Karol P. Budohoski⁵, Alexis J Joannides⁵, Mathew R. Guilfoyle⁵, Rikin A. Trivedi⁵, Ramez W. Kirollos⁵, Robert H. Morris⁵, Matthew R. Garnett⁵, Helen M Fernandes⁵, Ivan Timofeev⁵, Ibrahim Jalloh⁵, Katherine Holland⁵, Richard Mannion⁵, Richard Mair⁵, Colin Watts⁵, Stephen J. Price⁵, Peter J. Kirkpatrick⁵, Thomas Santarius⁵, Nicole Soranzo¹, Beth Stevens², Peter J. Hutchinson⁵, Robin J.M. Franklin⁵, Daniel J. Gaffney¹ - Show less +32 more•Institutions (5)

Wellcome Trust Sanger Institute¹, Howard Hughes Medical Institute², European Bioinformatics Institute³, Biogen Idec⁴, University of Cambridge⁵

20 Dec 2019-bioRxiv (Cold Spring Harbor Laboratory)-

TL;DR: This study provides the first population-scale transcriptional map of a critically important cell for neurodegenerative disorders and fine-map candidate causal variants at risk loci for Alzheimer’s disease.

read less

Abstract: Microglia, the tissue resident macrophages of the CNS, are implicated in a broad range of neurological pathologies, from acute brain injury to dementia. Here, we profiled gene expression variation in primary human microglia isolated from 141 patients undergoing neurosurgery. Using single cell and bulk RNA sequencing, we defined distinct cellular populations of acutely in vivo-activated microglia, and characterised a dramatic switch in microglial population composition in patients suffering from acute brain injury. We mapped expression quantitative trait loci (eQTLs) in human microglia and show that many disease-associated eQTLs in microglia replicate well in a human induced pluripotent stem cell (hIPSC) derived macrophage model system. Using ATAC-seq from 95 individuals in this hIPSC model we fine-map candidate causal variants at risk loci for Alzheimer9s disease, the most prevalent neurodegenerative condition in acute brain injury patients. Our study provides the first population-scale transcriptional map of a critically important cell for neurodegenerative disorders.

...read moreread less

Summary (3 min read)

Jump to: [Introduction] – [Tissue sampling] – [Dissociation of brain tissue] – [Fluorescence-activated cell sorting] – [Magnetic-activated cell sorting] – [Blood preparation] – [SNP genotyping] – [Sequencing data preprocessing] – [Variance component analysis] – [Detection of microglia subpopulations] – [Bayesian hierarchical model] and [URL]

Introduction

Microglia are tissue resident macrophages of the central nervous system and play critical roles in neurological immune defence, development and homeostasis (Schafer and Stevens 2015; Q. Li and Barres 2018; Salter and Stevens 2017).
Two populations (C and D) were common in patients with acute brain injury (25-76% of cells) but rare in other pathologies (<5% of cells).
This analysis retrieved colocalisations at other AD GWAS loci, such as CD33 and CASS4.
The authors also used the three-way model to evaluate the extent of sharing between the microglia eQTLs, IPSDM or monocyte eQTLs, and AD risk loci.
The authors identified multiple microglial subpopulations and showed how these populations are shaped by insult, injury and other life history factors.

Tissue sampling

Human brain tissue was obtained with informed consent under protocol REC 16/LO/2168 approved by the NHS Health Research Authority.
Adult brain tissue biopsies were taken from the site of neurosurgery resection for the original clinical indication.
Paired venous blood was sampled at the induction of anaesthesia.

Dissociation of brain tissue

The prepared mix was spun in HBSS+ (Life Technologies) at 300g for 5 mins and supernatant discarded.
The digested tissue was rigorously triturated at 4°C and filtered through a 70 m nylon cell strainer to remove large cell debris and undigested tissue.
Supernatant was discarded and the pellet was re-suspended in ice cold supplemented HALF.

Fluorescence-activated cell sorting

For single cell smart sequencing, human microglia were using fluorescence-activated cell sorting.
The isolated cell suspension was incubated with conjugated PE anti-human CD11b antibody for 20 mins at 4°C.
Cells were washed twice in ice cold supplemented HALF and stained with Helix NP viability marker.
Cell sorting was performed on BD AriaIII cell sorter (Becton, Dickinson and Company, Franklin Lakes, New Jersey, US) at the University of Cambridge Cell Phenotyping Hub at Cambridge University Hospital, Cambridge, UK.
Cells were either sorted into 98 well plates, prepared by the Wellcome Trust Sanger Institute for the purposes of single cell sequencing.

Magnetic-activated cell sorting

To avoid sustained stress on microglia as a result of prolonged sorting times for bulk sequencing magnetic-activated cell sorting was performed on these cells.
An isolated cell suspension of cells were incubated with anti-CD11b conjugated magnetic beads for 15 mins at 4°C.
Cells were washed twice with supplemented HALF and passed through an MS column .
Each sample was washed three times in the column and then extracted.

Blood preparation

DNA extraction was performed from the venous blood.
10 ml of whole blood was washed with 1% phosphate buffered saline (PBS) and layered on pancoll human (PAN biotech) and spun at 500g for 25 mins.
The white cell component was extracted and transferred to a 1.5ml Eppendorf and stored as a frozen pellet at -80C prior to sequencing.

SNP genotyping

Genomic DNA was extracted from blood using the QIAamp DNA mini and blood mini kit (Qiagen, 51104).
IPS cell culture and macrophage differentiation was carried as previously described (Alasoo et al. 2018) but with some minor modifications (see Supplementary Methods for details).
Tagmentation was quenched with 0.2 % sodium dodecyl sulphate.
Low-input bulk RNA-seq and ATAC-seq library preparation for primary microglia and iPS-derived macrophages For RNA-seq samples, between 0.3 ng and 10 ng of bulk total RNA from primary microglia cells or iPS-derived macrophage cells was used as input for a modified Smart-seq2 library preparation (Picelli et al. 2014) (see Supplementary Methods for detailed protocol).

Sequencing data preprocessing

All sequence data sets were aligned to human genome assembly GRCh38.
All other RNA-seq data were also aligned as same as their RNA-seq data without adapter trimming.
The authors fit the latent factor linear mixed model in which the three different studies were treated as a random effect (see Supplementary Note Section 1 for details).
The authors processed the data using the provided R script and obtained the cell type annotation for PBMCs.
The count data from two studies were joined by gene IDs and converted into CPM (count per million) along with their primary microglia read count data.

Variance component analysis

A linear mixed model of log(TPM+1) values across genome-wide genes (whose TPM>0 for 10% of total cells) was used to estimate the transcriptional variation.
The 13 different factors (Patient, the number of expressed genes per cell, pathology, plate ID, ERCC percentage, the number of expressed genes in each cell, 96 well plate position, age of patient, mitochondria RNA percentage, brain region, brain hemisphere, ethnicity and sex) were fitted as random effects with independent variance parameters 𝜙"#.
The variance explained by the factor k was measured by the intraclass correlation 𝜙"#/(1 + 𝜙"#), where the other 12 factors were fixed constant.
The standard error of the intraclass correlation was computed by the delta method with the standard error of the variance parameter estimator.
See Supplementary Note Section 1.1 for details.

Detection of microglia subpopulations

The authors used the linear mixed model to estimate the latent factors with the 13 known confounding factors (see Supplementary Note Section 1.2 for details).
There are 2l-1 contrasts which were tested against the null model (removing the focal factor k in the model) to compute Bayes factors.
The fragment counts were GC corrected as described in (Kumasaka, Knights, and Gaffney 2016), normalised into TPM (transcripts per million) and then log transformed (log of TPM+1).
25 principal components (PCs) were calculated and regressed out from the normalised expression levels.
The authors picked up the minimum BH Q-value for each gene to perform the multiple testing correction genome-wide.

Bayesian hierarchical model

The authors extended a standard Bayesian hierarchical model (Veyrieras et al. 2008) to jointly map eQTLs in three different cell types.
It can provide posterior probability that a gene is an eQTL for each cell type.
See Supplementary Note Section 2 for more details.
Alzheimer’s disease GWAS summary statistics GWAS of diagnosed AD (Kunkle et al. 2019) and a GWAS for family history of AD that the authors conducted in the UK Biobank (see Liu and Schwartzentruber 2019 for details) across 10,687,126 overlapping variants.
The authors lumped the true and proxy-cases together (53,042 unique affected individuals, 355,900 controls) and performed association tests using BOLT-LMM (Loh et al., 2015).

URL

Welch, Joshua D., Velina Kozareva, Ashley Ferreira, Charles Vanderburg, Carly Martin, and Evan Z. Macosko.
Zhang, Hanrui, Chenyi Xue, Rhia Shah, Kate Bermingham, Christine C. Hinkle, Wenjun Li, Amrith Rodrigues, et al. 2015.

Did you find this useful? Give us your feedback

Figures (6)

Figure 2. Transcriptional heterogeneity in human microglia. a. UMAP of 8,662 microglia cells after removing infiltrating cells. b. Microglial subpopulation variation between patient pathologies. Points coloured by the four different colours in Figure 2a illustrate subpopulation compositions for each pathology. Points coloured gray are all other cells. c. Heatmap showing the enrichment (log odds ratio) of microglial subpopulations between pathologies d. Heatmap of averaged, normalised expression level (defined as the posterior mean of pathology random effect term, see Online Methods) of differentially expressed genes at local true sign rate (ltsr) greater than 0.9 (Online Methods). Heatmap is divided into groups based on all possible pairwise groupings of the five cell subpopulations, with the most transcriptionally distinct population at the top. e. Pathway enrichment analysis of differentially expressed genes between different microglial subpopulations. f. Enrichment of disease-associated microglial (DAMs) transcriptional activation signatures in each of the four clusters in our data set. g. UMAP showing average normalised expression DAM genes transcriptional signatures in each cell. Red represents higher expression.

Figure 1. Study design and overview of the data. a. Metadata of 141 neurosurgery patients enrolled in this study. Brain region annotation: Cerebellum (C); Frontal (F); Occipital (O); Parietal (P); Temporal (T); non-dominant (ND); dominant (D). b. Experimental design using Smart-seq2 and low sample bulk RNA-seq with SNP genotyping. c. UMAP of bulk RNA-seq for myeloid cells and brain tissues (see Online Methods). Pink dots show our samples. d. UMAP of single-cell RNA-seq data combined with 68K PBMC scRNA-seq (Zheng et al. 2017) and whole brain DroNc-seq (Habib et al. 2017). Bright red dots represent cells collected in this study. Cell type annotations were obtained from: glutamatergic neurons from the PFC (exPFC); pyramidal neurons from the hip CA region (exCA); GABAergic interneurons (GABA); granule neurons from the hip dentate gyrus region (exDG); astrocytes (ASC); oligodendrocytes (ODC); oligodendrocyte precursor cells (OPC); neuronal stem cells (NSC); endothelial cells (END); dendritic cell (DC); B cell (B); hematopoietic progenitor cell (CD34+); NK T cell (NK). e. Proportions of non microglia for each patient in our data. Each horizontal bar corresponds to one patient. The thickness of each bar is proportional to the number of cells observed for the patient. Patients are stratified by pathology.

Figure 5. Fine-mapping of BIN1 locus. a. Posterior probability of colocalisation with Alzheimer’s disease (Liu and Schwartzentruber 2019). The y-axis is based on the GWAS primary signal of BIN1 locus and the x-axis is based on the secondary signal. b. Sequencing coverage depth of ATAC-seq and RNA-seq stratified by individuals or the three genotype groups at BIN1 lead eQTL SNP (rs6733839:C>T). The first two panels are obtained from the primary microglia and the bottom two panels are obtained from iPS cell derived macrophage. The MEF2A motif overlaps with the lead SNP and the alternative allele (T) increases predicted binding affinity. Tissue type annotation: Artery Tibial (AT), Esophagus Gastroesophageal Junction (EGJ), Colon Sigmoid (CS), Skin Sun Exposed Lower leg (SSELL), Heart Left Ventricle (HLV), Colon Transverse (CT), Esophagus Mucosa (EM), Pituitary (PI).

Table 1. Colocalisation analysis of microglia eQTL with 5 AD GWAS. STH: sub-threshold; below GWAS suggestive threshold of P=1.0x10-6; NA: conditional analysis result of GWAS is not available except for Liu and Schwartzentruber (2019) (see Online Methods).

Figure 4. Shared genetic architecture between microglia, other myeloid cell types and human complex traits. a. Colocalisation of microglia eQTLs with various GWAS traits. The x-axis shows the sum of posterior probability (PP) for independent causal variants for the microglia eQTL and a GWAS trait. The y-axis is the sum of posterior probabilities for colocalised eQTLs with the trait. b. Colocalisation of AD GWAS (Liu and Schwartzentruber 2019) with various cell and tissues. GTEx brain tissues are coloured yellow. c. The number of shared eQTL genes between microglia and monocyte with AD GWAS. Genes with posterior probability greater than 0.5 for each category are listed. Each gene is coloured by the statistical significance of the lead AD GWAS variant within the 1M cis window: P<5.0x10-8 (red); 5.0x10-8<=P<1.0x10-6 (orange); P>=1.0x10-6 (purple). d. Numbers of shared eQTL genes between microglia and iPSC-derived macrophage (IPSDM) with AD GWAS.

Figure 3. Single-cell RNA-seq revealing microglia heterogeneity driven clinical factors. a. Barplot shows the variance explained by each factor. Bars coloured gray are technical factors and coloured pink are clinical factors. Blue bars are partly related to patients’ genetic background. b. Heatmap showing the strength and direction of the age effect for differentially expressed (DE) genes at local true sign rate (ltsr) greater than 0.9 (Online Methods). c. Pathway enrichment for DE genes by age. Pathways coloured red are enriched only for DE genes upregulated by age, and the pathways coloured blue are enriched for DE genes downregulated by age. d-e. Example genes upregulated or downregulated by age. f. Heatmap showing the average normalised expression levels for DE genes by sex at ltsr greater than 0.9 (Online Methods). g. Pathway enrichment by sex. h. Pathway enrichment for combinations of brain regions. The blue bars show pathways upregulated in cerebellum and occipital lobe. The red bars show pathways upregulated in frontal, parietal and temporal lobes.

Content maybe subject to copyright Report

A map of transcriptional heterogeneity and regulatory

variation in human microglia

Adam MH Young

1-3*

, Natsuhiko Kumasaka

, Fiona Calvert

, Timothy R. Hammond

4,5

, Andrew

Knights

, Nikolaos Panousis

, Jeremy Schwartzentruber

, Jimmy Liu

, Kousik Kundu

, Michael

Segel

, Natalia Murphy

, Christopher E McMurran

, Harry Bulstrode

, Jason Correia

, Karol P

Budohoski

, Alexis Joannides

, Mathew R Guilfoyle

, Rikin Trivedi

, Ramez Kirollos

, Robert

Morris

, Matthew R Garnett

, Helen Fernandes

, Ivan Timofeev

, Ibrahim Jalloh

, Katherine

Holland

, Richard Mannion

, Richard Mair

, Colin Watts

3,8

, Stephen J Price

, Peter J

Kirkpatrick

, Thomas Santarius

, Nicole Soranzo

, Beth Stevens

4,5

, Peter J Hutchinson

, Robin

JM Franklin

& Daniel J Gaffney

1. Wellcome Trust MRC Stem Cell Institute, University of Cambridge, Cambridge, UK, CB2

0QQ. 2. Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire,

UK, CB10 1SA. 3. Division of Neurosurgery, Department of Clinical Neurosciences,

Cambridge University Hospitals, Cambridge, UK, CB2 0QQ. 4. FM Kirby Neurobiology Center,

Boston Children's Hospital, Harvard University, Boston, USA. 5. Howard Hughes Medical

Institute, Broad Institute of Harvard and MIT, Boston, USA. 6. EMBL-EBI, Wellcome Genome

Campus, Hinxton, Cambridgeshire, CB10 1SD. 7. Biogen, Cambridge, MA, 02142, USA. 8.

Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences,

Birmingham UK, B15 2TT $ Corresponding author

Abstract

Microglia, the tissue resident macrophages of the CNS, are implicated in a broad range of

neurological pathologies, from acute brain injury to dementia. Here, we profiled gene

expression variation in primary human microglia isolated from 141 patients undergoing

neurosurgery. Using single cell and bulk RNA sequencing, we defined distinct cellular

populations of acutely in vivo-activated microglia, and characterised a dramatic switch in

microglial population composition in patients suffering from acute brain injury. We mapped

expression quantitative trait loci (eQTLs) in human microglia and show that many disease-

associated eQTLs in microglia replicate well in a human induced pluripotent stem cell (hIPSC)

derived macrophage model system. Using ATAC-seq from 95 individuals in this hIPSC model

we fine-map candidate causal variants at risk loci for Alzheimer’s disease, the most prevalent

neurodegenerative condition in acute brain injury patients. Our study provides the first

population-scale transcriptional map of a critically important cell for neurodegenerative

disorders.

The copyright holder for thisthis version posted December 20, 2019. ; https://doi.org/10.1101/2019.12.20.874099doi: bioRxiv preprint

Introduction

Microglia are tissue resident macrophages of the central nervous system and play critical roles

in neurological immune defence, development and homeostasis (Schafer and Stevens 2015;

Q. Li and Barres 2018; Salter and Stevens 2017). These highly dynamic cells are challenging

to study in the laboratory and are strongly influenced by different experimental environments

(Gosselin et al. 2017). Genetic studies also strongly implicate microglial dysfunction in

neurodegeneration (Guerreiro et al. 2013; Jonsson et al. 2013; Tansey, Cameron, and Hill

2018; Gjoneska et al. 2015) particularly in the context of the injured brain (Johnson and

Stewart 2015). Single cell transcriptomics has suggested that microglial function may vary

across age, sex and brain region (Olah et al. 2018; Keren-Shaul et al. 2017; Hammond et al.

2019; Masuda et al. 2019; Mrdjen et al. 2018; Mathys et al. 2017). Previous studies have used

frozen post-mortem tissue from existing brain banks or fresh surgical samples typically from

restricted patient groups, typically temporal lobe resections for epilepsy or peritumoral

sampling. However, variability in the post-mortem index produces substantial variation in

cellular expression (Welch et al. 2019). Because of this, studies of microglial activation in

humans have relied on ex vivo stimulation with no available data from acutely injured human

brains. The challenge of sampling also means that large scale genetic studies of microglia

have not been attempted to date. Population studies have demonstrated that individuals

subject to mild brain trauma are 5-fold more likely to develop Alzheimer’s Disease (Mackay et

al. 2019). Consequently, it is of particular importance to understand the activation of human

microglia in the context of acute brain injury together with the underpinning genetic contribution

to neurodegeneration.

Characterisation microglial cell populations

Here, we describe the analysis of human microglia isolated from 141 patients undergoing a

range of neurosurgical procedures (Figure 1a). We recruited patients from a range of

pathologies, including 51 individuals with acute brain injury (haemorrhage and trauma), who

sustained substantial parenchymal injury, enabling us to observe in vivo microglial activation.

For each individual, we isolated CD11b-positive cells and performed both single cell

(SmartSeq2) (Picelli et al. 2014) and bulk RNA-seq on each individual. After QC, we retained

112 bulk RNA-seq samples, and 9,538 single cells from 129 patients (Figure 1b). All but three

of our bulk RNA-seq samples formed a single cluster with microglia from two previous studies

(Y. Zhang et al. 2016; Gosselin et al. 2017), and were distinct from both GTEx brain and

BLUEPRINT monocytes (Figure 1c). We then compared our single cell data to public datasets

of 68K PBMCs isolated from a healthy donor (Zheng et al. 2017) and 15K brain cells from 5

GTEx donors (Habib et al. 2017). A total of 8,662 cells formed a cluster with the microglia

population found in GTEx samples and distinct from PBMCs (Figure 1d) and expressed a

range of known microglial marker genes, including P2RY12, CX3CR1 and TMEM119, to a

high level (Extended Data Figure 1a). We defined this population of cells as microglia for the

remainder of our analysis. We found three less common populations of cells that closely

resembled other blood cell types, including NKT cells, monocytes or B-cells that comprised

8.4%, 0.5% and 0.3% of our single cell dataset, respectively. These cell types may reflect

either infiltration of immune cells as a result of blood-brain barrier breakdown or intravascular

contamination within the tissue. In support of the former hypothesis, we also found that the

abundance of infiltrating cells strongly correlated with patient pathology, with trauma patients

in particular enriched (OR=7.6, Fisher exact test P=1.2x10

-155

) (Figure 1d). We also found a

The copyright holder for thisthis version posted December 20, 2019. ; https://doi.org/10.1101/2019.12.20.874099doi: bioRxiv preprint

significant effect of age on the abundance of infiltrating cells (3.4% increase per year, Wald

test P=0.014) after adjusting for all known confounding factors, which could reflect blood brain

barrier degeneration over the lifespan (Extended Data Figure 1b).

Within microglia, we found four subpopulations of cells (Figure 2a). Two populations (C and

D) were common in patients with acute brain injury (25-76% of cells) but rare in other

pathologies (<5% of cells). Population B was enriched in tumor patients (OR=4.9, P=7.6x10

169

) while population A was most common in control and hydrocephalus patients (Figure 2b,

c). In populations A and B, we observed higher expression of microglial markers including

P2RY12 and CX3CR1. Cells from B, C and D also demonstrated an upregulation of general

immune response and cell activation (IL1B, CD83 & CCL3) (Figure 2d, e; Extended Data

Figure 2; Supplementary Table 1). Cells from C and D exhibited additional upregulation of

acute immune response pathways, including NF-kappa B, STAT3, RUNX1 as well as MHC-I

expression. Population C also showed differential expression of genes associated with stress

induced senescence and DNA damage (HIST1H2BG), populations D expressed genes

associated with cell proliferation (FLT1) and chemotaxis (CCL4, CXCL8, CXCL16), the latter

of which is shared with population B. Population B additionally showed strong upregulation of

catabolic process and metabolism (GPX1) and phagocytosis (TREM2). Our cells partially

overlapped with the transcriptional signatures of disease-associated microglia established in

previous literature (Keren-Shaul et al. 2017; Xue et al. 2014) (Figure 2f, g). Taken together,

these results suggest that our data contain a mix of naive microglia (population A), with three

distinct states of activation that, in part, are driven by patient pathology.

Biological drivers of microglial expression

Our sampling design enabled us to explore the relative importance of a wide range of biological

factors in driving microglial gene expression while controlling for important technical

confounders, using variance components analysis. Of the biological factors we examined,

clinical pathology explained more variation than all other factors combined, although all factors

except sex, including age, brain region, dominant hemisphere and ethnicity, explained a

fraction of variation that was significantly different from zero (LR test FWER 0.05) (Figure 3a).

Patient explained the most variability of any single factor in the model. Although this factor

captured the contribution of genetic background, it is also likely to reflect unmeasured

technical effects, such as variability in cell dissociation and surgical sampling, which are

confounded with patient in the model. The cellular pathways that differed between patient

pathologies closely resembled the differences we observed between different subpopulations

of microglia (Extended Data Figure 3a, b). We also detected 260 genes that varied

significantly by patient age, showing upregulation of inflammation (CLEC7A, CIITA and TLR2)

and downregulation of cell identity (P2RY12, CX3CR1), motility and proliferation (CSF1R) with

increasing age (Figure 3b-e; Extended Data Figure 3c; Supplementary Table 2). Although

sex explained little variation globally, we found 97 genes that were differentially expressed

between males and females (Figure 3f). These included multiple genes in the complement

pathway and synaptic pruning mechanisms (C1QA, C1QC and C3) that were more highly

expressed in females than males (Figure 3g; Extended Data Figure 3d; Supplementary

Table 3). Anatomical region of sampling also had a subtle effect on transcriptional variation,

with cerebellar microglia, which are known to exhibit a distinct, less ramified morphology

upregulating multiple recruitment chemokines (CCL4, CCL3, CCL4L2, CCL3L3) (Figure 3h;

Extended Data Figure 3e; Supplementary Table 4).

The copyright holder for thisthis version posted December 20, 2019. ; https://doi.org/10.1101/2019.12.20.874099doi: bioRxiv preprint

eQTL mapping in human microglia and neurodegeneration

We constructed a map of expression quantitative trait loci (eQTLs) in primary human microglia.

After excluding samples with low genotyping quality or substantial non-European ancestry, we

mapped eQTLs using our bulk RNA-seq data from 93 individuals, and detected 401 eQTLs,

summing over hierarchical model posteriors (585 eQTLs at FDR 5% using linear model). The

low number of eQTLs reflected the high between-sample heterogeneity in microglia, compared

with other cell types (Extended Data Figure 4a). We tested for colocalization of risk loci from

18 genome wide association studies (GWAS) with microglia eQTLs (Figure 4a), including five

previous studies of Alzheimer’s disease (AD), and our own meta-analysis of these five studies

for comparison (Online Methods). Across all AD GWAS, we found up to 11 risk loci with a

posterior probability of colocalisation (PP4) greater than 0.5 (Table 1). These included well-

known AD loci, such as BIN1, and less well-studied AD associations, for example EPHA1-

AS1. We repeated the analysis using microglia eQTLs mapped by RASQUAL to support the

colocalisation result using the allele specific expression signature (Supplementary Table 5).

This analysis retrieved colocalisations at other AD GWAS loci, such as CD33 and CASS4.

However, the test statistics may be inflated due to the additional overdispersion in our data

set (Extended Data Figure 4a).

Next, we compared AD risk loci from our meta-analysis with eQTLs from the GTEx project

(v7), in circulating blood monocytes and in a novel dataset of IPSC-derived macrophages

(IPSDMs) from 133 healthy individuals (Online Methods) (Figure 4b). We found more

colocalised AD eQTLs in microglia than in any GTEx brain region. We also observed many

AD risk loci that colocalised with eQTLs in blood monocytes and IPSDMs. To explore the level

of cell-type specificity, we mapped eQTLs jointly analysing data from microglia, monocytes

and IPSDMs using a three-way Bayesian hierarchical model (Extended Data Figure 4a, b;

Online Methods). Using this approach, we discovered 855 eQTLs, of which 108 were

microglia-specific, 449 were found in all three cell types, and 192 were shared with IPSDMs

but not monocytes. We also used the three-way model to evaluate the extent of sharing

between the microglia eQTLs, IPSDM or monocyte eQTLs, and AD risk loci. Many colocalised

AD loci, including BIN1, are found in both microglia and IPSDMs, but absent in monocytes

(Figure 4c, d). There were also multiple AD loci where an eQTL was only detectable in

circulating monocytes (e.g., CASS4 locus), although this is likely to reflect primarily the

differences in power between the monocyte (n=193) and microglia data sets.

IPS models of AD risk loci are an invaluable resource for the development of future

therapeutics. We next identified three AD association signals (BIN1, the EPHA1 locus and

PTK2B) that colocalised with both microglia and IPSDM eQTLs (Figure 4d). The association

for EPHA1-AS1 was shared across many cell types (Extended Data Figure 5a, b), while the

direction of effect at the PTK2B locus was inconsistent (Extended Data Figure 5c). To fine-

map causal variants we generated ATAC-seq data from 5 primary microglia and 89 IPSDMs.

Colocalisation analysis revealed that the AD association signal at BIN1 was highly cell type

specific (Figure 5a). The lead SNP of this association signal, rs6733839:C>T, was located in

a region of open chromatin in both microglia and IPSDMs in which the AD risk allele.

rs6733839:C>T was also associated with a significant change in chromatin accessibility

(Figure 5b, P<6.1x10

-10

), and the association signal for chromatin also colocalised

(PP4=0.996) with the AD association signal (Figure 5c-f). The AD risk allele at

rs6733839:C>T created a predicted high-affinity binding site for the MEF2A transcription factor

The copyright holder for thisthis version posted December 20, 2019. ; https://doi.org/10.1101/2019.12.20.874099doi: bioRxiv preprint

(Extended Data Figure 5d). We found that, although BIN1 and MEF2A are broadly expressed

in many tissues, co-expression of both genes was found only in primary microglia and IPSDMs

(Extended Data Figure 5e).

Discussion

Here we present a population-level study of human primary microglia. By sampling cells from

living donors, we defined transcriptional signatures of in vivo microglial activation avoiding

artefacts from post mortem index and cell culture. We identified multiple microglial

subpopulations and showed how these populations are shaped by insult, injury and other life

history factors. We also created the first map of eQTLs in microglia, identified high confidence

causal genes and variants underlying risk loci for Alzheimer’s disease, and identified a subset

that replicated in a scalable IPS model system.

Our results underscore the variability between microglia cells from different individuals. One

implication of the variation we observed between different patient pathologies is that the full

spectrum of microglial function is not well cannot be captured by small studies of a single

patient population. The most obvious example of this are the populations of activated microglia

we identified that account for less than 5% of cells in non-trauma patients. Our results also

provide a picture of the function of microglia following severe trauma, producing cell

populations that exhibit a mixture of a proinflammatory and chemotactic phenotypes. Notably,

although animal models of acute brain injury suggest rapid expansion of microglia following

trauma (Vela et al. 2002), we only observed one population we identified had a proliferative

phenotype, and both showed downregulation of CSF1R. Also in contrast to previous reports

(Olah et al. 2018), we found relatively subtle effects of age on microglial transcription. The

modest changes we did detect were consistent with increased inflammatory senescence in

microglia over the lifespan. Likewise, differences in microglia expression between males and

females were relatively small, although we did observe increased complement activity in

females, perhaps suggesting a role for complement pathways in the higher incidence of AD in

women.

Our eQTL analysis revealed a number of candidate risk genes for AD that function in microglia.

This included well-known genes, such as BIN1, and a number of less well understood loci.

One example we discovered was the EPHA1-AS1 locus, where AD risk appeared to be driven

by a change in the expression of a long noncoding RNA, rather than the neighbouring protein-

coding gene EPHA1 (Extended Data Figure 5a, b). We did not detect some well-known AD

risk loci, such as CD33, with suspected function in myeloid cells. In the case of CD33, analysis

of splicing patterns did indeed reveal a splice QTL at exon 2 (Extended Data Figure 6a-c),

consistent with previous studies (Raj et al. 2014). In other cases, we found strong

colocalisation between AD risk loci and monocyte, but not microglia, eQTLs. While it is

tempting to conclude that this reflects a monocyte-specific function, we believe it is more

plausible that this reflects lower power in our microglia dataset and that, with an increased our

sample size, many of these eQTLs would be found to be shared between the two cell types.

One example of this is the CASS4 locus, where the minor allele frequency of the GWAS lead

variant (rs6014724:A>G) was >10% in the monocyte data, but <5% in our microglia data set

(Extended Data Figure 6d). Other examples of apparently spurious monocyte-specific

The copyright holder for thisthis version posted December 20, 2019. ; https://doi.org/10.1101/2019.12.20.874099doi: bioRxiv preprint

HTML Viewer

Frequently Asked Questions (8)

Q1. What are the contributions in "A map of transcriptional heterogeneity and regulatory variation in human microglia" ?

The authors mapped expression quantitative trait loci ( eQTLs ) in human microglia and show that many diseaseassociated eQTLs in microglia replicate well in a human induced pluripotent stem cell ( hIPSC ) derived macrophage model system. Preprint ( which was not certified by peer review ) is the author/funder. Using ATAC-seq from 95 individuals in this hIPSC model the authors fine-map candidate causal variants at risk loci for Alzheimer ’ s disease, the most prevalent neurodegenerative condition in acute brain injury patients.

Q2. How was the iPS cell culture performed?

Samples were added to a 1.5 ml Eppendorf to which 350 µl of RNAlater (Qiagen) was added, samples were stored at -80°C prior to sequencingDNA extraction was performed from the venous blood.

Q3. How many ng of gDNA was used for input for the SNP array?

200 ng of gDNA was used for input for the SNP array (Infinium Omni2.5-8 v1.4 Kit) and genotyping was performed according to the manufacturer's instructions.

Q4. How many GWAS loci were used to identify independently associated SNPs?

Across 36 associated loci the authors used GCTA to identify independently associated SNPs with a threshold of p < 10-5, based on LD from 10,000 randomly-sampled UK Biobank individuals.

Q5. What did the authors use to keep the posterior genotype dosage identical to the prior genotype dosage?

The authors used --no-posteriorupdate option to keep the posterior genotype dosage identical to the prior genotype dosage, that allowed us to stabilise the convergence of model fitting.

Q6. How was the white cell component extracted?

The white cell component was extracted and transferred to a 1.5ml Eppendorf and stored as a frozen pellet at -80C prior to sequencing.

Q7. How many cells passed the quality control criteria?

In total the authors sequenced 26,496 cells, of which 9,538 cells passed the quality control criteria: the minimum number of sequenced fragments (>10,000 autosomal fragments), the minimum number of expressed genes (>500 autosomal genes), mitochondrial fragment percentage (<20%) and the library complexity (percentage of autosomal fragment counts for the top 100 highly expressed genes<30%).

Q8. What was the GWAS for the BIN1 and PTK2B loci?

For the BIN1 and PTK2B loci, the authors used GCTA --cojo-cond to determine summary statistics for each of the two independent signals at each locus, with a window of +/- 500 kb around each lead SNP.

A map of transcriptional heterogeneity and regulatory variation in human microglia

Summary (3 min read)

Introduction

Tissue sampling

Dissociation of brain tissue

Fluorescence-activated cell sorting

Magnetic-activated cell sorting

Blood preparation

SNP genotyping

Sequencing data preprocessing

Variance component analysis

Detection of microglia subpopulations

Bayesian hierarchical model

URL

Figures (6)

Citations

References

"A map of transcriptional heterogene..." refers methods in this paper

"A map of transcriptional heterogene..." refers methods in this paper

"A map of transcriptional heterogene..." refers methods or result in this paper

Related Papers (5)

Frequently Asked Questions (8)

Q1. What are the contributions in "A map of transcriptional heterogeneity and regulatory variation in human microglia" ?

Q2. How was the iPS cell culture performed?

Q3. How many ng of gDNA was used for input for the SNP array?

Q4. How many GWAS loci were used to identify independently associated SNPs?

Q5. What did the authors use to keep the posterior genotype dosage identical to the prior genotype dosage?

Q6. How was the white cell component extracted?

Q7. How many cells passed the quality control criteria?

Q8. What was the GWAS for the BIN1 and PTK2B loci?