scispace - formally typeset
Open AccessJournal ArticleDOI

Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism

F. Kyle Satterstrom, +153 more
TLDR
Using an enhanced Bayesian framework to integrate de novo and case-control rare variation, 102 risk genes are identified at a false discovery rate of ≤ 0.1, consistent with multiple paths to an excitatory/inhibitory imbalance underlying ASD.
Abstract
We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n=35,584 total samples, 11,986 with ASD). Using an enhanced Bayesian framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate ≤ 0.1. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained for severe neurodevelopmental delay, while 53 show higher frequencies in individuals ascertained for ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most of the risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In human cortex single-cell gene expression data, expression of risk genes is enriched in both excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory/inhibitory imbalance underlying ASD.

read more

Content maybe subject to copyright    Report

Article
Large-Scale Exome Sequencing Study Implicates
Both Developmental and Functional Changes in the
Neurobiology of Autism
Graphical Abstract
Highlights
d 102 genes implicated in risk for autism spectrum disorder
(ASD genes, FDR % 0.1)
d Most are expressed and enriched early in excitatory and
inhibitory neuronal lineages
d Most affect synapses or regulate other genes; how these
roles dovetail is unknown
d Some ASD genes alter early development broadly, others
appear more specific to ASD
Authors
F. Kyle Satterstrom, Jack A. Kosmicki,
Jiebiao Wang, ..., Kathryn Roeder,
Mark J. Daly, Joseph D. Buxbaum
Correspondence
joseph.buxbaum@mssm.edu (J.D.B.),
stephan.sanders@ucsf.edu (S.J.S.),
roeder@andrew.cmu.edu (K.R.),
mjdaly@broadinstitute.org (M.J.D.)
In Brief
Large-scale sequencing of pat ients with
autism allows identification of over 100
putative ASD-associat ed genes, the
majority of which are neuronally
expressed, and investigation of distinct
genetic influences on ASD compared with
other neurodevelopmental disorders.
Satterstrom et al., 2020, Cell 180, 568–584
February 6, 2020 ª 2020 Elsevier Inc.
https://doi.org/10.1016/j.cell.2019.12.036

Article
Large-Scale Exome Sequencing Study Implicates
Both Developmental and Functional Changes
in the Neurobiology of Autism
F. Kyle Satterstrom,
1,2,3,37
Jack A. Kosmicki,
1,2,3,4,5,37
Jiebiao Wang,
6,37
Michael S. Breen,
7,8,9
Silvia De Rubeis,
7,8,9
Joon-Yong An,
10,11
Minshi Peng,
6
Ryan Collins,
5,12
Jakob Grove,
13,14,15
Lambertus Klei,
16
Christine Stevens,
1,3,4,5
Jennifer Reichert,
7,8
Maureen S. Mulhern,
7,8
Mykyta Artomov,
1,3,4,5
Sherif Gerges,
1,3,4,5
Brooke Sheppard,
10
Xinyi Xu,
7,8
Aparna Bhaduri,
17,18
Utku Norman,
19
Harrison Brand,
5
Grace Schwartz,
10
Rachel Nguyen,
20
Elizabeth E. Guerrero,
21
(Author list continued on next page)
SUMMARY
We present the largest exome sequencing study of
autism spectrum disorder (ASD) to date (n = 35,584
total samples, 11,986 with ASD). Using an enhanced
analytical framework to integrate de novo and case-
control rare variation, we identify 102 risk genes at a
false discovery rate of 0.1 or less. Of these genes, 49
show higher frequencies of disruptive de novo vari-
ants in individuals ascertained to have severe neuro-
developmental delay, whereas 53 show higher fre-
quencies in individuals ascertained to have ASD;
comparing ASD cases with mutations in these
groups reveals phenotypic differences. Expressed
early in brain development, most risk genes have
roles in regulation of gene expression or neuronal
communication (i.e., mutation s effect neurodevelop-
mental and neurophysiological changes), and 13 fall
within loci recurrently hit by copy number variants.
In cells from the human cortex, expression of risk
genes is enriched in excitatory and inhibitory
neuronal lineages, consistent with multiple paths to
an excitatory-inhibitory imbalance underlying ASD.
INTRODUCTION
Rare inherited and de novo variants are major contributors to in-
dividual risk for autism spectrum disorder (ASD) (De Rubeis et al.,
2014; Iossifov et al., 2014; Sanders et al., 2015). When such rare
variation disrupts a gene in individuals with ASD more often than
expected by chance, it implicates that gene in risk (He et al.,
2013). These risk genes provide insight into the underpinnings
1
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
2
Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
3
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
4
Harvard Medical School, Boston, MA, USA
5
Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
6
Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA
7
Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA
8
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
9
The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
10
Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
11
School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, Republic of Korea
12
Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, USA
13
The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
14
Center for Genomics and Personalized Medicine, Aarhus, Denmark
15
Department of Biomedicine Human Genetics, Aarhus University, Aarhus, Denmark
16
Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
17
Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
18
The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco,
CA, USA
19
Computer Engineering Department, Bilkent University, Ankara, Turkey
20
Center for Autism Research and Translation, University of California, Irvine, Irvine, CA, USA
21
MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California, Davis, Davis, CA, USA
22
Division of Genetics, Boston Children’s Hospital, Boston, MA, USA
23
Division of Developmental Medicine, Boston Children’s Hospital, Boston, MA, USA
24
Sorbonne Universite
´
, INSERM, CNRS, Neuroscience Paris Seine, Institut de Biologie Paris Seine, Paris, France
(Affiliations continued on next page)
568 Cell 180, 568–584, February 6, 2020 ª 2020 Elsevier Inc.

of ASD both individually (Ben-Shalom et al., 2017; Bernier et al.,
2014) and en masse (De Rubeis et al., 2014; Ruzzo et al., 2019;
Sanders et al., 2015; Willsey et al., 2013). However, fundamental
questions about the altered neurodevelopment and altered
neurophysiology in ASD—including when it occurs, where, and
in what cell types—remain poorly resolved.
Here we present the largest exome sequencing study in ASD
to date. We assembled a cohort of 35,584 samples, including
11,986 with ASD. We introduce an enhanced Bayesian analytic
framework that incorporates recently developed gene- and
variant-level scores of evolutionary constraint of genetic varia-
tion, and we use it to identify 102 ASD-associated genes (false
discovery rate [FDR] % 0.1). Because ASD is often one of a
constellation of symptoms of neurodevelopmental delay (NDD),
we identify subsets of the 102 ASD-associated genes that
have disruptive de novo variants more often in NDD-ascertained
or ASD-ascertained cohorts. We also consider the cellular func-
tion of ASD-associated genes and, by examining extant data
from single cells in the developing human cortex, (1) show that
their expression is enriched in maturing and mature excitatory
and inhibitory neurons from midfetal development onward, (2)
confirm their role in neuronal communication or regulation of
gene expression, and (3) show that these functions are sepa-
rable. Together, these insights form an important step forward
in elucidating the neurobiology of ASD.
RESULTS
Dataset
We analyzed whole-exome sequence (WES) data from 35,584
samples that passed our quality control procedures (STAR
Methods): 21,219 family-based samples (6,430 ASD cases,
2,179 unaffected siblings, and both parents) and 14,365 case-
control samples (5,556 ASD cases, 8,809 controls) (Figure S1;
Table S1). Of these, 6,197 samples were newly sequenced by
our consortium (1,908 cases with parents, 274 additional cases,
25 controls) and 11,265 samples were newly incorporated
(416 cases with parents, plus 4,811 additional cases and 5,214
controls from the Danish iPSYCH study; Satterstrom et al., 2018).
From the family-based data, we identified 9,345 rare de novo
variants in protein-coding exons (allele frequency % 0.1% in our
dataset and non-psychiatric subsets of reference databases):
63% of cases and 59% of unaffected siblings carried at least
one such variant (4,073 of 6,430 and 1,294 of 2,179, respectively;
Table S1 ; Figure S1). For inherited and case-control analyses, we
included variants with an allele count of no more than five in our
dataset or a reference database (STAR Methods; Kosmicki et al.,
2017; Lek et al., 2016 ).
Effect of Genetic Variants on ASD Risk
Because protein-truncating variants (PTVs; nonsense, frame-
shift, and essential splice site variants) show a greater difference
in burden between ASD cases and controls than missense vari-
ants, their average effect on liability must be larger (He et al.,
2013). Measures of functional severity assessing evolutionary
constraint against deleterious genetic variation, such as the
‘probability of loss-of-function intolerance’ (pLI) score (Kos-
micki et al., 2017; Lek et al., 2016) and the integrated ‘missense
badness, PolyPhen-2, constraint’ (MPC) score (Samocha et al.,
2017), can further delineate variant classes with higher burden.
Therefore, we divided the list of rare autosomal genetic variants
into seven tiers of predicted functional severity: three tiers for
PTVs by pLI score (R0.995, 0.5–0.995, 0–0.5) in order of
decreasing expected effect; likewise, three tiers for missense
variants by MPC score (R2, 1–2, 0–1); and a single tier for syn-
onymous variants, expected to have minimal effect. We further
divided variants by their inheritance pattern: de novo, inherited,
and case-control. Because ASD is associated with reduced
fecundity (Power et al., 2013), variation associated with it is sub-
ject to natural selection. Inherited variation has survived at least
Caroline Dias,
22,23
Autism Sequencing Consortium, and iPSYCH-Broad Consortium, Catalina Betancur,
24
Edwin H. Cook,
25
Louise Gallagher,
26
Michael Gill,
26
James S. Sutcliffe,
27,28
Audrey Thurm,
29
Michael E. Zwick,
30
Anders D. Børglum,
13,14,15,31
Matthew W. State,
10
A. Ercument Cicek,
6,19
Michael E. Talkowski,
5
David J. Cutler,
30
Bernie Devlin,
16
Stephan J. Sanders,
10,38,
*
Kathryn Roeder,
6,32,38,
*
Mark J. Daly,
1,2,3,4,5,33,38,
*
and
Joseph D. Buxbaum
7,8,9,34,35,36,38,39,
*
25
Institute for Juvenile Research, Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA
26
Department of Psychiatry, School of Medicine, Trinity College Dublin, Dublin, Ireland
27
Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, USA
28
Department of Molecular Physiology and Biophysics and Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA
29
National Institute of Mental Health, NIH, Bethesda, MD, USA
30
Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
31
Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
32
Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
33
Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
34
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
35
Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
36
Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
37
These authors contributed equally
38
Senior author
39
Lead Contact
*Correspondence: joseph.buxbaum@mssm.edu (J.D.B.), stephan.sanders@ucsf.edu (S.J.S.), roeder@andrew.cmu.edu (K.R.), mjdaly@
broadinstitute.org (M.J.D.)
https://doi.org/10.1016/j.cell.2019.12.036
Cell 180, 568–584, February 6, 2020 569

0
Variants per sample
Case-Control
0.10
RR=1.8
p = 4x10
-24
0.1
0.2
RR=1.3
p = 3x10
-5
0
0.4
0.8
RR=1.2
p = 4x10
-7
0
Cases Controls Cases Controls Cases Controls
Case-control:
Variants in cases
PTV
Missense
Synonymous
271,837
variants
0.05
0.15
Trans-
mitted
Untrans-
mitted
Variants per sample
Family-based: Transmission
RR=1.2
p = 0.07
0.06
0.12
0
0.1
0.2
RR=1.1
p = 1.00
0
0.5
1.0
RR=1.0
p = 1.00
0
Trans-
mitted
Untrans-
mitted
Trans-
mitted
Untrans-
mitted
Family-based:
Transmitted to cases
Missense
Synonymous
PTV
613,052
variants
AB
0
0.04
0.08
Variants per sample
RR=3.5
p = 4x10
-17
Family-based: De novo
p = 3x10
-6
RR=1.3
p = 1.00
0.03
0.06
RR=2.1
p = 3x10
-8
0
Cases Controls Cases Controls Cases Controls
0.01
0.02
0
0.03
Family-based:
De novo in cases
PTV
Missense
Synonymous
pLI 0.995
pLI 0.5-0.995
pLI 0-0.5
MPC 2
MPC 1-2
MPC 0-1
Synonymous
PTV Missense
7,131
variants
Cases or
transmitted
MalesFemales
Both
sexes
Controls or
untransmitted
C
PTVMissense
pLI 0.995
pLI 0.5-0.995
pLI 0-0.5
MPC 2
MPC 1-2
MPC 0-1
p < 0.001
***
Females
Males
Both sexes
Mean
95%CI
Family-based: De novo Family-based: Transmission Case-Control
Synonymous
***
***
***
***
***
0 0.50.25 0.75
Variant liability
(
z-score
)
0 0.25
Variant liability
(
z-score
)
0 0.25
Variant liability
(
z-score
)
(legend on next page)
570 Cell 180, 568–584, February 6, 2020

one generation of viability and fecundity selection in the parental
generation whereas de novo variation in offspring has not. Thus,
on average, de novo mutations are exposed to less selective
pressure and could mediate substantial risk for ASD. This expec-
tation is borne out by the substantially higher proportions of all
three PTV tiers and the two most severe missense variant tiers
in de novo compared with inherited variants (Figure 1A).
Comparing family-based cases with unaffected siblings in the
1,447 genes with pLI R 0.995, there is a 3.5-fold enrichment of
de novo PTVs (366 in 6,430 cases versus 35 in 2,179 controls;
0.057 versus 0.016 variants per sample (vps); p = 4 3 10
17
,
two-sided Poisson exact test; Figure 1B) and 1.2-fold enrich-
ment of rare inherited PTVs (695 transmitted versus 557 untrans-
mitted in 5,869 parents; 0.12 versus 0.10 vps; p = 0.07, binomial
exact test; Figure 1B). The same genes in the case-control data
show an intermediate 1.8-fold enrichment of PTVs (874 in 5,556
cases versus 759 in 8,809 controls; 0.16 versus 0.09 vps; p = 4 3
10
24
, binomial exact test; Figure 1B). Analysis of the middle tier
of PTVs (0.5 % pLI < 0.995) shows a similar but muted pattern
(Figure 1B), whereas the lowest tier of PTVs (pLI < 0.5) shows
no enrichment (Table S1).
De novo missense variants occur more frequently than de
novo PTVs. Collectively, they show only marginal enrichment
over the rate expected by chance (De Rubeis et al., 2014; Fig-
ure 1). The most severe de novo missense variants (MPC R 2),
however, show a frequency similar to the most severe tier of
de novo PTVs. They yield 2.1-fold case enrichment (354 in
6,430 cases versus 58 in 2,179 controls; 0.055 versus 0.027
vps; p = 3 3 10
8
, two-sided Poisson exact test; Figure 1B)
with consistent 1.2-fold enrichment in case-control data (4,277
in 5,556 cases versus 6,149 in 8,809 controls; 0.80 versus 0.68
vps; p = 4 3 10
7
, binomial exact test; Figure 1B). These variants
show stronger enrichment than the middle tier of PTVs, whereas
the other two tiers of missense variation are not significantly en-
riched (Table S1).
From our data, the proportion of the variance explained by
de novo PTVs is 1.3%, 1.2% of it from the highest pLI category.
The proportion of the variance explained by de novo MPC R 2
missense variants is 0.5%, whereas all remaining missense vari-
ation explains 0.12%. Thus, in total, all exome de novo variants in
the autosomes explain 1.92% of the variance of ASD.
Sex Differences in ASD Risk
ASD is more prevalent in males than females. In line with previous
observations (De Rubeis et al., 2014), we observe a 2-fold enrich-
ment of de novo PTVs in highly constrained genes in affected fe-
males (n = 1,097) versus affected males (n = 5,333) (p = 3 3 10
6
,
two-sided Poisson exact test; Figure 1B; Table S1). This result is
consistent with the female protective effect model, which postu-
lates that females require an increased genetic load to reach the
threshold for ASD diagnosis (Werling, 2016). The converse hy-
pothesis is that risk variation has larger effects in males than in fe-
males so that females require a higher burden to reach the same
diagnostic threshold as males. Across all classes of genetic var-
iants, we observed no significant sex differences in trait liability,
consistent with the female protective effect model (Figure 1C;
STAR Methods). Thus, we estimated the liability Z scores for
different classes of variants from both sexes together (Figure 1C;
Table S1) and leveraged them to enhance gene discovery.
ASD Gene Discovery
In previous risk gene discovery efforts, we used the transmitted
and de novo association (TADA) model (He et al., 2013) to inte-
grate protein-truncating and missense variants that are de
novo, inherited, or from case-control populations and to stratify
autosomal genes by FDR for association. Here we update the
TADA model to include pLI score as a continuous metric for
PTVs and MPC score as a two-tiered metric (R2, 1–2) for
missense variants (STAR Methods; Figure S2). From family
data, we include de novo PTVs as well as de novo missense var-
iants, whereas from the case-control, we include only PTVs; we
do not include inherited variants because of the limited liabilities
observed ( Figure 1C). Our analyses reveal that these modifica-
tions result in an enhanced TADA model with greater sensitivity
and accuracy than the original model (Figure 2A); no other cova-
riates examined were important after accounting for these
factors (STAR Methods).
Our refined TADA model identifies 102 ASD risk genes at FDR
% 0.1, of which 78 pass FDR % 0.05 and 26 pass Bonferroni-
corrected (p % 0.05) thresholds (Figure 2B; Table S2). Simulation
experiments (STAR Methods) show that the FDR is properly cali-
brated and relatively insensitive to estimates of the total number
of ASD-related genes in the genome (Figure S2). Of the 102 ASD-
associated genes, 60 were not discovered by our earlier ana-
lyses (De Rubeis et al., 2014; Iossifov et al., 2014; Sanders
et al., 2015). These include 30 considered truly novel because
they have not been implicated in autosomal dominant neurode-
velopmental disorders (ASD, developmental delay, epilepsy, and
intellectual disability) and were not significantly enriched for de
novo and/or rare variants in previous studies (Table S2). The pat-
terns of liability seen for the 102 genes are similar to that seen
over all genes (compare Figure 2C with Figure 1C), although
Figure 1. Distribution of Rare Autosomal Protein-Coding Variants in ASD Cases and Controls
(A) The proportion of rare autosomal genetic variants split by predicted functional consequences, represented by color, is displayed for family-based (split into de
novo and inherited variants) and case-control data. PTVs and missense variants are split into three tiers of predicted functional severity, represented by shade,
based on the pLI and MPC metrics, respectively.
(B) The relative difference in variant frequency (i.e., burden) between ASD cases and controls (top and bottom) or transmitted and untransmitted parental variants
(center) is shown for the top two tiers of functional severity for PTVs (left and center) and the top tier of functional severity for missense variants (right). Next to the
bar plot, the same data are shown divided by sex.
(C) The relative difference in variant frequency shown in (B) is converted to a trait liability Z score, split by the same subsets used in (A). For context, a Z score of
2.18 would shift an individual from the population mean to the top 1.69% of the population (equivalent to an ASD threshold based on 1 in 68 children; Christensen
et al., 2016). No significant difference in liability was observed between males and females for any analysis.
Statistical tests: (B) and (C), binomial exact test (BET) for most contrasts; exceptions were ‘both’ and ‘case-control,’ for which Fisher’s method for combining
BET p values for each sex and, for case-control, each population was used; p values corrected for 168 tests are shown.
Cell 180, 568–584, February 6, 2020 571

Figures
Citations
More filters
Journal ArticleDOI

Mapping genomic loci implicates genes and synaptic biology in schizophrenia

Vassily Trubetskoy, +432 more
- 08 Apr 2022 - 
TL;DR: In this article , a two-stage genome-wide association study of up to 76,755 individuals with schizophrenia and 243,649 control individuals was conducted, and the authors reported common variant associations at 287 distinct genomic loci.
Journal ArticleDOI

Rare coding variants in ten genes confer substantial risk for schizophrenia

Tarjinder Singh, +110 more
- 08 Apr 2022 - 
TL;DR: In this paper , a meta-analysing the whole exomes of 24,248 schizophrenia cases and 97,322 controls was used to implicate ultra-rare coding variants in 10 genes as conferring substantial risk for schizophrenia (odds ratios of 3-50, P < 2.14 × 10-6) and 32 genes at a false discovery rate of < 5%.
Journal ArticleDOI

Autism genes converge on asynchronous development of shared neuron classes

TL;DR: In this paper , the authors used organoid models of the human cerebral cortex to identify cell-type-specific developmental abnormalities that result from haploinsufficiency in three ASD risk genes, SUV420H1 (also known as KMT5B), ARID1B, and CHD8, in multiple cell lines from different donors.
Journal ArticleDOI

Rare coding variation provides insight into the genetic architecture and phenotypic context of autism

TL;DR: The authors explored the genes disrupted by these variants from joint analysis of protein-truncating variants (PTVs), missense variants and copy number variants (CNVs) in a cohort of 63,237 individuals.
Related Papers (5)

Large-scale exome sequencing study implicates both developmental and functional changes in the neurobiology of autism

Satterstrom Fk, +171 more
- 30 Nov 2018 - 

Novel genes for autism implicate both excitatory and inhibitory cell lineages in risk

F. Kyle Satterstrom, +172 more
- 01 Dec 2018 - 

Synaptic, transcriptional and chromatin genes disrupted in autism

Silvia De Rubeis, +99 more
- 13 Nov 2014 -