Home
/
Authors
/
Seungtai Yoon

Author

Seungtai Yoon

Other affiliations: Cold Spring Harbor Laboratory, Stony Brook University

Bio: Seungtai Yoon is an academic researcher from Icahn School of Medicine at Mount Sinai. The author has contributed to research in topics: Copy-number variation & Exome sequencing. The author has an hindex of 21, co-authored 28 publications receiving 21749 citations. Previous affiliations of Seungtai Yoon include Cold Spring Harbor Laboratory & Stony Brook University.

Topics: Copy-number variation, Exome sequencing, Exome, Genomics, Epigenetics of autism ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

12,661 citations

A global reference for human genetic variation

[...]

Adam Auton, Gonçalo R. Abecasis, David Altshuler, Richard Durbin +476 more

01 Oct 2015

TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.

...read moreread less

3,247 citations

Journal Article•DOI•

Strong Association of De Novo Copy Number Mutations with Autism

[...]

Jonathan Sebat¹, B. Lakshmi¹, Dheeraj Malhotra¹, Jennifer Troge¹, Christa Lese-Martin², Tom Walsh³, Boris Yamrom¹, Seungtai Yoon¹, Alexander Krasnitz¹, Jude Kendall¹, Anthony Leotta¹, Deepa Pai¹, Ray Zhang¹, Yoon-ha Lee¹, James W. Hicks¹, Sarah J. Spence⁴, Annette Lee⁵, Kaija Puura⁶, Terho Lehtimäki, David H. Ledbetter², Peter K. Gregersen⁵, Joel D. Bregman⁵, James S. Sutcliffe⁷, Vaidehi Jobanputra⁸, Wendy K. Chung⁸, Dorothy Warburton⁸, Mary Claire King³, David Skuse⁹, Daniel H. Geschwind¹⁰, T. Conrad Gilliam¹¹, Kenny Ye¹², Michael Wigler¹ - Show less +28 more•Institutions (12)

Cold Spring Harbor Laboratory¹, Emory University², University of Washington³, National Institutes of Health⁴, North Shore-LIJ Health System⁵, University of Tampere⁶, Vanderbilt University⁷, Columbia University⁸, University College London⁹, University of California, Los Angeles¹⁰, University of Chicago¹¹, Albert Einstein College of Medicine¹²

20 Apr 2007-Science

TL;DR: Findings establish de novo germline mutation as a more significant risk factor for ASD than previously recognized.

...read moreread less

Abstract: We tested the hypothesis that de novo copy number variation (CNV) is associated with autism spectrum disorders (ASDs). We performed comparative genomic hybridization (CGH) on the genomic DNA of patients and unaffected subjects to detect copy number variants not present in their respective parents. Candidate genomic regions were validated by higher-resolution CGH, paternity testing, cytogenetics, fluorescence in situ hybridization, and microsatellite genotyping. Confirmed de novo CNVs were significantly associated with autism (P = 0.0005). Such CNVs were identified in 12 out of 118 (10%) of patients with sporadic autism, in 2 out of 77 (3%) of patients with an affected first-degree relative, and in 2 out of 196 (1%) of controls. Most de novo CNVs were smaller than microscopic resolution. Affected genomic regions were highly heterogeneous and included mutations of single genes. These findings establish de novo germline mutation as a more significant risk factor for ASD than previously recognized.

...read moreread less

2,770 citations

Journal Article•DOI•

The contribution of de novo coding mutations to autism spectrum disorder

[...]

Ivan Iossifov¹, Brian J. O'Roak², Stephen Sanders³, Stephen Sanders⁴, Michael Ronemus¹, Niklas Krumm², Dan Levy¹, Holly A.F. Stessman², Kali Witherspoon², Laura Vives², Karynne E. Patterson², Joshua D. Smith², Bryan W. Paeper², Deborah A. Nickerson², Jeanselle Dea⁴, Shan Dong³, Shan Dong⁵, Luis E. Gonzalez³, Jeffrey D. Mandell⁴, Shrikant Mane³, Michael T. Murtha³, Catherine A.W. Sullivan³, Michael F. Walker⁴, Zainulabedin Waqar³, Liping Wei⁵, A. Jeremy Willsey³, A. Jeremy Willsey⁴, Boris Yamrom¹, Yoon-ha Lee¹, Ewa A. Grabowska¹, Ertugrul Dalkic¹, Ertugrul Dalkic⁶, Zihua Wang¹, Steven Marks¹, Peter Andrews¹, Anthony Leotta¹, Jude Kendall¹, Inessa Hakker¹, Julie Rosenbaum¹, Beicong Ma¹, Linda Rodgers¹, Jennifer Troge¹, Giuseppe Narzisi¹, Seungtai Yoon¹, Michael C. Schatz¹, Kenny Ye⁷, W. Richard McCombie¹, Jay Shendure², Evan E. Eichler², Evan E. Eichler⁸, Matthew W. State³, Matthew W. State⁴, Michael Wigler¹ - Show less +49 more•Institutions (8)

Cold Spring Harbor Laboratory¹, University of Washington², Yale University³, University of California, San Francisco⁴, Peking University⁵, Zonguldak Karaelmas University⁶, Yeshiva University⁷, Howard Hughes Medical Institute⁸

13 Nov 2014-Nature

TL;DR: It is estimated that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to contributory missense mutation.

...read moreread less

Abstract: Whole exome sequencing has proven to be a powerful tool for understanding the genetic architecture of human disease. Here we apply it to more than 2,500 simplex families, each having a child with an autistic spectrum disorder. By comparing affected to unaffected siblings, we show that 13% of de novo missense mutations and 43% of de novo likely gene-disrupting (LGD) mutations contribute to 12% and 9% of diagnoses, respectively. Including copy number variants, coding de novo mutations contribute to about 30% of all simplex and 45% of female diagnoses. Almost all LGD mutations occur opposite wild-type alleles. LGD targets in affected females significantly overlap the targets in males of lower intelligence quotient (IQ), but neither overlaps significantly with targets in males of higher IQ. We estimate that LGD mutation in about 400 genes can contribute to the joint class of affected females and males of lower IQ, with an overlapping and similar number of genes vulnerable to contributory missense mutation. LGD targets in the joint class overlap with published targets for intellectual disability and schizophrenia, and are enriched for chromatin modifiers, FMRP-associated genes and embryonically expressed genes. Most of the significance for the latter comes from affected females.

...read moreread less

2,124 citations

Journal Article•DOI•

Patterns and rates of exonic de novo mutations in autism spectrum disorders

[...]

Benjamin M. Neale¹, Yan Kou², Li Liu³, Avi Ma'ayan², Kaitlin E. Samocha¹, Kaitlin E. Samocha⁴, Aniko Sabo⁵, Chiao-Feng Lin⁶, Christine Stevens⁴, Li-San Wang⁶, Vladimir Makarov², Paz Polak⁴, Paz Polak⁷, Seungtai Yoon², Jared Maguire⁴, Emily L. Crawford⁸, Nicholas G. Campbell⁸, Evan T. Geller⁶, Otto Valladares⁶, Chad M. Schafer³, Han Liu⁹, Tuo Zhao⁹, Guiqing Cai², Jayon Lihm², Ruth Dannenfelser², Omar Jabado², Zuleyma Peralta², Uma Nagaswamy⁵, Donna M. Muzny⁵, Jeffrey G. Reid⁵, Irene Newsham⁵, Yuanqing Wu⁵, Lora Lewis⁵, Yi Han⁵, Benjamin F. Voight⁴, Benjamin F. Voight⁶, Elaine T. Lim¹, Elaine T. Lim⁴, Elizabeth J. Rossin¹, Elizabeth J. Rossin⁴, Andrew Kirby¹, Andrew Kirby⁴, Jason Flannick⁴, Menachem Fromer¹, Menachem Fromer⁴, Khalid Shakir⁴, Timothy Fennell⁴, Kiran V. Garimella⁴, Eric Banks⁴, Ryan Poplin⁴, Stacey Gabriel⁴, Mark A. DePristo⁴, Jack R. Wimbish, Braden E. Boone, Shawn Levy, Catalina Betancur¹⁰, Shamil R. Sunyaev⁷, Shamil R. Sunyaev⁴, Eric Boerwinkle⁵, Eric Boerwinkle¹¹, Joseph D. Buxbaum, Edwin H. Cook¹², Bernie Devlin¹³, Richard A. Gibbs⁵, Kathryn Roeder³, Gerard D. Schellenberg⁶, James S. Sutcliffe⁸, Mark J. Daly¹, Mark J. Daly⁴ - Show less +65 more•Institutions (13)

Harvard University¹, Icahn School of Medicine at Mount Sinai², Carnegie Mellon University³, Broad Institute⁴, Baylor College of Medicine⁵, University of Pennsylvania⁶, Brigham and Women's Hospital⁷, Vanderbilt University⁸, Johns Hopkins University⁹, French Institute of Health and Medical Research¹⁰, University of Texas Health Science Center at Houston¹¹, University of Illinois at Chicago¹², University of Pittsburgh¹³

04 Apr 2012-Nature

TL;DR: Results from de novo events and a large parallel case–control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors and support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold.

...read moreread less

Abstract: Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.

...read moreread less

1,700 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Analysis of protein-coding genetic variation in 60,706 humans

[...]

Monkol Lek, Konrad J. Karczewski¹, Konrad J. Karczewski², Eric Vallabh Minikel¹, Eric Vallabh Minikel², Kaitlin E. Samocha, Eric Banks¹, Timothy Fennell¹, Anne H. O’Donnell-Luria³, Anne H. O’Donnell-Luria², Anne H. O’Donnell-Luria¹, James S. Ware, Andrew J. Hill⁴, Andrew J. Hill¹, Andrew J. Hill², Beryl B. Cummings², Beryl B. Cummings¹, Taru Tukiainen¹, Taru Tukiainen², Daniel P. Birnbaum¹, Jack A. Kosmicki, Laramie E. Duncan¹, Laramie E. Duncan², Karol Estrada¹, Karol Estrada², Fengmei Zhao², Fengmei Zhao¹, James Zou¹, Emma Pierce-Hoffman², Emma Pierce-Hoffman¹, Joanne Berghout⁵, David Neil Cooper⁶, Nicole A. Deflaux⁷, Mark A. DePristo¹, Ron Do, Jason Flannick¹, Jason Flannick², Menachem Fromer, Laura D. Gauthier¹, Jackie Goldstein¹, Jackie Goldstein², Namrata Gupta¹, Daniel P. Howrigan², Daniel P. Howrigan¹, Adam Kiezun¹, Mitja I. Kurki¹, Mitja I. Kurki², Ami Levy Moonshine¹, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso¹, Gina M. Peloso², Ryan Poplin¹, Manuel A. Rivas¹, Valentin Ruano-Rubio¹, Samuel A. Rose¹, Douglas M. Ruderfer⁸, Khalid Shakir¹, Peter D. Stenson⁶, Christine Stevens¹, Brett Thomas², Brett Thomas¹, Grace Tiao¹, María Teresa Tusié-Luna, Ben Weisburd¹, Hong-Hee Won⁹, Dongmei Yu, David Altshuler¹, David Altshuler¹⁰, Diego Ardissino, Michael Boehnke¹¹, John Danesh¹², Stacey Donnelly¹, Roberto Elosua, Jose C. Florez¹, Jose C. Florez², Stacey Gabriel¹, Gad Getz², Gad Getz¹, Stephen J. Glatt¹³, Christina M. Hultman¹⁴, Sekar Kathiresan, Markku Laakso¹⁵, Steven A. McCarroll¹, Steven A. McCarroll², Mark I. McCarthy¹⁶, Mark I. McCarthy¹⁷, Dermot P.B. McGovern¹⁸, Ruth McPherson¹⁹, Benjamin M. Neale¹, Benjamin M. Neale², Aarno Palotie, Shaun Purcell⁸, Danish Saleheen²⁰, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan²¹, Patrick F. Sullivan¹⁴, Jaakko Tuomilehto²², Ming T. Tsuang²³, Hugh Watkins¹⁶, Hugh Watkins¹⁷, James G. Wilson²⁴, Mark J. Daly¹, Mark J. Daly², Daniel G. MacArthur¹, Daniel G. MacArthur² - Show less +103 more•Institutions (24)

Broad Institute¹, Harvard University², Boston Children's Hospital³, University of Washington⁴, University of Arizona⁵, Cardiff University⁶, Google⁷, Icahn School of Medicine at Mount Sinai⁸, Samsung Medical Center⁹, Vertex Pharmaceuticals¹⁰, University of Michigan¹¹, University of Cambridge¹², State University of New York Upstate Medical University¹³, Karolinska Institutet¹⁴, University of Eastern Finland¹⁵, University of Oxford¹⁶, Wellcome Trust Centre for Human Genetics¹⁷, Cedars-Sinai Medical Center¹⁸, University of Ottawa¹⁹, University of Pennsylvania²⁰, University of North Carolina at Chapel Hill²¹, University of Helsinki²², University of California, San Diego²³, University of Mississippi Medical Center²⁴

18 Aug 2016-Nature

TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.

...read moreread less

Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

...read moreread less

8,758 citations

Journal Article•DOI•

Finding the missing heritability of complex diseases

[...]

Teri A. Manolio¹, Francis S. Collins¹, Nancy J. Cox², David Goldstein³, Lucia A. Hindorff¹, David J. Hunter⁴, Mark I. McCarthy⁵, Erin M. Ramos¹, Lon R. Cardon⁶, Aravinda Chakravarti⁷, Judy H. Cho⁸, Alan E. Guttmacher¹, Augustine Kong⁹, Leonid Kruglyak¹⁰, Leonid Kruglyak¹¹, Elaine R. Mardis¹², Charles N. Rotimi¹, Montgomery Slatkin¹³, David Valle⁷, Alice S. Whittemore¹⁴, Michael Boehnke¹⁵, Andrew G. Clark¹⁶, Evan E. Eichler¹⁷, Greg Gibson¹⁸, Jonathan L. Haines¹⁹, Trudy F. C. Mackay²⁰, Steven A. McCarroll⁴, Peter M. Visscher²¹ - Show less +24 more•Institutions (21)

National Institutes of Health¹, University of Chicago², Duke University³, Harvard University⁴, University of Oxford⁵, GlaxoSmithKline⁶, Johns Hopkins University⁷, Yale University⁸, deCODE genetics⁹, Howard Hughes Medical Institute¹⁰, Princeton University¹¹, Washington University in St. Louis¹², University of California, Berkeley¹³, Stanford University¹⁴, University of Michigan¹⁵, Cornell University¹⁶, University of Washington¹⁷, University of Queensland¹⁸, Vanderbilt University¹⁹, North Carolina State University²⁰, QIMR Berghofer Medical Research Institute²¹

08 Oct 2009-Nature

TL;DR: This paper examined potential sources of missing heritability and proposed research strategies, including and extending beyond current genome-wide association approaches, to illuminate the genetics of complex diseases and enhance its potential to enable effective disease prevention or treatment.

...read moreread less

Abstract: Genome-wide association studies have identified hundreds of genetic variants associated with complex human diseases and traits, and have provided valuable insights into their genetic architecture. Most variants identified so far confer relatively small increments in risk, and explain only a small proportion of familial clustering, leading many to question how the remaining, 'missing' heritability can be explained. Here we examine potential sources of missing heritability and propose research strategies, including and extending beyond current genome-wide association approaches, to illuminate the genetics of complex diseases and enhance its potential to enable effective disease prevention or treatment.

...read moreread less

7,797 citations

Journal Article•DOI•

An integrated map of genetic variation from 1,092 human genomes

[...]

Gonçalo R. Abecasis¹, Adam Auton², Lisa D. Brooks³, Mark A. DePristo⁴, Richard Durbin⁵, Robert E. Handsaker⁶, Robert E. Handsaker⁴, Hyun Min Kang¹, Gabor T. Marth⁷, Gil McVean⁸ - Show less +6 more•Institutions (8)

University of Michigan¹, Yeshiva University², National Institutes of Health³, Broad Institute⁴, Wellcome Trust Sanger Institute⁵, Harvard University⁶, Boston College⁷, University of Oxford⁸

01 Nov 2012-Nature

TL;DR: It is shown that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites.

...read moreread less

Abstract: By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations.

...read moreread less

7,710 citations

Journal Article•DOI•

A general framework for estimating the relative pathogenicity of human genetic variants

[...]

Martin Kircher¹, Daniela Witten¹, Preti Jain, Brian J. O'Roak², Brian J. O'Roak¹, Gregory M. Cooper, Jay Shendure¹ - Show less +3 more•Institutions (2)

University of Washington¹, Oregon Health & Science University²

01 Mar 2014-Nature Genetics

TL;DR: The ability of CADD to prioritize functional, deleterious and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current single-annotation method.

...read moreread less

Abstract: Our capacity to sequence human genomes has exceeded our ability to interpret genetic variation. Current genomic annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Here, we describe Combined Annotation Dependent Depletion (CADD), a framework that objectively integrates many diverse annotations into a single, quantitative score. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human derived alleles from 14.7 million simulated variants. We pre-compute “C-scores” for all 8.6 billion possible human single nucleotide variants and enable scoring of short insertions/deletions. C-scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects, and complex trait associations, and highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious, and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current annotation.

...read moreread less

4,956 citations

Journal Article•DOI•

The mutational constraint spectrum quantified from variation in 141,456 humans

[...]

Konrad J. Karczewski¹, Laurent C. Francioli¹, Grace Tiao¹, Beryl B. Cummings¹, Jessica Alföldi¹, Qingbo Wang¹, Ryan L. Collins¹, Kristen M. Laricchia¹, Andrea Ganna¹, Daniel P. Birnbaum¹, Laura D. Gauthier¹, Harrison Brand¹, Matthew Solomonson¹, Nicholas A. Watts¹, Daniel R. Rhodes², Moriel Singer-Berk¹, Eleina M. England¹, Eleanor G. Seaby¹, Jack A. Kosmicki¹, Raymond K. Walters¹, Katherine Tashman¹, Yossi Farjoun¹, Eric Banks¹, Timothy Poterba¹, Arcturus Wang¹, Cotton Seed¹, Nicola Whiffin¹, Jessica X. Chong³, Kaitlin E. Samocha⁴, Emma Pierce-Hoffman¹, Zachary Zappala¹, Anne H. O’Donnell-Luria¹, Eric Vallabh Minikel¹, Ben Weisburd¹, Monkol Lek⁵, James S. Ware¹, Christopher Vittal⁶, Irina M. Armean¹, Louis Bergelson¹, Kristian Cibulskis¹, Kristen M. Connolly¹, Miguel Covarrubias¹, Stacey Donnelly¹, Steven Ferriera¹, Stacey Gabriel¹, Jeff Gentry¹, Namrata Gupta¹, Thibault Jeandet¹, Diane Kaplan¹, Christopher Llanwarne¹, Ruchi Munshi¹, Sam Novod¹, Nikelle Petrillo¹, David Roazen¹, Valentin Ruano-Rubio¹, Andrea Saltzman¹, Molly Schleicher¹, Jose Soto¹, Kathleen Tibbetts¹, Charlotte Tolonen¹, Gordon Wade¹, Michael E. Talkowski¹, Benjamin M. Neale¹, Mark J. Daly¹, Daniel G. MacArthur¹ - Show less +61 more•Institutions (6)

Broad Institute¹, Queen Mary University of London², University of Washington³, Wellcome Trust Sanger Institute⁴, Yale University⁵, Harvard University⁶

27 May 2020-Nature

TL;DR: A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.

...read moreread less

Abstract: Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases. A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.

...read moreread less

4,913 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse