Home
/
Authors
/
Laura D. Gauthier

Author

Laura D. Gauthier

Other affiliations: Harvard University, Massachusetts Institute of Technology

Bio: Laura D. Gauthier is an academic researcher from Broad Institute. The author has contributed to research in topics: Genome & Exome sequencing. The author has an hindex of 13, co-authored 26 publications receiving 13937 citations. Previous affiliations of Laura D. Gauthier include Harvard University & Massachusetts Institute of Technology.

Topics: Genome, Exome sequencing, Gene, Exome, Biology ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Analysis of protein-coding genetic variation in 60,706 humans

[...]

Monkol Lek, Konrad J. Karczewski¹, Konrad J. Karczewski², Eric Vallabh Minikel², Eric Vallabh Minikel¹, Kaitlin E. Samocha, Eric Banks², Timothy Fennell², Anne H. O’Donnell-Luria², Anne H. O’Donnell-Luria¹, Anne H. O’Donnell-Luria³, James S. Ware, Andrew J. Hill², Andrew J. Hill⁴, Andrew J. Hill¹, Beryl B. Cummings¹, Beryl B. Cummings², Taru Tukiainen², Taru Tukiainen¹, Daniel P. Birnbaum², Jack A. Kosmicki, Laramie E. Duncan², Laramie E. Duncan¹, Karol Estrada¹, Karol Estrada², Fengmei Zhao², Fengmei Zhao¹, James Zou², Emma Pierce-Hoffman², Emma Pierce-Hoffman¹, Joanne Berghout⁵, David Neil Cooper⁶, Nicole A. Deflaux⁷, Mark A. DePristo², Ron Do, Jason Flannick², Jason Flannick¹, Menachem Fromer, Laura D. Gauthier², Jackie Goldstein¹, Jackie Goldstein², Namrata Gupta², Daniel P. Howrigan¹, Daniel P. Howrigan², Adam Kiezun², Mitja I. Kurki², Mitja I. Kurki¹, Ami Levy Moonshine², Pradeep Natarajan, Lorena Orozco, Gina M. Peloso¹, Gina M. Peloso², Ryan Poplin², Manuel A. Rivas², Valentin Ruano-Rubio², Samuel A. Rose², Douglas M. Ruderfer⁸, Khalid Shakir², Peter D. Stenson⁶, Christine Stevens², Brett Thomas², Brett Thomas¹, Grace Tiao², María Teresa Tusié-Luna, Ben Weisburd², Hong-Hee Won⁹, Dongmei Yu, David Altshuler¹⁰, David Altshuler², Diego Ardissino, Michael Boehnke¹¹, John Danesh¹², Stacey Donnelly², Roberto Elosua, Jose C. Florez¹, Jose C. Florez², Stacey Gabriel², Gad Getz¹, Gad Getz², Stephen J. Glatt¹³, Christina M. Hultman¹⁴, Sekar Kathiresan, Markku Laakso¹⁵, Steven A. McCarroll², Steven A. McCarroll¹, Mark I. McCarthy¹⁶, Mark I. McCarthy¹⁷, Dermot P.B. McGovern¹⁸, Ruth McPherson¹⁹, Benjamin M. Neale², Benjamin M. Neale¹, Aarno Palotie, Shaun Purcell⁸, Danish Saleheen²⁰, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan²¹, Patrick F. Sullivan¹⁴, Jaakko Tuomilehto²², Ming T. Tsuang²³, Hugh Watkins¹⁷, Hugh Watkins¹⁶, James G. Wilson²⁴, Mark J. Daly², Mark J. Daly¹, Daniel G. MacArthur¹, Daniel G. MacArthur² - Show less +103 more•Institutions (24)

Harvard University¹, Broad Institute², Boston Children's Hospital³, University of Washington⁴, University of Arizona⁵, Cardiff University⁶, Google⁷, Icahn School of Medicine at Mount Sinai⁸, Samsung Medical Center⁹, Vertex Pharmaceuticals¹⁰, University of Michigan¹¹, University of Cambridge¹², State University of New York Upstate Medical University¹³, Karolinska Institutet¹⁴, University of Eastern Finland¹⁵, University of Oxford¹⁶, Wellcome Trust Centre for Human Genetics¹⁷, Cedars-Sinai Medical Center¹⁸, University of Ottawa¹⁹, University of Pennsylvania²⁰, University of North Carolina at Chapel Hill²¹, University of Helsinki²², University of California, San Diego²³, University of Mississippi Medical Center²⁴

18 Aug 2016-Nature

TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.

...read moreread less

Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

...read moreread less

8,758 citations

Journal Article•DOI•

The mutational constraint spectrum quantified from variation in 141,456 humans

[...]

Konrad J. Karczewski¹, Laurent C. Francioli¹, Grace Tiao¹, Beryl B. Cummings¹, Jessica Alföldi¹, Qingbo Wang¹, Ryan L. Collins¹, Kristen M. Laricchia¹, Andrea Ganna¹, Daniel P. Birnbaum¹, Laura D. Gauthier¹, Harrison Brand¹, Matthew Solomonson¹, Nicholas A. Watts¹, Daniel R. Rhodes², Moriel Singer-Berk¹, Eleina M. England¹, Eleanor G. Seaby¹, Jack A. Kosmicki¹, Raymond K. Walters¹, Katherine Tashman¹, Yossi Farjoun¹, Eric Banks¹, Timothy Poterba¹, Arcturus Wang¹, Cotton Seed¹, Nicola Whiffin¹, Jessica X. Chong³, Kaitlin E. Samocha⁴, Emma Pierce-Hoffman¹, Zachary Zappala¹, Anne H. O’Donnell-Luria¹, Eric Vallabh Minikel¹, Ben Weisburd¹, Monkol Lek⁵, James S. Ware¹, Christopher Vittal⁶, Irina M. Armean¹, Louis Bergelson¹, Kristian Cibulskis¹, Kristen M. Connolly¹, Miguel Covarrubias¹, Stacey Donnelly¹, Steven Ferriera¹, Stacey Gabriel¹, Jeff Gentry¹, Namrata Gupta¹, Thibault Jeandet¹, Diane Kaplan¹, Christopher Llanwarne¹, Ruchi Munshi¹, Sam Novod¹, Nikelle Petrillo¹, David Roazen¹, Valentin Ruano-Rubio¹, Andrea Saltzman¹, Molly Schleicher¹, Jose Soto¹, Kathleen Tibbetts¹, Charlotte Tolonen¹, Gordon Wade¹, Michael E. Talkowski¹, Benjamin M. Neale¹, Mark J. Daly¹, Daniel G. MacArthur¹ - Show less +61 more•Institutions (6)

Broad Institute¹, Queen Mary University of London², University of Washington³, Wellcome Trust Sanger Institute⁴, Yale University⁵, Harvard University⁶

27 May 2020-Nature

TL;DR: A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.

...read moreread less

Abstract: Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases. A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.

...read moreread less

4,913 citations

Posted Content•DOI•

Analysis of protein-coding genetic variation in 60,706 humans

[...]

Monkol Lek¹, Konrad J. Karczewski¹, Eric Vallabh Minikel¹, Kaitlin E. Samocha¹, Eric Banks², Timothy Fennell², Anne H. O’Donnell-Luria¹, James S. Ware², Andrew J. Hill¹, Beryl B. Cummings¹, Taru Tukiainen¹, Daniel P. Birnbaum¹, Jack A. Kosmicki¹, Laramie E. Duncan¹, Karol Estrada¹, Fengmei Zhao¹, James Zou², Emma Pierce-Hoffman¹, David Neil Cooper³, Mark A. DePristo², Ron Do⁴, Jason Flannick², Menachem Fromer¹, Laura D. Gauthier², Jackie Goldstein¹, Namrata Gupta², Daniel P. Howrigan¹, Adam Kiezun², Mitja I. Kurki², Ami Levy Moonshine², Pradeep Natarajan², Lorena Orozco, Gina M. Peloso², Ryan Poplin², Manuel A. Rivas², Valentin Ruano-Rubio², Douglas M. Ruderfer⁴, Khalid Shakir², Peter D. Stenson³, Christine Stevens², Brett Thomas¹, Grace Tiao², María Teresa Tusié-Luna, Ben Weisburd², Hong-Hee Won², Dongmei Yu², David Altshuler², Diego Ardissino, Michael Boehnke⁵, John Danesh⁶, Roberto Elosua, Jose C. Florez², Stacey Gabriel², Gad Getz², Christina M. Hultman⁷, Sekar Kathiresan², Markku Laakso⁸, Steven A. McCarroll², Mark I. McCarthy⁹, Dermot P.B. McGovern¹⁰, Ruth McPherson¹¹, Benjamin M. Neale¹, Aarno Palotie¹², Shaun Purcell⁴, Danish Saleheen¹³, Jeremiah M. Scharf², Pamela Sklar⁴, Patrick F. Sullivan¹⁴, Jaakko Tuomilehto¹², Hugh Watkins⁹, James G. Wilson¹⁵, Mark J. Daly¹, Daniel G. MacArthur¹ - Show less +69 more•Institutions (15)

Harvard University¹, Broad Institute², Cardiff University³, Icahn School of Medicine at Mount Sinai⁴, University of Michigan⁵, University of Cambridge⁶, Karolinska Institutet⁷, University of Eastern Finland⁸, University of Oxford⁹, Cedars-Sinai Medical Center¹⁰, University of Ottawa¹¹, University of Helsinki¹², University of Pennsylvania¹³, University of North Carolina at Chapel Hill¹⁴, University of Mississippi Medical Center¹⁵

30 Oct 2015-bioRxiv

TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.

...read moreread less

Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities. The resulting catalogue of human genetic diversity has unprecedented resolution, with an average of one variant every eight bases of coding sequence and the presence of widespread mutational recurrence. The deep catalogue of variation provided by the Exome Aggregation Consortium (ExAC) can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 79% of which have no currently established human disease phenotype. Finally, we show that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human knockout variants in protein-coding genes.

...read moreread less

1,552 citations

Posted Content•DOI•

Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes

[...]

Konrad J. Karczewski¹, Konrad J. Karczewski², Laurent C. Francioli², Laurent C. Francioli¹, Grace Tiao¹, Grace Tiao², Beryl B. Cummings², Beryl B. Cummings¹, Jessica Alföldi¹, Jessica Alföldi², Qingbo Wang², Qingbo Wang¹, Ryan L. Collins¹, Ryan L. Collins², Kristen M. Laricchia¹, Kristen M. Laricchia², Andrea Ganna², Andrea Ganna³, Andrea Ganna¹, Daniel P. Birnbaum¹, Laura D. Gauthier¹, Harrison Brand¹, Harrison Brand², Matthew Solomonson¹, Matthew Solomonson², Nicholas A. Watts¹, Nicholas A. Watts², Daniel R. Rhodes⁴, Moriel Singer-Berk¹, Eleanor G. Seaby¹, Eleanor G. Seaby², Jack A. Kosmicki², Jack A. Kosmicki¹, Raymond K. Walters¹, Raymond K. Walters², Katherine Tashman¹, Katherine Tashman², Yossi Farjoun¹, Eric Banks¹, Timothy Poterba¹, Timothy Poterba², Arcturus Wang², Arcturus Wang¹, Cotton Seed¹, Cotton Seed², Nicola Whiffin⁵, Nicola Whiffin¹, Jessica X. Chong⁶, Kaitlin E. Samocha⁷, Emma Pierce-Hoffman¹, Zachary Zappala⁸, Zachary Zappala¹, Anne H. O’Donnell-Luria⁹, Anne H. O’Donnell-Luria², Anne H. O’Donnell-Luria¹, Eric Vallabh Minikel¹, Ben Weisburd¹, Monkol Lek¹, Monkol Lek¹⁰, James S. Ware⁵, James S. Ware¹, Christopher Vittal¹, Christopher Vittal², Irina M. Armean¹¹, Irina M. Armean¹, Irina M. Armean², Louis Bergelson¹, Kristian Cibulskis¹, Kristen M. Connolly¹, Miguel Covarrubias¹, Stacey Donnelly¹, Steven Ferriera¹, Stacey Gabriel¹, Jeff Gentry¹, Namrata Gupta¹, Thibault Jeandet¹, Diane Kaplan¹, Christopher Llanwarne¹, Ruchi Munshi¹, Sam Novod¹, Nikelle Petrillo¹, David Roazen¹, Valentin Ruano-Rubio¹, Andrea Saltzman¹, Molly Schleicher¹, Jose Soto¹, Kathleen Tibbetts¹, Charlotte Tolonen¹, Gordon Wade¹, Michael E. Talkowski¹, Michael E. Talkowski², Benjamin M. Neale², Benjamin M. Neale¹, Mark J. Daly¹, Daniel G. MacArthur², Daniel G. MacArthur¹ - Show less +92 more•Institutions (11)

Broad Institute¹, Harvard University², University of Helsinki³, Queen Mary University of London⁴, National Institutes of Health⁵, University of Washington⁶, Wellcome Trust Sanger Institute⁷, Vertex Pharmaceuticals⁸, Boston Children's Hospital⁹, Yale University¹⁰, European Bioinformatics Institute¹¹

30 Jan 2019-bioRxiv

TL;DR: Using an improved human mutation rate model, human protein-coding genes are classified along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.

...read moreread less

Abstract: Summary Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved model of human mutation, we classify human protein-coding genes along a spectrum representing intolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases.

...read moreread less

1,128 citations

Posted Content•DOI•

Scaling accurate genetic variant discovery to tens of thousands of samples

[...]

Ryan Poplin¹, Ruano-Rubio¹, Mark A. DePristo¹, Timothy Fennell¹, Mauricio O. Carneiro¹, Van der Auwera Ga¹, David E. Kling¹, Laura D. Gauthier¹, Ami Levy-Moonshine¹, David Roazen¹, Khalid Shakir¹, Thibault J¹, Chandran S¹, Christopher W. Whelan¹, Monkol Lek², Stacey Gabriel¹, Mark J. Daly², Neale Bm², Daniel G. MacArthur², Eric Banks¹ - Show less +16 more•Institutions (2)

Broad Institute¹, Harvard University²

14 Nov 2017-bioRxiv

TL;DR: A novel assembly-based approach to variant calling, the GATK HaplotypeCaller and Reference Confidence Model, that determines genotype likelihoods independently per-sample but performs joint calling across all samples within a project simultaneously, showing that the accuracy of indel variant calling is superior in comparison to other algorithms.

...read moreread less

Abstract: Comprehensive disease gene discovery in both common and rare diseases will require the efficient and accurate detection of all classes of genetic variation across tens to hundreds of thousands of human samples. We describe here a novel assembly-based approach to variant calling, the GATK HaplotypeCaller (HC) and Reference Confidence Model (RCM), that determines genotype likelihoods independently per-sample but performs joint calling across all samples within a project simultaneously. We show by calling over 90,000 samples from the Exome Aggregation Consortium (ExAC) that, in contrast to other algorithms, the HC-RCM scales efficiently to very large sample sizes without loss in accuracy; and that the accuracy of indel variant calling is superior in comparison to other algorithms. More importantly, the HC-RCM produces a fully squared-off matrix of genotypes across all samples at every genomic position being investigated. The HC- RCM is a novel, scalable, assembly-based algorithm with abundant applications for population genetics and clinical studies.

...read moreread less

1,033 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Minimap2: pairwise alignment for nucleotide sequences

[...]

Heng Li¹•Institutions (1)

Broad Institute¹

15 Sep 2018-Bioinformatics

TL;DR: Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database and is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mapper at higher accuracy, surpassing most aligners specialized in one type of alignment.

...read moreread less

Abstract: Motivation Recent advances in sequencing technologies promise ultra-long reads of ∼100 kb in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 Mb in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Results Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥100 bp in length, ≥1 kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. Availability and implementation https://github.com/lh3/minimap2. Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

6,264 citations

Journal Article•DOI•

The mutational constraint spectrum quantified from variation in 141,456 humans

[...]

Broad Institute¹, Queen Mary University of London², University of Washington³, Wellcome Trust Sanger Institute⁴, Yale University⁵, Harvard University⁶

27 May 2020-Nature

...read moreread less

4,913 citations

Journal Article•DOI•

The UK Biobank resource with deep phenotyping and genomic data

[...]

Clare Bycroft¹, Colin Freeman¹, Desislava Petkova¹, Desislava Petkova², Gavin Band¹, Lloyd T. Elliott¹, Kevin Sharp¹, Allan Motyer³, Damjan Vukcevic³, Olivier Delaneau⁴, Olivier Delaneau⁵, Jared O'Connell⁶, Adrian Cortes¹, Adrian Cortes⁷, Samantha Welsh, Alan Young¹, Mark Effingham, Gil McVean¹, Stephen Leslie³, Naomi E. Allen¹, Peter Donnelly¹, Jonathan Marchini¹ - Show less +18 more•Institutions (7)

University of Oxford¹, Procter & Gamble², University of Melbourne³, Swiss Institute of Bioinformatics⁴, University of Geneva⁵, Illumina⁶, John Radcliffe Hospital⁷

11 Oct 2018-Nature

TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.

...read moreread less

Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

...read moreread less

4,489 citations

Integrative analysis of 111 reference human epigenomes

[...]

Anshul Kundaje, Wouter Meuleman, Jason Ernst, Angela Yen, Pouya Kheradpour, Zhizhuo Zhang, Jianrong Wang, Lucas D. Ward, Abhishek Sarkar, Gerald Quon, Matthew L. Eaton, Yi-Chieh Wu, Andreas R. Pfenning, Xinchen Wang, Melina Claussnitzer, Yaping Liu, Mukul S. Bansal, Soheil Feizi-Khankandi, Ah Ram Kim, Richard C Sallari, Nicholas A Sinnott-Armstrong, Laurie A. Boyer, Elizabeta Gjoneska, Li-Huei Tsai, Manolis Kellis - Show less +21 more

01 Feb 2015

TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.

...read moreread less

Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

...read moreread less

4,409 citations

Journal Article•DOI•

Genetic effects on gene expression across human tissues.

[...]

Enhancing GTEx (eGTEx) groups¹, Nih Common Fund², Nhgri, Biospecimen Core Resource—VARI, Elsi study, Genome Browser Data Integration Visualization—EBI, Lead analysts, Alexis Battle³, Christopher D. Brown⁴, Barbara E. Engelhardt¹, Stephen B. Montgomery² - Show less +7 more•Institutions (4)

Princeton University¹, Stanford University², Johns Hopkins University³, University of Pennsylvania⁴

12 Oct 2017-Nature

TL;DR: It is found that local genetic variation affects gene expression levels for the majority of genes, and inter-chromosomal genetic effects for 93 genes and 112 loci are identified, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

...read moreread less

Abstract: Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

...read moreread less

3,289 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse