Home
/
Authors
/
Thomas Willems

Author

Thomas Willems

Other affiliations: Vertex Pharmaceuticals, University of California, Berkeley, Lawrence Berkeley National Laboratory

Bio: Thomas Willems is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: 1000 Genomes Project & Genomics. The author has an hindex of 14, co-authored 19 publications receiving 13404 citations. Previous affiliations of Thomas Willems include Vertex Pharmaceuticals & University of California, Berkeley.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

12,661 citations

A global reference for human genetic variation

[...]

Adam Auton, Gonçalo R. Abecasis, David Altshuler, Richard Durbin +476 more

01 Oct 2015

TL;DR: The 1000 Genomes Project as mentioned in this paper provided a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and reported the completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole genome sequencing, deep exome sequencing and dense microarray genotyping.

...read moreread less

3,247 citations

Journal Article•DOI•

The Simons Genome Diversity Project: 300 genomes from 142 diverse populations

[...]

Swapan Mallick¹, Swapan Mallick², Swapan Mallick³, Heng Li², Mark Lipson¹, Iain Mathieson¹, Melissa Gymrek, Fernando Racimo⁴, Mengyao Zhao¹, Mengyao Zhao², Mengyao Zhao³, Niru Chennagiri³, Niru Chennagiri², Niru Chennagiri¹, Susanne Nordenfelt³, Susanne Nordenfelt¹, Susanne Nordenfelt², Arti Tandon¹, Arti Tandon², Pontus Skoglund², Pontus Skoglund¹, Iosif Lazaridis¹, Iosif Lazaridis², Sriram Sankararaman⁵, Sriram Sankararaman¹, Sriram Sankararaman², Qiaomei Fu⁶, Qiaomei Fu², Qiaomei Fu¹, Nadin Rohland¹, Nadin Rohland², Gabriel Renaud⁷, Yaniv Erlich⁸, Thomas Willems⁹, Carla Gallo¹⁰, Jeffrey P. Spence⁴, Yun S. Song¹¹, Yun S. Song⁴, Giovanni Poletti¹⁰, Francois Balloux¹², George van Driem¹³, Peter de Knijff¹⁴, Irene Gallego Romero¹⁵, Aashish R. Jha¹⁶, Doron M. Behar¹⁷, Claudio M. Bravi¹⁸, Cristian Capelli¹⁹, Tor Hervig²⁰, Andrés Moreno-Estrada, Olga L. Posukh²¹, Elena Balanovska, Oleg Balanovsky²², Sena Karachanak-Yankova²³, Hovhannes Sahakyan²⁴, Hovhannes Sahakyan¹⁷, Draga Toncheva²³, Levon Yepiskoposyan²⁴, Chris Tyler-Smith²⁵, Yali Xue²⁵, M. Syafiq Abdullah²⁶, Andres Ruiz-Linares¹², Cynthia M. Beall²⁷, Anna Di Rienzo¹⁶, Choongwon Jeong¹⁶, Elena B. Starikovskaya, Ene Metspalu¹⁷, Ene Metspalu²⁸, Jüri Parik¹⁷, Richard Villems²⁹, Richard Villems²⁸, Richard Villems¹⁷, Brenna M. Henn³⁰, Ugur Hodoglugil³¹, Robert W. Mahley³², Antti Sajantila³³, George Stamatoyannopoulos³⁴, Joseph Wee, Rita Khusainova³⁵, Elza Khusnutdinova³⁵, Sergey Litvinov³⁵, Sergey Litvinov¹⁷, George Ayodo³⁶, David Comas³⁷, Michael F. Hammer³⁸, Toomas Kivisild³⁹, Toomas Kivisild¹⁷, William Klitz, Cheryl A. Winkler⁴⁰, Damian Labuda⁴¹, Michael J. Bamshad³⁴, Lynn B. Jorde⁴², Sarah A. Tishkoff¹¹, W. Scott Watkins⁴², Mait Metspalu¹⁷, Stanislav Dryomov, Rem I. Sukernik⁴³, Lalji Singh⁴⁴, Lalji Singh⁵, Kumarasamy Thangaraj⁴⁴, Svante Pääbo⁷, Janet Kelso⁷, Nick Patterson², David Reich², David Reich³, David Reich¹ - Show less +101 more•Institutions (44)

Harvard University¹, Broad Institute², Howard Hughes Medical Institute³, University of California, Berkeley⁴, University of California, Los Angeles⁵, Chinese Academy of Sciences⁶, Max Planck Society⁷, Columbia University⁸, Massachusetts Institute of Technology⁹, Cayetano Heredia University¹⁰, University of Pennsylvania¹¹, University College London¹², University of Bern¹³, Leiden University¹⁴, Nanyang Technological University¹⁵, University of Chicago¹⁶, Estonian Biocentre¹⁷, National University of La Plata¹⁸, University of Oxford¹⁹, University of Bergen²⁰, Novosibirsk State University²¹, Moscow Institute of Physics and Technology²², Sofia Medical University²³, Armenian National Academy of Sciences²⁴, Wellcome Trust Sanger Institute²⁵, Raja Isteri Pengiran Anak Saleha Hospital²⁶, Case Western Reserve University²⁷, University of Tartu²⁸, Estonian Academy of Sciences²⁹, Stony Brook University³⁰, Illumina³¹, Gladstone Institutes³², University of Helsinki³³, University of Washington³⁴, Bashkir State University³⁵, Jaramogi Oginga Odinga University of Science and Technology³⁶, Pompeu Fabra University³⁷, University of Arizona³⁸, University of Cambridge³⁹, Leidos⁴⁰, Université de Montréal⁴¹, University of Utah⁴², Altai State University⁴³, Council of Scientific and Industrial Research⁴⁴

13 Oct 2016-Nature

TL;DR: It is demonstrated that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.

...read moreread less

Abstract: Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.

...read moreread less

1,133 citations

Journal Article•DOI•

Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials

[...]

Thomas Willems¹, Chris H. Rycroft¹, Chris H. Rycroft², Michael Kazi², Michael Kazi¹, Juan Meza¹, Maciej Haranczyk¹ - Show less +3 more•Institutions (2)

Lawrence Berkeley National Laboratory¹, University of California, Berkeley²

01 Feb 2012-Microporous and Mesoporous Materials

TL;DR: Here, algorithms and tools to efficiently calculate some of the geometrical parameters describing pores are presented based on the Voronoi decomposition, which for a given arrangement of atoms in a periodic domain provides a graph representation of the void space.

...read moreread less

1,060 citations

Journal Article•DOI•

Abundant contribution of short tandem repeats to gene expression variation in humans

[...]

Melissa Gymrek¹, Thomas Willems¹, Audrey Guilmatre², Haoyang Zeng¹, Barak Markus¹, Stoyan Georgiev³, Mark J. Daly⁴, Alkes L. Price⁴, Alkes L. Price⁵, Jonathan K. Pritchard⁶, Jonathan K. Pritchard³, Andrew J. Sharp², Yaniv Erlich - Show less +9 more•Institutions (6)

Massachusetts Institute of Technology¹, Icahn School of Medicine at Mount Sinai², Stanford University³, Broad Institute⁴, Harvard University⁵, Howard Hughes Medical Institute⁶

01 Jan 2016-Nature Genetics

TL;DR: A genome-wide survey of the contribution of short tandem repeats (STRs), which constitute one of the most polymorphic and abundant repeat classes, to gene expression in humans found that eSTRs are enriched in various clinically relevant conditions and may modulate certain histone modifications.

...read moreread less

Abstract: The contribution of repetitive elements to quantitative human traits is largely unknown. Here we report a genome-wide survey of the contribution of short tandem repeats (STRs), which constitute one of the most polymorphic and abundant repeat classes, to gene expression in humans. Our survey identified 2,060 significant expression STRs (eSTRs). These eSTRs were replicable in orthogonal populations and expression assays. We used variance partitioning to disentangle the contribution of eSTRs from that of linked SNPs and indels and found that eSTRs contribute 10-15% of the cis heritability mediated by all common variants. Further functional genomic analyses showed that eSTRs are enriched in conserved regions, colocalize with regulatory elements and may modulate certain histone modifications. By analyzing known genome-wide association study (GWAS) signals and searching for new associations in 1,685 whole genomes from deeply phenotyped individuals, we found that eSTRs are enriched in various clinically relevant conditions. These results highlight the contribution of STRs to the genetic architecture of quantitative human traits.

...read moreread less

298 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

...read moreread less

12,661 citations

Journal Article•DOI•

Analysis of protein-coding genetic variation in 60,706 humans

[...]

Monkol Lek, Konrad J. Karczewski¹, Konrad J. Karczewski², Eric Vallabh Minikel², Eric Vallabh Minikel¹, Kaitlin E. Samocha, Eric Banks¹, Timothy Fennell¹, Anne H. O’Donnell-Luria³, Anne H. O’Donnell-Luria², Anne H. O’Donnell-Luria¹, James S. Ware, Andrew J. Hill⁴, Andrew J. Hill², Andrew J. Hill¹, Beryl B. Cummings², Beryl B. Cummings¹, Taru Tukiainen², Taru Tukiainen¹, Daniel P. Birnbaum¹, Jack A. Kosmicki, Laramie E. Duncan², Laramie E. Duncan¹, Karol Estrada¹, Karol Estrada², Fengmei Zhao¹, Fengmei Zhao², James Zou¹, Emma Pierce-Hoffman¹, Emma Pierce-Hoffman², Joanne Berghout⁵, David Neil Cooper⁶, Nicole A. Deflaux⁷, Mark A. DePristo¹, Ron Do, Jason Flannick², Jason Flannick¹, Menachem Fromer, Laura D. Gauthier¹, Jackie Goldstein², Jackie Goldstein¹, Namrata Gupta¹, Daniel P. Howrigan¹, Daniel P. Howrigan², Adam Kiezun¹, Mitja I. Kurki², Mitja I. Kurki¹, Ami Levy Moonshine¹, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso¹, Gina M. Peloso², Ryan Poplin¹, Manuel A. Rivas¹, Valentin Ruano-Rubio¹, Samuel A. Rose¹, Douglas M. Ruderfer⁸, Khalid Shakir¹, Peter D. Stenson⁶, Christine Stevens¹, Brett Thomas¹, Brett Thomas², Grace Tiao¹, María Teresa Tusié-Luna, Ben Weisburd¹, Hong-Hee Won⁹, Dongmei Yu, David Altshuler¹⁰, David Altshuler¹, Diego Ardissino, Michael Boehnke¹¹, John Danesh¹², Stacey Donnelly¹, Roberto Elosua, Jose C. Florez², Jose C. Florez¹, Stacey Gabriel¹, Gad Getz¹, Gad Getz², Stephen J. Glatt¹³, Christina M. Hultman¹⁴, Sekar Kathiresan, Markku Laakso¹⁵, Steven A. McCarroll², Steven A. McCarroll¹, Mark I. McCarthy¹⁶, Mark I. McCarthy¹⁷, Dermot P.B. McGovern¹⁸, Ruth McPherson¹⁹, Benjamin M. Neale², Benjamin M. Neale¹, Aarno Palotie, Shaun Purcell⁸, Danish Saleheen²⁰, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan¹⁴, Patrick F. Sullivan²¹, Jaakko Tuomilehto²², Ming T. Tsuang²³, Hugh Watkins¹⁶, Hugh Watkins¹⁷, James G. Wilson²⁴, Mark J. Daly¹, Mark J. Daly², Daniel G. MacArthur¹, Daniel G. MacArthur² - Show less +103 more•Institutions (24)

Broad Institute¹, Harvard University², Boston Children's Hospital³, University of Washington⁴, University of Arizona⁵, Cardiff University⁶, Google⁷, Icahn School of Medicine at Mount Sinai⁸, Samsung Medical Center⁹, Vertex Pharmaceuticals¹⁰, University of Michigan¹¹, University of Cambridge¹², State University of New York Upstate Medical University¹³, Karolinska Institutet¹⁴, University of Eastern Finland¹⁵, Wellcome Trust Centre for Human Genetics¹⁶, University of Oxford¹⁷, Cedars-Sinai Medical Center¹⁸, University of Ottawa¹⁹, University of Pennsylvania²⁰, University of North Carolina at Chapel Hill²¹, University of Helsinki²², University of California, San Diego²³, University of Mississippi Medical Center²⁴

18 Aug 2016-Nature

TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.

...read moreread less

Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

...read moreread less

8,758 citations

Journal Article•DOI•

The UK Biobank resource with deep phenotyping and genomic data

[...]

Clare Bycroft¹, Colin Freeman¹, Desislava Petkova², Desislava Petkova¹, Gavin Band¹, Lloyd T. Elliott¹, Kevin Sharp¹, Allan Motyer³, Damjan Vukcevic³, Olivier Delaneau⁴, Olivier Delaneau⁵, Jared O'Connell⁶, Adrian Cortes¹, Adrian Cortes⁷, Samantha Welsh, Alan Young¹, Mark Effingham, Gil McVean¹, Stephen Leslie³, Naomi E. Allen¹, Peter Donnelly¹, Jonathan Marchini¹ - Show less +18 more•Institutions (7)

University of Oxford¹, Procter & Gamble², University of Melbourne³, University of Geneva⁴, Swiss Institute of Bioinformatics⁵, Illumina⁶, John Radcliffe Hospital⁷

11 Oct 2018-Nature

TL;DR: Deep phenotype and genome-wide genetic data from 500,000 individuals from the UK Biobank is described, describing population structure and relatedness in the cohort, and imputation to increase the number of testable variants to 96 million.

...read moreread less

Abstract: The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

...read moreread less

4,489 citations

Journal Article•DOI•

Genetic effects on gene expression across human tissues.

[...]

Enhancing GTEx (eGTEx) groups¹, Nih Common Fund², Nhgri, Biospecimen Core Resource—VARI, Elsi study, Genome Browser Data Integration Visualization—EBI, Lead analysts, Alexis Battle³, Christopher D. Brown⁴, Barbara E. Engelhardt¹, Stephen B. Montgomery² - Show less +7 more•Institutions (4)

Princeton University¹, Stanford University², Johns Hopkins University³, University of Pennsylvania⁴

12 Oct 2017-Nature

TL;DR: It is found that local genetic variation affects gene expression levels for the majority of genes, and inter-chromosomal genetic effects for 93 genes and 112 loci are identified, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

...read moreread less

Abstract: Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.

...read moreread less

3,289 citations

Journal Article•DOI•

Coming of age: ten years of next-generation sequencing technologies

[...]

Sara Goodwin¹, John Douglas Mcpherson², W. Richard McCombie¹•Institutions (2)

Cold Spring Harbor Laboratory¹, University of California, Davis²

01 Jun 2016-Nature Reviews Genetics

TL;DR: These and other strategies are providing researchers and clinicians a variety of tools to probe genomes in greater depth, leading to an enhanced understanding of how genome sequence variants underlie phenotype and disease.

...read moreread less

Abstract: Since the completion of the human genome project in 2003, extraordinary progress has been made in genome sequencing technologies, which has led to a decreased cost per megabase and an increase in the number and diversity of sequenced genomes. An astonishing complexity of genome architecture has been revealed, bringing these sequencing technologies to even greater advancements. Some approaches maximize the number of bases sequenced in the least amount of time, generating a wealth of data that can be used to understand increasingly complex phenotypes. Alternatively, other approaches now aim to sequence longer contiguous pieces of DNA, which are essential for resolving structurally complex regions. These and other strategies are providing researchers and clinicians a variety of tools to probe genomes in greater depth, leading to an enhanced understanding of how genome sequence variants underlie phenotype and disease.

...read moreread less

3,096 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse