Home
/
Authors
/
David Altshuler

Author

David Altshuler

Other affiliations: Vertex Pharmaceuticals, Massachusetts Institute of Technology, Broad Institute ...read more

Bio: David Altshuler is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 162, co-authored 345 publications receiving 201782 citations. Previous affiliations of David Altshuler include Vertex Pharmaceuticals & Massachusetts Institute of Technology.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1993
1992

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Large-Scale Gene-Centric Analysis Identifies Novel Variants for Coronary Artery Disease

[...]

Adam S. Butterworth¹, Peter S. Braund², Martin Farrall³, Robert J. Hardwick⁴ +352 more•Institutions (107)

01 Sep 2011

TL;DR: This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes.

...read moreread less

Abstract: Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ~2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10; LPA:p<10; 1p13.3:p<10) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06-1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ~4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes. © 2011 Butterworth et al.

...read moreread less

228 citations

Journal Article•DOI•

Genetic analysis of human traits in vitro: drug response and gene expression in lymphoblastoid cell lines.

[...]

Edwin Choy, Roman Yelensky, Sasha Bonakdar¹, Robert M. Plenge, Richa Saxena², Richa Saxena¹, Philip L. De Jager, Stanley Y. Shaw², Stanley Y. Shaw¹, Cara S Wolfish¹, Cara S Wolfish³, Jacqueline M. Slavik³, Chris Cotsapas², Chris Cotsapas¹, Manuel A. Rivas⁴, Manuel A. Rivas¹, Emmanouil T. Dermitzakis⁵, Ellen Cahir-McFarland³, Ellen Cahir-McFarland², Elliott Kieff³, Elliott Kieff², David A. Hafler¹, David A. Hafler³, Mark J. Daly¹, Mark J. Daly², David Altshuler - Show less +22 more•Institutions (5)

Broad Institute¹, Harvard University², Brigham and Women's Hospital³, Massachusetts Institute of Technology⁴, Wellcome Trust Sanger Institute⁵

28 Nov 2008-PLOS Genetics

TL;DR: Lymphoblastoid cell lines are a promising model for pharmacogenetic experiments, but biological noise and in vitro artifacts may reduce power and have the potential to create spurious association due to confounding.

...read moreread less

Abstract: Lymphoblastoid cell lines (LCLs), originally collected as renewable sources of DNA, are now being used as a model system to study genotype-phenotype relationships in human cells, including searches for QTLs influencing levels of individual mRNAs and responses to drugs and radiation. In the course of attempting to map genes for drug response using 269 LCLs from the International HapMap Project, we evaluated the extent to which biological noise and non-genetic confounders contribute to trait variability in LCLs. While drug responses could be technically well measured on a given day, we observed significant day-to-day variability and substantial correlation to non-genetic confounders, such as baseline growth rates and metabolic state in culture. After correcting for these confounders, we were unable to detect any QTLs with genome-wide significance for drug response. A much higher proportion of variance in mRNA levels may be attributed to non-genetic factors (intra-individual variance--i.e., biological noise, levels of the EBV virus used to transform the cells, ATP levels) than to detectable eQTLs. Finally, in an attempt to improve power, we focused analysis on those genes that had both detectable eQTLs and correlation to drug response; we were unable to detect evidence that eQTL SNPs are convincingly associated with drug response in the model. While LCLs are a promising model for pharmacogenetic experiments, biological noise and in vitro artifacts may reduce power and have the potential to create spurious association due to confounding.

...read moreread less

225 citations

Journal Article•DOI•

Transferability of tag SNPs in genetic association studies in multiple populations

[...]

Paul I.W. de Bakker¹, Noël P. Burtt¹, Robert R. Graham, Candace Guiducci¹, Roman Yelensky, Jared A. Drake², Jared A. Drake¹, Todd Bersaglieri², Todd Bersaglieri¹, Kathryn L. Penney³, Johannah L. Butler¹, Johannah L. Butler², Stanton Young³, Robert C. Onofrio¹, Helen N. Lyon¹, Helen N. Lyon², Daniel O. Stram⁴, Christopher A. Haiman⁴, Matthew L. Freedman³, Matthew L. Freedman¹, Xiaofeng Zhu⁵, Richard S. Cooper⁵, Leif Groop⁶, Leif Groop⁷, Laurence N. Kolonel⁸, Brian E. Henderson⁴, Mark J. Daly¹, Mark J. Daly³, Joel N. Hirschhorn¹, Joel N. Hirschhorn³, Joel N. Hirschhorn², David Altshuler - Show less +28 more•Institutions (8)

Broad Institute¹, Boston Children's Hospital², Harvard University³, University of Southern California⁴, Loyola University Chicago⁵, Lund University⁶, University of Helsinki⁷, University of Hawaii⁸

22 Oct 2006-Nature Genetics

TL;DR: It is demonstrated that the HapMap DNA samples can be used to select tags for genome-wide association studies in many samples around the world.

...read moreread less

Abstract: A general question for linkage disequilibrium-based association studies is how power to detect an association is compromised when tag SNPs are chosen from data in one population sample and then deployed in another sample. Specifically, it is important to know how well tags picked from the HapMap DNA samples capture the variation in other samples. To address this, we collected dense data uniformly across the four HapMap population samples and eleven other population samples. We picked tag SNPs using genotype data we collected in the HapMap samples and then evaluated the effective coverage of these tags in comparison to the entire set of common variants observed in the other samples. We simulated case-control association studies in the non-HapMap samples under a disease model of modest risk, and we observed little loss in power. These results demonstrate that the HapMap DNA samples can be used to select tags for genome-wide association studies in many samples around the world.

...read moreread less

219 citations

Journal Article•DOI•

Polymorphism at the TNF superfamily gene TNFSF4 confers susceptibility to systemic lupus erythematosus

[...]

Deborah S. Cunninghame Graham¹, Robert R. Graham², Harinder Manku¹, Andrew Wong¹, John C. Whittaker³, Patrick M. Gaffney⁴, Patrick M. Gaffney⁵, Kathy L. Moser⁵, Kathy L. Moser⁴, John D. Rioux⁶, John D. Rioux⁷, David Altshuler², Timothy W. Behrens⁴, Timothy W. Behrens⁵, Timothy J. Vyse¹ - Show less +11 more•Institutions (7)

Hammersmith Hospital¹, Massachusetts Institute of Technology², University of London³, Oklahoma Medical Research Foundation⁴, University of Minnesota⁵, Montreal Heart Institute⁶, Broad Institute⁷

01 Jan 2008-Nature Genetics

TL;DR: It is hypothesized that increased expression of TNFSF4 predisposes to SLE either by quantitatively augmenting T cell–APC interaction or by influencing the functional consequences of T cell activation via TNFRSF4.

...read moreread less

Abstract: Systemic lupus erythematosus (SLE) is a multisystem complex autoimmune disease of uncertain etiology (OMIM 152700). Over recent years a genetic component to SLE susceptibility has been established. Recent successes with association studies in SLE have identified genes including IRF5 (refs. 4,5) and FCGR3B. Two tumor necrosis factor (TNF) superfamily members located within intervals showing genetic linkage with SLE are TNFSF4 (also known as OX40L; 1q25), which is expressed on activated antigen-presenting cells (APCs) and vascular endothelial cells, and also its unique receptor, TNFRSF4 (also known as OX40; 1p36), which is primarily expressed on activated CD4+ T cells. TNFSF4 produces a potent co-stimulatory signal for activated CD4+ T cells after engagement of TNFRSF4 (ref. 11). Using both a family-based and a case-control study design, we show that the upstream region of TNFSF4 contains a single risk haplotype for SLE, which is correlated with increased expression of both cell-surface TNFSF4 and the TNFSF4 transcript. We hypothesize that increased expression of TNFSF4 predisposes to SLE either by quantitatively augmenting T cell-APC interaction or by influencing the functional consequences of T cell activation via TNFRSF4.

...read moreread less

217 citations

Journal Article•DOI•

Association of a low-frequency variant in HNF1A with type 2 diabetes in a Latino population.

[...]

Karol Estrada¹, Karol Estrada², Ingvild Aukrust², Ingvild Aukrust¹, Lise Bjørkhaug³, Noël P. Burtt², Josep M. Mercader², Humberto García-Ortiz, Alicia Huerta-Chagoya⁴, Hortensia Moreno-Macías⁵, Geoffrey A. Walford⁵, Jason Flannick¹, Amy L. Williams¹, Amy L. Williams², María José Gómez-Vázquez, Juan Carlos Fernández-López, Angélica Martínez-Hernández, Silvia Jiménez-Morales, Federico Centeno-Cruz, Elvia Mendoza-Caamal, Cristina Revilla-Monsalve⁶, Sergio Islas-Andrade⁶, Emilio J. Córdova, Xavier Soberón, María Elena González-Villalpando, E. Henderson⁷, Lynne R. Wilkens⁸, Loic Le Marchand⁸, Olimpia Arellano-Campos, María Luisa Ordóñez-Sánchez, Maribel Rodríguez-Torres, Rosario Rodríguez-Guillén, Laura Riba⁴, Laeya A. Najmi⁴, Suzanne B.R. Jacobs², Timothy Fennell², Stacey Gabriel², Pierre Fontanillas², Craig L. Hanis⁹, Donna M. Lehman¹⁰, Christopher P. Jenkinson¹⁰, Hanna E. Abboud¹⁰, Graeme I. Bell¹⁰, Maria L. Cortes², Michael Boehnke¹¹, Clicerio Gonzalez-Villalpando, Lorena Orozco, Christopher A. Haiman⁷, Teresa Tusie-Luna⁷, Carlos A. Aguilar-Salinas, David Altshuler, Pål R. Njølstad, Jose C. Florez¹², Jose C. Florez³, Daniel G. MacArthur², Daniel G. MacArthur¹ - Show less +52 more•Institutions (12)

Harvard University¹, Broad Institute², University of Bergen³, National Autonomous University of Mexico⁴, Universidad Autónoma Metropolitana⁵, Mexican Social Security Institute⁶, University of Southern California⁷, University of Hawaii⁸, University of Texas Health Science Center at Houston⁹, University of Texas Health Science Center at San Antonio¹⁰, University of Michigan¹¹, Haukeland University Hospital¹²

11 Jun 2014-JAMA

TL;DR: A single low-frequency variant in the MODY3-causing gene HNF1A that is associated with type 2 diabetes in Latino populations and may affect protein function is identified and may have implications for screening and therapeutic modification in this population.

...read moreread less

Abstract: Importance Latino populations have one of the highest prevalences of type 2 diabetes worldwide. Objectives To investigate the association between rare protein-coding genetic variants and prevalence of type 2 diabetes in a large Latino population and to explore potential molecular and physiological mechanisms for the observed relationships. Design, Setting, and Participants Whole-exome sequencing was performed on DNA samples from 3756 Mexican and US Latino individuals (1794 with type 2 diabetes and 1962 without diabetes) recruited from 1993 to 2013. One variant was further tested for allele frequency and association with type 2 diabetes in large multiethnic data sets of 14 276 participants and characterized in experimental assays. Main Outcome and Measures Prevalence of type 2 diabetes. Secondary outcomes included age of onset, body mass index, and effect on protein function. Results A single rare missense variant (c.1522G>A [p.E508K]) was associated with type 2 diabetes prevalence (odds ratio [OR], 5.48; 95% CI, 2.83-10.61; P = 4.4 × 10 −7 ) in hepatocyte nuclear factor 1-α ( HNF1A ), the gene responsible for maturity onset diabetes of the young type 3 (MODY3). This variant was observed in 0.36% of participants without type 2 diabetes and 2.1% of participants with it. In multiethnic replication data sets, the p.E508K variant was seen only in Latino patients (n = 1443 with type 2 diabetes and 1673 without it) and was associated with type 2 diabetes (OR, 4.16; 95% CI, 1.75-9.92; P = .0013). In experimental assays, HNF-1A protein encoding the p.E508K mutant demonstrated reduced transactivation activity of its target promoter compared with a wild-type protein. In our data, carriers and noncarriers of the p.E508K mutation with type 2 diabetes had no significant differences in compared clinical characteristics, including age at onset. The mean (SD) age for carriers was 45.3 years (11.2) vs 47.5 years (11.5) for noncarriers ( P = .49) and the mean (SD) BMI for carriers was 28.2 (5.5) vs 29.3 (5.3) for noncarriers ( P = .19). Conclusions and Relevance Using whole-exome sequencing, we identified a single low-frequency variant in the MODY3-causing gene HNF1A that is associated with type 2 diabetes in Latino populations and may affect protein function. This finding may have implications for screening and therapeutic modification in this population, but additional studies are required.

...read moreread less

217 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
…
26
27
28
29
30
31
32
…
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Fast gapped-read alignment with Bowtie 2

[...]

Ben Langmead¹, Steven L. Salzberg¹, Steven L. Salzberg², Steven L. Salzberg³•Institutions (3)

University of Maryland, College Park¹, Johns Hopkins University School of Medicine², Johns Hopkins University³

01 Apr 2012-Nature Methods

TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

37,898 citations

Journal Article•DOI•

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

[...]

Aravind Subramanian¹, Pablo Tamayo¹, Vamsi K. Mootha², Sayan Mukherjee³, Benjamin L. Ebert², Michael A. Gillette², Amanda G. Paulovich⁴, Scott L. Pomeroy², Todd R. Golub², Eric S. Lander¹, Jill P. Mesirov¹ - Show less +7 more•Institutions (4)

Massachusetts Institute of Technology¹, Harvard University², Duke University³, Fred Hutchinson Cancer Research Center⁴

25 Oct 2005-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.

...read moreread less

Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

...read moreread less

34,830 citations

Journal Article•DOI•

PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

[...]

Shaun Purcell¹, Shaun Purcell², Benjamin M. Neale¹, Benjamin M. Neale³, Kathe Todd-Brown², Lori Thomas², Manuel A. R. Ferreira², David Bender², David Bender¹, Julian Maller¹, Julian Maller², Pamela Sklar², Pamela Sklar¹, Paul I.W. de Bakker², Paul I.W. de Bakker¹, Mark J. Daly², Mark J. Daly¹, Pak C. Sham⁴ - Show less +14 more•Institutions (4)

Massachusetts Institute of Technology¹, Harvard University², University of London³, University of Hong Kong⁴

01 Sep 2007-American Journal of Human Genetics

TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.

...read moreread less

Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

...read moreread less

26,280 citations

Journal Article•DOI•

Initial sequencing and analysis of the human genome.

[...]

Eric S. Lander¹, Lauren Linton¹, Bruce W. Birren¹, Chad Nusbaum¹ +245 more•Institutions (29)

15 Feb 2001-Nature

TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

...read moreread less

Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

...read moreread less

22,269 citations

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth⁵, Gordon K. Smyth¹ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse