scispace - formally typeset
Search or ask a question
Author

David Altshuler

Bio: David Altshuler is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 162, co-authored 345 publications receiving 201782 citations. Previous affiliations of David Altshuler include Vertex Pharmaceuticals & Massachusetts Institute of Technology.


Papers
More filters
Journal ArticleDOI
Sophie R. Wang1, Sophie R. Wang2, Sophie R. Wang3, Vineeta Agarwala4, Vineeta Agarwala3, Vineeta Agarwala2, Jason Flannick2, Jason Flannick3, Charleston W. K. Chiang5, David Altshuler, Alisa Manning1, Christopher Hartl1, Pierre Fontanillas, Todd Green, Eric Banks, Mark A. DePristo, Ryan Poplin, Khalid Shakir, Timothy Fennell, Jacquelyn Murphy, Noël P. Burtt, Stacey Gabriel, Christian Fuchsberger, Hyun Min Kang, Xueling Sim, Clement Ma, Adam E. Locke, Thomas W. Blackwell, Anne U. Jackson, Tanya M. Teslovich, Heather M. Stringham, Peter S. Chines, Phoenix Kwan, Jeroen R. Huyghe, Adrian Tan, Goo Jun, Michael L. Stitzel, Richard N. Bergman, Lori L. Bonnycastle, Jaakko Tuomilehto, Francis S. Collins, Laura J. Scott, Karen L. Mohlke, Gonçalo R. Abecasis, Michael Boehnke, Tim M. Strom, Christian Gieger, Martina Müller-Nurasyid, Harald Grallert, Jennifer Kriebel, Janina S. Ried, Martin Hrabé de Angelis, Cornelia Huth, Christa Meisinger, Annette Peters, Wolfgang Rathmann, Konstantin Strauch, Thomas Meitinger, Jasmina Kravic, Claes Ladenvall, Tiinamaija Toumi, Bo Isomaa, Leif Groop, Kyle J. Gaulton, Loukas Moutsianas, Manny Rivas, Richard D. Pearson, Anubha Mahajan, Inga Prokopenko, Ashok Kumar, John R. B. Perry, Jeff Chen, Bryan Howie, Martijn van de Bunt, Kerrin S. Small, Cecilia M. Lindgren, Gerton Lunter, Neil Robertson, Will Rayner, Andrew D. Morris, David Buck, Andrew T. Hattersley, Tim D. Spector, Gil McVean, Timothy M. Frayling, Peter Donnelly, Mark I. McCarthy, Joel N. Hirschhorn3, Joel N. Hirschhorn2, Joel N. Hirschhorn1 
TL;DR: It is demonstrated that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants.
Abstract: Finnish samples have been extensively utilized in studying single-gene disorders, where the founder effect has clearly aided in discovery, and more recently in genome-wide association studies of complex traits, where the founder effect has had less obvious impacts. As the field starts to explore rare variants' contribution to polygenic traits, it is of great importance to characterize and confirm the Finnish founder effect in sequencing data and to assess its implications for rare-variant association studies. Here, we employ forward simulation, guided by empirical deep resequencing data, to model the genetic architecture of quantitative polygenic traits in both the general European and the Finnish populations simultaneously. We demonstrate that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants. SKAT-O, variable-threshold tests, and single-variant tests are more powerful than other rare-variant methods in the Finnish population across a range of genetic models. We also compare the relative power and efficiency of exome array genotyping to those of high-coverage exome sequencing. At a fixed cost, less expensive genotyping strategies have far greater power than sequencing; in a fixed number of samples, however, genotyping arrays miss a substantial portion of genetic signals detected in sequencing, even in the Finnish founder population. As genetic studies probe sequence variation at greater depth in more diverse populations, our simulation approach provides a framework for evaluating various study designs for gene discovery.

30 citations

Journal ArticleDOI
TL;DR: RAFT (recessive-allele-frequency-based test) outperforms existing approaches when the variant influences disease risk in a recessive manner on simulated data, and suggests that RAFT can effectively reveal rare recessive contributions to complex diseases overlooked by conventional association tests.
Abstract: Rare-variant association studies in common, complex diseases are customarily conducted under an additive risk model in both single-variant and burden testing. Here, we describe a method to improve detection of rare recessive variants in complex diseases termed RAFT (recessive-allele-frequency-based test). We found that RAFT outperforms existing approaches when the variant influences disease risk in a recessive manner on simulated data. We then applied our method to 1,791 Finnish individuals with type 2 diabetes (T2D) and 2,657 matched control subjects. In BBS10, we discovered a rare variant (c.1189A>G [p.Ile397Val]; rs202042386) that confers risk of T2D in a recessive state (p = 1.38 × 10−6) and would be missed by conventional methods. Testing of this variant in an established in vivo zebrafish model confirmed the variant to be pathogenic. Taken together, these data suggest that RAFT can effectively reveal rare recessive contributions to complex diseases overlooked by conventional association tests.

30 citations

Journal Article
Nicole Soranzo1, Serena Sanna2, Eleanor Wheeler3, Christian Gieger  +176 moreInstitutions (49)
01 Mar 2011-Diabetes
TL;DR: This article calculated net reclassification of diabetes mellitus patients using a database of patients diagnosed with type 2 diabetes over a 12-month period and found that the number of patients with diabetes decreases with age and disease progression.

29 citations

Journal ArticleDOI
01 Sep 2008-Blood
TL;DR: A novel missense single nucleotide polymorphism in the cytoplasmic domain of CD40 at position 227 (P227A) was identified, which resides on a conserved ancestral haplotype highly enriched in persons of Mexican and South American descent, and is identified as a novel genetic variant of hCD40 with a gain-of-function immune phenotype.

28 citations

Journal ArticleDOI
TL;DR: The power of haplotype analysis for mapping mutations in isolated populations and specifically for dissecting effects of multiple variants of the same locus is exemplified, with a striking 100% increase of PPS in carriers.
Abstract: Pinpointing culprit causal variants along signal peaks of genome-wide association studies (GWAS) is challenging. To overcome confounding effects of multiple independent variants at such a locus and narrow the interval for causal allele capture, we developed an approach that maps local shared haplotypes harboring a putative causal variant. We demonstrate our method in an extreme isolate founder population, the pacific Island of Kosrae. We analyzed plasma plant sterol (PPS) levels, a surrogate measure of cholesterol absorption from the intestine, where previous studies have implicated 2p21 mutations in the ATP binding cassette subfamily G members 5 or 8 (ABCG5 or ABCG8) genes. We have previously reported that 11.1% of the islanders are carriers of a frameshift ABCG8 mutation increasing PPS levels in carriers by 50%. GWAS adjusted for this mutation revealed genomewide significant signals along 11 Mb around it. To fine-map this signal, we detected pairwise identity-by-descent haplotypes using our tool GERMLINE and implemented a clustering algorithm to identify haplotypes shared across multiple samples with their unique shared boundaries. A single 526-kb haplotype mapped strongly to PPS levels, dramatically refining the mapped interval. This haplotype spans the ABCG5/ABCG8 genes, is carried by 1.8% of the islanders, and results in a striking 100% increase of PPS in carriers. Resequencing of ABCG5 in these carriers found a D450H missense mutation along the associated haplotype. These findings exemplify the power of haplotype analysis for mapping mutations in isolated populations and specifically for dissecting effects of multiple variants of the same locus.

28 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: The Gene Set Enrichment Analysis (GSEA) method as discussed by the authors focuses on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation.
Abstract: Although genomewide RNA expression analysis has become a routine tool in biomedical research, extracting biological insight from such information remains a major challenge. Here, we describe a powerful analytical method called Gene Set Enrichment Analysis (GSEA) for interpreting gene expression data. The method derives its power by focusing on gene sets, that is, groups of genes that share common biological function, chromosomal location, or regulation. We demonstrate how GSEA yields insights into several cancer-related data sets, including leukemia and lung cancer. Notably, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer, GSEA reveals many biological pathways in common. The GSEA method is embodied in a freely available software package, together with an initial database of 1,325 biologically defined gene sets.

34,830 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations