scispace - formally typeset
Search or ask a question
Author

Gonçalo R. Abecasis

Bio: Gonçalo R. Abecasis is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 179, co-authored 595 publications receiving 230323 citations. Previous affiliations of Gonçalo R. Abecasis include Johns Hopkins University School of Medicine & Wellcome Trust Centre for Human Genetics.


Papers
More filters
04 Dec 2014
TL;DR: The results indicate a genetic architecture for human height that is characterized by a very large but finite number of causal variants, including mTOR, osteoglycin and binding of hyaluronic acid.
Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate–related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

97 citations

Journal ArticleDOI
Viktoria Gusarova1, Colm O'Dushlaine1, Tanya M. Teslovich1, Peter N. Benotti, Tooraj Mirshahi, Omri Gottesman1, Cristopher V. Van Hout1, Michael F. Murray, Anubha Mahajan2, Jonas B. Nielsen3, Lars G. Fritsche3, Anders Berg Wulff4, Daniel F. Gudbjartsson5, Marketa Sjögren6, Connor A. Emdin7, Robert A. Scott8, Wen-Jane Lee9, Aeron Small10, Lydia Coulter Kwee11, Om Prakash Dwivedi12, Rashmi B. Prasad6, Shannon Bruse, Alexander Lopez, John Penn, Anthony Marcketta, Joseph B. Leader, Christopher D. Still, H. Lester Kirchner, Uyenlinh L. Mirshahi, Amr H. Wardeh, Cassandra M. Hartle, Lukas Habegger, Samantha N. Fetterolf, Teresa Tusié-Luna13, Andrew P. Morris2, Andrew P. Morris14, Andrew P. Morris15, Hilma Holm16, Valgerdur Steinthorsdottir16, Patrick Sulem16, Unnur Thorsteinsdottir5, Jerome I. Rotter17, Lee-Ming Chuang18, Scott M. Damrauer19, Scott M. Damrauer10, David Birtwell10, Chad M. Brummett3, Amit Khera20, Amit Khera7, Pradeep Natarajan20, Pradeep Natarajan7, Marju Orho-Melander6, Jason Flannick7, Jason Flannick20, Luca A. Lotta8, Cristen J. Willer3, Oddgeir L. Holmen21, Marylyn D. Ritchie, David H. Ledbetter, Andrew J. Murphy1, Ingrid B. Borecki, Jeffrey G. Reid, John D. Overton, Ola Hansson12, Ola Hansson6, Leif Groop12, Leif Groop6, Svati H. Shah11, William E. Kraus11, Daniel J. Rader10, Yii-Der Ida Chen17, Kristian Hveem21, Kristian Hveem22, Nicholas J. Wareham8, Sekar Kathiresan20, Olle Melander6, Kari Stefansson16, Børge G. Nordestgaard, Anne Tybjærg-Hansen, Gonçalo R. Abecasis3, David Altshuler, Jose C. Florez7, Jose C. Florez20, Michael Boehnke3, Mark I. McCarthy2, Mark I. McCarthy23, George D. Yancopoulos1, David J. Carey, Alan R. Shuldiner, Aris Baras1, Frederick E. Dewey1, Jesper Gromada1 
TL;DR: It is found that predicted loss-of-function variants in ANGPTL4 are associated with glucose homeostasis and reduced risk of type 2 diabetes and that Angptl4−/− mice on a high-fat diet show improved insulin sensitivity.
Abstract: Angiopoietin-like 4 (ANGPTL4) is an endogenous inhibitor of lipoprotein lipase that modulates lipid levels, coronary atherosclerosis risk, and nutrient partitioning. We hypothesize that loss of ANGPTL4 function might improve glucose homeostasis and decrease risk of type 2 diabetes (T2D). We investigate protein-altering variants in ANGPTL4 among 58,124 participants in the DiscovEHR human genetics study, with follow-up studies in 82,766 T2D cases and 498,761 controls. Carriers of p.E40K, a variant that abolishes ANGPTL4 ability to inhibit lipoprotein lipase, have lower odds of T2D (odds ratio 0.89, 95% confidence interval 0.85–0.92, p = 6.3 × 10−10), lower fasting glucose, and greater insulin sensitivity. Predicted loss-of-function variants are associated with lower odds of T2D among 32,015 cases and 84,006 controls (odds ratio 0.71, 95% confidence interval 0.49–0.99, p = 0.041). Functional studies in Angptl4-deficient mice confirm improved insulin sensitivity and glucose homeostasis. In conclusion, genetic inactivation of ANGPTL4 is associated with improved glucose homeostasis and reduced risk of T2D.

97 citations

Journal ArticleDOI
TL;DR: In this article, the authors compared the effectiveness of study-specific reference panels to the commonly used 1000 Genomes Project (1000G) reference panels in the isolated Sardinian population and in cohorts of European ancestry including samples from Minnesota (USA).
Abstract: The utility of genotype imputation in genome-wide association studies is increasing as progressively larger reference panels are improved and expanded through whole-genome sequencing. Developing general guidelines for optimally cost-effective imputation, however, requires evaluation of performance issues that include the relative utility of study-specific compared with general/multipopulation reference panels; genotyping with various array scaffolds; effects of different ethnic backgrounds; and assessment of ranges of allele frequencies. Here we compared the effectiveness of study-specific reference panels to the commonly used 1000 Genomes Project (1000G) reference panels in the isolated Sardinian population and in cohorts of European ancestry including samples from Minnesota (USA). We also examined different combinations of genome-wide and custom arrays for baseline genotypes. In Sardinians, the study-specific reference panel provided better coverage and genotype imputation accuracy than the 1000G panels and other large European panels. In fact, even gene-centered custom arrays (interrogating ~200 000 variants) provided highly informative content across the entire genome. Gain in accuracy was also observed for Minnesotans using the study-specific reference panel, although the increase was smaller than in Sardinians, especially for rare variants. Notably, a combined panel including both study-specific and 1000G reference panels improved imputation accuracy only in the Minnesota sample, and only at rare sites. Finally, we found that when imputation is performed with a study-specific reference panel, cutoffs different from the standard thresholds of MACH-Rsq and IMPUTE-INFO metrics should be used to efficiently filter badly imputed rare variants. This study thus provides general guidelines for researchers planning large-scale genetic studies.

96 citations

Journal ArticleDOI
TL;DR: A genome-wide association study identified a SNP in the COL4A1 gene that was significantly associated with PWV in 2 populations, suggesting that previously unrecognized cell-matrix interactions may exert an important role in regulating arterial stiffness.
Abstract: Background— Pulse wave velocity (PWV), a noninvasive index of central arterial stiffness, is a potent predictor of cardiovascular mortality and morbidity. Heritability and linkage studies have pointed toward a genetic component affecting PWV. We conducted a genome-wide association study to identify single-nucleotide polymorphisms (SNPs) associated with PWV. Methods and Results— The study cohort included participants from the SardiNIA study for whom PWV measures were available. Genotyping was performed in 4221 individuals, using either the Affymetrix 500K or the Affymetrix 10K mapping array sets (with imputation of the missing genotypes). Associations with PWV were evaluated using an additive genetic model that included age, age2, and sex as covariates. The findings were tested for replication in an independent internal Sardinian cohort of 1828 individuals, using a custom chip designed to include the top 43 nonredundant SNPs associated with PWV. Of the loci that were tested for association with PWV, the nonsynonymous SNP rs3742207 in the COL4A1 gene on chromosome 13 and SNP rs1495448 in the MAGI1 gene on chromosome 3 were successfully replicated ( P =7.08×10−7 and P =1.06×10−5, respectively, for the combined analyses). The association between rs3742207 and PWV was also successfully replicated ( P =0.02) in an independent population, the Old-Order Amish, leading to an overall P =5.16×10−8. Conclusions— A genome-wide association study identified a SNP in the COL4A1 gene that was significantly associated with PWV in 2 populations. Collagen type 4 is the major structural component of basement membranes, suggesting that previously unrecognized cell-matrix interactions may exert an important role in regulating arterial stiffness. Received September 22, 2008; accepted January 26, 2009. # CLINICAL PERSPECTIVE {#article-title-2}

96 citations

Journal ArticleDOI
TL;DR: A genome-wide association scan of 466 BAV cases and 4,660 age, sex and ethnicity-matched controls with replication in up to 1,326 cases and 8,103 controls identifies association with a noncoding variant 151 kb from the gene encoding the cardiac-specific transcription factor, GATA4.
Abstract: Bicuspid aortic valve (BAV) is a heritable congenital heart defect and an important risk factor for valvulopathy and aortopathy. Here we report a genome-wide association scan of 466 BAV cases and 4,660 age, sex and ethnicity-matched controls with replication in up to 1,326 cases and 8,103 controls. We identify association with a noncoding variant 151 kb from the gene encoding the cardiac-specific transcription factor, GATA4, and near-significance for p.Ser377Gly in GATA4. GATA4 was interrupted by CRISPR-Cas9 in induced pluripotent stem cells from healthy donors. The disruption of GATA4 significantly impaired the transition from endothelial cells into mesenchymal cells, a critical step in heart valve development. Bicuspid aortic valve (BAV) is the most common human congenital cardiovascular malformation. Here, the authors perform a genome-wide association study for BAV and identify risk variants in the gene region of cardiac-specific transcription factor GATA4 and implicate GATA4 in heart valve development.

95 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations