scispace - formally typeset
Search or ask a question
Author

Gonçalo R. Abecasis

Bio: Gonçalo R. Abecasis is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 179, co-authored 595 publications receiving 230323 citations. Previous affiliations of Gonçalo R. Abecasis include Johns Hopkins University School of Medicine & Wellcome Trust Centre for Human Genetics.


Papers
More filters
Posted ContentDOI
Anubha Mahajan1, Daniel Taliun2, Matthias Thurner1, Neil R. Robertson1, Jason M. Torres1, N. William Rayner1, Valgerdur Steinthorsdottir3, Robert A. Scott4, Niels Grarup5, James P. Cook6, Ellen M. Schmidt2, Matthias Wuttke7, Chloé Sarnowski8, Reedik Mägi9, Jana Nano10, Christian Gieger, Stella Trompet11, Cécile Lecoeur12, Michael Preuss13, Bram P. Prins14, Xiuqing Guo, Lawrence F. Bielak2, Amanda J. Bennett, Jette Bork-Jensen5, Chad M. Brummett2, Mickaël Canouil12, Kai-Uwe Eckardt15, Krista Fischer9, Sharon L.R. Kardia2, Florian Kronenberg16, Kristi Läll9, Ching-Ti Liu8, Adam E. Locke17, Jian'an Luan4, Ioanna Ntalla, Vibe Nylander, Sebastian Schönherr16, Claudia Schurmann13, Loic Yengo12, Erwin P. Bottinger13, Ivan Brandslund18, Cramer Christensen, George Dedoussis19, Jose C. Florez20, Ian Ford21, Oscar H. Franco10, Timothy M. Frayling22, Vilmantas Giedraitis23, Sophie Hackinger14, Andrew T. Hattersley22, Christian Herder, M. Arfan Ikram10, Martin Ingelsson23, Marit E. Jørgensen24, Torben Jørgensen, Jennifer Kriebel, Johanna Kuusisto25, Symen Ligthart10, Cecilia M. Lindgren1, Allan Linneberg, Valeriya Lyssenko26, Vasiliki Mamakou19, Thomas Meitinger27, Karen L. Mohlke28, Andrew D. Morris29, Girish N. Nadkarni13, James S. Pankow30, Annette Peters, Naveed Sattar31, Alena Stančáková25, Konstantin Strauch, Kent D. Taylor, Barbara Thorand, Gudmar Thorleifsson3, Unnur Thorsteinsdottir32, Jaakko Tuomilehto33, Daniel R. Witte34, Josée Dupuis, Patricia A. Peyser2, Eleftheria Zeggini14, Ruth J. F. Loos13, Philippe Froguel12, Erik Ingelsson35, Lars Lind23, Leif Groop26, Markku Laakso25, Francis S. Collins33, J. Wouter Jukema36, Colin N. A. Palmer, Harald Grallert, Andres Metspalu9, Abbas Dehghan10, Anna Köttgen7, Gonçalo R. Abecasis2, James B. Meigs20, Jerome I. Rotter, Jonathan Marchini1, Oluf Pedersen5, Torben Hansen5, Claudia Langenberg4, Nicholas J. Wareham4, Kari Stefansson32, Anna L. Gloyn, Andrew P. Morris9, Michael Boehnke2, Mark I. McCarthy1 
09 Jan 2018-bioRxiv
TL;DR: Increase in sample size and variant diversity deliver enhanced discovery and single-variant resolution of causal T2D-risk alleles, and the consequent impact on mechanistic insights and clinical translation is highlighted.
Abstract: We aggregated genome-wide genotyping data from 32 European-descent GWAS (74,124 T2D cases, 824,006 controls) imputed to high-density reference panels of >30,000 sequenced haplotypes. Analysis of ~27M variants (~21M with minor allele frequency [MAF] p -8 ; MAF 0.02%-50%; odds ratio [OR] 1.04-8.05), 135 not previously-implicated in T2D-predisposition. Conditional analyses revealed 160 additional distinct association signals ( p -5 ) within the identified loci. The combined set of 403 T2D-risk signals includes 56 low-frequency (0.5%≤MAF 2. Forty-one of the signals displayed effect-size heterogeneity between BMI-unadjusted and adjusted analyses. Increased sample size and improved imputation led to substantially more precise localisation of causal variants than previously attained: at 51 signals, the lead variant after fine-mapping accounted for >80% posterior probability of association (PPA) and at 18 of these, PPA exceeded 99%. Integration with islet regulatory annotations enriched for T2D association further reduced median credible set size (from 42 variants to 32) and extended the number of index variants with PPA>80% to 73. Although most signals mapped to regulatory sequence, we identified 18 genes as human validated therapeutic targets through coding variants that are causal for disease. Genome wide chip heritability accounted for 18% of T2D-risk, and individuals in the 2.5% extremes of a polygenic risk score generated from the GWAS data differed >9-fold in risk. Our observations highlight how increases in sample size and variant diversity deliver enhanced discovery and single-variant resolution of causal T2D-risk alleles, and the consequent impact on mechanistic insights and clinical translation.

51 citations

Journal ArticleDOI
TL;DR: A genome-wide association study meta-analysis of up to 125,584 cases and over 2.5 million control individuals across 60 studies from 25 countries reveals compelling insights regarding disease susceptibility and severity.

50 citations

Journal ArticleDOI
12 Oct 2011-PLOS ONE
TL;DR: Examination of 34 AMD-enriched extended families and controls demonstrated that deletion CNP148 was protective against AMD, independent of SNPs at CFH, and identified a 32-kb region downstream of Y402H shared by all three risk haplotypes, suggesting that this region may be critical for AMD development.
Abstract: Complement factor H shows very strong association with Age-related Macular Degeneration (AMD), and recent data suggest that multiple causal variants are associated with disease. To refine the location of the disease associated variants, we characterized in detail the structural variation at CFH and its paralogs, including two copy number polymorphisms (CNP), CNP147 and CNP148, and several rare deletions and duplications. Examination of 34 AMD-enriched extended families (N = 293) and AMD cases (White N = 4210 Indian = 134; Malay = 140) and controls (White N = 3229; Indian = 117; Malay = 2390) demonstrated that deletion CNP148 was protective against AMD, independent of SNPs at CFH. Regression analysis of seven common haplotypes showed three haplotypes, H1, H6 and H7, as conferring risk for AMD development. Being the most common haplotype H1 confers the greatest risk by increasing the odds of AMD by 2.75-fold (95% CI = [2.51, 3.01]; p = 8.31×10(-109)); Caucasian (H6) and Indian-specific (H7) recombinant haplotypes increase the odds of AMD by 1.85-fold (p = 3.52×10(-9)) and by 15.57-fold (P = 0.007), respectively. We identified a 32-kb region downstream of Y402H (rs1061170), shared by all three risk haplotypes, suggesting that this region may be critical for AMD development. Further analysis showed that two SNPs within the 32 kb block, rs1329428 and rs203687, optimally explain disease association. rs1329428 resides in 20 kb unique sequence block, but rs203687 resides in a 12 kb block that is 89% similar to a noncoding region contained in ΔCNP148. We conclude that causal variation in this region potentially encompasses both regulatory effects at single markers and copy number.

50 citations

Journal ArticleDOI
TL;DR: Insomnia may be a prominent early symptom in cases of CJD linked to the E200K-129M haplotype in which the thalamus is severely affected.
Abstract: Background: Insomnia with predominant thalamic involvement and minor cortical and cerebellar pathologic changes is not characteristic of familial Creutzfeldt–Jakob disease (CJD) but is a hallmark of fatal familial insomnia. Objective: To report a 53-year-old woman with intractable insomnia as her initial symptom of disease. Methods: The authors characterized clinical, pathologic, and molecular features of the disease using EEG, polysomnography, neurohistology, Western blotting, protein sequencing, and prion protein (PrP) gene ( PRNP ) analysis. Results: The patient developed dysgraphia, dysarthria, bulimia, myoclonus, memory loss, visual hallucinations, and opisthotonos, as well as pyramidal, extrapyramidal, and cerebellar signs. Polysomnographic studies showed an absence of stages 3 and 4, and REM. She died 8 months after onset. On neuropathologic examination, there was major thalamic involvement characterized by neuronal loss, spongiform changes, and prominent gliosis. The inferior olivary nuclei exhibited chromatolysis, neuronal loss, and gliosis. Spongiform changes were mild in the neocortex and not evident in the cerebellum. PrP immunopositivity was present in these areas as well as in the thalamus. PRNP analysis showed the haplotype E200K-129M. Western blot analysis showed the presence of proteinase K (PK)–resistant PrP (PrP sc ) with the nonglycosylated isoform of approximately 21 kd, corresponding in size to that of type 1 PrP sc . N -terminal protein sequencing demonstrated PK cleavage sites at glycine (G) 82 and G78, as previously reported in CJD with the E200K-129 M haplotype. Conclusions: Insomnia may be a prominent early symptom in cases of CJD linked to the E200K-129M haplotype in which the thalamus is severely affected.

50 citations

Journal ArticleDOI
TL;DR: This article analyzed the contribution of rare variants to 57 diseases and 26 cardiometabolic traits, using data from 200,337 UK Biobank participants with whole-exome sequencing.
Abstract: Cardiometabolic diseases are the leading cause of death worldwide. Despite a known genetic component, our understanding of these diseases remains incomplete. Here, we analyzed the contribution of rare variants to 57 diseases and 26 cardiometabolic traits, using data from 200,337 UK Biobank participants with whole-exome sequencing. We identified 57 gene-based associations, with broad replication of novel signals in Geisinger MyCode. There was a striking risk associated with mutations in known Mendelian disease genes, including MYBPC3, LDLR, GCK, PKD1 and TTN. Many genes showed independent convergence of rare and common variant evidence, including an association between GIGYF1 and type 2 diabetes. We identified several large effect associations for height and 18 unique genes associated with blood lipid or glucose levels. Finally, we found that between 1.0% and 2.4% of participants carried rare potentially pathogenic variants for cardiometabolic disorders. These findings may facilitate studies aimed at therapeutics and screening of these common disorders.

48 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations