scispace - formally typeset
Search or ask a question
Author

Gonçalo R. Abecasis

Bio: Gonçalo R. Abecasis is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 179, co-authored 595 publications receiving 230323 citations. Previous affiliations of Gonçalo R. Abecasis include Johns Hopkins University School of Medicine & Wellcome Trust Centre for Human Genetics.


Papers
More filters
Journal ArticleDOI
V Lagou1, Reedik Mägi1, Hottenga J-J.2, H Grallert  +235 moreInstitutions (81)
TL;DR: The original version of this article contained an error in Fig 2, in which panels a and b were inadvertently swapped This has now been corrected in the PDF and HTML versions of the Article.
Abstract: The original version of this Article contained an error in Fig 2, in which panels a and b were inadvertently swapped This has now been corrected in the PDF and HTML versions of the Article

2 citations

Posted ContentDOI
19 Sep 2018-bioRxiv
TL;DR: It is shown that associated variants are generally predictive of CKD but improve detection only modestly compared with other known clinical risk factors, and genetic risk scores constructed from these eGFR meta-analysis results are shown to be inadequate.
Abstract: Chronic Kidney Disease (CKD) is a growing health burden currently affecting 10-15% of adults worldwide Estimated glomerular filtration rate (eGFR) as a marker of kidney function is commonly used to diagnose CKD Previous genome-wide association study (GWAS) meta-analyses of CKD and eGFR or related phenotypes have identified a number of variants associated with kidney function, but these only explain a fraction of the variability in kidney phenotypes attributed to genetic components To extend these studies, we analyzed data from the Nord-Trondelag Health Study (HUNT), which is more densely imputed than previous studies, and performed a GWAS meta-analysis of eGFR with publicly available summary statistics, more than doubling the sample size of previous meta-analyses We identified 147 loci (53 novel loci) associated with eGFR, including genes involved in transcriptional regulation, kidney development, cellular signaling, metabolism, and solute transport Moreover, genes at these loci show enriched expression in urogenital tissues and highlight gene sets known to play a role in kidney function In addition, sex-stratified analysis identified three regions (prioritized genes: PPM1J, MCL1, and SLC47A1) with more significant effects in women than men Using genetic risk scores constructed from these eGFR meta-analysis results, we show that associated variants are generally predictive of CKD but improve detection only modestly compared with other known clinical risk factors Collectively, these results yield additional insight into the genetic factors underlying kidney function and progression to CKD

2 citations

Posted ContentDOI
30 Mar 2019-bioRxiv
TL;DR: A meta-analysis framework that uses summary statistics to test for association between multiple continuous phenotypes and variants in a region of interest and demonstrates the utility and improved power of Meta-MultiSKAT in the meta-analyses of four white blood cell subtype traits.
Abstract: The power of genetic association analyses can be increased by jointly meta-analyzing multiple correlated phenotypes. Here, we develop a meta-analysis framework, Meta-MultiSKAT, that uses summary statistics to test for association between multiple continuous phenotypes and variants in a region of interest. Our approach models the heterogeneity of effects between studies through a kernel matrix and performs a variance component test for association. Using a genotype kernel, our approach can test for rare-variants and the combined effects of both common and rare-variants. To achieve robust power, within Meta-MultiSKAT, we developed fast and accurate omnibus tests combining different models of genetic effects, functional genomic annotations, multiple correlated phenotypes and heterogeneity across studies. Additionally, Meta-MultiSKAT accommodates situations where studies do not share exactly the same set of phenotypes or have differing correlation patterns among the phenotypes. Simulation studies confirm that Meta-MultiSKAT can maintain type-I error rate at exome-wide level of 2.5×10−6. Further simulations under different models of association show that Meta-MultiSKAT can improve power of detection from 23% to 38% on average over single phenotype-based meta-analysis approaches. We demonstrate the utility and improved power of Meta-MultiSKAT in the meta-analyses of four white blood cell subtype traits from the Michigan Genomics Initiative (MGI) and SardiNIA studies.

2 citations

Journal ArticleDOI
TL;DR: Repeated measures substantially improve power and the proportional increase in LOD score depends mostly on measurement error and total heritability but not much on marker map, the number of alleles per marker or family structure.
Abstract: Background: When subjects are measured multiple times, linkage analysis needs to appropriately model these repeated measures. A number of methods have been proposed to model repeate

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations