scispace - formally typeset
Search or ask a question
Author

Gonçalo R. Abecasis

Bio: Gonçalo R. Abecasis is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 179, co-authored 595 publications receiving 230323 citations. Previous affiliations of Gonçalo R. Abecasis include Johns Hopkins University School of Medicine & Wellcome Trust Centre for Human Genetics.


Papers
More filters
Journal ArticleDOI
Mandy Y.M. Ng1, Douglas F. Levinson2, Stephen V. Faraone3, Brian K. Suarez4, Lynn E. DeLisi5, Lynn E. DeLisi6, Tadao Arinami7, Brien P. Riley8, Tiina Paunio9, Tiina Paunio10, A. E. Pulver11, Irmansyah12, Peter Holmans13, Michael Escamilla14, Dieter B. Wildenauer15, Nigel Williams13, Claudine Laurent16, Bryan J. Mowry17, Linda M. Brzustowicz18, Michel Maziade19, Pamela Sklar20, D L Garver21, Gonçalo R. Abecasis22, Bernard Lerer, M D Fallin11, Hugh Gurling23, Pablo V. Gejman24, Eva Lindholm25, Hans W. Moises26, William Byerley27, Ellen M. Wijsman28, Paola Forabosco1, Ming T. Tsuang20, Ming T. Tsuang29, H-G Hwu30, Yuji Okazaki31, Kenneth S. Kendler8, Brandon Wormley8, Ayman H. Fanous21, Ayman H. Fanous32, Dermot Walsh, Francis A. O'Neill33, Leena Peltonen, Gerald Nestadt11, Virginia K. Lasseter11, Kung-Yee Liang11, G M Papadimitriou34, Dimitris Dikeos34, Sibylle G. Schwab15, Michael John Owen13, Michael Conlon O'Donovan13, Nadine Norton13, Elizabeth Hare14, Henriette Raventós35, Humberto Nicolini36, Margot Albus, Wolfgang Maier37, Vishwajit L. Nimgaonkar38, Lars Terenius39, J. Mallet40, Melanie Jay16, Stephanie Godard41, Deborah A. Nertney17, M. Alexander2, Raymond R. Crowe42, Jeremy M. Silverman43, Anne S. Bassett44, M-A Roy19, Chantal Mérette19, Carlos N. Pato45, Michele T. Pato45, J. Louw Roos46, Yoav Kohn, Daniela Amann-Zalcenstein47, Gursharan Kalsi23, Andrew McQuillin23, David Curtis48, Jon Brynjolfson, Thordur Sigmundsson, Hannes Petursson, Alan R. Sanders24, Jubao Duan24, Elena Jazin25, Marina Myles-Worsley3, Maria Karayiorgou49, Cathryn M. Lewis1 
King's College London1, Stanford University2, State University of New York Upstate Medical University3, Washington University in St. Louis4, New York University5, Nathan Kline Institute for Psychiatric Research6, University of Tsukuba7, Virginia Commonwealth University8, National Institute for Health and Welfare9, University of Helsinki10, Johns Hopkins University11, University of Indonesia12, Cardiff University13, University of Texas Health Science Center at San Antonio14, University of Western Australia15, Pierre-and-Marie-Curie University16, University of Queensland17, Rutgers University18, Laval University19, Harvard University20, Veterans Health Administration21, University of Michigan22, University College London23, NorthShore University HealthSystem24, Uppsala University25, University of Kiel26, University of California, San Francisco27, University of Washington28, University of California, San Diego29, National Taiwan University30, Tokyo Metropolitan Matsuzawa Hospital31, Georgetown University32, Queen's University Belfast33, National and Kapodistrian University of Athens34, University of Costa Rica35, Universidad Autónoma de la Ciudad de México36, University of Bonn37, University of Pittsburgh38, Karolinska Institutet39, University of Paris40, French Institute of Health and Medical Research41, University of Iowa42, Icahn School of Medicine at Mount Sinai43, University of Toronto44, University of Southern California45, University of Pretoria46, Weizmann Institute of Science47, Queen Mary University of London48, Columbia University49
TL;DR: The primary analysis met empirical criteria for ‘aggregate’ genome-wide significance, indicating that some or all of 10 bins are likely to contain loci linked to SCZ, including regions of chromosomes 1, 2q, 3q, 4q, 5q, 8p and 10q.
Abstract: A genome scan meta-analysis (GSMA) was carried out on 32 independent genome-wide linkage scan analyses that included 3255 pedigrees with 7413 genotyped cases affected with schizophrenia (SCZ) or related disorders. The primary GSMA divided the autosomes into 120 bins, rank-ordered the bins within each study according to the most positive linkage result in each bin, summed these ranks (weighted for study size) for each bin across studies and determined the empirical probability of a given summed rank (P(SR)) by simulation. Suggestive evidence for linkage was observed in two single bins, on chromosomes 5q (142-168 Mb) and 2q (103-134 Mb). Genome-wide evidence for linkage was detected on chromosome 2q (119-152 Mb) when bin boundaries were shifted to the middle of the previous bins. The primary analysis met empirical criteria for 'aggregate' genome-wide significance, indicating that some or all of 10 bins are likely to contain loci linked to SCZ, including regions of chromosomes 1, 2q, 3q, 4q, 5q, 8p and 10q. In a secondary analysis of 22 studies of European-ancestry samples, suggestive evidence for linkage was observed on chromosome 8p (16-33 Mb). Although the newer genome-wide association methodology has greater power to detect weak associations to single common DNA sequence variants, linkage analysis can detect diverse genetic effects that segregate in families, including multiple rare variants within one locus or several weakly associated loci in the same region. Therefore, the regions supported by this meta-analysis deserve close attention in future studies.

274 citations

Journal ArticleDOI
01 Nov 2008-Diabetes
TL;DR: These findings point to a molecular mechanism in humans by which higher triglycerides and CRP can be coupled with lower plasma glucose concentrations and position GCKR in central pathways regulating both hepatic triglyceride and glucose metabolism.
Abstract: OBJECTIVE Using the genome-wide-association approach, we recently identified the glucokinase regulatory protein gene ( GCKR , rs780094) region as a novel quantitative trait locus for plasma triglyceride concentration in Europeans. Here, we sought to study the association of GCKR variants with metabolic phenotypes including measures of glucose homeostasis, to evaluate the GCKR locus in samples of non-European ancestry, and to fine-map across the associated genomic interval. RESEARCH DESIGN AND METHODS We performed association studies in 12 independent cohorts comprising a total of >45,000 individuals representing several ancestral groups (whites from Northern and Southern Europe, whites from the United States, African Americans from the United States, Hispanics of Caribbean origin, and Chinese, Malays and Asian Indians from Singapore). We conducted genetic fine-mapping across the ∼417 kilobase region of linkage disequilibrium spanning GCKR and 16 other genes on chromosome 2p23 by imputing untyped HapMap SNPs and genotyping 104 SNPs across the associated genomic interval.. RESULTS We provide comprehensive evidence that GCKR rs780094 is associated with opposite effects on fasting plasma triglyceride (p meta =3x10 -56 ) and glucose (p meta =1x10 -13 ) concentrations. In addition, we confirmed recent reports that the same SNP is associated with C-reative protein level (p=5x10 -5 ). Both fine mapping approaches revealed a common missense GCKR variant (rs1260326, Pro446Leu, 34% frequency, r 2 =0.93 with rs780094) as the strongest association signal in the region. CONCLUSIONS- These findings point to a molecular mechanism in humans by which higher triglycerides and C-reactive protein can be coupled with lower plasma glucose concentrations and position GCKR in central pathways regulating both hepatic triglyceride and glucose metabolism.

270 citations

Journal ArticleDOI
TL;DR: A substantial proportion, but not all, of the associations with borderline genome-wide significance represent replicable, possibly genuine associations, which suggests a possible relaxation in the current GWS threshold.
Abstract: Background Robust replication is a sine qua non for the rigorous documentation of proposed associations in the genome-wide association (GWA) setting. Currently, associations of common variants reaching P ≤ 5 × 10(-8) are considered replicated. However, there is some ambiguity about the most suitable threshold for claiming genome-wide significance. Methods We defined as 'borderline' associations those with P > 5 × 10(-8) and P ≤ 1 × 10(-7). The eligible associations were retrieved using the 'Catalog of Published Genome-Wide Association Studies'. For each association we assessed whether it reached P ≤ 5 × 10(-8) with inclusion of additional data from subsequent GWA studies. Results Thirty-four eligible genotype-phenotype associations were evaluated with data and clarifications contributed from diverse investigators. Replication data from subsequent GWA studies could be obtained for 26 of them. Of those, 19 associations (73%) reached P ≤ 5 × 10(-8) for the same or a related trait implicating either the exact same allele or one in very high linkage disequilibrium and 17 reached P 10(-6) [corresponding false-discovery rate 19% (95% CI 7-39%)]. Conclusion A substantial proportion, but not all, of the associations with borderline genome-wide significance represent replicable, possibly genuine associations. Our empirical evaluation suggests a possible relaxation in the current GWS threshold.

269 citations

Journal ArticleDOI
John C. Chambers1, Weihua Zhang1, Graham M. Lord2, Graham M. Lord3, Pim van der Harst, Debbie A Lawlor4, Joban Sehmi1, Daniel P. Gale5, Mark N. Wass1, Kourosh R. Ahmadi6, Stephan J. L. Bakker7, Jacqui Beckmann8, Henk J. G. Bilo7, Murielle Bochud8, Morris J. Brown9, Mark J. Caulfield10, John M. C. Connell11, H. Terence Cook1, Ioana Cotlarciuc6, George Davey Smith4, Ranil de Silva1, Guohong Deng1, Olivier Devuyst12, Lambert D Dikkeschei, Nada Dimkovic, Mark Dockrell, Anna F. Dominiczak11, Shah Ebrahim2, Thomas Eggermann, Martin Farrall13, Luigi Ferrucci, Jürgen Floege14, Nita G. Forouhi9, Ron T. Gansevoort15, Xijin Han16, Bo Hedblad17, Jaap J. Homan van der Heide7, Bouke G. Hepkema7, Maria P. Hernandez-Fuentes3, Maria P. Hernandez-Fuentes2, Elina Hyppönen5, Toby Johnson8, Paul E. de Jong, Nanne Kleefstra, Vasiliki Lagou15, Marta Lapsley, Yun Li16, Ruth J. F. Loos9, Jian'an Luan9, Karin Luttropp18, Céline Maréchal12, Olle Melander17, Patricia B. Munroe10, Louise Nordfors18, Afshin Parsa, Leena Peltonen19, Leena Peltonen20, Leena Peltonen21, Brenda W.J.H. Penninx22, Brenda W.J.H. Penninx23, Brenda W.J.H. Penninx7, Esperanza Perucha2, Esperanza Perucha3, Anneli Pouta24, Inga Prokopenko13, Paul Roderick25, Aimo Ruokonen24, Nilesh J. Samani26, Serena Sanna, Martin Schalling18, David Schlessinger, Georg Schlieper14, Marc A. Seelen15, Alan R. Shuldiner, Marketa Sjögren17, Johannes H. Smit22, Johannes H. Smit23, Johannes H. Smit7, Harold Snieder15, Nicole Soranzo6, Tim D. Spector2, Peter Stenvinkel27, Michael J.E. Sternberg1, R. Swaminathan3, Toshiko Tanaka, L.J. Ubink-Veltmaat, Manuela Uda, Peter Vollenweider8, Chris Wallace9, Dawn M. Waterworth28, Klaus Zerres, Gérard Waeber8, Nicholas J. Wareham9, Patrick H. Maxwell5, Mark I. McCarthy13, Marjo-Riitta Järvelin, Vincent Mooser28, Gonçalo R. Abecasis16, Liz Lightstone1, James Scott1, Gerjan Navis, Paul Elliott1, Jaspal S. Kooner1 
TL;DR: Using genome-wide association, common variants at 2p12–p13, 6q26, 17q23 and 19q13 associated with serum creatinine associated with chronic kidney disease are identified.
Abstract: Using genome-wide association, we identify common variants at 2p12-p13, 6q26, 17q23 and 19q13 associated with serum creatinine, a marker of kidney function (P = 10(-10) to 10(-15)). Of these, rs10206899 (near NAT8, 2p12-p13) and rs4805834 (near SLC7A9, 19q13) were also associated with chronic kidney disease (P = 5.0 x 10(-5) and P = 3.6 x 10(-4), respectively). Our findings provide insight into metabolic, solute and drug-transport pathways underlying susceptibility to chronic kidney disease.

262 citations

Journal ArticleDOI
TL;DR: GotCloud is presented, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data that automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information.
Abstract: The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies.

257 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations