Home
/
Authors
/
Gonçalo R. Abecasis

Author

Gonçalo R. Abecasis

Other affiliations: Johns Hopkins University School of Medicine, Wellcome Trust Centre for Human Genetics, University of California, Los Angeles ...read more

Bio: Gonçalo R. Abecasis is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 179, co-authored 595 publications receiving 230323 citations. Previous affiliations of Gonçalo R. Abecasis include Johns Hopkins University School of Medicine & Wellcome Trust Centre for Human Genetics.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Heritability of Cardiovascular and Personality Traits in 6,148 Sardinians

[...]

Giuseppe Pilia, Wei-Min Chen¹, Angelo Scuteri², Marco Orru, Giuseppe Albai, Mariano Dei, Sandra Lai, Gianluca Usala, Monica Lai, Paola Loi, Cinzia Mameli, Loredana Vacca, Manila Deiana, Nazario Olla, Marco Masala, Antonio Cao, Samer S. Najjar², Antonio Terracciano², Timur Nedorezov², Alexei A. Sharov², Alan B. Zonderman², Gonçalo R. Abecasis¹, Paul T. Costa², Edward G. Lakatta², David Schlessinger² - Show less +21 more•Institutions (2)

University of Michigan¹, National Institutes of Health²

01 Jan 2005-PLOS Genetics

TL;DR: Significant evidence for heritability of many medically important traits, including cardiovascular function and personality is found, and evidence for heterogeneity by age and sex suggests that models allowing for these differences will be important in mapping quantitative traits.

...read moreread less

Abstract: In family studies, phenotypic similarities between relatives yield information on the overall contribution of genes to trait variation. Large samples are important for these family studies, especially when comparing heritability between subgroups such as young and old, or males and females. We recruited a cohort of 6,148 participants, aged 14–102 y, from four clustered towns in Sardinia. The cohort includes 34,469 relative pairs. To extract genetic information, we implemented software for variance components heritability analysis, designed to handle large pedigrees, analyze multiple traits simultaneously, and model heterogeneity. Here, we report heritability analyses for 98 quantitative traits, focusing on facets of personality and cardiovascular function. We also summarize results of bivariate analyses for all pairs of traits and of heterogeneity analyses for each trait. We found a significant genetic component for every trait. On average, genetic effects explained 40% of the variance for 38 blood tests, 51% for five anthropometric measures, 25% for 20 measures of cardiovascular function, and 19% for 35 personality traits. Four traits showed significant evidence for an X-linked component. Bivariate analyses suggested overlapping genetic determinants for many traits, including multiple personality facets and several traits related to the metabolic syndrome; but we found no evidence for shared genetic determinants that might underlie the reported association of some personality traits and cardiovascular risk factors. Models allowing for heterogeneity suggested that, in this cohort, the genetic variance was typically larger in females and in younger individuals, but interesting exceptions were observed. For example, narrow heritability of blood pressure was approximately 26% in individuals more than 42 y old, but only approximately 8% in younger individuals. Despite the heterogeneity in effect sizes, the same loci appear to contribute to variance in young and old, and in males and females. In summary, we find significant evidence for heritability of many medically important traits, including cardiovascular function and personality. Evidence for heterogeneity by age and sex suggests that models allowing for these differences will be important in mapping quantitative traits.

...read moreread less

547 citations

Journal Article•DOI•

Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction

[...]

Ron Do¹, Ron Do², Nathan O. Stitziel³, Hong-Hee Won², Hong-Hee Won¹, Anders Berg Jørgensen⁴, Stefano Duga⁵, Pier Angelica Merlini, Adam Kiezun¹, Martin Farrall⁶, Anuj Goel⁶, Or Zuk¹, Illaria Guella⁵, Rosanna Asselta⁵, Leslie A. Lange⁷, Gina M. Peloso², Gina M. Peloso¹, Paul L. Auer⁸, Domenico Girelli⁹, Nicola Martinelli⁹, Deborah N. Farlow¹, Mark A. DePristo¹, Robert Roberts¹⁰, Alex Stewart¹⁰, Danish Saleheen¹¹, John Danesh¹¹, Stephen E. Epstein¹², Suthesh Sivapalaratnam¹³, G. Kees Hovingh¹³, John J.P. Kastelein¹³, Nilesh J. Samani¹⁴, Heribert Schunkert¹⁵, Jeanette Erdmann¹⁶, Svati H. Shah¹⁷, William E. Kraus¹⁷, Robert W. Davies¹⁰, Majid Nikpay¹⁰, Christopher T. Johansen¹⁸, Jian Wang¹⁸, Robert A. Hegele¹⁸, Eliana Hechter¹, Winfried März¹⁹, Winfried März²⁰, Winfried März²¹, Marcus E. Kleber²⁰, Jie Huang, Andrew D. Johnson²², Mingyao Li²³, Greg L. Burke²⁴, Myron D. Gross²⁵, Yongmei Liu²⁶, Themistocles L. Assimes²⁷, Gerardo Heiss⁷, Ethan M. Lange⁷, Aaron R. Folsom²⁵, Herman A. Taylor²⁸, Oliviero Olivieri⁹, Anders Hamsten²⁹, Robert Clarke⁶, Dermot F. Reilly³⁰, Wu Yin³⁰, Manuel A. Rivas⁶, Peter Donnelly⁶, Jacques E. Rossouw²², Bruce M. Psaty³¹, Bruce M. Psaty³², David M. Herrington²⁶, James G. Wilson²⁸, Stephen S. Rich³³, Michael J. Bamshad³², Russell P. Tracy³⁴, L. Adrienne Cupples³⁵, Daniel J. Rader²³, Muredach P. Reilly²³, John A. Spertus³⁶, Sharon Cresci³, Jaana Hartiala³⁷, W.H. Wilson Tang³⁸, Stanley L. Hazen³⁸, Hooman Allayee³⁷, Alexander P. Reiner³², Alexander P. Reiner⁸, Christopher S. Carlson⁸, Charles Kooperberg⁸, Rebecca D. Jackson³⁹, Eric Boerwinkle⁴⁰, Eric S. Lander¹, Stephen M. Schwartz³², Stephen M. Schwartz⁸, David S. Siscovick³², Ruth McPherson¹⁰, Anne Tybjærg-Hansen⁴, Gonçalo R. Abecasis⁴¹, Hugh Watkins⁶, Deborah A. Nickerson³², Diego Ardissino, Shamil R. Sunyaev¹, Shamil R. Sunyaev², Christopher J. O'Donnell, David Altshuler², David Altshuler¹, Stacey Gabriel¹, Sekar Kathiresan², Sekar Kathiresan¹ - Show less +100 more•Institutions (41)

Broad Institute¹, Harvard University², Washington University in St. Louis³, University of Copenhagen⁴, University of Milan⁵, University of Oxford⁶, University of North Carolina at Chapel Hill⁷, Fred Hutchinson Cancer Research Center⁸, University of Verona⁹, University of Ottawa¹⁰, University of Cambridge¹¹, Memorial Hospital of South Bend¹², University of Amsterdam¹³, University of Leicester¹⁴, Technische Universität München¹⁵, University of Lübeck¹⁶, Duke University¹⁷, University of Western Ontario¹⁸, Synlab Group¹⁹, Heidelberg University²⁰, Medical University of Graz²¹, National Institutes of Health²², University of Pennsylvania²³, University of Alabama at Birmingham²⁴, University of Minnesota²⁵, Wake Forest University²⁶, Stanford University²⁷, University of Mississippi²⁸, Karolinska Institutet²⁹, Merck & Co.³⁰, Group Health Cooperative³¹, University of Washington³², University of Virginia³³, University of Vermont³⁴, Boston University³⁵, University of Missouri–Kansas City³⁶, University of Southern California³⁷, Cleveland Clinic³⁸, Ohio State University³⁹, University of Texas Health Science Center at Houston⁴⁰, University of Michigan⁴¹

05 Feb 2015-Nature

TL;DR: Kathiresan et al. as mentioned in this paper used exome sequencing of nearly 10,000 people to identify alleles associated with early-onset myocardial infarction; mutations in low-density lipoprotein receptor (LDLR) or apolipoprotein A-V (APOA5) were associated with disease risk.

...read moreread less

Abstract: Exome sequence analysis of nearly 10,000 people was carried out to identify alleles associated with early-onset myocardial infarction; mutations in low-density lipoprotein receptor (LDLR) or apolipoprotein A-V (APOA5) were associated with disease risk, identifying the key roles of low-density lipoprotein cholesterol and metabolism of triglyceride-rich lipoproteins. Sekar Kathiresan and colleagues use exome sequencing of nearly 10,000 people to probe the contribution of multiple rare mutations within a gene to risk for myocardial infarction at a population level. They find that mutations in low-density lipoprotein receptor (LDLR) or apolipoprotein A-V (APOA5) are associated with disease risk. When compared with non-carriers, LDLR mutation carriers had higher plasma levels of LDL cholesterol, whereas APOA5 mutation carriers had higher plasma levels of triglycerides. As well as confirming that APOA5 is a myocardial infarction gene, this work informs the design and conduct of rare-variant association studies for complex diseases. Myocardial infarction (MI), a leading cause of death around the world, displays a complex pattern of inheritance1,2. When MI occurs early in life, genetic inheritance is a major component to risk1. Previously, rare mutations in low-density lipoprotein (LDL) genes have been shown to contribute to MI risk in individual families3,4,5,6,7,8, whereas common variants at more than 45 loci have been associated with MI risk in the population9,10,11,12,13,14,15. Here we evaluate how rare mutations contribute to early-onset MI risk in the population. We sequenced the protein-coding regions of 9,793 genomes from patients with MI at an early age (≤50 years in males and ≤60 years in females) along with MI-free controls. We identified two genes in which rare coding-sequence mutations were more frequent in MI cases versus controls at exome-wide significance. At low-density lipoprotein receptor (LDLR), carriers of rare non-synonymous mutations were at 4.2-fold increased risk for MI; carriers of null alleles at LDLR were at even higher risk (13-fold difference). Approximately 2% of early MI cases harbour a rare, damaging mutation in LDLR; this estimate is similar to one made more than 40 years ago using an analysis of total cholesterol16. Among controls, about 1 in 217 carried an LDLR coding-sequence mutation and had plasma LDL cholesterol > 190 mg dl−1. At apolipoprotein A-V (APOA5), carriers of rare non-synonymous mutations were at 2.2-fold increased risk for MI. When compared with non-carriers, LDLR mutation carriers had higher plasma LDL cholesterol, whereas APOA5 mutation carriers had higher plasma triglycerides. Recent evidence has connected MI risk with coding-sequence mutations at two genes functionally related to APOA5, namely lipoprotein lipase15,17 and apolipoprotein C-III (refs 18, 19). Combined, these observations suggest that, as well as LDL cholesterol, disordered metabolism of triglyceride-rich lipoproteins contributes to MI risk.

...read moreread less

521 citations

Journal Article•DOI•

The Metabochip, a Custom Genotyping Array for Genetic Studies of Metabolic, Cardiovascular, and Anthropometric Traits

[...]

Benjamin F. Voight¹, Hyun Min Kang², Jun Ding³, Cameron D. Palmer¹, Cameron D. Palmer⁴, Carlo Sidore², Carlo Sidore⁵, Peter S. Chines³, N P Burtt¹, Christian Fuchsberger², Yun Li², Jeanette Erdmann⁶, Timothy M. Frayling⁷, Iris M. Heid⁸, Anne U. Jackson², Toby Johnson⁹, Tuomas O. Kilpeläinen, Cecilia M. Lindgren¹⁰, Andrew P. Morris¹⁰, Inga Prokopenko¹¹, Inga Prokopenko¹⁰, Joshua C. Randall¹⁰, Richa Saxena¹, Richa Saxena¹², Nicole Soranzo¹³, Elizabeth K. Speliotes², Elizabeth K. Speliotes¹, Tanya M. Teslovich², Eleanor Wheeler¹³, Jared Maguire¹, Melissa Parkin¹, Simon C. Potter¹³, Nigel W. Rayner¹¹, Nigel W. Rayner¹⁰, Nigel W. Rayner¹³, Neil Robertson¹⁰, Neil Robertson¹¹, Kathy Stirrups¹³, Wendy Winckler¹, Serena Sanna, Antonella Mulas, Ramaiah Nagaraja³, Francesco Cucca⁵, Inês Barroso¹³, Inês Barroso¹⁴, Panagiotis Deloukas¹³, Ruth J. F. Loos, Sekar Kathiresan, Patricia B. Munroe⁹, Christopher Newton-Cheh, Arne Pfeufer¹⁵, Nilesh J. Samani¹⁶, Nilesh J. Samani¹⁷, Heribert Schunkert⁶, Joel N. Hirschhorn¹, Joel N. Hirschhorn¹², Joel N. Hirschhorn⁴, David Altshuler, Mark I. McCarthy¹⁰, Mark I. McCarthy¹⁸, Mark I. McCarthy¹¹, Gonçalo R. Abecasis², Michael Boehnke² - Show less +59 more•Institutions (18)

Massachusetts Institute of Technology¹, University of Michigan², National Institutes of Health³, Boston Children's Hospital⁴, University of Sassari⁵, University of Lübeck⁶, Peninsula College of Medicine and Dentistry⁷, University Hospital Regensburg⁸, Queen Mary University of London⁹, Wellcome Trust Centre for Human Genetics¹⁰, University of Oxford¹¹, Harvard University¹², Wellcome Trust Sanger Institute¹³, University of Cambridge¹⁴, Technische Universität München¹⁵, University of Leicester¹⁶, Glenfield Hospital¹⁷, Churchill Hospital¹⁸

02 Aug 2012-PLOS Genetics

TL;DR: The Metabochip and its component SNP sets are described and evaluated, its performance in capturing variation across the allele-frequency spectrum is evaluated, solutions to methodological challenges commonly encountered in its analysis are described, and its performance as a platform for genotype imputation is evaluated.

...read moreread less

Abstract: Genome-wide association studies have identified hundreds of loci for type 2 diabetes, coronary artery disease and myocardial infarction, as well as for related traits such as body mass index, glucose and insulin levels, lipid levels, and blood pressure. These studies also have pointed to thousands of loci with promising but not yet compelling association evidence. To establish association at additional loci and to characterize the genome-wide significant loci by fine-mapping, we designed the ‘‘Metabochip,’’ a custom genotyping array that assays nearly 200,000 SNP markers. Here, we describe the Metabochip and its component SNP sets, evaluate its performance in capturing variation across the allele-frequency spectrum, describe solutions to methodological challenges commonly encountered in its analysis, and evaluate its performance as a platform for genotype imputation. The metabochip achieves dramatic cost efficiencies compared to designing single-trait follow-up reagents, and provides the opportunity to compare results across a range of related traits. The metabochip and similar custom genotyping arrays offer a powerful and cost-effective approach to follow-up large-scale genotyping and sequencing studies and advance our understanding of the genetic basis of complex human diseases and traits.

...read moreread less

516 citations

Journal Article•DOI•

Deletion of the late cornified envelope LCE3B and LCE3C genes as a susceptibility factor for psoriasis

[...]

Rafael de Cid, Eva Riveira-Muñoz, Patrick L.J.M. Zeeuwen¹, Jason Robarge², Wilson Liao³, Emma N. Dannhauser⁴, Emiliano Giardina⁵, Philip E. Stuart⁶, Rajan P. Nair⁶, Cynthia Helms², Geòrgia Escaramís, Ester Ballana, Gemma Martin-Ezquerra⁷, Martin den Heijer¹, Marijke Kamsteeg¹, Irma Joosten¹, Evan E. Eichler⁸, Conxi Lázaro, Ramon M. Pujol⁷, Lluís Armengol, Gonçalo R. Abecasis⁶, James T. Elder⁶, Giuseppe Novelli⁵, John A.L. Armour⁴, Pui-Yan Kwok³, Anne M. Bowcock², Joost Schalkwijk¹, Xavier Estivill⁷ - Show less +24 more•Institutions (8)

Radboud University Nijmegen¹, Washington University in St. Louis², University of California, San Francisco³, University of Nottingham⁴, University of Rome Tor Vergata⁵, University of Michigan⁶, Pompeu Fabra University⁷, University of Washington⁸

25 Jan 2009-Nature Genetics

TL;DR: LCE expression can be induced in normal epidermis by skin barrier disruption and is strongly expressed in psoriatic lesions, suggesting that compromised skin barrier function has a role in psoriasis susceptibility.

...read moreread less

Abstract: Psoriasis is a common inflammatory skin disease with a prevalence of 2-3% in individuals of European ancestry. In a genome-wide search for copy number variants (CNV) using a sample pooling approach, we have identified a deletion comprising LCE3B and LCE3C, members of the late cornified envelope (LCE) gene cluster. The absence of LCE3B and LCE3C (LCE3C_LCE3B-del) is significantly associated (P = 1.38E-08) with risk of psoriasis in 2,831 samples from Spain, The Netherlands, Italy and the United States, and in a family-based study (P = 5.4E-04). LCE3C_LCE3B-del is tagged by rs4112788 (r(2) = 0.93), which is also strongly associated with psoriasis (P < 6.6E-09). LCE3C_LCE3B-del shows epistatic effects with the HLA-Cw6 allele on the development of psoriasis in Dutch samples and multiplicative effects in the other samples. LCE expression can be induced in normal epidermis by skin barrier disruption and is strongly expressed in psoriatic lesions, suggesting that compromised skin barrier function has a role in psoriasis susceptibility.

...read moreread less

514 citations

Journal Article•DOI•

Genome-wide association study in individuals of South Asian ancestry identifies six new type 2 diabetes susceptibility loci

[...]

Jaspal S. Kooner¹, Danish Saleheen², Xueling Sim³, Joban Sehmi¹, Joban Sehmi⁴, Weihua Zhang⁵, Philippe M. Frossard, Latonya F. Been⁶, Kee Seng Chia³, Antigone S. Dimas⁷, Antigone S. Dimas⁸, Neelam Hassanali⁸, Tazeen H. Jafar⁹, Jeremy B M Jowett¹⁰, Xinzhong Li⁵, Venkatesan Radha¹¹, Simon D. Rees¹², Simon D. Rees¹³, Fumihiko Takeuchi, Robin Young², Tin Aung¹⁴, Tin Aung³, Abdul Basit, Manickam Chidambaram¹¹, Debashish Das¹⁵, Elin Grundberg¹⁶, Åsa K. Hedman⁸, Zafar I. Hydrie, Muhammed Islam⁹, Chiea Chuen Khor¹⁷, Chiea Chuen Khor³, Sudhir Kowlessur, Malene M. Kristensen¹⁰, Samuel Liju¹¹, Wei-Yen Lim³, David R. Matthews⁸, Jianjun Liu¹⁷, Andrew P. Morris⁸, Alexandra C. Nica⁷, Janani Pinidiyapathirage¹⁸, Inga Prokopenko⁸, Asif Rasheed, Maria Samuel, Nabi Shah, A. Samad Shera, Kerrin S. Small¹⁶, Kerrin S. Small¹⁹, Chen Suo³, Ananda R. Wickremasinghe¹⁸, Tien Yin Wong²⁰, Tien Yin Wong¹⁴, Tien Yin Wong³, Mingyu Yang²¹, Fan Zhang²¹, MuTHER¹², MuTHER¹³, Gonçalo R. Abecasis²², Anthony H. Barnett¹², Anthony H. Barnett¹³, Mark J. Caulfield²³, Panos Deloukas¹⁹, Timothy M. Frayling²⁴, Philippe Froguel⁵, Norihiro Kato, Prasad Katulanda²⁵, Prasad Katulanda⁸, M. Ann Kelly¹², M. Ann Kelly¹³, Junbin Liang²¹, Viswanathan Mohan¹¹, Dharambir K. Sanghera²⁶, James Scott⁵, Mark Seielstad²⁷, Paul Zimmet²⁸, Paul Elliott⁵, Yik Ying Teo, Mark I. McCarthy⁸, Mark I. McCarthy²⁹, Mark I. McCarthy³⁰, John Danesh², E. Shyong Tai³, John C. Chambers⁴, John C. Chambers³¹, John C. Chambers⁵ - Show less +80 more•Institutions (31)

National Institutes of Health¹, University of Cambridge², National University of Singapore³, Ealing Hospital⁴, Imperial College London⁵, University of Oklahoma⁶, University of Geneva⁷, University of Oxford⁸, Aga Khan University⁹, The Heart Research Institute¹⁰, Indian Council of Medical Research¹¹, Heart of England NHS Foundation Trust¹², University of Birmingham¹³, Singapore National Eye Center¹⁴, National Health Service¹⁵, King's College London¹⁶, Agency for Science, Technology and Research¹⁷, University of Kelaniya¹⁸, Wellcome Trust Sanger Institute¹⁹, University of Melbourne²⁰, Beijing Genomics Institute²¹, University of Michigan²², Queen Mary University of London²³, University of Exeter²⁴, University of Colombo²⁵, University of Oklahoma Health Sciences Center²⁶, University of California, San Francisco²⁷, Baker IDI Heart and Diabetes Institute²⁸, Wellcome Trust Centre for Human Genetics²⁹, National Institute for Health Research³⁰, Imperial College Healthcare³¹

01 Oct 2011-Nature Genetics

TL;DR: A genome-wide association study of type-2 diabetes in individuals of South Asian ancestry provides additional insight into mechanisms underlying T2D and shows the potential for new discovery from genetic association studies in South Asians.

...read moreread less

Abstract: John Chambers and colleagues report a genome-wide association study for type 2 diabetes in individuals of south Asian ancestry. They identify six loci newly associated with type 2 diabetes.

...read moreread less

513 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
…
15
16
17
18
19
20
21
…
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Fast and accurate short read alignment with Burrows–Wheeler transform

[...]

Heng Li¹, Richard Durbin¹•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jul 2009-Bioinformatics

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.

...read moreread less

Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

...read moreread less

43,862 citations

Journal Article•DOI•

Fast gapped-read alignment with Bowtie 2

[...]

Ben Langmead¹, Steven L. Salzberg², Steven L. Salzberg¹, Steven L. Salzberg³•Institutions (3)

University of Maryland, College Park¹, Johns Hopkins University², Johns Hopkins University School of Medicine³

01 Apr 2012-Nature Methods

TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

37,898 citations

Journal Article•DOI•

PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

[...]

Shaun Purcell¹, Shaun Purcell², Benjamin M. Neale³, Benjamin M. Neale¹, Kathe Todd-Brown², Lori Thomas², Manuel A. R. Ferreira², David Bender¹, David Bender², Julian Maller¹, Julian Maller², Pamela Sklar¹, Pamela Sklar², Paul I.W. de Bakker², Paul I.W. de Bakker¹, Mark J. Daly¹, Mark J. Daly², Pak C. Sham⁴ - Show less +14 more•Institutions (4)

Massachusetts Institute of Technology¹, Harvard University², University of London³, University of Hong Kong⁴

01 Sep 2007-American Journal of Human Genetics

TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.

...read moreread less

Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

...read moreread less

26,280 citations

Journal Article•DOI•

Initial sequencing and analysis of the human genome.

[...]

Eric S. Lander¹, Lauren Linton¹, Bruce W. Birren¹, Chad Nusbaum¹ +245 more•Institutions (29)

15 Feb 2001-Nature

TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

...read moreread less

Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

...read moreread less

22,269 citations

Journal Article•DOI•

The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data

[...]

Aaron McKenna¹, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kristian Cibulskis, Andrew Kernytsky, Kiran V. Garimella, David Altshuler, Stacey Gabriel, Mark J. Daly, Mark A. DePristo - Show less +7 more•Institutions (1)

Broad Institute¹

01 Sep 2010-Genome Research

TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

...read moreread less

20,557 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse