scispace - formally typeset
Search or ask a question
Author

Gonçalo R. Abecasis

Bio: Gonçalo R. Abecasis is an academic researcher from University of Michigan. The author has contributed to research in topics: Genome-wide association study & Population. The author has an hindex of 179, co-authored 595 publications receiving 230323 citations. Previous affiliations of Gonçalo R. Abecasis include Johns Hopkins University School of Medicine & Wellcome Trust Centre for Human Genetics.


Papers
More filters
Journal ArticleDOI
Xingyi Guo1, Weiqiang Lin2, Wanqing Wen1, Jeroen R. Huyghe3, Stephanie A. Bien3, Qiuyin Cai1, Tabitha A. Harrison3, Zhishan Chen1, Conghui Qu3, Jiandong Bao1, Jirong Long1, Yuan Yuan2, Fangqin Wang2, Mengqiu Bai2, Gonçalo R. Abecasis4, Demetrius Albanes5, Sonja I. Berndt5, Stéphane Bézieau, D. Timothy Bishop6, Hermann Brenner7, Stephan Buch8, Andrea N. Burnett-Hartman9, Peter T. Campbell10, Sergi Castellví-Bel11, Andrew T. Chan12, Andrew T. Chan13, Jenny Chang-Claude7, Jenny Chang-Claude14, Stephen J. Chanock5, Sang-Hee Cho15, David V. Conti16, Albert de la Chapelle17, Edith J. M. Feskens18, Steven Gallinger19, Graham G. Giles20, Graham G. Giles21, Phyllis J. Goodman3, Andrea Gsur, Mark A. Guinter10, Marc J. Gunter22, Jochen Hampe8, Heather Hampel17, Richard B. Hayes23, Michael Hoffmeister7, Ellen Kampman18, Hyun Min Kang4, Temitope O. Keku24, Hyeong Rok Kim15, Loic Le Marchand25, Soo-Chin Lee26, Christopher I. Li3, Li Li27, Annika Lindblom28, Annika Lindblom29, Noralane M. Lindor30, Roger L. Milne21, Roger L. Milne20, Victor Moreno, Neil Murphy10, Polly A. Newcomb3, Polly A. Newcomb31, Deborah A. Nickerson31, Kenneth Offit32, Kenneth Offit33, Rachel Pearlman17, Paul D.P. Pharoah34, Elizabeth A. Platz35, John D. Potter3, Gad Rennert36, Lori C. Sakoda9, Lori C. Sakoda3, Clemens Schafmayer, Stephanie L. Schmit, Robert E. Schoen37, Fredrick R. Schumacher38, Martha L. Slattery39, Yu Ru Su3, Catherine M. Tangen3, Cornelia M. Ulrich40, Fränzel J.B. Van Duijnhoven18, Bethany Van Guelpen41, Kala Visvanathan35, Pavel Vodicka42, Pavel Vodicka43, Ludmila Vodickova42, Ludmila Vodickova43, Veronika Vymetalkova43, Veronika Vymetalkova42, Xiaoliang Wang3, Emily White31, Emily White3, Alicja Wolk29, Michael O. Woods44, Graham Casey27, Li Hsu3, Mark A. Jenkins21, Stephen B. Gruber16, Ulrike Peters3, Ulrike Peters31, Wei Zheng1 
TL;DR: A transcriptome-wide association study to identify putative susceptibility genes for colorectal cancer risk identified 25 genes and provides new insight into the biological mechanisms underlying CRC development.

28 citations

Journal ArticleDOI
TL;DR: Rare germline mutations in CIDEB conferred substantial protection from liver disease, and small interfering RNA knockdown prevented the buildup of large lipid droplets in human hepatoma cell lines challenged with oleate.
Abstract: BACKGROUND Exome sequencing in hundreds of thousands of persons may enable the identification of rare protein-coding genetic variants associated with protection from human diseases like liver cirrhosis, providing a strategy for the discovery of new therapeutic targets. METHODS We performed a multistage exome sequencing and genetic association analysis to identify genes in which rare protein-coding variants were associated with liver phenotypes. We conducted in vitro experiments to further characterize associations. RESULTS The multistage analysis involved 542,904 persons with available data on liver aminotransferase levels, 24,944 patients with various types of liver disease, and 490,636 controls without liver disease. We found that rare coding variants in APOB, ABCB4, SLC30A10, and TM6SF2 were associated with increased aminotransferase levels and an increased risk of liver disease. We also found that variants in CIDEB, which encodes a structural protein found in hepatic lipid droplets, had a protective effect. The burden of rare predicted loss-of-function variants plus missense variants in CIDEB (combined carrier frequency, 0.7%) was associated with decreased alanine aminotransferase levels (beta per allele, -1.24 U per liter; 95% confidence interval [CI], -1.66 to -0.83; P = 4.8×10-9) and with 33% lower odds of liver disease of any cause (odds ratio per allele, 0.67; 95% CI, 0.57 to 0.79; P = 9.9×10-7). Rare coding variants in CIDEB were associated with a decreased risk of liver disease across different underlying causes and different degrees of severity, including cirrhosis of any cause (odds ratio per allele, 0.50; 95% CI, 0.36 to 0.70). Among 3599 patients who had undergone bariatric surgery, rare coding variants in CIDEB were associated with a decreased nonalcoholic fatty liver disease activity score (beta per allele in score units, -0.98; 95% CI, -1.54 to -0.41 [scores range from 0 to 8, with higher scores indicating more severe disease]). In human hepatoma cell lines challenged with oleate, CIDEB small interfering RNA knockdown prevented the buildup of large lipid droplets. CONCLUSIONS Rare germline mutations in CIDEB conferred substantial protection from liver disease. (Funded by Regeneron Pharmaceuticals.).

27 citations

Journal ArticleDOI
25 Feb 2021-Gut
TL;DR: In this article, the authors identify new anatomical subsite-specific risk loci for colorectal cancer and characterised effect heterogeneity at CRC risk locis using multinomial modeling.
Abstract: OBJECTIVE: An understanding of the etiologic heterogeneity of colorectal cancer (CRC) is critical for improving precision prevention, including individualized screening recommendations and the discovery of novel drug targets and repurposable drug candidates for chemoprevention. Known differences in molecular characteristics and environmental risk factors among tumors arising in different locations of the colorectum suggest partly distinct mechanisms of carcinogenesis. The extent to which the contribution of inherited genetic risk factors for CRC differs by anatomical subsite of the primary tumor has not been examined. DESIGN: To identify new anatomical subsite-specific risk loci, we performed genome-wide association study (GWAS) meta-analyses including data of 48 214 CRC cases and 64 159 controls of European ancestry. We characterised effect heterogeneity at CRC risk loci using multinomial modelling. RESULTS: We identified 13 loci that reached genome-wide significance (p<5×10-8) and that were not reported by previous GWASs for overall CRC risk. Multiple lines of evidence support candidate genes at several of these loci. We detected substantial heterogeneity between anatomical subsites. Just over half (61) of 109 known and new risk variants showed no evidence for heterogeneity. In contrast, 22 variants showed association with distal CRC (including rectal cancer), but no evidence for association or an attenuated association with proximal CRC. For two loci, there was strong evidence for effects confined to proximal colon cancer. CONCLUSION: Genetic architectures of proximal and distal CRC are partly distinct. Studies of risk factors and mechanisms of carcinogenesis, and precision prevention strategies should take into consideration the anatomical subsite of the tumour.

27 citations

Journal ArticleDOI
25 May 2020-Genes
TL;DR: This work showed that the joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches.
Abstract: There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease-relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results. We discuss practical issues and methods to encode multi-allelic sites, conduct single-variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ~18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches. Software packages implementing these methods are available online.

27 citations

Journal ArticleDOI
TL;DR: The results provide empirical evidence for the predicted effects of genotype and relationship error and highlight the need for rigorous detection and elimination of data error in complex trait studies.
Abstract: The effects of genotype and relationship errors on linkage results are evaluated in three of the Genetic Analysis Workshop 12 asthma genome scans. A number of errors are detected in the samples. While the evidence for linkage is not striking in any data set with or without error, in some cases the difference in test statistic could support different conclusions. The results provide empirical evidence for the predicted effects of genotype and relationship error and highlight the need for rigorous detection and elimination of data error in complex trait studies.

26 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.
Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

37,898 citations

Journal ArticleDOI
TL;DR: This work introduces PLINK, an open-source C/C++ WGAS tool set, and describes the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation, which focuses on the estimation and use of identity- by-state and identity/descent information in the context of population-based whole-genome studies.
Abstract: Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.

26,280 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations