scispace - formally typeset
Search or ask a question
Author

Alexander Gusev

Bio: Alexander Gusev is an academic researcher from Harvard University. The author has contributed to research in topics: Genome-wide association study & Medicine. The author has an hindex of 40, co-authored 185 publications receiving 11407 citations. Previous affiliations of Alexander Gusev include Broad Institute & Brigham and Women's Hospital.


Papers
More filters
Journal ArticleDOI
TL;DR: This work introduces a technique—cross-trait LD Score regression—for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap, and uses this method to estimate 276 genetic correlations among 24 traits.
Abstract: Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual-level genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique-cross-trait LD Score regression-for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use this method to estimate 276 genetic correlations among 24 traits. The results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity, and educational attainment and several diseases. These results highlight the power of genome-wide analyses, as there currently are no significantly associated SNPs for anorexia nervosa and only three for educational attainment.

2,993 citations

Journal ArticleDOI
TL;DR: A new method is introduced, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers, which is computationally tractable at very large sample sizes and leverages genome-wide information.
Abstract: Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.

1,939 citations

Journal ArticleDOI
TL;DR: A powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated with complex traits is introduced.
Abstract: Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance of one or multiple proteins. Here we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated with complex traits. We leverage expression imputation from genetic data to perform a transcriptome-wide association study (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ∼ 3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 new genes significantly associated with obesity-related traits (BMI, lipids and height). Many of these genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits.

1,473 citations

Journal ArticleDOI
TL;DR: LDpred is introduced, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel, and outperforms the approach of pruning followed by thresholding, particularly at large sample sizes.
Abstract: Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

1,088 citations

Journal ArticleDOI
TL;DR: An approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics and found significant tissue-specific enrichments for 34 traits.
Abstract: We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.

707 citations


Cited by
More filters
Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
27 May 2020-Nature
TL;DR: A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.
Abstract: Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases. A catalogue of predicted loss-of-function variants in 125,748 whole-exome and 15,708 whole-genome sequencing datasets from the Genome Aggregation Database (gnomAD) reveals the spectrum of mutational constraints that affect these human protein-coding genes.

4,913 citations

01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
David E. Gordon, Gwendolyn M. Jang, Mehdi Bouhaddou, Jiewei Xu, Kirsten Obernier, Kris M. White1, Matthew J. O’Meara2, Veronica V. Rezelj3, Jeffrey Z. Guo, Danielle L. Swaney, Tia A. Tummino4, Ruth Hüttenhain, Robyn M. Kaake, Alicia L. Richards, Beril Tutuncuoglu, Helene Foussard, Jyoti Batra, Kelsey M. Haas, Maya Modak, Minkyu Kim, Paige Haas, Benjamin J. Polacco, Hannes Braberg, Jacqueline M. Fabius, Manon Eckhardt, Margaret Soucheray, Melanie J. Bennett, Merve Cakir, Michael McGregor, Qiongyu Li, Bjoern Meyer3, Ferdinand Roesch3, Thomas Vallet3, Alice Mac Kain3, Lisa Miorin1, Elena Moreno1, Zun Zar Chi Naing, Yuan Zhou, Shiming Peng4, Ying Shi, Ziyang Zhang, Wenqi Shen, Ilsa T Kirby, James E. Melnyk, John S. Chorba, Kevin Lou, Shizhong Dai, Inigo Barrio-Hernandez5, Danish Memon5, Claudia Hernandez-Armenta5, Jiankun Lyu4, Christopher J.P. Mathy, Tina Perica4, Kala Bharath Pilla4, Sai J. Ganesan4, Daniel J. Saltzberg4, Rakesh Ramachandran4, Xi Liu4, Sara Brin Rosenthal6, Lorenzo Calviello4, Srivats Venkataramanan4, Jose Liboy-Lugo4, Yizhu Lin4, Xi Ping Huang7, Yongfeng Liu7, Stephanie A. Wankowicz, Markus Bohn4, Maliheh Safari4, Fatima S. Ugur, Cassandra Koh3, Nastaran Sadat Savar3, Quang Dinh Tran3, Djoshkun Shengjuler3, Sabrina J. Fletcher3, Michael C. O’Neal, Yiming Cai, Jason C.J. Chang, David J. Broadhurst, Saker Klippsten, Phillip P. Sharp4, Nicole A. Wenzell4, Duygu Kuzuoğlu-Öztürk4, Hao-Yuan Wang4, Raphael Trenker4, Janet M. Young8, Devin A. Cavero9, Devin A. Cavero4, Joseph Hiatt4, Joseph Hiatt9, Theodore L. Roth, Ujjwal Rathore9, Ujjwal Rathore4, Advait Subramanian4, Julia Noack4, Mathieu Hubert3, Robert M. Stroud4, Alan D. Frankel4, Oren S. Rosenberg, Kliment A. Verba4, David A. Agard4, Melanie Ott, Michael Emerman8, Natalia Jura, Mark von Zastrow, Eric Verdin4, Eric Verdin10, Alan Ashworth4, Olivier Schwartz3, Christophe d'Enfert3, Shaeri Mukherjee4, Matthew P. Jacobson4, Harmit S. Malik8, Danica Galonić Fujimori, Trey Ideker6, Charles S. Craik, Stephen N. Floor4, James S. Fraser4, John D. Gross4, Andrej Sali, Bryan L. Roth7, Davide Ruggero, Jack Taunton4, Tanja Kortemme, Pedro Beltrao5, Marco Vignuzzi3, Adolfo García-Sastre, Kevan M. Shokat, Brian K. Shoichet4, Nevan J. Krogan 
30 Apr 2020-Nature
TL;DR: A human–SARS-CoV-2 protein interaction map highlights cellular processes that are hijacked by the virus and that can be targeted by existing drugs, including inhibitors of mRNA translation and predicted regulators of the sigma receptors.
Abstract: A newly described coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is the causative agent of coronavirus disease 2019 (COVID-19), has infected over 2.3 million people, led to the death of more than 160,000 individuals and caused worldwide social and economic disruption1,2. There are no antiviral drugs with proven clinical efficacy for the treatment of COVID-19, nor are there any vaccines that prevent infection with SARS-CoV-2, and efforts to develop drugs and vaccines are hampered by the limited knowledge of the molecular details of how SARS-CoV-2 infects cells. Here we cloned, tagged and expressed 26 of the 29 SARS-CoV-2 proteins in human cells and identified the human proteins that physically associated with each of the SARS-CoV-2 proteins using affinity-purification mass spectrometry, identifying 332 high-confidence protein–protein interactions between SARS-CoV-2 and human proteins. Among these, we identify 66 druggable human proteins or host factors targeted by 69 compounds (of which, 29 drugs are approved by the US Food and Drug Administration, 12 are in clinical trials and 28 are preclinical compounds). We screened a subset of these in multiple viral assays and found two sets of pharmacological agents that displayed antiviral activity: inhibitors of mRNA translation and predicted regulators of the sigma-1 and sigma-2 receptors. Further studies of these host-factor-targeting agents, including their combination with drugs that directly target viral enzymes, could lead to a therapeutic regimen to treat COVID-19. A human–SARS-CoV-2 protein interaction map highlights cellular processes that are hijacked by the virus and that can be targeted by existing drugs, including inhibitors of mRNA translation and predicted regulators of the sigma receptors.

3,319 citations