scispace - formally typeset
Search or ask a question
Author

Xiang Zhou

Bio: Xiang Zhou is an academic researcher from University of Michigan. The author has an hindex of 1, co-authored 1 publications receiving 9 citations.

Papers
More filters
Journal ArticleDOI
17 Jun 2020
TL;DR: A technical review on thirteen TWAS methods and shows that these methods can all be viewed as two-sample Mendelian randomization (MR) analysis, which has been widely applied in GWASs for examining the causal effects of exposure on outcome.
Abstract: Genome-wide association studies (GWASs) have identified thousands of genetic variants that are associated with many complex traits However, their biological mechanisms remain largely unknown Transcriptome-wide association studies (TWAS) have been recently proposed as an invaluable tool for investigating the potential gene regulatory mechanisms underlying variant-trait associations Specifically, TWAS integrate GWAS with expression mapping studies based on a common set of variants and aim to identify genes whose GReX is associated with the phenotype Various methods have been developed for performing TWAS and/or similar integrative analysis Each such method has a different modeling assumption and many were initially developed to answer different biological questions Consequently, it is not straightforward to understand their modeling property from a theoretical perspective We present a technical review on thirteen TWAS methods Importantly, we show that these methods can all be viewed as two-sample Mendelian randomization (MR) analysis, which has been widely applied in GWASs for examining the causal effects of exposure on outcome Viewing different TWAS methods from an MR perspective provides us a unique angle for understanding their benefits and pitfalls We systematically introduce the MR analysis framework, explain how features of the GWAS and expression data influence the adaptation of MR for TWAS, and re-interpret the modeling assumptions made in different TWAS methods from an MR angle We finally describe future directions for TWAS methodology development We hope that this review would serve as a useful reference for both methodologists who develop TWAS methods and practitioners who perform TWAS analysis

19 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors developed a method, moPMR-Egger (multiple outcome probabilistic Mendelian randomization with Egger assumption), for analyzing multiple outcome traits in TWAS applications.
Abstract: A transcriptome-wide association study (TWAS) integrates data from genome-wide association studies and gene expression mapping studies for investigating the gene regulatory mechanisms underlying diseases. Existing TWAS methods are primarily univariate in nature, focusing on analyzing one outcome trait at a time. However, many complex traits are correlated with each other and share a common genetic basis. Consequently, analyzing multiple traits jointly through multivariate analysis can potentially improve the power of TWASs. Here, we develop a method, moPMR-Egger (multiple outcome probabilistic Mendelian randomization with Egger assumption), for analyzing multiple outcome traits in TWAS applications. moPMR-Egger examines one gene at a time, relies on its cis-SNPs that are in potential linkage disequilibrium with each other to serve as instrumental variables, and tests its causal effects on multiple traits jointly. A key feature of moPMR-Egger is its ability to test and control for potential horizontal pleiotropic effects from instruments, thus maximizing power while minimizing false associations for TWASs. In simulations, moPMR-Egger provides calibrated type I error control for both causal effects testing and horizontal pleiotropic effects testing and is more powerful than existing univariate TWAS approaches in detecting causal associations. We apply moPMR-Egger to analyze 11 traits from 5 trait categories in the UK Biobank. In the analysis, moPMR-Egger identified 13.15% more gene associations than univariate approaches across trait categories and revealed distinct regulatory mechanisms underlying systolic and diastolic blood pressures.

30 citations

Journal ArticleDOI
TL;DR: Huang et al. as mentioned in this paper developed a new method, HMAT, which aggregates TWAS association evidence obtained across multiple gene expression prediction models by leveraging the harmonic mean P-value combination strategy.
Abstract: Transcriptome-wide association study (TWAS) is an important integrative method for identifying genes that are causally associated with phenotypes. A key step of TWAS involves the construction of expression prediction models for every gene in turn using its cis-SNPs as predictors. Different TWAS methods rely on different models for gene expression prediction, and each such model makes a distinct modeling assumption that is often suitable for a particular genetic architecture underlying expression. However, the genetic architectures underlying gene expression vary across genes throughout the transcriptome. Consequently, different TWAS methods may be beneficial in detecting genes with distinct genetic architectures. Here, we develop a new method, HMAT, which aggregates TWAS association evidence obtained across multiple gene expression prediction models by leveraging the harmonic mean P-value combination strategy. Because each expression prediction model is suited to capture a particular genetic architecture, aggregating TWAS associations across prediction models as in HMAT improves accurate expression prediction and enables subsequent powerful TWAS analysis across the transcriptome. A key feature of HMAT is its ability to accommodate the correlations among different TWAS test statistics and produce calibrated P-values after aggregation. Through numerical simulations, we illustrated the advantage of HMAT over commonly used TWAS methods as well as ad hoc P-value combination rules such as Fisher's method. We also applied HMAT to analyze summary statistics of nine common diseases. In the real data applications, HMAT was on average 30.6% more powerful compared to the next best method, detecting many new disease-associated genes that were otherwise not identified by existing TWAS approaches. In conclusion, HMAT represents a flexible and powerful TWAS method that enjoys robust performance across a range of genetic architectures underlying gene expression.

17 citations

Journal ArticleDOI
TL;DR: A review of the emerging bulk RNA-Seq-based analyses, emphasizing less familiar and underused applications, is presented in this paper, highlighting the power of bulk RNASeq in providing biological insights.
Abstract: Significant innovations in next-generation sequencing techniques and bioinformatics tools have impacted our appreciation and understanding of RNA. Practical RNA sequencing (RNA-Seq) applications have evolved in conjunction with sequence technology and bioinformatic tools advances. In most projects, bulk RNA-Seq data is used to measure gene expression patterns, isoform expression, alternative splicing and single-nucleotide polymorphisms. However, RNA-Seq holds far more hidden biological information including details of copy number alteration, microbial contamination, transposable elements, cell type (deconvolution) and the presence of neoantigens. Recent novel and advanced bioinformatic algorithms developed the capacity to retrieve this information from bulk RNA-Seq data, thus broadening its scope. The focus of this review is to comprehend the emerging bulk RNA-Seq-based analyses, emphasizing less familiar and underused applications. In doing so, we highlight the power of bulk RNA-Seq in providing biological insights.

15 citations

Journal ArticleDOI
TL;DR: The METRO method as discussed by the authors incorporates expression prediction models constructed in different genetic ancestries through a likelihood-based inference framework, producing calibrated p values with substantially improved transcriptome-wide association study power.
Abstract: Integrative analysis of genome-wide association studies (GWASs) and gene expression studies in the form of a transcriptome-wide association study (TWAS) has the potential to better elucidate the molecular mechanisms underlying disease etiology. Here we present a method, METRO, that can leverage gene expression data collected from multiple genetic ancestries to enhance TWASs. METRO incorporates expression prediction models constructed in different genetic ancestries through a likelihood-based inference framework, producing calibrated p values with substantially improved TWAS power. We illustrate the benefits of METRO in both simulations and applications to seven complex traits and diseases obtained from four GWASs. These GWASs include two of primarily European ancestry (n = 188,577 and 339,226) and two of primarily African ancestry (n = 42,752 and 23,827). In the real data applications, we leverage gene expression data measured on 1,032 African Americans and 801 European Americans from the Genetic Epidemiology Network of Arteriopathy (GENOA) study to identify a substantially larger number of gene-trait associations as compared to existing TWAS approaches. The benefits of METRO are most prominent in applications to GWASs of African ancestry where the sample size is much smaller than GWASs of European ancestry and where a more powerful TWAS method is crucial. Among the identified associations are high-density lipoprotein-associated genes including PLTP and PPARG that are critical for maintaining lipid homeostasis and the type II diabetes-associated gene MAPT that supports microtubule-associated protein tau as a key component underlying impaired insulin secretion.

7 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a comprehensive technical review to summarize ten existing methods for trait-tissue relevance inference, including functional annotation information, expression quantitative trait loci information, genetically regulated gene expression information, as well as gene co-expression network information.
Abstract: Genome-wide association studies (GWASs) have identified and replicated many genetic variants that are associated with diseases and disease-related complex traits. However, the biological mechanisms underlying these identified associations remain largely elusive. Exploring the biological mechanisms underlying these associations requires identifying trait-relevant tissues and cell types, as genetic variants likely influence complex traits in a tissue- and cell type-specific manner. Recently, several statistical methods have been developed to integrate genomic data with GWASs for identifying trait-relevant tissues and cell types. These methods often rely on different genomic information and use different statistical models for trait-tissue relevance inference. Here, we present a comprehensive technical review to summarize ten existing methods for trait-tissue relevance inference. These methods make use of different genomic information that include functional annotation information, expression quantitative trait loci information, genetically regulated gene expression information, as well as gene co-expression network information. These methods also use different statistical models that range from linear mixed models to covariance network models. We hope that this review can serve as a useful reference both for methodologists who develop methods and for applied analysts who apply these methods for identifying trait relevant tissues and cell types.

7 citations