scispace - formally typeset
Search or ask a question
Author

Zhaoxia Yu

Other affiliations: Mayo Clinic, University of California, Berkeley, Rice University  ...read more
Bio: Zhaoxia Yu is an academic researcher from University of California, Irvine. The author has contributed to research in topics: Population & Type I and type II errors. The author has an hindex of 19, co-authored 55 publications receiving 1592 citations. Previous affiliations of Zhaoxia Yu include Mayo Clinic & University of California, Berkeley.


Papers
More filters
Journal ArticleDOI
TL;DR: Results from single-marker and haplotypic analysis of the BEAGLE method's genotype calls for the bipolar disorder study indicate that the method is highly effective at eliminating genotyping artifacts that cause false-positive associations in genome-wide association studies.
Abstract: We present a novel method for simultaneous genotype calling and haplotype-phase inference Our method employs the computationally efficient BEAGLE haplotype-frequency model, which can be applied to large-scale studies with millions of markers and thousands of samples We compare genotype calls made with our method to genotype calls made with the BIRDSEED, CHIAMO, GenCall, and ILLUMINUS genotype-calling methods, using genotype data from the Illumina 550K and Affymetrix 500K arrays We show that our method has higher genotype-call accuracy and yields fewer uncalled genotypes than competing methods We perform single-marker analysis of data from the Wellcome Trust Case Control Consortium bipolar disorder and type 2 diabetes studies For bipolar disorder, the genotype calls in the original study yield 25 markers with apparent false-positive association with bipolar disorder at a p < 10−7 significance level, whereas genotype calls made with our method yield no associated markers at this significance threshold Conversely, for markers with replicated association with type 2 diabetes, there is good concordance between genotype calls used in the original study and calls made by our method Results from single-marker and haplotypic analysis of our method's genotype calls for the bipolar disorder study indicate that our method is highly effective at eliminating genotyping artifacts that cause false-positive associations in genome-wide association studies Our new genotype-calling methods are implemented in the BEAGLE and BEAGLECALL software packages

221 citations

Journal ArticleDOI
TL;DR: Associated with changes in CO, decreased after phenylephrine treatment, but remained unchanged after ephedrine treatment, implies a cause-effect relationship between global and regional haemodynamics.
Abstract: † Phenylephrine, but not ephedrine, decreased cardiac output (CO) and brain oxygenation. † This study highlights the importance of CO in preserving brain oxygenation during management of intraoperative hypotension. Background. How phenylephrine and ephedrine treatments affect global and regional haemodynamics is of major clinical relevance. Cerebral tissue oxygen saturation (SctO2 )-guided management may improve postoperative outcome. The physiological variables responsible for SctO2 changes induced by phenylephrine and ephedrine bolus treatment in anaesthetized patients need to be defined. Methods. A randomized two-treatment cross-over trial was conducted: one bolus dose of phenylephrine (100‐200 mg) and one bolus dose of ephedrine (5‐20 mg) were given to 29 ASA I‐III patients anaesthetized with propofol and remifentanil. SctO2, mean arterial pressure (MAP), cardiac output (CO), and other physiological variables were recorded before and after treatments. The associations of changes were analysed using linear-mixed models. Results. The COdecreased significantlyafter phenylephrine treatment [△CO¼ 22.1 (1.4) litre min 21 , P,0.001], but was preserved after ephedrine treatment [△CO¼0.5 (1.4) litre min 21 , P.0.05]. The SctO2 was significantly decreased after phenylephrine treatment [△SctO2 ¼ 23.2 (3.0)%, P,0.01] but preserved after ephedrine treatment [△SctO2 ¼0.04 (1.9)%, P.0.05]. CO was identified to have the most significant association with SctO2 (P,0.001). After taking CO into consideration, the other physiological variables, including MAP, were not significantly associated with SctO2 (P.0.05). Conclusions. Associated with changes in CO, SctO2 decreased after phenylephrine treatment, but remained unchanged after ephedrine treatment. The significant correlation between CO and SctO2 implies a cause‐effect relationship between global and regional haemodynamics.

171 citations

Journal ArticleDOI
TL;DR: The results from this meta-analysis found that inaccuracy and imprecision of continuous noninvasive arterial pressure monitoring devices are larger than what was defined as acceptable, which may have implications for clinical situations where continuous non invasive arterIAL pressure is being used for patient care decisions.
Abstract: Background:Continuous noninvasive arterial pressure monitoring devices are available for bedside use, but the accuracy and precision of these devices have not been evaluated in a systematic review and meta-analysis.Methods:The authors performed a systematic review and meta-analysis of studies compar

158 citations

Journal ArticleDOI
TL;DR: It is shown that MS risk modulators converge to alter N-glycosylation and/or CTLA-4 surface retention conditional on metabolism and vitamin D3, including genetic variants in interleukin-7 receptor-α (IL7RA*C), interleokin-2 receptor- α (IL2RA*T), MGAT1 (IVAVT−T) and CTLA
Abstract: How environmental factors combine with genetic risk at the molecular level to promote complex trait diseases such as multiple sclerosis (MS) is largely unknown. In mice, N-glycan branching by the Golgi enzymes Mgat1 and/or Mgat5 prevents T cell hyperactivity, cytotoxic T-lymphocyte antigen 4 (CTLA-4) endocytosis, spontaneous inflammatory demyelination and neurodegeneration, the latter pathologies characteristic of MS. Here we show that MS risk modulators converge to alter N-glycosylation and/or CTLA-4 surface retention conditional on metabolism and vitamin D(3), including genetic variants in interleukin-7 receptor-α (IL7RA*C), interleukin-2 receptor-α (IL2RA*T), MGAT1 (IV(A)V(T-T)) and CTLA-4 (Thr17Ala). Downregulation of Mgat1 by IL7RA*C and IL2RA*T is opposed by MGAT1 (IV(A)V(T-T)) and vitamin D(3), optimizing branching and mitigating MS risk when combined with enhanced CTLA-4 N-glycosylation by CTLA-4 Thr17. Our data suggest a molecular mechanism in MS whereby multiple environmental and genetic inputs lead to dysregulation of a final common pathway, namely N-glycosylation.

154 citations

Journal ArticleDOI
TL;DR: The SNP-based pathway enrichment method described here offers a new alternative approach for analysing GWAS data, and is able to identify statistically significant pathways, and importantly, pathways that can be replicated in large genetically distinct samples.
Abstract: Background: Recently we have witnessed a surge of interest in using genome-wide association studies (GWAS) to discover the genetic basis of complex diseases. Many genetic variations, mostly in the form of single nucleotide polymorphisms (SNPs), have been identified in a wide spectrum of diseases, including diabetes, cancer, and psychiatric diseases. A common theme arising from these studies is that the genetic variations discovered by GWAS can only explain a small fraction of the genetic risks associated with the complex diseases. New strategies and statistical approaches are needed to address this lack of explanation. One such approach is the pathway analysis, which considers the genetic variations underlying a biological pathway, rather than separately as in the traditional GWAS studies. A critical challenge in the pathway analysis is how to combine evidences of association over multiple SNPs within a gene and multiple genes within a pathway. Most current methods choose the most significant SNP from each gene as a representative, ignoring the joint action of multiple SNPs within a gene. This approach leads to preferential identification of genes with a greater number of SNPs. Results: We describe a SNP-based pathway enrichment method for GWAS studies. The method consists of the following two main steps: 1) for a given pathway, using an adaptive truncated product statistic to identify all representative (potentially more than one) SNPs of each gene, calculating the average number of representative SNPs for the genes, then re-selecting the representative SNPs of genes in the pathway based on this number; and 2) ranking all selected SNPs by the significance of their statistical association with a trait of interest, and testing if the set of SNPs from a particular pathway is significantly enriched with high ranks using a weighted KolmogorovSmirnov test. We applied our method to two large genetically distinct GWAS data sets of schizophrenia, one from European-American (EA) and the other from African-American (AA). In the EA data set, we found 22 pathways with nominal P-value less than or equal to 0.001 and corresponding false discovery rate (FDR) less than 5%. In the AA data set, we found 11 pathways by controlling the same nominal P-value and FDR threshold. Interestingly, 8 of these pathways overlap with those found in the EA sample. We have implemented our method in a JAVA software package, called SNP Set Enrichment Analysis (SSEA), which contains a user-friendly interface and is freely available at http://cbcl.ics.uci.edu/SSEA. Conclusions: The SNP-based pathway enrichment method described here offers a new alternative approach for analysing GWAS data. By applying it to schizophrenia GWAS studies, we show that our method is able to identify statistically significant pathways, and importantly, pathways that can be replicated in large genetically distinct samples.

124 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs is presented.
Abstract: Recent advances in sequencing technology make it possible to comprehensively catalogue genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (1) initial read mapping; (2) local realignment around indels; (3) base quality score recalibration; (4) SNP discovery and genotyping to find all potential variants; and (5) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We discuss the application of these tools, instantiated in the Genome Analysis Toolkit (GATK), to deep whole-genome, whole-exome capture, and multi-sample low-pass (~4×) 1000 Genomes Project datasets.

10,056 citations

Journal ArticleDOI
Heng Li1
TL;DR: This work presents a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation and demonstrates that this method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping.
Abstract: Motivation: Most existing methods for DNA sequence analysis rely on accurate sequences or genotypes. However, in applications of the next-generation sequencing (NGS), accurate genotypes may not be easily obtained (e.g. multi-sample low-coverage sequencing or somatic mutation discovery). These applications press for the development of new methods for analyzing sequence data with uncertainty. Results: We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation. On real data, we demonstrate that our method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping. We also highlight the necessity of using symmetric datasets for finding somatic mutations and confirm that for discovering rare events, mismapping is frequently the leading source of errors. Availability: http://samtools.sourceforge.net Contact: hengli@broadinstitute.org

4,949 citations

Journal ArticleDOI
TL;DR: Substantial agreement was found among a large, interdisciplinary cohort of international experts regarding evidence supporting recommendations, and the remaining literature gaps in the assessment, prevention, and treatment of Pain, Agitation/sedation, Delirium, Immobility (mobilization/rehabilitation), and Sleep (disruption) in critically ill adults.
Abstract: Objective:To update and expand the 2013 Clinical Practice Guidelines for the Management of Pain, Agitation, and Delirium in Adult Patients in the ICU.Design:Thirty-two international experts, four methodologists, and four critical illness survivors met virtually at least monthly. All section groups g

1,935 citations

01 Mar 2001
TL;DR: Using singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype.
Abstract: ‡We describe the use of singular value decomposition in transforming genome-wide expression data from genes 3 arrays space to reduced diagonalized ‘‘eigengenes’’ 3 ‘‘eigenarrays’’ space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.

1,815 citations

Journal ArticleDOI
TL;DR: A multithreaded program suite called ANGSD that can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods.
Abstract: High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously. We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods. The open source c/c++ program ANGSD is available at http://www.popgen.dk/angsd . The program is tested and validated on GNU/Linux systems. The program facilitates multiple input formats including BAM and imputed beagle genotype probability files. The program allow the user to choose between combinations of existing methods and can perform analysis that is not implemented elsewhere.

1,795 citations