scispace - formally typeset
Search or ask a question
Author

Xihao Li

Bio: Xihao Li is an academic researcher from Harvard University. The author has contributed to research in topics: Medicine & Biology. The author has an hindex of 7, co-authored 20 publications receiving 172 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: STAAR is a powerful rare variant association test that incorporates variant functional categories and complementary functional annotations using a dynamic weighting scheme based on annotation principal components and is scalable for analyzing large whole-genome sequencing studies.
Abstract: Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce 'annotation principal components', multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.

116 citations

Journal ArticleDOI
TL;DR: The results imply that rare variants, in particular those in regions of low linkage disequilibrium, are a major source of the still missing heritability of complex traits and disease.

72 citations

Journal ArticleDOI
TL;DR: A large-scale cross-trait genome-wide association study to investigate genetic overlap between chronic obstructive pulmonary disease (COPD) and four primary cardiac traits found evidence of shared genetics between COPD and cardiac traits.
Abstract: A growing number of studies clearly demonstrate a substantial association between chronic obstructive pulmonary disease (COPD) and cardiovascular diseases (CVD), although little is known about the shared genetics that contribute to this association. We conducted a large-scale cross-trait genome-wide association study to investigate genetic overlap between COPD (Ncase = 12,550, Ncontrol = 46,368) from the International COPD Genetics Consortium and four primary cardiac traits: resting heart rate (RHR) (N = 458,969), high blood pressure (HBP) (Ncase = 144,793, Ncontrol = 313,761), coronary artery disease (CAD)(Ncase = 60,801, Ncontrol = 123,504), and stroke (Ncase = 40,585, Ncontrol = 406,111) from UK Biobank, CARDIoGRAMplusC4D Consortium, and International Stroke Genetics Consortium data. RHR and HBP had modest genetic correlation, and CAD had borderline evidence with COPD at a genome-wide level. We found evidence of local genetic correlation with particular regions of the genome. Cross-trait meta-analysis of COPD identified 21 loci jointly associated with RHR, 22 loci with HBP, and 3 loci with CAD. Functional analysis revealed that shared genes were enriched in smoking-related pathways and in cardiovascular, nervous, and immune system tissues. An examination of smoking-related genetic variants identified SNPs located in 15q25.1 region associated with cigarettes per day, with effects on RHR and CAD. A Mendelian randomization analysis showed a significant positive causal effect of COPD on RHR (causal estimate = 0.1374, P = 0.008). In a set of large-scale GWAS, we identify evidence of shared genetics between COPD and cardiac traits.

61 citations

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the relationship between smoking history and clinically relevant mutations in non-small cell lung cancer, revealing the potential of smoking history as a surrogate for tumor mutation burden.
Abstract: Lung carcinogenesis is a complex and stepwise process involving accumulation of genetic mutations in signaling and oncogenic pathways via interactions with environmental factors and host susceptibility. Tobacco exposure is the leading cause of lung cancer, but its relationship to clinically relevant mutations and the composite tumor mutation burden (TMB) has not been fully elucidated. In this study, we investigated the dose-response relationship in a retrospective observational study of 931 patients treated for advanced-stage non-small cell lung cancer (NSCLC) between April 2013 and February 2020 at the Dana Farber Cancer Institute and Brigham and Women's Hospital. Doubling smoking pack-years was associated with increased KRASG12C and less frequent EGFRdel19 and EGFRL858R mutations, whereas doubling smoking-free months was associated with more frequent EGFRL858R . In advanced lung adenocarcinoma, doubling smoking pack-years was associated with an increase in TMB, whereas doubling smoking-free months was associated with a decrease in TMB, after controlling for age, gender, and stage. There is a significant dose-response association of smoking history with genetic alterations in cancer-related pathways and TMB in advanced lung adenocarcinoma. SIGNIFICANCE: This study clarifies the relationship between smoking history and clinically relevant mutations in non-small cell lung cancer, revealing the potential of smoking history as a surrogate for tumor mutation burden.

50 citations

Journal ArticleDOI
TL;DR: Functional analysis revealed that the shared genes are enriched in amyloid metabolic process, lipoprotein remodeling and other related biological pathways; also in pancreas, liver, blood and other tissues.
Abstract: A growing number of studies clearly demonstrate a substantial link between metabolic dysfunction and the risk of Alzheimer’s disease (AD), especially glucose-related dysfunction; one hypothesis for this comorbidity is the presence of a common genetic etiology. We conducted a large-scale cross-trait GWAS to investigate the genetic overlap between AD and ten metabolic traits. Among all the metabolic traits, fasting glucose, fasting insulin and HDL were found to be genetically associated with AD. Local genetic covariance analysis found that 19q13 region had strong local genetic correlation between AD and T2D (P = 6.78 × 10− 22), LDL (P = 1.74 × 10− 253) and HDL (P = 7.94 × 10− 18). Cross-trait meta-analysis identified 4 loci that were associated with AD and fasting glucose, 3 loci that were associated with AD and fasting insulin, and 20 loci that were associated with AD and HDL (Pmeta < 1.6 × 10− 8, single trait P < 0.05). Functional analysis revealed that the shared genes are enriched in amyloid metabolic process, lipoprotein remodeling and other related biological pathways; also in pancreas, liver, blood and other tissues. Our work identifies common genetic architectures shared between AD and fasting glucose, fasting insulin and HDL, and sheds light on molecular mechanisms underlying the association between metabolic dysregulation and AD.

42 citations


Cited by
More filters
01 Feb 2015
TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Abstract: The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.

4,409 citations

Journal ArticleDOI
Daniel Taliun1, Daniel N. Harris2, Michael D. Kessler2, Jedidiah Carlson1  +202 moreInstitutions (61)
10 Feb 2021-Nature
TL;DR: The Trans-Omics for Precision Medicine (TOPMed) project as discussed by the authors aims to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases.
Abstract: The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1 In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals) These rare variants provide insights into mutational processes and recent human evolutionary history The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 001% The goals, resources and design of the NHLBI Trans-Omics for Precision Medicine (TOPMed) programme are described, and analyses of rare variants detected in the first 53,831 samples provide insights into mutational processes and recent human evolutionary history

801 citations

Posted ContentDOI
Daniel Taliun1, Daniel N. Harris2, Michael D. Kessler2, Jedidiah Carlson1  +191 moreInstitutions (61)
06 Mar 2019-bioRxiv
TL;DR: The nearly complete catalog of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and non-coding sequence variants to phenotypic variation as well as resources and early insights from the sequence data.
Abstract: Summary paragraph The Trans-Omics for Precision Medicine (TOPMed) program seeks to elucidate the genetic architecture and disease biology of heart, lung, blood, and sleep disorders, with the ultimate goal of improving diagnosis, treatment, and prevention. The initial phases of the program focus on whole genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here, we describe TOPMed goals and design as well as resources and early insights from the sequence data. The resources include a variant browser, a genotype imputation panel, and sharing of genomic and phenotypic data via dbGaP. In 53,581 TOPMed samples, >400 million single-nucleotide and insertion/deletion variants were detected by alignment with the reference genome. Additional novel variants are detectable through assembly of unmapped reads and customized analysis in highly variable loci. Among the >400 million variants detected, 97% have frequency

662 citations

01 Jan 2011
TL;DR: In this paper, the authors show that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected, its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants.
Abstract: Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and 'genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ~4000 and ~133,000 individuals.

413 citations

01 Jan 2018
TL;DR: The first large randomized controlled trials of multidomain lifestyle interventions to prevent cognitive impairment have been completed, and the results suggest that targeting interventions to individuals at risk of dementia is an effective strategy.
Abstract: Research into dementia prevention is of paramount importance if the dementia epidemic is to be halted. Observational studies have identified several potentially modifiable risk factors for dementia, including hypertension, dyslipidaemia and obesity at midlife, diabetes mellitus, smoking, physical inactivity, depression and low levels of education. Randomized clinical trials are needed that investigate whether interventions targeting these risk factors can reduce the risk of cognitive decline and dementia in elderly adults, but such trials are methodologically challenging. To date, most preventive interventions have been tested in small groups, have focused on a single lifestyle factor and have yielded negative or modest results. Given the multifactorial aetiology of dementia and late-onset Alzheimer disease, multidomain interventions that target several risk factors and mechanisms simultaneously might be necessary for an optimal preventive effect. In the past few years, three large multidomain trials (FINGER, MAPT and PreDIVA) have been completed. The FINGER trial showed that a multidomain lifestyle intervention can benefit cognition in elderly people with an elevated risk of dementia. The primary results from the other trials did not show a statistically significant benefit of preventive interventions, but additional analyses among participants at risk of dementia showed beneficial effects of intervention. Overall, results from these three trials suggest that targeting of preventive interventions to at-risk individuals is an effective strategy. This Review discusses the current knowledge of lifestyle-related risk factors and results from novel trials aiming to prevent cognitive decline and dementia. Global initiatives are presented, including the World Wide FINGERS network, which aims to harmonize studies on dementia prevention, generate high-quality scientific evidence and promote its implementation.

228 citations