scispace - formally typeset
Open AccessPosted ContentDOI

An integrated approach to identify environmental modulators of genetic risk factors for complex traits

Reads0
Chats0
TLDR
It is shown that heritability of multiple traits is enriched in regions surrounding genes responsive to specific perturbations and, further, that environmentally responsive genes are enriched for associations with specific diseases and phenotypes from the GWAS catalogue.
Abstract
Complex traits and diseases can be influenced by both genetics and environment. However, given the large number of environmental stimuli and power challenges for gene-by-environment testing, it remains a critical challenge to identify and prioritize specific disease-relevant environmental exposures. We propose a novel framework for leveraging signals from transcriptional responses to environmental perturbations to identify disease-relevant perturbations that can modulate genetic risk for complex traits and inform the functions of genetic variants associated with complex traits. We perturbed human skeletal muscle, fat, and liver relevant cell lines with 21 perturbations affecting insulin resistance, glucose homeostasis, and metabolic regulation in humans and identified thousands of environmentally responsive genes. By combining these data with GWAS from 31 distinct polygenic traits, we show that heritability of multiple traits is enriched in regions surrounding genes responsive to specific perturbations and, further, that environmentally responsive genes are enriched for associations with specific diseases and phenotypes from the GWAS catalogue. Overall, we demonstrate the advantages of large-scale characterization of transcriptional changes in diversely stimulated and pathologically relevant cells to identify disease-relevant perturbations.

read more

Content maybe subject to copyright    Report

An integrated approach to identify environmental modulators of genetic
risk factors for complex traits
Brunilda Balliu*
Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
Ivan Carcamo -Orive*
Department of Medicine, Division of Cardiovascular Medicine, Cardiovascular Institute and Stanford Diabetes
Research Center, Stanford University School of Medicine, Stanford, CA, USA
Michael J. Gloudemans
Biomedical Informatics Training Program and Department of Pathology, Stanford University School of Medicine,
Stanford, CA, USA
Daniel C. Nachun
Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
Matthew G. Durrant
Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
Steven Gazal
Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, CA, USA
Chong Y. Park
Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA,
USA
David A. Knowles
New York Genome Center, New York, NY, USA
Martin Wabitsch
Department of Pediatrics and Adolescent Medicine, Division of Pediatric Endocrinology, Ulm University, Ulm,
Germany
Thomas Quertermous
Department of Medicine, Division of Cardiology and Cardiovascular Institute, Stanford Diabetes Research Center,
Stanford University School of Medicine, Stanford, CA, USA
Joshua W. Knowles*
Department of Medicine, Division of Cardiology and Cardiovascular Institute, Stanford Diabetes Research Center and
Stanford Prevention Research Center, Stanford University School of Medicine, Stanford, CA, USA
Stephen B. Montgomery*
Department of Pathology and Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
* correspondence to bballiu@ucla.edu, ivantxo@stanford.edu, knowlej@stanford.edu, and smontgom@stanford.edu
¶ These authors contributed equally to this work.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.432608doi: bioRxiv preprint

Abstract
Complex traits and diseases can be influenced by both genetics and environment. However, given
the large number of environmental stimuli and power challenges for gene-by-
environment testing,
remains a critical challenge to identify and prioritize specific disease-relevant environmental
exposures. We propose a novel framework for leveraging signals from transcriptional responses
to
environmental perturbations to identify disease-
relevant perturbations that can modulate genetic ri
for complex traits and inform the functions of genetic variants associated with complex traits
. We
perturbed human skeletal muscle, fat, and liver relevant cell lines with 21 perturbations affecting
insulin resistance, glucose homeostasis, and metabolic regulation in humans and identified
thousands of environmentally responsive genes. By combining these data with GWAS from 31
distinct polygenic traits, we show that heritability of multiple traits is enriched in regions
surrounding genes responsive to specific perturbations and, further, that
environmentally responsiv
genes are enriched for associations with specific diseases and phenotypes from the GWAS
catalogue. Overall, we demonstrate the advantages of large-
changes in diversely stimulated and pathologically relevant cells to identify disease-relevant
perturbations.
Keywords: Co-localization, Gene expression, Gene-by-environment interactions, GWAS
1
1
en
g, it
to
risk
e
sive
nal
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.432608doi: bioRxiv preprint

2
Introduction
Genome-wide association studies (GWAS) have identified thousands of genetic variants
associated with complex diseases and traits
1
. The majority of these variants fall into non-coding
regions of the genome and, as a result, their mechanism of action remains largely unknown
2
. In
recent years, researchers have gained an increasingly clear picture of which parts of the genome are
active in a range of tissues and cell types
3–6
. Integrating such information with results from GWAS
has identified cell types, tissues, and regulatory elements relevant to specific diseases and
phenotypes and moved the field towards mechanistic understanding of GWAS hits
7–9
. In addition,
genomic colocalization and transcriptome-wide association studies combining results from GWAS
and expression quantitative trait loci (eQTL) studies have identified candidate causal genes and their
mechanisms of action
10–12
.
Despite these advances, a modest fraction of GWAS associated variants and eQTLs
colocalize for any trait
13,14
providing the perspective that many disease-relevant effects are
modulated by yet-to-be-discovered environmental factors. To address this challenge, multiple
studies have mapped eQTLs in vitro that are responsive to the environment
15–26
. For example, the
Immune Variation project identified eQTLs in human CD4+ T lymphocytes with different effects
across distinct immune states
17
. These previously unknown, immune state-specific eQTLs were
enriched for autoimmune disease-associated variants, underscoring the importance of exploring
contexts beyond tissues and cell types to reveal the specificity of genetic associations. Although
there is mounting evidence that environment modulates genetic effects, GWAS and eQTL studies
rarely measure and test for genetic interactions with environment exposures. This is, in part, due to
the difficulty of identifying and collecting information on the most relevant environmental
exposures in GWAS cohorts and performing eQTL studies in contexts that are relevant for the
specific trait or disease.
In this study, we extend the current understanding of inherited variation in complex traits by
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.432608doi: bioRxiv preprint

3
implementing a novel framework to model signals from transcriptional responses to environmental
perturbations in order to identify and prioritize disease-relevant environments that can modulate
genetic risk for complex traits and inform the functions of genetic variants and genes associated
with complex traits. Specifically, we first assessed environmental effects on gene expression levels
in three metabolic human cell lines by performing RNA-seq in muscle-, fat-, and liver-relevant cell
lines treated with 21 different environmental perturbations related to aspects of glucose and insulin
metabolism, kinase inhibitors, inflammation, fatty acid metabolism, etc. (N=234 samples). We
identified thousands of environmentally responsive genes underlying disease-associated response
pathways and characterized the specificity and sharing of these effects across perturbations and cell
lines. Next, to identify disease-relevant perturbations, we coupled our gene expression data with
GWAS summary statistics of 31 complex traits and diseases as well as associations from the GWAS
catalogue. We confirmed several well-established environmental-phenotype associations, e.g., the
role of TGF-
β
1 on asthma
27
and provided additional evidence for recent and less well-understood
associations, e.g., the role of leptin on major depressive disorder
28
. Last, to further illustrate how
perturbation experiments inform the functions of complex trait associated variants, we integrate our
perturbation data with genomic colocalization studies and show that the effects of these
perturbations in the relevant tissues identifies context-specific molecular mechanisms of GWAS hits
for diverse cardiometabolic traits.
This resource characterizes the dynamic transcriptional landscape in metabolic tissues and
provides a framework to identify and prioritize disease-relevant perturbations and disentangle the
complex gene-environment interactions that determine disease susceptibility, which is particularly
relevant for complex traits such as insulin resistance, diabetes and obesity.
Results
Transcriptome map of 21 perturbations across human skeletal muscle, fat and
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.432608doi: bioRxiv preprint

4
liver cell lines
We generated a transcriptome map of multiple chemical and environmental perturbations in
well-established human skeletal muscle, fat and liver cell lines (N=234 samples). Specifically, we
studied 21 environmental perturbations covering multiple aspects of glucose and insulin
metabolism, inflammation, fatty acid metabolism, and including both LDL-lowering and anti-
diabetic drugs (Figure 1 and Table S1). For each perturbation and cell line and matched controls, we
conducted assays in triplicate and applied differential expression analysis. We observed that the
majority of perturbations induced broad gene expression changes in at least one cell line at FDR <
5% (Figure 1A, Table S2). Several perturbations induced broad changes across all cell lines; for
example, insulin and IGF1 altered the gene expression of 1,500-2,000 genes in each cell line. Other
perturbations had broad changes only in specific cell lines. For example, IL-6, lauroyl-l-carnitine,
and glucose had more pronounced effects in fat, muscle, and liver, respectively, impacting the
expression of 3,161, 2,051 and 2,724 genes, respectively.
Despite the broad effects for each perturbation, multiple differentially expressed (DE) genes
showed perturbation-specific effects within each cell line, highlighting a unique molecular response
to each perturbation. We observed 1,883 genes in muscle, 1,813 genes in fat, and 2,231 genes in
liver altered by only a single perturbation in their respective cell lines (Figure 1B and Table S3).
The largest proportions of perturbation-specific DE genes were found in glucose-stimulated liver
cell lines and TGF-
β
1-stimulated fat cell lines. For these perturbations, 32.6% and 26.4% of DE
genes were not altered by any of the other 20 perturbations in the same cell line (Figure 1C). By
further stratifying across these cell lines, we identified 627, 742, and 808 genes that were both
perturbation- and cell line-specific DE genes in muscle, fat, and liver (FDR < 5%; Figure S4A and
Table S3). Glucose-stimulated liver cells also provided the largest amount of perturbation-and cell
line-specific DE genes; 9.8% of DE genes were not altered by any of the other 20 perturbations in
any cell line or by glucose stimulation in fat or muscle.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted February 25, 2021. ; https://doi.org/10.1101/2021.02.23.432608doi: bioRxiv preprint

Figures
Citations
More filters

Integrative analysis of 111 reference human epigenomes

TL;DR: In this article, the authors describe the integrative analysis of 111 reference human epigenomes generated as part of the NIH Roadmap Epigenomics Consortium, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression.
Journal ArticleDOI

SARS-CoV-2 infection impairs the insulin/IGF signaling pathway in the lung, liver, adipose tissue, and pancreatic cells via IRF1

TL;DR: First scientific evidence is provided that SARS-CoV-2 infection impairs the insulin/IGF signaling pathway in respiratory, metabolic, and endocrine cells and tissues and contributes to COVID-19 severity with cell/tissue damage and metabolic abnormalities, which may be exacerbated in older, male, obese, or diabetic patients.
Journal ArticleDOI

Stem Cell Models for Context-Specific Modeling in Psychiatric Disorders

TL;DR: In this paper , the authors discuss methods and insights from context-specific modeling of genetically and environmentally regulated expression, and integrate disorder-associated contexts with these patient-specific human induced pluripotent stem cell derived cell type and organoid models.
Posted ContentDOI

Integration of genetic colocalizations with physiological and pharmacological perturbations identifies cardiometabolic disease genes

TL;DR: In this paper, the authors perform and integrate LD-adjusted colocalization analyses across nine cardiometabolic traits combined with eQTL and sQTLs from five metabolically relevant human tissues (subcutaneous and visceral adipose, skeletal muscle, liver, and pancreas).
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI

Meta-Analysis: A Constantly Evolving Research Integration Tool

TL;DR: The four articles in this special section onMeta-analysis illustrate some of the complexities entailed in meta-analysis methods and contributes both to advancing this methodology and to the increasing complexities that can befuddle researchers.
Journal ArticleDOI

clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters

TL;DR: An R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters and can be easily extended to other species and ontologies is presented.
Journal ArticleDOI

A global reference for human genetic variation.

Adam Auton, +517 more
- 01 Oct 2015 - 
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Related Papers (5)
Frequently Asked Questions (1)
Q1. What are the contributions in "An integrated approach to identify environmental modulators of genetic risk factors for complex traits" ?

The authors propose a novel framework for leveraging signals from transcriptional responses to environmental perturbations to identify disease-relevant perturbations that can modulate genetic ri for complex traits and inform the functions of genetic variants associated with complex traits. By combining these data with GWAS from 31 distinct polygenic traits, the authors show that heritability of multiple traits is enriched in regions surrounding genes responsive to specific perturbations and, further, that environmentally responsiv genes are enriched for associations with specific diseases and phenotypes from the GWAS catalogue. Overall, the authors demonstrate the advantages of large-scale characterization of transcriptiona changes in diversely stimulated and pathologically relevant cells to identify disease-relevant perturbations. 

Trending Questions (1)
Are muscles associated with genetics or environment?

The paper does not specifically mention whether muscles are associated with genetics or environment. The paper focuses on identifying disease-relevant environmental exposures and their modulation of genetic risk factors for complex traits.