scispace - formally typeset
Open AccessPosted ContentDOI

Cross-species identification of cancer-resistance associated genes uncovers their relevance to human cancer risk

Reads0
Chats0
TLDR
The authors applied a comparative genomics approach to systematically characterize the genes whose conservation levels significantly correlate positively (PC) or negatively (NC) with a broad spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species.
Abstract
Cancer is an evolutionarily conserved disease that occurs in a wide variety of species. We applied a comparative genomics approach to systematically characterize the genes whose conservation levels significantly correlates positively (PC) or negatively (NC) with a broad spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species. PC genes are enriched in pathways relevant to tumor suppression including cell cycle, DNA repair, and immune response, while NC genes are enriched with a host of metabolic pathways. The conservation levels of the PC and NC genes in a species serve to build the first genomics-based predictor of its cancer resistance score. We find that PC genes are less tolerant to loss of function (LoF) mutations, are enriched in cancer driver genes and are associated with germline mutations that increase human cancer risk. Furthermore, their expression levels are associated with lifetime cancer risk across human tissues. Finally, their knockout in mice results in increased cancer incidence. In sum, we find that many genes associated with cancer resistance across species are implicated in human cancers, pointing to several additional candidate genes that may have a functional role in human cancer.

read more

Content maybe subject to copyright    Report

Cross-species identification of cancer-resistance associated genes
uncovers their relevance to human cancer risk
Nishanth Ulhas Nair
1,*,#
, Kuoyuan Cheng
1,2,*,#
, Lamis Naddaf
3,*
, Elad Sharon
3
, Lipika R. Pal
1
, Padma
S. Rajagopal
4
, Irene Unterman
3
, Kenneth Aldape
5
, Sridhar Hannenhalli
1
, Chi-Ping Day
6
, Yuval
Tabach
3,#
, Eytan Ruppin
1,#
1. Cancer Data Science Laboratory (CDSL), National Cancer Institute (NCI), National Institutes of
Health (NIH), Bethesda, MD, USA.
2. Center for Bioinformatics and Computational Biology, University of Maryland, College Park,
MD, USA.
3. Department of Developmental Biology and Cancer Research, Institute of Medical Research -
Israel-Canada, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel.
4. Section of Hematology/Oncology, Department of Medicine, The University of Chicago,
Chicago, IL, USA.
5. Laboratory of Pathology, National Cancer Institute (NCI), National Institutes of Health (NIH),
Bethesda, MD, USA.
6. Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), National
Institutes of Health (NIH), Bethesda, MD, USA.
* These authors contributed equally to this work as co-first authors.
# co-corresponding authors (nishanth.nair@nih.gov, kycheng@terpmail.umd.edu,
tabachy@gmail.com, eytan.ruppin@nih.gov)
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

ABSTRACT
Cancer is an evolutionarily conserved disease that occurs in a wide variety of species. We
applied a comparative genomics approach to systematically characterize the genes whose
conservation levels significantly correlates positively (PC) or negatively (NC) with a broad
spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species. PC
genes are enriched in pathways relevant to tumor suppression including cell cycle, DNA repair,
and immune response, while NC genes are enriched with a host of metabolic pathways. The
conservation levels of the PC and NC genes in a species serve to build the first genomics-based
predictor of its cancer resistance score. We find that PC genes are less tolerant to loss of
function (LoF) mutations, are enriched in cancer driver genes and are associated with germline
mutations that increase human cancer risk. Furthermore, their expression levels are associated
with lifetime cancer risk across human tissues. Finally, their knockout in mice results in
increased cancer incidence. In sum, we find that many genes associated with cancer resistance
across species are implicated in human cancers, pointing to several additional candidate genes
that may have a functional role in human cancer.
INTRODUCTION
Animal species are known to have dramatic differences in their cancer rates and lifespans, and
several animals are considered cancer resistant while others are considered to be cancer prone
(Gorbunova
et al.
2014; Albuquerque
et al.
2018). Studying the genomic underpinnings of these
differences across various branches of life may provide insights into cancer development and
cancer prevention/treatment options in humans (Seluanov
et al.
2018).
The multistage carcinogenesis model states that “individual cells become cancerous
after accumulating a specific number of mutational hits” (Seluanov
et al.
2018; Nordling, 1953).
Based on this model, larger (and longer-living) animals are expected to have higher cancer
incidence as they have more stem cell divisions overall, resulting in a higher likelihood of
producing and propagating carcinogenic mutations. For humans, it has been shown that the
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

risks of cancer development across different tissue types are correlated with their
corresponding estimated number of lifetime stem cell divisions (Tomasetti
et al.
2015 and
2017); consistent with that, human cancer risk is indeed correlated with body height (Khankari
et al.
2016). However, cancer risk does not correlate with body size across species, a
contradiction known as Peto’s paradox (Peto, 1947; Tollis
et al.
, 2017; Seluanov
et al.
2018). For
example, humans do not have higher cancer risk than mice despite having thousands of times
more cells (Lipman
et al.
2004; Szymanska
et al.
2014; Ikeno
et al.
2009). More drastically, the
cancer-resistant bowhead whale (Keane
et al.
, 2016) can weigh 100 tons, live for over 200 years
(George
et al.
, 1999) and have millions times more cells than mice. It follows that different
species must have evolved different cancer resistance mechanisms to fit their lifestyles,
modifying the “baseline” probability of malignant transformation determined by body size,
lifespan, and tissue stem cell division (see Supp. Note for a short review of such mechanisms).
Numerous studies have adopted comparative genomics approaches to understand the
evolution of cancer resistance mechanisms across mammals. Some have focused on known
human cancer genes and their homologs. For example, Vicens and Posada (2018) found that
genes related to DNA repair and T cell proliferation have evolved under positive selection in
mammals. Tollis
et al.
(2020) found that the number of paralogs of human cancer genes across
mammals is positively correlated with the species’ lifespan, but not body size. Vazquez and
Lynch (2021) reported wide-spread tumor suppressor gene (TSG) duplications across both large
and small Afrotherian species. Other studies focused on body size and longevity, yielding some
insights into Peto’s paradox. Kowalczyk
et al.
(2020) analyzed genes whose evolutionary rates
across mammals correlate with body size and lifespan and discovered cancer resistance-related
genes that are under increased evolutionary constraints in larger and longer-living mammals.
Ferris
et al.
(2018) identified regions with accelerated evolution in specific mammals, including
several cancer resistant species, which provided some insights on the cancer resistance
mechanisms they have developed.
Unlike previous studies that focused exclusively on mammals, here we perform a
comprehensive genome-wide comparative study aimed at identifying genes related to cancer
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

resistance across a wide range of vertebrate species. To this end, we estimated the protein
conservation scores across species including mammals, birds and fish, identifying genes whose
conservation levels are associated with cancer resistance estimates. We then use these cancer-
resistance associated genes to build the first genomics-based predictor of cancer resistance for
any species. We show that the biological processes associated with cancer resistance vary
across taxonomic groups (classes and orders of species), pointing to the diversity in the
evolutionary paths and mechanisms for resisting cancer. Finally, the genes identified from this
phylogenetic analysis are enriched for cancer driver genes and in genes associated with cancer
risk in humans. These results show that a comparative genomic approach can help identify
genes involved in human cancers.
RESULTS
Computing
gene conservation
and
species cancer-resistance
estimates
We computed a matrix (Tabach
et al.
Nature 2013; Tabach
et al.
MSB 2013) of gene
conservation scores (phylogenetic profiles) across 240 species for which we had phenotypic
information in the AnAge database (Tacutu
et al.
2018) and sequence information from UniProt
(UnitProt Consortium, 2021), Refseq (O’Leary
et al.
2016), Keane
et al.
(2015), and NCBI (Sayers
et al.
2021) databases. To do this, the protein sequence similarity between each gene in the
genome of a reference species and its orthologs in each of the rest of the species (termed
phylogenetic profiling; Pellegrini
et al.
1999) was measured using the bit score computed with
BLASTP (Altschul
et al.
1990). The BLASTP bit scores were normalized by their gene length
(Tabach
et al.
Nature 2013; Sherill-Rofe
et al.
2019) and then rank-normalized across all genes
within each species to control for the evolutionary distance between the reference and each
species (Methods). These rank-normalized values range from 0 to 1, with higher values
corresponding to higher conservation levels. This method is termed rank-based phylogenetic
profiling. We primarily focused on the human as the reference species (Braun
et al.
, 2020) as
we are interested in making our findings relevant to human cancers. However, we
demonstrated that our conclusions are robust to the choice of reference (Methods, Supp.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

Note), largely because the normalization effectively removes dependency on phylogenetic
distance.
Since the cancer incidence rates of most species are largely unknown, we used two
proxy cancer-resistance estimates that have been proposed in the literature
MTLAW
and
MLCAW
. MLTAW assumes that the level of cancer resistance in a given species needs to roughly
counteract its risk of cancer development due to cell division, which is proportional to ML
6
×
AW, where ML denotes the species maximum longevity and AW denotes its adult weight (Peto
et al.
1977, 2015; Vazquez
et al.
2021; Methods). MLCAW considers the well-established
correlation between lifespan and body weight (AW) across many species (Speakman, 2005) and
thus regresses out the species AW from its ML (Methods). We computed MLTAW and MLCAW
for 193 out of the 240 species for which both ML and AW data was publicly available (Table S1,
Methods). These 193 species are from multiple Vertebrata classes, including Mammalia
(mammals, n=108), Aves (birds, n=55), Teleostei (teleost fishes, n=18), and Reptilia (reptiles,
n=7).
Genes associated with cancer resistance are enriched in cell cycle, DNA repair, immune
response, and different metabolic pathways
For each gene, we computed the Pearson correlation coefficient between its conservation
scores and the cancer-resistance estimates (MLTAW and MLCAW) across all species (Tables
S2A,B; Methods). We then computed the pathway enrichment of the positive and of the
negatively correlated genes (termed PC or NC genes, respectively) (Tables S3A,B; Methods). PC
genes correlated with either the MLCAW (
Fig. 1
) and MLTAW measures (Fig. S1) are enriched
for cell cycle, immune response, DNA repair, and transcription regulation pathways (FDR<0.1),
indicating that many genes in these pathways are more conserved in the relatively long-lived
cancer-resistant species. NC genes are enriched for a diverse range of metabolic pathways
(FDR<0.1,
Figs. 1
,S1).
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

Citations
More filters
Posted ContentDOI

Comparative analysis of genome-scale, base-resolution DNA methylation profiles across 580 animal species

TL;DR: In this paper , reference-genome independent analysis of this comprehensive dataset quantified the association of DNA methylation with the underlying genomic DNA sequence throughout vertebrate evolution and observed a broadly conserved link with two major transitions, once in the first vertebrates and again with the emergence of reptiles.
References
More filters
Journal ArticleDOI

Germline Mutations in the BRIP1, BARD1, PALB2, and NBN Genes in Women With Ovarian Cancer

TL;DR: Deleterious germline mutations in BRIP1 are associated with a moderate increase in EOC risk and have clinical implications for risk prediction and prevention approaches for ovarian cancer.
Journal ArticleDOI

Genetic variants associated with breast-cancer risk: comprehensive research synopsis, meta-analysis, and epidemiological evidence

TL;DR: Whereas most genetic variants assessed in previous candidate-gene studies showed no association with breast-cancer risk in meta-analyses, 14 variants in nine genes had moderate to strong evidence for an association.
Journal ArticleDOI

Interferon gamma in cancer immunotherapy

TL;DR: An extensive literature search of recent 5‐year studies is conducted to comprehensively understand the roles of IFNγ in tumor immunity, which contributes to better design and management of clinical immunotherapy approaches.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What contributions have the authors mentioned in the paper "Cross-species identification of cancer-resistance associated genes uncovers their relevance to human cancer risk" ?

Seluanov et al. this paper applied a comparative genomics approach to systematically characterize the genes whose conservation levels significantly correlates positively ( PC ) or negatively ( NC ) with a broad spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species. 

Lastly, although the knockout mouse data validates the cancer-resistance function of many PC genes ( Fig. 4E ), further studies are obviously required for testing the roles of PC and NC genes ( and their curated gene list in Table S8 ) in human carcinogenesis. Many of the genes identified are implicated in human cancers, and their further study may increase their understanding of human cancer development, prevention and treatment. 

The expression of PC genes in normal human tissues is associated with their lifetime cancer riskAs PC genes are enriched for human TSGs and oncogenes, they may also have roles in modulating human cancer risk. 

The top PC-enriched pathways using the MLTAW measure, where both body size and lifespan are multiplication factors, are dominated by cell cycle regulation and transcription/RNA regulation (Fig. S1), suggesting a stronger role of tissue stem cell division. 

Evidence considered are if a gene is: (a) a PC or NC gene (at FDR<0.1) for the all-species, mammals-only, birds-only analysis using both the estimates; (b) human oncogene or tumor suppressor; (c) whose knockout causes early cancer incidence or early cancer onset in mice; (d) is a loss-of-function gene in CTVT; (e) GWAS gene associated with human cancers; (f) expressed mutated genes in single-cell phylogeny of a mouse melanoma model. 

PC genes are associated with cancer incidence in mice and canine transmissible venereal tumorsThe authors investigated the relevance of PC and NC genes to cancer risk in other mammalian species. 

Some of the manually prioritized genes the authors identified are currently under investigation for association to cancer risk, and their results may support greater consideration of their contribution to human cancer development. 

These results echo those of a recent study showing that cell cycle, DNA repair, NF-κB-related, and immunity pathways have higher evolutionary constraints in larger and longer-living mammals (Kowalczyk et al. 2020). 

the authors used either TSGs alone, or oncogenes alone, or TSGs combined with oncogenes to compute the CR score: (Number of TSGs, or oncogenes, or combined > MCS) / (Total number of genes), where MCS is the median conservation score of all genes in a species.