scispace - formally typeset
Search or ask a question
Posted ContentDOI

Cross-species identification of cancer-resistance associated genes uncovers their relevance to human cancer risk

TL;DR: The authors applied a comparative genomics approach to systematically characterize the genes whose conservation levels significantly correlate positively (PC) or negatively (NC) with a broad spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species.
Abstract: Cancer is an evolutionarily conserved disease that occurs in a wide variety of species. We applied a comparative genomics approach to systematically characterize the genes whose conservation levels significantly correlates positively (PC) or negatively (NC) with a broad spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species. PC genes are enriched in pathways relevant to tumor suppression including cell cycle, DNA repair, and immune response, while NC genes are enriched with a host of metabolic pathways. The conservation levels of the PC and NC genes in a species serve to build the first genomics-based predictor of its cancer resistance score. We find that PC genes are less tolerant to loss of function (LoF) mutations, are enriched in cancer driver genes and are associated with germline mutations that increase human cancer risk. Furthermore, their expression levels are associated with lifetime cancer risk across human tissues. Finally, their knockout in mice results in increased cancer incidence. In sum, we find that many genes associated with cancer resistance across species are implicated in human cancers, pointing to several additional candidate genes that may have a functional role in human cancer.

Summary (2 min read)

Jump to: [INTRODUCTION][RESULTS][DISCUSSION] and [METHODS]

INTRODUCTION

  • Animal species are known to have dramatic differences in their cancer rates and lifespans, and several animals are considered cancer resistant while others are considered to be cancer prone (Gorbunova et al. 2014; Albuquerque et al. 2018).
  • Cancer risk does not correlate with body size across species, a contradiction known as Peto’s paradox (Peto, 1947; Tollis et al., 2017; Seluanov et al. 2018).
  • Note for a short review of such mechanisms).
  • Unlike previous studies that focused exclusively on mammals, here the authors perform a comprehensive genome-wide comparative study aimed at identifying genes related to cancer resistance across a wide range of vertebrate species.
  • Finally, the genes identified from this phylogenetic analysis are enriched for cancer driver genes and in genes associated with cancer risk in humans.

RESULTS

  • Computing gene conservation and species cancer-resistance estimates.
  • For each gene, the authors computed the Pearson correlation coefficient between its conservation scores and the cancer-resistance estimates (MLTAW and MLCAW) across all species (Tables S2A,B; Methods).
  • The authors then computed the pathway enrichment of the positive and of the negatively correlated genes (termed PC or NC genes, respectively) (Tables S3A,B; Methods).
  • Species with the top and bottom 5% MLCAW values in (A), the top and bottom 10% MLTAW or MLCAW values in (B,C), all data-points in (D), are labeled by their common names.
  • The authors manually curated the lists of PC genes, identifying a subset showing relevance to cancers based on multiple criteria according to the various analyses performed above (e.g., being human cancer drivers, genes whose knockout results in cancer-related phenotypes in mice, etc. Methods; Table S8), and investigated their functions closely.

DISCUSSION

  • The authors systematically analyzed the genomes of almost 200 species to identify genes whose conservation levels are correlated with cancer resistance estimates across different taxonomic groups and characterized their functional enrichment.
  • These results echo those of a recent study showing that cell cycle, DNA repair, NF-κB-related, and immunity pathways have higher evolutionary constraints in larger and longer-living mammals (Kowalczyk et al. 2020).
  • First, the gene conservation computation is based on comparison to a reference species and rank normalization, which does not consider paralogous genes or the phylogenetic tree structure.
  • Additionally, most of their downstream analyses were on the pathway-level, which mitigates the potential variation due to paralogs.
  • In summary, this study presents a systematic species comparison identifying key genes and pathways associated with cancer resistance across species.

METHODS

  • The authors created a matrix of conservation scores for 20076 genes across 240 species with human genome as a reference.
  • Finally, the conservation scores were obtained by ranknormalizing the protein length-normalized bit scores across genes within each species, to control for the evolutionary distance between human and each species.
  • To identify cancer resistance-associated genes (PC or NC genes), the authors computed the Pearson correlation coefficient between the conservation scores of each gene and each of the two cancer-resistance estimates (MLTAW and MLCAW) after proper transformation (described above).
  • All the genes were ranked by the Spearman’s correlation coefficient, and the enrichment of the PC or NC genes for genes associated with lifetime cancer risk across human tissues was tested with GSEA.
  • For each of the PC/NC genes from the various analyses (at FDR<0.1), the authors look for supporting evidence from many of the different analyses described in the manuscript.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Cross-species identification of cancer-resistance associated genes
uncovers their relevance to human cancer risk
Nishanth Ulhas Nair
1,*,#
, Kuoyuan Cheng
1,2,*,#
, Lamis Naddaf
3,*
, Elad Sharon
3
, Lipika R. Pal
1
, Padma
S. Rajagopal
4
, Irene Unterman
3
, Kenneth Aldape
5
, Sridhar Hannenhalli
1
, Chi-Ping Day
6
, Yuval
Tabach
3,#
, Eytan Ruppin
1,#
1. Cancer Data Science Laboratory (CDSL), National Cancer Institute (NCI), National Institutes of
Health (NIH), Bethesda, MD, USA.
2. Center for Bioinformatics and Computational Biology, University of Maryland, College Park,
MD, USA.
3. Department of Developmental Biology and Cancer Research, Institute of Medical Research -
Israel-Canada, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel.
4. Section of Hematology/Oncology, Department of Medicine, The University of Chicago,
Chicago, IL, USA.
5. Laboratory of Pathology, National Cancer Institute (NCI), National Institutes of Health (NIH),
Bethesda, MD, USA.
6. Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), National
Institutes of Health (NIH), Bethesda, MD, USA.
* These authors contributed equally to this work as co-first authors.
# co-corresponding authors (nishanth.nair@nih.gov, kycheng@terpmail.umd.edu,
tabachy@gmail.com, eytan.ruppin@nih.gov)
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

ABSTRACT
Cancer is an evolutionarily conserved disease that occurs in a wide variety of species. We
applied a comparative genomics approach to systematically characterize the genes whose
conservation levels significantly correlates positively (PC) or negatively (NC) with a broad
spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species. PC
genes are enriched in pathways relevant to tumor suppression including cell cycle, DNA repair,
and immune response, while NC genes are enriched with a host of metabolic pathways. The
conservation levels of the PC and NC genes in a species serve to build the first genomics-based
predictor of its cancer resistance score. We find that PC genes are less tolerant to loss of
function (LoF) mutations, are enriched in cancer driver genes and are associated with germline
mutations that increase human cancer risk. Furthermore, their expression levels are associated
with lifetime cancer risk across human tissues. Finally, their knockout in mice results in
increased cancer incidence. In sum, we find that many genes associated with cancer resistance
across species are implicated in human cancers, pointing to several additional candidate genes
that may have a functional role in human cancer.
INTRODUCTION
Animal species are known to have dramatic differences in their cancer rates and lifespans, and
several animals are considered cancer resistant while others are considered to be cancer prone
(Gorbunova
et al.
2014; Albuquerque
et al.
2018). Studying the genomic underpinnings of these
differences across various branches of life may provide insights into cancer development and
cancer prevention/treatment options in humans (Seluanov
et al.
2018).
The multistage carcinogenesis model states that “individual cells become cancerous
after accumulating a specific number of mutational hits” (Seluanov
et al.
2018; Nordling, 1953).
Based on this model, larger (and longer-living) animals are expected to have higher cancer
incidence as they have more stem cell divisions overall, resulting in a higher likelihood of
producing and propagating carcinogenic mutations. For humans, it has been shown that the
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

risks of cancer development across different tissue types are correlated with their
corresponding estimated number of lifetime stem cell divisions (Tomasetti
et al.
2015 and
2017); consistent with that, human cancer risk is indeed correlated with body height (Khankari
et al.
2016). However, cancer risk does not correlate with body size across species, a
contradiction known as Peto’s paradox (Peto, 1947; Tollis
et al.
, 2017; Seluanov
et al.
2018). For
example, humans do not have higher cancer risk than mice despite having thousands of times
more cells (Lipman
et al.
2004; Szymanska
et al.
2014; Ikeno
et al.
2009). More drastically, the
cancer-resistant bowhead whale (Keane
et al.
, 2016) can weigh 100 tons, live for over 200 years
(George
et al.
, 1999) and have millions times more cells than mice. It follows that different
species must have evolved different cancer resistance mechanisms to fit their lifestyles,
modifying the “baseline” probability of malignant transformation determined by body size,
lifespan, and tissue stem cell division (see Supp. Note for a short review of such mechanisms).
Numerous studies have adopted comparative genomics approaches to understand the
evolution of cancer resistance mechanisms across mammals. Some have focused on known
human cancer genes and their homologs. For example, Vicens and Posada (2018) found that
genes related to DNA repair and T cell proliferation have evolved under positive selection in
mammals. Tollis
et al.
(2020) found that the number of paralogs of human cancer genes across
mammals is positively correlated with the species’ lifespan, but not body size. Vazquez and
Lynch (2021) reported wide-spread tumor suppressor gene (TSG) duplications across both large
and small Afrotherian species. Other studies focused on body size and longevity, yielding some
insights into Peto’s paradox. Kowalczyk
et al.
(2020) analyzed genes whose evolutionary rates
across mammals correlate with body size and lifespan and discovered cancer resistance-related
genes that are under increased evolutionary constraints in larger and longer-living mammals.
Ferris
et al.
(2018) identified regions with accelerated evolution in specific mammals, including
several cancer resistant species, which provided some insights on the cancer resistance
mechanisms they have developed.
Unlike previous studies that focused exclusively on mammals, here we perform a
comprehensive genome-wide comparative study aimed at identifying genes related to cancer
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

resistance across a wide range of vertebrate species. To this end, we estimated the protein
conservation scores across species including mammals, birds and fish, identifying genes whose
conservation levels are associated with cancer resistance estimates. We then use these cancer-
resistance associated genes to build the first genomics-based predictor of cancer resistance for
any species. We show that the biological processes associated with cancer resistance vary
across taxonomic groups (classes and orders of species), pointing to the diversity in the
evolutionary paths and mechanisms for resisting cancer. Finally, the genes identified from this
phylogenetic analysis are enriched for cancer driver genes and in genes associated with cancer
risk in humans. These results show that a comparative genomic approach can help identify
genes involved in human cancers.
RESULTS
Computing
gene conservation
and
species cancer-resistance
estimates
We computed a matrix (Tabach
et al.
Nature 2013; Tabach
et al.
MSB 2013) of gene
conservation scores (phylogenetic profiles) across 240 species for which we had phenotypic
information in the AnAge database (Tacutu
et al.
2018) and sequence information from UniProt
(UnitProt Consortium, 2021), Refseq (O’Leary
et al.
2016), Keane
et al.
(2015), and NCBI (Sayers
et al.
2021) databases. To do this, the protein sequence similarity between each gene in the
genome of a reference species and its orthologs in each of the rest of the species (termed
phylogenetic profiling; Pellegrini
et al.
1999) was measured using the bit score computed with
BLASTP (Altschul
et al.
1990). The BLASTP bit scores were normalized by their gene length
(Tabach
et al.
Nature 2013; Sherill-Rofe
et al.
2019) and then rank-normalized across all genes
within each species to control for the evolutionary distance between the reference and each
species (Methods). These rank-normalized values range from 0 to 1, with higher values
corresponding to higher conservation levels. This method is termed rank-based phylogenetic
profiling. We primarily focused on the human as the reference species (Braun
et al.
, 2020) as
we are interested in making our findings relevant to human cancers. However, we
demonstrated that our conclusions are robust to the choice of reference (Methods, Supp.
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

Note), largely because the normalization effectively removes dependency on phylogenetic
distance.
Since the cancer incidence rates of most species are largely unknown, we used two
proxy cancer-resistance estimates that have been proposed in the literature
MTLAW
and
MLCAW
. MLTAW assumes that the level of cancer resistance in a given species needs to roughly
counteract its risk of cancer development due to cell division, which is proportional to ML
6
×
AW, where ML denotes the species maximum longevity and AW denotes its adult weight (Peto
et al.
1977, 2015; Vazquez
et al.
2021; Methods). MLCAW considers the well-established
correlation between lifespan and body weight (AW) across many species (Speakman, 2005) and
thus regresses out the species AW from its ML (Methods). We computed MLTAW and MLCAW
for 193 out of the 240 species for which both ML and AW data was publicly available (Table S1,
Methods). These 193 species are from multiple Vertebrata classes, including Mammalia
(mammals, n=108), Aves (birds, n=55), Teleostei (teleost fishes, n=18), and Reptilia (reptiles,
n=7).
Genes associated with cancer resistance are enriched in cell cycle, DNA repair, immune
response, and different metabolic pathways
For each gene, we computed the Pearson correlation coefficient between its conservation
scores and the cancer-resistance estimates (MLTAW and MLCAW) across all species (Tables
S2A,B; Methods). We then computed the pathway enrichment of the positive and of the
negatively correlated genes (termed PC or NC genes, respectively) (Tables S3A,B; Methods). PC
genes correlated with either the MLCAW (
Fig. 1
) and MLTAW measures (Fig. S1) are enriched
for cell cycle, immune response, DNA repair, and transcription regulation pathways (FDR<0.1),
indicating that many genes in these pathways are more conserved in the relatively long-lived
cancer-resistant species. NC genes are enriched for a diverse range of metabolic pathways
(FDR<0.1,
Figs. 1
,S1).
was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprint (whichthis version posted May 21, 2021. ; https://doi.org/10.1101/2021.05.19.444895doi: bioRxiv preprint

Citations
More filters
Journal ArticleDOI
TL;DR: In this article , a large dataset of vertebrate and invertebrate DNA methylomes was used to investigate the association of DNA methylation with the underlying genomic DNA sequence throughout vertebrate evolution.
Abstract: Abstract Methylation of cytosines is a prototypic epigenetic modification of the DNA. It has been implicated in various regulatory mechanisms across the animal kingdom and particularly in vertebrates. We mapped DNA methylation in 580 animal species (535 vertebrates, 45 invertebrates), resulting in 2443 genome-scale DNA methylation profiles of multiple organs. Bioinformatic analysis of this large dataset quantified the association of DNA methylation with the underlying genomic DNA sequence throughout vertebrate evolution. We observed a broadly conserved link with two major transitions—once in the first vertebrates and again with the emergence of reptiles. Cross-species comparisons focusing on individual organs supported a deeply conserved association of DNA methylation with tissue type, and cross-mapping analysis of DNA methylation at gene promoters revealed evolutionary changes for orthologous genes. In summary, this study establishes a large resource of vertebrate and invertebrate DNA methylomes, it showcases the power of reference-free epigenome analysis in species for which no reference genomes are available, and it contributes an epigenetic perspective to the study of vertebrate evolution.

6 citations

Posted ContentDOI
20 Jun 2022
TL;DR: In this paper , reference-genome independent analysis of this comprehensive dataset quantified the association of DNA methylation with the underlying genomic DNA sequence throughout vertebrate evolution and observed a broadly conserved link with two major transitions, once in the first vertebrates and again with the emergence of reptiles.
Abstract: Abstract Methylation of cytosines is the prototypic epigenetic modification of the DNA. It has been implicated in various regulatory mechanisms throughout the animal kingdom and particularly in vertebrates. We mapped DNA methylation in 580 animal species (535 vertebrates, 45 invertebrates), resulting in 2443 genome-scale, base-resolution DNA methylation profiles of primary tissue samples from various organs. Reference-genome independent analysis of this comprehensive dataset quantified the association of DNA methylation with the underlying genomic DNA sequence throughout vertebrate evolution. We observed a broadly conserved link with two major transitions – once in the first vertebrates and again with the emergence of reptiles. Cross-species comparisons focusing on individual organs supported a deeply conserved association of DNA methylation with tissue type, and cross-mapping analysis of DNA methylation at gene promoters revealed evolutionary changes for orthologous genes with conserved DNA methylation patterns. In summary, this study establishes a large resource of vertebrate and invertebrate DNA methylomes, it showcases the power of reference-free epigenome analysis in species for which no reference genomes are available, and it contributes an epigenetic perspective to the study of vertebrate evolution.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: A reliable and usable NPP construction pipeline is created using NPP from 1028 genomes, both separately and in various value combinations, and several parameter sets that optimized performance for pathways with certain biological annotation are identified.
Abstract: SUMMARY The exponential growth in available genomic data is expected to reach full sequencing of a million genomes in the coming decade. Improving and developing methods to analyze these genomes and to reveal their utility is of major interest in a wide variety of fields, such as comparative and functional genomics, evolution and bioinformatics. Phylogenetic profiling is an established method for predicting functional interactions between proteins based on similarities in their evolutionary patterns across species. Proteins that function together (i.e. generate complexes, interact in the same pathways or improve adaptation to environmental niches) tend to show coordinated evolution across the tree of life. The normalized phylogenetic profiling (NPP) method takes into account minute changes in proteins across species to identify protein co-evolution. Despite the success of this method, it is still not clear what set of parameters is required for optimal use of co-evolution in predicting functional interactions. Moreover, it is not clear if pathway evolution or function should direct parameter choice. Here, we create a reliable and usable NPP construction pipeline. We explore the effect of parameter selection on functional interaction prediction using NPP from 1028 genomes, both separately and in various value combinations. We identify several parameter sets that optimize performance for pathways with certain biological annotation. This work reveals the importance of choosing the right parameters for optimized function prediction based on a biological context. AVAILABILITY AND IMPLEMENTATION Source code and documentation are available on GitHub: https://github.com/iditam/CompareNPPs. CONTACT yuvaltab@ekmd.huji.ac.il. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

10 citations

Posted ContentDOI
23 May 2021-bioRxiv
TL;DR: Trisicell as discussed by the authors is a computational toolkit for scalable mutational intratumor heterogeneity inference and assessment from scRNAseq as well as single-cell genome or exome sequencing data, allowing reliable identification of distinct clonal lineages of a tumor, offering the ability to focus on the most important subclones and the genomic alterations that are associated with tumor proliferation.
Abstract: Advances in single-cell RNA sequencing (scRNAseq) technologies uncovered an unexpected complexity in tumors, underlining the relevance of intratumor heterogeneity to cancer progression and therapeutic resistance. Heterogeneity in the mutational composition of cancer cells is a result of distinct (sub)clonal expansions, each with a distinct metastatic potential and resistance to specific treatments. Unfortunately, due to their low read coverage per cell, scRNAseq datasets are too sparse and noisy to be used for detecting expressed mutations in single cells. Additionally, the large number of cells and mutations present in typical scRNAseq datasets are too large for available computational tools to, e.g., infer distinct subclones, lineages or trajectories in a tumor. Finally, there are no principled methods to assess distinct subclones inferred through single-cell sequencing data and the genomic alterations that seed and potentially cause them. Here we present Trisicell, a computational toolkit for scalable mutational intratumor heterogeneity inference and assessment from scRNAseq as well as single-cell genome or exome sequencing data. Trisicell allows reliable identification of distinct clonal lineages of a tumor, offering the ability to focus on the most important subclones and the genomic alterations that are associated with tumor proliferation. We comprehensively assessed Trisicell on a melanoma model by comparing distinct lineages and subclones it identifies on scRNAseq data, to those inferred using matching bulk whole exome (bWES) and transcriptome (bWTS) sequencing data from clonal sublines derived from single cells. Our results demonstrate that distinct lineages and subclones of a tumor can be reliably inferred and evaluated based on mutation calls from scRNAseq data through the use of Trisicell. Additionally, they reveal a strong correlation between aggressiveness and mutational composition, both across the inferred subclones, and among human melanomas. We also applied Trisicell to infer and evaluate distinct subclonal expansion patterns of the same mouse melanoma model after treatment with immune checkpoint blockade (ICB). After integratively analyzing our cell-specific mutation calls with their expression profiles, we observed that each subclone with a distinct set of novel somatic mutations is strongly associated with a specific developmental status. Moreover, each subclone had developed a unique ICB-resistance mechanism. These results demonstrate that Trisicell can robustly utilize scRNAseq data to delineate intratumor heterogeneity and help understand biological mechanisms underlying tumor progression and resistance to therapy.

6 citations

Posted ContentDOI
10 Feb 2021-bioRxiv
TL;DR: In this article, the authors compared protein-coding regions across the mammalian phylogeny, aiming to detect individual amino acid changes shared by the most long-lived mammal species and genes whose rates of protein evolution correlate with longevity.
Abstract: Mammals vary 100-fold in their maximum lifespan. This enormous variation is the result of the adaptations of each species to their own biological trade-offs and ecological conditions. Comparative genomics studies have demonstrated that the genomic factors underlying the lifespans of species and the longevity of individuals are shared across the tree of life. Here, we set out to compare protein-coding regions across the mammalian phylogeny, aiming to detect individual amino acid changes shared by the most long-lived mammal species and genes whose rates of protein evolution correlate with longevity. We discovered a total of 2,737 amino acid changes in 2,004 genes that distinguish long- and short-lived mammals, significantly more than expected by chance (p=0.003). The detected genes belong to pathways involved in regulating lifespan, such as inflammatory response and hemostasis. Among them, a total 1,157 amino acids, located in 996 different genes, showed a significant association with maximum lifespan in a phylogenetically controlled test. Interestingly, most of the detected amino acids positions do not vary in extant human populations (>81.2%) or have allele frequencies below 1% (99.78%), Consequently, almost none could have been detected by Genome-Wide Association Studies (GWAS). Additionally, we identified four more genes whose rate of protein evolution correlated with longevity in mammals. Crucially, SNPs located in the detected genes explain a larger fraction of human lifespan heritability than expected by chance, successfully demonstrating for the first time that comparative genomics can be used to enhance the interpretation of human GWAS. Finally, we show that the human longevity-associated proteins coded by the detected genes are significantly more stable than the orthologous proteins from short-lived mammals, strongly suggesting that general protein stability is linked to increased lifespan.

1 citations

Frequently Asked Questions (9)
Q1. What contributions have the authors mentioned in the paper "Cross-species identification of cancer-resistance associated genes uncovers their relevance to human cancer risk" ?

Seluanov et al. this paper applied a comparative genomics approach to systematically characterize the genes whose conservation levels significantly correlates positively ( PC ) or negatively ( NC ) with a broad spectrum of cancer-resistance estimates, computed across almost 200 vertebrate species. 

Lastly, although the knockout mouse data validates the cancer-resistance function of many PC genes ( Fig. 4E ), further studies are obviously required for testing the roles of PC and NC genes ( and their curated gene list in Table S8 ) in human carcinogenesis. Many of the genes identified are implicated in human cancers, and their further study may increase their understanding of human cancer development, prevention and treatment. 

The expression of PC genes in normal human tissues is associated with their lifetime cancer riskAs PC genes are enriched for human TSGs and oncogenes, they may also have roles in modulating human cancer risk. 

The top PC-enriched pathways using the MLTAW measure, where both body size and lifespan are multiplication factors, are dominated by cell cycle regulation and transcription/RNA regulation (Fig. S1), suggesting a stronger role of tissue stem cell division. 

Evidence considered are if a gene is: (a) a PC or NC gene (at FDR<0.1) for the all-species, mammals-only, birds-only analysis using both the estimates; (b) human oncogene or tumor suppressor; (c) whose knockout causes early cancer incidence or early cancer onset in mice; (d) is a loss-of-function gene in CTVT; (e) GWAS gene associated with human cancers; (f) expressed mutated genes in single-cell phylogeny of a mouse melanoma model. 

PC genes are associated with cancer incidence in mice and canine transmissible venereal tumorsThe authors investigated the relevance of PC and NC genes to cancer risk in other mammalian species. 

Some of the manually prioritized genes the authors identified are currently under investigation for association to cancer risk, and their results may support greater consideration of their contribution to human cancer development. 

These results echo those of a recent study showing that cell cycle, DNA repair, NF-κB-related, and immunity pathways have higher evolutionary constraints in larger and longer-living mammals (Kowalczyk et al. 2020). 

the authors used either TSGs alone, or oncogenes alone, or TSGs combined with oncogenes to compute the CR score: (Number of TSGs, or oncogenes, or combined > MCS) / (Total number of genes), where MCS is the median conservation score of all genes in a species.