scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Reconfiguring phosphorylation signaling by genetic polymorphisms affects cancer susceptibility

01 Jun 2015-Journal of Molecular Cell Biology (Oxford University Press)-Vol. 7, Iss: 3, pp 187-202
TL;DR: By systematically characterizing potential phosphorylation-related cancer mutations in 12 types of cancers, it was observed that both types of GVs preferentially occur in the known cancer genes, while a considerable number of phosphorylated proteins contain both phosSNPs and phosCMs.
Abstract: Large-scale sequencing has characterized an enormous number of genetic variations (GVs), and the functional analysis of GVs is fundamental to understanding differences in disease susceptibility and therapeutic response among and within populations. Using a combination of a sequence-based predictor with known phosphorylation and protein – protein interaction information, we computationally detected 9606 potential phosSNPs (phosphorylation-related single nucleotide polymorphisms), including 720 known, disease-associated SNPs that dramatically modify the human phosSNP-associated kinase – substrate network. Further analyses demonstrated that the proteins in the network are heavily associated in various signaling and cancer pathways, while cancer genes and drug targets are significantly enriched. We re-constructed four population-specific kinase – substrate networks and found that several inherited disease or cancer genes, such as IRS1, RAF1, and EGFR, were differentially regulated by phosSNPs. Thus, phosSNPs may influence disease susceptibility and be involved in cancer development by reconfiguring phosphorylation networks in different populations. Moreover, by systematically characterizing potential phosphorylation-related cancer mutations (phosCMs) in 12 types of cancers, we observed that both types of GVs preferentially occur in the known cancer genes, while a considerable number of phosphorylated proteins, especially those over-representing cancer genes, contain both phosSNPs and phosCMs. Furthermore, it was observed that phosSNPs were significantly enriched in amplification genes identified from breast cancers and tyrosine kinase circuits of lung cancers. Taken together, these results should prove helpful for further elucidation of the functional impacts of disease-associated SNPs.
Citations
More filters
Journal Article
TL;DR: In this paper, the coding exons of the family of 518 protein kinases were sequenced in 210 cancers of diverse histological types to explore the nature of the information that will be derived from cancer genome sequencing.
Abstract: AACR Centennial Conference: Translational Cancer Medicine-- Nov 4-8, 2007; Singapore PL02-05 All cancers are due to abnormalities in DNA. The availability of the human genome sequence has led to the proposal that resequencing of cancer genomes will reveal the full complement of somatic mutations and hence all the cancer genes. To explore the nature of the information that will be derived from cancer genome sequencing we have sequenced the coding exons of the family of 518 protein kinases, ~1.3Mb DNA per cancer sample, in 210 cancers of diverse histological types. Despite the screen being directed toward the coding regions of a gene family that has previously been strongly implicated in oncogenesis, the results indicate that the majority of somatic mutations detected are “passengers”. There is considerable variation in the number and pattern of these mutations between individual cancers, indicating substantial diversity of processes of molecular evolution between cancers. The imprints of exogenous mutagenic exposures, mutagenic treatment regimes and DNA repair defects can all be seen in the distinctive mutational signatures of individual cancers. This systematic mutation screen and others have previously yielded a number of cancer genes that are frequently mutated in one or more cancer types and which are now anticancer drug targets (for example BRAF , PIK3CA , and EGFR ). However, detailed analyses of the data from our screen additionally suggest that there exist a large number of additional “driver” mutations which are distributed across a substantial number of genes. It therefore appears that cells may be able to utilise mutations in a large repertoire of potential cancer genes to acquire the neoplastic phenotype. However, many of these genes are employed only infrequently. These findings may have implications for future anticancer drug development.

2,737 citations

Journal ArticleDOI
TL;DR: A review of methods and computational tools developed during the past several years for detecting potential driver or actionable mutations in rapidly emerging precision cancer medicine may help investigators find an appropriate tool.
Abstract: Cancer is often driven by the accumulation of genetic alterations, including single nucleotide variants, small insertions or deletions, gene fusions, copy-number variations, and large chromosomal rearrangements. Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data and catalog somatic mutations in both common and rare cancer types. So far, the somatic mutation landscapes and signatures of >10 major cancer types have been reported; however, pinpointing driver mutations and cancer genes from millions of available cancer somatic mutations remains a monumental challenge. To tackle this important task, many methods and computational tools have been developed during the past several years and, thus, a review of its advances is urgently needed. Here, we first summarize the main features of these methods and tools for whole-exome, whole-genome and whole-transcriptome sequencing data. Then, we discuss major challenges like tumor intra-heterogeneity, tumor sample saturation and functionality of synonymous mutations in cancer, all of which may result in false-positive discoveries. Finally, we highlight new directions in studying regulatory roles of noncoding somatic mutations and quantitatively measuring circulating tumor DNA in cancer. This review may help investigators find an appropriate tool for detecting potential driver or actionable mutations in rapidly emerging precision cancer medicine.

123 citations

Journal ArticleDOI
TL;DR: Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations, and ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs.
Abstract: Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org.

72 citations

Journal ArticleDOI
TL;DR: A curated database of dbPAF, containing known phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. elegans, S. pombe and S. cerevisiae, is developed, largely consistent with previous reports, and proposed new features of phospho-regulation.
Abstract: Protein phosphorylation is one of the most important post-translational modifications (PTMs) and regulates a broad spectrum of biological processes. Recent progresses in phosphoproteomic identifications have generated a flood of phosphorylation sites, while the integration of these sites is an urgent need. In this work, we developed a curated database of dbPAF, containing known phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. pombe and S. cerevisiae. From the scientific literature and public databases, we totally collected and integrated 54,148 phosphoproteins with 483,001 phosphorylation sites. Multiple options were provided for accessing the data, while original references and other annotations were also present for each phosphoprotein. Based on the new data set, we computationally detected significantly over-represented sequence motifs around phosphorylation sites, predicted potential kinases that are responsible for the modification of collected phospho-sites, and evolutionarily analyzed phosphorylation conservation states across different species. Besides to be largely consistent with previous reports, our results also proposed new features of phospho-regulation. Taken together, our database can be useful for further analyses of protein phosphorylation in human and other model organisms. The dbPAF database was implemented in PHP + MySQL and freely available at http://dbpaf.biocuckoo.org.

69 citations

Journal ArticleDOI
TL;DR: A hypothetical model on the tumorigenesis from premalignant stem cells to glioma is developed and it is shown that genomic alterations of PIK3CA, CDKN2A, CDK4, FIP1L1, or FUBP1 collaborate with IDH mutations to negatively affect patients' survival in LGG.
Abstract: Glioma is a complex disease with limited treatment options. Recent advances have identified isocitrate dehydrogenase (IDH) mutations in up to 80% lower grade gliomas (LGG) and in 76% secondary glioblastomas (GBM). IDH mutations are also seen in 10%-20% of acute myeloid leukemia (AML). In AML, it was determined that mutations of IDH and other genes involving epigenetic regulations are early events, emerging in the pre-leukemic stem cells (pre-LSCs) stage, whereas mutations in genes propagating oncogenic signal are late events in leukemia. IDH mutations are also early events in glioma, occurring before TP53 mutation, 1p/19q deletion, etc. Despite these advances in glioma research, studies into other molecular alterations have lagged considerably. In this study, we analyzed currently available databases. We identified EZH2, KMT2C, and CHD4 as important genes in glioma in addition to the known gene IDH1/2. We also showed that genomic alterations of PIK3CA, CDKN2A, CDK4, FIP1L1, or FUBP1 collaborate with IDH mutations to negatively affect patients' survival in LGG. In LGG patients with TP53 mutations or IDH1/2 mutations, additional genomic alterations of EZH2, KMC2C, and CHD4 individually or in combination were associated with a markedly decreased disease-free survival than patients without such alterations. Alterations of EZH2, KMT2C, and CHD4 at genetic level or protein level could perturb epigenetic program, leading to malignant transformation in glioma. By reviewing current literature on both AML and glioma and performing bioinformatics analysis on available datasets, we developed a hypothetical model on the tumorigenesis from premalignant stem cells to glioma.

50 citations


Cites methods from "Reconfiguring phosphorylation signa..."

  • ...In the future, more advanced data mining techniques (Jiang et al., 2011, 2015; Zhang et al., 2014, 2015a, b, 2016; Jiang, 2015; Kim et al., 2015; Melamed et al., 2015; Tanaka and Ogishima, 2015; Wang et al., 2015; Xia et al., 2017b) will be integrated to continue this research....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Abstract: Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

32,980 citations

Journal ArticleDOI
28 Oct 2010-Nature
TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.
Abstract: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

7,538 citations

Journal ArticleDOI
TL;DR: The dbSNP database is a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, and is integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data.
Abstract: In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K.Sirotkin (1999) Genome Res., 9, 677–679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp:// ncbi.nlm.nih.gov/snp/.

6,449 citations


"Reconfiguring phosphorylation signa..." refers methods in this paper

  • ...1 flat format) on September 17, 2014 (Sherry et al., 2001)....

    [...]

  • ...The human SNPs with map summaries were downloaded from the NCBI ftp server (dbSNP Build 141, RefSNP docsums in ASN.1 flat format) on September 17, 2014 (Sherry et al., 2001)....

    [...]

Journal ArticleDOI
29 Jun 2007-Cell
TL;DR: Those Akt substrates that are most likely to contribute to the diverse cellular roles of Akt, which include cell survival, growth, proliferation, angiogenesis, metabolism, and migration are discussed.

5,505 citations


"Reconfiguring phosphorylation signa..." refers background in this paper

  • ...The aberrant signaling of AKT was involved in a variety of complex diseases (Manning and Cantley, 2007)....

    [...]

Related Papers (5)