scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Improve Glioblastoma Multiforme Prognosis Prediction by Using Feature Selection and Multiple Kernel Learning

TL;DR: The goal is to establish an integrated model which could predict GBM prognosis with high accuracy by taking advantage of the minimum redundancy feature selection method (mRMR) and Multiple Kernel Machine (MKL) learning method.
Abstract: Glioblastoma multiforme (GBM) is a highly aggressive type of brain cancer with very low median survival. In order to predict the patient's prognosis, researchers have proposed rules to classify different glioma cancer cell subtypes. However, survival time of different subtypes of GBM is often various due to different individual basis. Recent development in gene testing has evolved classic subtype rules to more specific classification rules based on single biomolecular features. These classification methods are proven to perform better than traditional simple rules in GBM prognosis prediction. However, the real power behind the massive data is still under covered. We believe a combined prediction model based on more than one data type could perform better, which will contribute further to clinical treatment of GBM. The Cancer Genome Atlas (TCGA) database provides huge dataset with various data types of many cancers that enables us to inspect this aggressive cancer in a new way. In this research, we have improved GBM prognosis prediction accuracy further by taking advantage of the minimum redundancy feature selection method (mRMR) and Multiple Kernel Machine (MKL) learning method. Our goal is to establish an integrated model which could predict GBM prognosis with high accuracy.
Citations
More filters
01 Jan 2013
TL;DR: In this article, the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs) was described, including several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA.
Abstract: We describe the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer.

2,616 citations

Journal ArticleDOI
TL;DR: It is shown that Bayesian models are able to use prior information and model measurements with various distributions, and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.
Abstract: Driven by high-throughput sequencing techniques, modern genomic and clinical studies are in a strong need of integrative machine learning models for better use of vast volumes of heterogeneous information in the deep understanding of biological systems and the development of predictive models. How data from multiple sources (called multi-view data) are incorporated in a learning system is a key step for successful analysis. In this article, we provide a comprehensive review on omics and clinical data integration techniques, from a machine learning perspective, for various analyses such as prediction, clustering, dimension reduction and association. We shall show that Bayesian models are able to use prior information and model measurements with various distributions; tree-based methods can either build a tree with all features or collectively make a final decision based on trees learned from each view; kernel methods fuse the similarity matrices learned from individual views together for a final similarity matrix or learning model; network-based fusion methods are capable of inferring direct and indirect associations in a heterogeneous network; matrix factorization models have potential to learn interactions among features from different views; and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.

333 citations


Cites methods from "Improve Glioblastoma Multiforme Pro..."

  • ...SimpleMKL-based [74] multiple kernel learning has been applied in [75] to predict cancer prognosis by using gene expres-...

    [...]

Proceedings ArticleDOI
27 Feb 2018
TL;DR: In this paper, the authors adopt and incorporate CapsNets for the problem of brain tumor classification to design an improved architecture which maximizes the accuracy of the classification problem at hand.
Abstract: Brain tumor is considered as one of the deadliest and most common form of cancer both in children and in adults. Consequently, determining the correct type of brain tumor in early stages is of significant importance to devise a precise treatment plan and predict patient's response to the adopted treatment. In this regard, there has been a recent surge of interest in designing Convolutional Neural Networks (CNNs) for the problem of brain tumor type classification. However, CNNs typically require large amount of training data and can not properly handle input transformations. Capsule networks (referred to as CapsNets) are brand new machine learning architectures proposed very recently to overcome these shortcomings of CNNs, and posed to revolutionize deep learning solutions. Of particular interest to this work is that Capsule networks are robust to rotation and affine transformation, and require far less training data, which is the case for processing medical image datasets including brain Magnetic Resonance Imaging (MRI) images. In this paper, we focus to achieve the following four objectives: (i) Adopt and incorporate CapsNets for the problem of brain tumor classification to design an improved architecture which maximizes the accuracy of the classification problem at hand; (ii) Investigate the over-fitting problem of CapsNets based on a real set of MRI images; (iii) Explore whether or not CapsNets are capable of providing better fit for the whole brain images or just the segmented tumor, and; (iv) Develop a visualization paradigm for the output of the CapsNet to better explain the learned features. Our results show that the proposed approach can successfully overcome CNNs for the brain tumor classification problem.

304 citations

Journal ArticleDOI
28 Jan 2019-Genes
TL;DR: In this article, state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.
Abstract: Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.

185 citations


Cites background or methods from "Improve Glioblastoma Multiforme Pro..."

  • ...For improving drug sensitivity in breast cancer, genomics, epigenomics, and proteomics, data were integrated using a multiview multiple kernel learning (MKL) approach [25]....

    [...]

  • ...For example, Glioblastoma Multiforme is a highly aggressive type of brain cancer whose prognostic prediction can be improved by considering multiple data types together [77], i....

    [...]

  • ...Moreover, scalable MKL methods like dual-layer kernel extreme learning machine (DKELM) [198] and easyMKL [199] can be employed in multi-omics integrative analysis since MKL, a popular approach for integrating multiple omics datasets, can be computationally very expensive for large datasets....

    [...]

  • ...Multiple kernel learning (MKL) [82] has become a popular approach to integrate data by calculating individual kernel matrices for each data type and fusing them into a global model....

    [...]

  • ...For example, a feed-forward neural network with multiple hidden layers can now be trained to accurately differentiate non-coding RNA types, i.e., circular RNAs (cirRNAs) from long non-coding RNAs (lncRNAs) in just a few hours on a single computer while the MKL method would take four days [182]....

    [...]

Journal ArticleDOI
TL;DR: This study proposes a Multimodal Deep Neural Network by integrating Multi-dimensional Data (MDNNMD) for the prognosis prediction of breast cancer and shows that the proposed method achieves a better performance than the prediction methods with single-dimensional data and other existing approaches.
Abstract: Breast cancer is a highly aggressive type of cancer with very low median survival. Accurate prognosis prediction of breast cancer can spare a significant number of patients from receiving unnecessary adjuvant systemic treatment and its related expensive medical costs. Previous work relies mostly on selected gene expression data to create a predictive model. The emergence of deep learning methods and multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of breast cancer and therefore can improve diagnosis, treatment, and prevention. In this study, we propose a Multimodal Deep Neural Network by integrating Multi-dimensional Data (MDNNMD) for the prognosis prediction of breast cancer. The novelty of the method lies in the design of our method's architecture and the fusion of multi-dimensional data. The comprehensive performance evaluation results show that the proposed method achieves a better performance than the prediction methods with single-dimensional data and other existing approaches. The source code implemented by TensorFlow 1.0 deep learning library can be downloaded from the Github: https://github.com/USTC-HIlab/MDNNMD.

174 citations


Cites background or methods from "Improve Glioblastoma Multiforme Pro..."

  • ...propose a multiple kernel machine learning method by fusing different types of data for the GBM prognosis prediction [16]....

    [...]

  • ...To further demonstrate the predictive results of the multi-dimensional data in assessing the risk of developing distant metastases in breast cancer patients, survival data analyses of the proposed method is also performed according to previous studies [16], [51], [52], the Kaplan-Meier curve is plotted and shown in Fig....

    [...]

  • ...The optimal parameters are chosen by the parameter combination leading to the best performance (AUC value) [16], [44]....

    [...]

  • ...mRMR [36] is one of the most common dimensionality reduction algorithm in a wide range of applications [16], [37], [38]....

    [...]

  • ...To comprehensively evaluate our proposed method, we use ten-fold cross validation experiment in consistent with previous existing studies of cancer prognosis prediction [16], [48]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The addition of temozolomide to radiotherapy for newly diagnosed glioblastoma resulted in a clinically meaningful and statistically significant survival benefit with minimal additional toxicity.
Abstract: methods Patients with newly diagnosed, histologically confirmed glioblastoma were randomly assigned to receive radiotherapy alone (fractionated focal irradiation in daily fractions of 2 Gy given 5 days per week for 6 weeks, for a total of 60 Gy) or radiotherapy plus continuous daily temozolomide (75 mg per square meter of body-surface area per day, 7 days per week from the first to the last day of radiotherapy), followed by six cycles of adjuvant temozolomide (150 to 200 mg per square meter for 5 days during each 28-day cycle). The primary end point was overall survival. results A total of 573 patients from 85 centers underwent randomization. The median age was 56 years, and 84 percent of patients had undergone debulking surgery. At a median follow-up of 28 months, the median survival was 14.6 months with radiotherapy plus temozolomide and 12.1 months with radiotherapy alone. The unadjusted hazard ratio for death in the radiotherapy-plus-temozolomide group was 0.63 (95 percent confidence interval, 0.52 to 0.75; P<0.001 by the log-rank test). The two-year survival rate was 26.5 percent with radiotherapy plus temozolomide and 10.4 percent with radiotherapy alone. Concomitant treatment with radiotherapy plus temozolomide resulted in grade 3 or 4 hematologic toxic effects in 7 percent of patients.

16,653 citations


"Improve Glioblastoma Multiforme Pro..." refers background in this paper

  • ...Recent development in gene testing has evolved classic subtype rules to more specific classification rules based on single biomolecular features....

    [...]

Journal ArticleDOI
TL;DR: The fourth edition of the World Health Organization (WHO) classification of tumours of the central nervous system, published in 2007, lists several new entities, including angiocentric glioma, papillary glioneuronal tumour, rosette-forming glioneurs tumour of the fourth ventricle, Papillary tumourof the pineal region, pituicytoma and spindle cell oncocytoma of the adenohypophysis.
Abstract: The fourth edition of the World Health Organization (WHO) classification of tumours of the central nervous system, published in 2007, lists several new entities, including angiocentric glioma, papillary glioneuronal tumour, rosette-forming glioneuronal tumour of the fourth ventricle, papillary tumour of the pineal region, pituicytoma and spindle cell oncocytoma of the adenohypophysis. Histological variants were added if there was evidence of a different age distribution, location, genetic profile or clinical behaviour; these included pilomyxoid astrocytoma, anaplastic medulloblastoma and medulloblastoma with extensive nodularity. The WHO grading scheme and the sections on genetic profiles were updated and the rhabdoid tumour predisposition syndrome was added to the list of familial tumour syndromes typically involving the nervous system. As in the previous, 2000 edition of the WHO ‘Blue Book’, the classification is accompanied by a concise commentary on clinico-pathological characteristics of each tumour type. The 2007 WHO classification is based on the consensus of an international Working Group of 25 pathologists and geneticists, as well as contributions from more than 70 international experts overall, and is presented as the standard for the definition of brain tumours to the clinical oncology and cancer research communities world-wide.

13,134 citations


"Improve Glioblastoma Multiforme Pro..." refers background in this paper

  • ...However, the real power behind the massive data is still under covered....

    [...]

Journal ArticleDOI
TL;DR: The most recent data on cancer incidence, mortality, and survival from the American Cancer Society (ACS) is presented in this paper, where the authors compare the three major cancer sites in men (lung, prostate, and colon and rectum [colorectum]) and in two major cancers sites in women (breast and colorectal) over a 15-year period.
Abstract: Each year, the American Cancer Society estimates the number of new cancer cases and deaths expected in the United States in the current year and compiles the most recent data on cancer incidence, mortality, and survival based on incidence data from the National Cancer Institute, Centers for Disease Control and Prevention, and the North American Association of Central Cancer Registries and mortality data from the National Center for Health Statistics. Incidence and death rates are standardized by age to the 2000 United States standard million population. A total of 1,479,350 new cancer cases and 562,340 deaths from cancer are projected to occur in the United States in 2009. Overall cancer incidence rates decreased in the most recent time period in both men (1.8% per year from 2001 to 2005) and women (0.6% per year from 1998 to 2005), largely because of decreases in the three major cancer sites in men (lung, prostate, and colon and rectum [colorectum]) and in two major cancer sites in women (breast and colorectum). Overall cancer death rates decreased in men by 19.2% between 1990 and 2005, with decreases in lung (37%), prostate (24%), and colorectal (17%) cancer rates accounting for nearly 80% of the total decrease. Among women, overall cancer death rates between 1991 and 2005 decreased by 11.4%, with decreases in breast (37%) and colorectal (24%) cancer rates accounting for 60% of the total decrease. The reduction in the overall cancer death rates has resulted in the avoidance of about 650,000 deaths from cancer over the 15-year period. This report also examines cancer incidence, mortality, and survival by site, sex, race/ethnicity, education, geographic area, and calendar year. Although progress has been made in reducing incidence and mortality rates and improving survival, cancer still accounts for more deaths than heart disease in persons younger than 85 years of age. Further progress can be accelerated by applying existing cancer control knowledge across all segments of the population and by supporting new discoveries in cancer prevention, early detection, and treatment.

9,129 citations

Journal ArticleDOI
TL;DR: In this article, the maximal statistical dependency criterion based on mutual information (mRMR) was proposed to select good features according to the maximal dependency condition. But the problem of feature selection is not solved by directly implementing mRMR.
Abstract: Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we first derive an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection. Then, we present a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers). This allows us to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits, arrhythmia, NCI cancer cell lines, and lymphoma tissues). The results confirm that mRMR leads to promising improvement on feature selection and classification accuracy.

8,078 citations

05 Aug 2003
TL;DR: This work derives an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection, and presents a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers).

7,075 citations