scispace - formally typeset
Search or ask a question
Author

Ya Zhang

Bio: Ya Zhang is an academic researcher from University of Science and Technology of China. The author has contributed to research in topics: Cancer & Gene signature. The author has an hindex of 3, co-authored 3 publications receiving 103 citations.

Papers
More filters
Proceedings ArticleDOI
01 Oct 2012
TL;DR: An efficient feature selection method: the support vector machine-based recursive feature elimination (SVM-RFE) approach for gene selection and prognosis prediction finds a 50-gene signature that is effective in predicting the prognoses of metastases and distinguishing patient who should receive adjuvant therapy.
Abstract: Breast cancer is a common disease in elderly women. With the development of microarray technique, discovering gene signature became a powerful approach in predicting survival of breast cancer. Previously, a 70-gene signature had been discovered for breast cancer prognosis prediction and received a good performance. In this study we adopted an efficient feature selection method: the support vector machine-based recursive feature elimination (SVM-RFE) approach for gene selection and prognosis prediction. Using the leave-one-out evaluation procedure on a gene expression dataset including 295 breast cancer patients, we discovered a 50-gene signature that by combing with SVM, achieved a superior prediction performance with 34%, 48% and 3% improvement in Accuracy, Sensitivity and Specificity, compared with the widely used 70-gene signature. Further analysis shows that the 50-gene signature is effective in predicting the prognoses of metastases and distinguishing patient who should receive adjuvant therapy.

72 citations

Journal ArticleDOI
TL;DR: The goal is to establish an integrated model which could predict GBM prognosis with high accuracy by taking advantage of the minimum redundancy feature selection method (mRMR) and Multiple Kernel Machine (MKL) learning method.
Abstract: Glioblastoma multiforme (GBM) is a highly aggressive type of brain cancer with very low median survival. In order to predict the patient's prognosis, researchers have proposed rules to classify different glioma cancer cell subtypes. However, survival time of different subtypes of GBM is often various due to different individual basis. Recent development in gene testing has evolved classic subtype rules to more specific classification rules based on single biomolecular features. These classification methods are proven to perform better than traditional simple rules in GBM prognosis prediction. However, the real power behind the massive data is still under covered. We believe a combined prediction model based on more than one data type could perform better, which will contribute further to clinical treatment of GBM. The Cancer Genome Atlas (TCGA) database provides huge dataset with various data types of many cancers that enables us to inspect this aggressive cancer in a new way. In this research, we have improved GBM prognosis prediction accuracy further by taking advantage of the minimum redundancy feature selection method (mRMR) and Multiple Kernel Machine (MKL) learning method. Our goal is to establish an integrated model which could predict GBM prognosis with high accuracy.

62 citations

Journal ArticleDOI
TL;DR: The research shows that HI-MKL is an accurate, robust, and generalized MKL method, which performs well in a GBM prognosis task, and is built a system that could predict the G BM prognosis with high accuracy.
Abstract: Glioblastoma multiforme (GBM) is one of the most malignant brain tumors with very short prognosis expectation. To improve patients’ clinical treatment and their life quality after surgery, researches have developed tremendous in silico models and tools for predicting GBM prognosis based on molecular datasets and have earned great success. However, pathology still plays the most critical role in cancer diagnosis and prognosis in the clinic at present. Recent advancement of storing and processing histopathological images has drawn attention of researchers. Models based on histopathological images are developed, which show great potential for computer-aided pathological diagnoses. But models based on both molecular and histopathological images that could predict GBM prognosis with high accuracy are not present yet. In our previous research, we used the simple MKL method to integrate multi-omics data to improve GBM prognosis prediction successfully. In this paper, we have developed a novel multiple kernel learning (MKL) method, named histopathological integrating multiple kernel learning (HI-MKL), that could integrate both histopathological images and multi-omics data efficiently. By using datasets from The Cancer Genome Atlas project, we have built a system that could predict the GBM prognosis with high accuracy. Our research shows that HI-MKL is an accurate, robust, and generalized MKL method, which performs well in a GBM prognosis task.

27 citations


Cited by
More filters
01 Jan 2013
TL;DR: In this article, the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs) was described, including several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA.
Abstract: We describe the landscape of somatic genomic alterations based on multidimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors, including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer.

2,616 citations

Journal ArticleDOI
TL;DR: Given the growing trend on the application of ML methods in cancer research, this work presents here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.
Abstract: Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.

1,991 citations

Journal ArticleDOI
TL;DR: It is shown that Bayesian models are able to use prior information and model measurements with various distributions, and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.
Abstract: Driven by high-throughput sequencing techniques, modern genomic and clinical studies are in a strong need of integrative machine learning models for better use of vast volumes of heterogeneous information in the deep understanding of biological systems and the development of predictive models. How data from multiple sources (called multi-view data) are incorporated in a learning system is a key step for successful analysis. In this article, we provide a comprehensive review on omics and clinical data integration techniques, from a machine learning perspective, for various analyses such as prediction, clustering, dimension reduction and association. We shall show that Bayesian models are able to use prior information and model measurements with various distributions; tree-based methods can either build a tree with all features or collectively make a final decision based on trees learned from each view; kernel methods fuse the similarity matrices learned from individual views together for a final similarity matrix or learning model; network-based fusion methods are capable of inferring direct and indirect associations in a heterogeneous network; matrix factorization models have potential to learn interactions among features from different views; and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.

333 citations

Proceedings ArticleDOI
27 Feb 2018
TL;DR: In this paper, the authors adopt and incorporate CapsNets for the problem of brain tumor classification to design an improved architecture which maximizes the accuracy of the classification problem at hand.
Abstract: Brain tumor is considered as one of the deadliest and most common form of cancer both in children and in adults. Consequently, determining the correct type of brain tumor in early stages is of significant importance to devise a precise treatment plan and predict patient's response to the adopted treatment. In this regard, there has been a recent surge of interest in designing Convolutional Neural Networks (CNNs) for the problem of brain tumor type classification. However, CNNs typically require large amount of training data and can not properly handle input transformations. Capsule networks (referred to as CapsNets) are brand new machine learning architectures proposed very recently to overcome these shortcomings of CNNs, and posed to revolutionize deep learning solutions. Of particular interest to this work is that Capsule networks are robust to rotation and affine transformation, and require far less training data, which is the case for processing medical image datasets including brain Magnetic Resonance Imaging (MRI) images. In this paper, we focus to achieve the following four objectives: (i) Adopt and incorporate CapsNets for the problem of brain tumor classification to design an improved architecture which maximizes the accuracy of the classification problem at hand; (ii) Investigate the over-fitting problem of CapsNets based on a real set of MRI images; (iii) Explore whether or not CapsNets are capable of providing better fit for the whole brain images or just the segmented tumor, and; (iv) Develop a visualization paradigm for the output of the CapsNet to better explain the learned features. Our results show that the proposed approach can successfully overcome CNNs for the brain tumor classification problem.

304 citations