scispace - formally typeset
Search or ask a question
Author

Yue Joseph Wang

Bio: Yue Joseph Wang is an academic researcher from Virginia Tech. The author has contributed to research in topics: Image registration & Visualization. The author has an hindex of 16, co-authored 47 publications receiving 599 citations. Previous affiliations of Yue Joseph Wang include Children's National Medical Center & The Catholic University of America.

Papers
More filters
Journal ArticleDOI
TL;DR: A network-based approach for cancer biomarker identification, netSVM, is developed, resulting in an improved prediction performance with network biomarkers and several novel hub genes, which may provide new insight to the underlying mechanism of breast cancer metastasis.
Abstract: Background One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers.

87 citations

Journal ArticleDOI
TL;DR: A novel unsupervised deconvolution method, within a well-grounded mathematical framework, to dissect mixed gene expressions in heterogeneous tumor samples, and results obtained suggest not only the existence of cell-specific MGs but also UNDO's ability to detect them blindly and correctly.
Abstract: Summary: We develop a novel unsupervised deconvolution method, within a well-grounded mathematical framework, to dissect mixed gene expressions in heterogeneous tumor samples. We implement an R package, UNsupervised DecOnvolution (UNDO), that can be used to automatically detect cell-specific marker genes (MGs) located on the scatter radii of mixed gene expressions, estimate cellular proportions in each sample and deconvolute mixed expressions into cell-specific expression profiles. We demonstrate the performance of UNDO over a wide range of tumor–stroma mixing proportions, validate UNDO on various biologically mixed benchmark gene expression datasets and further estimate tumor purity in TCGA/CPTAC datasets. The highly accurate deconvolution results obtained suggest not only the existence of cell-specific MGs but also UNDO’s ability to detect them blindly and correctly. Although the principal application here involves microarray gene expressions, our methodology can be readily applied to other types of quantitative molecular profiling data. Availability and implementation: UNDO is available at http://bioconductor.org/packages. Contact: ude.tv@gnaweuy Supplementary information: Supplementary data are available at Bioinformatics online.

57 citations

Journal ArticleDOI
TL;DR: This work has developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly and serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables may affect biological networks and clinical phenotypes.
Abstract: Summary: Differential dependency network (DDN) is a caBIG® (cancer Biomedical Informatics Grid) analytical tool for detecting and visualizing statistically significant topological changes in transcriptional networks representing two biological conditions. Developed under caBIG® 's In Silico Research Centers of Excellence (ISRCE) Program, DDN enables differential network analysis and provides an alternative way for defining network biomarkers predictive of phenotypes. DDN also serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables may affect biological networks and clinical phenotypes. Besides the standalone Java application, we have also developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly. Availability: The Java and MATLAB source code can be downloaded at the authors' web site http://www.cbil.ece.vt.edu/software.htm Contact: ude.tv@gnaweuy Supplementary information: Supplementary data are available at Bioinformatics online.

44 citations

Journal ArticleDOI
Guoqiang Yu1, Bai Zhang1, G. Steven Bova1, Jianfeng Xu1, Ie-Ming Shih1, Yue Joseph Wang1 
TL;DR: A statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells is reported.
Abstract: Motivation: Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples. Results: We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines. Availability: The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm. Contact: ude.tv@gnaweuy Supplementary Information: Supplementary data are available at Bioinformatics online.

40 citations

Journal ArticleDOI
TL;DR: In this article, a motif-directed NCA (mNCA) was proposed to integrate motif information and gene expression data to infer regulatory networks, which is applicable to many biological studies due to a lack of ChIP-on-chip data.
Abstract: Background Network Component Analysis (NCA) has shown its effectiveness in discovering regulators and inferring transcription factor activities (TFAs) when both microarray data and ChIP-on-chip data are available. However, a NCA scheme is not applicable to many biological studies due to limited topology information available, such as lack of ChIP-on-chip data. We propose a new approach, motif-directed NCA (mNCA), to integrate motif information and gene expression data to infer regulatory networks.

32 citations


Cited by
More filters
Journal ArticleDOI
28 Jul 2016-Cell
TL;DR: A view of how the somatic genome drives the cancer proteome and associations between protein and post-translational modification levels and clinical outcomes in HGSC is provided.

728 citations

Journal ArticleDOI
TL;DR: The recent progress of SVMs in cancer genomic studies is reviewed and the strength of the SVM learning and its future perspective incancer genomic applications is comprehended.
Abstract: Machine learning with maximization (support) of separating margin (vector), called support vector machine (SVM) learning, is a powerful classification tool that has been used for cancer genomic classification or subtyping. Today, as advancements in high-throughput technologies lead to production of large amounts of genomic and epigenomic data, the classification feature of SVMs is expanding its use in cancer genomics, leading to the discovery of new biomarkers, new drug targets, and a better understanding of cancer driver genes. Herein we reviewed the recent progress of SVMs in cancer genomic studies. We intend to comprehend the strength of the SVM learning and its future perspective in cancer genomic applications.

635 citations

Journal ArticleDOI
TL;DR: It is shown through a high-resolution genome-wide single nucleotide polymorphism and copy number survey that most, if not all, metastatic prostate cancers have monoclonal origins and maintain a unique signature copy number pattern of the parent cancer cell while also accumulating a variable number of separate subclonally sustained changes.
Abstract: Many studies have shown that primary prostate cancers are multifocal and are composed of multiple genetically distinct cancer cell clones. Whether or not multiclonal primary prostate cancers typically give rise to multiclonal or monoclonal prostate cancer metastases is largely unknown, although studies at single chromosomal loci are consistent with the latter case. Here we show through a high-resolution genome-wide single nucleotide polymorphism and copy number survey that most, if not all, metastatic prostate cancers have monoclonal origins and maintain a unique signature copy number pattern of the parent cancer cell while also accumulating a variable number of separate subclonally sustained changes. We find no relationship between anatomic site of metastasis and genomic copy number change pattern. Taken together with past animal and cytogenetic studies of metastasis and recent single-locus genetic data in prostate and other metastatic cancers, these data indicate that despite common genomic heterogeneity in primary cancers, most metastatic cancers arise from a single precursor cancer cell. This study establishes that genomic archeology of multiple anatomically separate metastatic cancers in individuals can be used to define the salient genomic features of a parent cancer clone of proven lethal metastatic phenotype.

631 citations

Journal ArticleDOI
TL;DR: This work classifies such integrative approaches into four broad categories, describes their bioinformatic principles and review their applications.
Abstract: A central goal of systems biology is to elucidate the structural and functional architecture of the cell. To this end, large and complex networks of molecular interactions are being rapidly generated for humans and model organisms. A recent focus of bioinformatics research has been to integrate these networks with each other and with diverse molecular profiles to identify sets of molecules and interactions that participate in a common biological function - that is, 'modules'. Here, we classify such integrative approaches into four broad categories, describe their bioinformatic principles and review their applications.

532 citations

01 Jan 2003
TL;DR: This work has provided a keyword index to help finding articles of interest, and additionally a modern automatically constructed variant of a thematic index: a WEBSOM interface to the whole article collection of years 1981-2000.
Abstract: The Self-Organizing Map (SOM) algorithm has attracted a great deal of interest among researches and practitioners in a wide variety of fields. The SOM has been analyzed extensively, a number of variants have been developed and, perhaps most notably, it has been applied extensively within fields ranging from engineering sciences to medicine, biology, and economics. We have collected a comprehensive list of 5384 scientific papers that use the algorithms, have benefited from them, or contain analyses of them. The list is intended to serve as a source for literature surveys. The present addendum contains 2092 new articles, mainly from the years 1998-2002. We have provided a keyword index to help finding articles of interest, and additionally a modern automatically constructed variant of a thematic index: a WEBSOM interface to the whole article collection of years 1981-2000. The SOM of SOMs is available at http://websom.hut.fi/websom/somref/search.cgi for browsing and searching the collection.

402 citations