scispace - formally typeset
Search or ask a question
Author

Ie-Ming Shih

Bio: Ie-Ming Shih is an academic researcher from Virginia Tech. The author has contributed to research in topics: Copy number analysis & Dependency network. The author has an hindex of 4, co-authored 4 publications receiving 120 citations. Previous affiliations of Ie-Ming Shih include Children's National Medical Center.

Papers
More filters
Journal ArticleDOI
TL;DR: This work has developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly and serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables may affect biological networks and clinical phenotypes.
Abstract: Summary: Differential dependency network (DDN) is a caBIG® (cancer Biomedical Informatics Grid) analytical tool for detecting and visualizing statistically significant topological changes in transcriptional networks representing two biological conditions. Developed under caBIG® 's In Silico Research Centers of Excellence (ISRCE) Program, DDN enables differential network analysis and provides an alternative way for defining network biomarkers predictive of phenotypes. DDN also serves as a useful systems biology tool for users across biomedical research communities to infer how genetic, epigenetic or environment variables may affect biological networks and clinical phenotypes. Besides the standalone Java application, we have also developed a Cytoscape plug-in, CytoDDN, to integrate network analysis and visualization seamlessly. Availability: The Java and MATLAB source code can be downloaded at the authors' web site http://www.cbil.ece.vt.edu/software.htm Contact: ude.tv@gnaweuy Supplementary information: Supplementary data are available at Bioinformatics online.

44 citations

Journal ArticleDOI
Guoqiang Yu1, Bai Zhang1, G. Steven Bova1, Jianfeng Xu1, Ie-Ming Shih1, Yue Joseph Wang1 
TL;DR: A statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells is reported.
Abstract: Motivation: Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples. Results: We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines. Availability: The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm. Contact: ude.tv@gnaweuy Supplementary Information: Supplementary data are available at Bioinformatics online.

40 citations

Journal ArticleDOI
TL;DR: UNLABELLED Phenotypic Up-regulated Gene Support Vector Machine (PUGSVM) is a cancer Biomedical Informatics Grid (caBIG™) analytical tool for multiclass gene selection and classification.
Abstract: Summary: Phenotypic Up-regulated Gene Support Vector Machine (PUGSVM) is a cancer Biomedical Informatics Grid (caBIG™) analytical tool for multiclass gene selection and classification. PUGSVM addresses the problem of imbalanced class separability, small sample size and high gene space dimensionality, where multiclass gene markers are defined by the union of one-versus-everyone phenotypic upregulated genes, and used by a well-matched one-versus-rest support vector machine. PUGSVM provides a simple yet more accurate strategy to identify statistically reproducible mechanistic marker genes for characterization of heterogeneous diseases. Availability: http://www.cbil.ece.vt.edu/caBIG-PUGSVM.htm. Contact: ude.tv@gnaweuy Supplementary information: Supplementary data are available at Bioinformatics online.

22 citations

Journal ArticleDOI
TL;DR: The acquired biologically plausible results provide new insights into network rewiring as a mechanistic principle and illustrate the performance of KDDN on various simulations and real gene expression datasets, and further compare the results with those obtained by the most relevant peer methods.
Abstract: Summary: We have developed an integrated molecular network learning method, within a well-grounded mathematical framework, to construct differential dependency networks with significant rewiring. This knowledge-fused differential dependency networks (KDDN) method, implemented as a Java Cytoscape app, can be used to optimally integrate prior biological knowledge with measured data to simultaneously construct both common and differential networks, to quantitatively assign model parameters and significant rewiring p-values and to provide user-friendly graphical results. The KDDN algorithm is computationally efficient and provides users with parallel computing capability using ubiquitous multi-core machines. We demonstrate the performance of KDDN on various simulations and real gene expression datasets, and further compare the results with those obtained by the most relevant peer methods. The acquired biologically plausible results provide new insights into network rewiring as a mechanistic principle and illustrate KDDN’s ability to detect them efficiently and correctly. Although the principal application here involves microarray gene expressions, our methodology can be readily applied to other types of quantitative molecular profiling data. Availability: Source code and compiled package are freely available for download at http://apps.cytoscape.org/apps/kddn Contact: ude.tv@gnaweuy Supplementary information: Supplementary data are available at Bioinformatics online.

20 citations


Cited by
More filters
Journal ArticleDOI
28 Jul 2016-Cell
TL;DR: A view of how the somatic genome drives the cancer proteome and associations between protein and post-translational modification levels and clinical outcomes in HGSC is provided.

728 citations

Journal ArticleDOI
TL;DR: This work classifies such integrative approaches into four broad categories, describes their bioinformatic principles and review their applications.
Abstract: A central goal of systems biology is to elucidate the structural and functional architecture of the cell. To this end, large and complex networks of molecular interactions are being rapidly generated for humans and model organisms. A recent focus of bioinformatics research has been to integrate these networks with each other and with diverse molecular profiles to identify sets of molecules and interactions that participate in a common biological function - that is, 'modules'. Here, we classify such integrative approaches into four broad categories, describe their bioinformatic principles and review their applications.

532 citations

Journal ArticleDOI
TL;DR: A predictor for survival in estrogen receptor–negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures is devised.
Abstract: Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor-negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.

368 citations

01 Jan 2012
TL;DR: Yuan et al. as discussed by the authors developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively.
Abstract: Image analysis of breast cancer tissue improves and complements genomic data to predict patient survival. Digitizing Pathology for Genomics The tumor microenvironment is a complex milieu that includes not only the cancer cells but also the stromal cells, immune cells, and even normal, healthy cells. Molecular analysis of tumor tissue is therefore a challenging task because all this “extra” genomic information can muddle the results. Conversely, biopsy tissue staining can provide a spatial and cellular readout (architecture and content), but it is mostly qualitative information. In response, Yuan and colleagues have developed a quantitative, computational approach to pathology. When combined with molecular analyses, the authors were able to uncover new knowledge about breast tumor biology and, in turn, predict patient survival. Yuan et al. first collected histopathology images, gene expression data, and DNA copy number variation data for 564 breast cancer patients. Using a portion of the images (the “discovery set”), they developed an image processing approach that automatically classified cells as cancer, lymphocyte, or stroma on the basis of their size and shape. This approach was validated on the remaining samples, and any errors in this analysis were digitally corrected before obtaining a plot of tumor cellular heterogeneity. With exact knowledge of the tumor’s cellular composition, the authors were able to correct copy number data to more accurately reflect HER2 status compared with uncorrected data. Yuan and colleagues combined their digital pathology with genomic information to devise an integrated predictor of survival for estrogen receptor (ER)–negative patients. Higher number of infiltrating lymphocytes (immune cells) as quantified by their image analysis platform were found in a subset of patients with better clinical outcome than the rest of ER-negative patients, and this outcome difference was significantly enhanced with the addition of gene expression. The quantitative and objective nature of this integrated predictor could benefit diagnosis and prognosis in many areas of cancer by using the rich combination of tumor cellular content and genomic data. Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin–stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor–negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.

286 citations

Journal ArticleDOI
TL;DR: Linking network dynamics to the real-life, non-ideal patient in whom diseases co-occur and interact provides a valuable basis for generating hypotheses on molecular disease mechanisms, and provides knowledge that can facilitate drug repurposing and the development of targeted therapeutic strategies.
Abstract: The co-occurrence of diseases can inform the underlying network biology of shared and multifunctional genes and pathways. In addition, comorbidities help to elucidate the effects of external exposures, such as diet, lifestyle and patient care. With worldwide health transaction data now often being collected electronically, disease co-occurrences are starting to be quantitatively characterized. Linking network dynamics to the real-life, non-ideal patient in whom diseases co-occur and interact provides a valuable basis for generating hypotheses on molecular disease mechanisms, and provides knowledge that can facilitate drug repurposing and the development of targeted therapeutic strategies.

275 citations