Author

Jianfeng Xu

Bio: Jianfeng Xu is an academic researcher from Virginia Tech. The author has contributed to research in topics: Copy number analysis. The author has an hindex of 1, co-authored 1 publications receiving 37 citations.

Topics: Copy number analysis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data.

[...]

Guoqiang Yu¹, Bai Zhang¹, G. Steven Bova¹, Jianfeng Xu¹, Ie-Ming Shih¹, Yue Joseph Wang¹ - Show less +2 more•Institutions (1)

Virginia Tech¹

01 Jun 2011-Bioinformatics

TL;DR: A statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells is reported.

...read moreread less

Abstract: Motivation: Identification of somatic DNA copy number alterations (CNAs) and significant consensus events (SCEs) in cancer genomes is a main task in discovering potential cancer-driving genes such as oncogenes and tumor suppressors. The recent development of SNP array technology has facilitated studies on copy number changes at a genome-wide scale with high resolution. However, existing copy number analysis methods are oblivious to normal cell contamination and cannot distinguish between contributions of cancerous and normal cells to the measured copy number signals. This contamination could significantly confound downstream analysis of CNAs and affect the power to detect SCEs in clinical samples. Results: We report here a statistically principled in silico approach, Bayesian Analysis of COpy number Mixtures (BACOM), to accurately estimate genomic deletion type and normal tissue contamination, and accordingly recover the true copy number profile in cancer cells. We tested the proposed method on two simulated datasets, two prostate cancer datasets and The Cancer Genome Atlas high-grade ovarian dataset, and obtained very promising results supported by the ground truth and biological plausibility. Moreover, based on a large number of comparative simulation studies, the proposed method gives significantly improved power to detect SCEs after in silico correction of normal tissue contamination. We develop a cross-platform open-source Java application that implements the whole pipeline of copy number analysis of heterogeneous cancer tissues including relevant processing steps. We also provide an R interface, bacomR, for running BACOM within the R environment, making it straightforward to include in existing data pipelines. Availability: The cross-platform, stand-alone Java application, BACOM, the R interface, bacomR, all source code and the simulation data used in this article are freely available at authors' web site: http://www.cbil.ece.vt.edu/software.htm. Contact: ude.tv@gnaweuy Supplementary Information: Supplementary data are available at Bioinformatics online.

...read moreread less

40 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer

[...]

Hui Zhang¹, Tao Liu², Zhen Zhang¹, Samuel H. Payne², Bai Zhang¹, Jason E. McDermott², Jian-Ying Zhou¹, Vladislav A. Petyuk², Li Chen¹, Debjit Ray², Shisheng Sun¹, Feng Yang², Lijun Chen¹, Jing Wang³, Punit Shah¹, Seong Won Cha⁴, Paul Aiyetan¹, Sunghee Woo⁴, Yuan Tian¹, Marina A. Gritsenko², Therese R. W. Clauss², Caitlin H. Choi¹, Matthew E. Monroe², Stefani N. Thomas¹, Song Nie², Chaochao Wu², Ronald J. Moore², Kun-Hsing Yu⁵, David L. Tabb³, David Fenyö⁶, Vineet Bafna⁴, Yue Wang⁷, Henry Rodriguez, Emily S. Boja, Tara Hiltke, Robert Rivers, Lori J. Sokoll¹, Heng Zhu¹, Ie Ming Shih¹, Leslie Cope¹, Akhilesh Pandey¹, Bing Zhang³, Michael Snyder⁵, Douglas A. Levine⁶, Richard D. Smith², Daniel W. Chan¹, Karin D. Rodland², Steven A. Carr, Michael A. Gillette, Karl R. Klauser, Eric Kuhn, D. R. Mani, Philipp Mertins, Karen A. Ketchum, Ratna R. Thangudu, Shuang Cai, Mauricio Oberti, Amanda G. Paulovich, Jeffrey R. Whiteaker, Nathan Edwards, Peter B. McGarvey, Subha Madhavan, Pei Wang, Gordon Whiteley, Steven J. Skates, Forest M. White, Christopher R. Kinsinger, Mehdi Mesri, Kenna M. Shaw, Stephen E. Stein, Paul A. Rudnick, Michael Snyder⁵, Yingming Zhao, Xian Chen, David F. Ransohoff, Andrew N. Hoofnagle, Daniel C. Liebler, Melinda E. Sanders, Zhiao Shi, Robbert J.C. Slebos, Lisa J. Zimmerman, Sherri R. Davies, Li Ding, Matthew J. Ellis, R. Reid Townsend - Show less +81 more•Institutions (7)

Johns Hopkins University¹, Pacific Northwest National Laboratory², Vanderbilt University³, University of California, San Diego⁴, Stanford University⁵, New York University⁶, Virginia Tech⁷

28 Jul 2016-Cell

TL;DR: A view of how the somatic genome drives the cancer proteome and associations between protein and post-translational modification levels and clinical outcomes in HGSC is provided.

...read moreread less

728 citations

Journal Article•DOI•

Quantitative Image Analysis of Cellular Heterogeneity in Breast Tumors Complements Genomic Profiling

[...]

Yinyin Yuan¹, Henrik Failmezger², Henrik Failmezger³, Oscar M. Rueda¹, H. Raza Ali¹, Stefan Gräf¹, Suet-Feung Chin¹, Roland F. Schwarz¹, Christina Curtis⁴, Mark J Dunning, Helen Bardwell, Nicola Johnson¹, Sarah Doyle¹, Gulisa Turashvili⁵, Elena Provenzano¹, Samuel Aparicio⁵, Carlos Caldas, Florian Markowetz¹ - Show less +14 more•Institutions (5)

University of Cambridge¹, Center for Integrated Protein Science Munich², Max Planck Society³, University of Southern California⁴, University of British Columbia⁵

24 Oct 2012-Science Translational Medicine

TL;DR: A predictor for survival in estrogen receptor–negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures is devised.

...read moreread less

Abstract: Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor-negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.

...read moreread less

368 citations

Supplementary Materials for Quantitative Image Analysis of Cellular Heterogeneity in Breast Tumors Complements Genomic Profiling

[...]

Yinyin Yuan, Henrik Failmezger, Oscar M. Rueda, H. Raza Ali, Stefan Gräf, Roland F. Schwarz, Christina Curtis, Mark J Dunning, Helen Bardwell, Sarah Doyle, Samuel Aparicio, Carlos Caldas, Florian Markowetz - Show less +9 more

01 Jan 2012

TL;DR: Yuan et al. as discussed by the authors developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively.

...read moreread less

Abstract: Image analysis of breast cancer tissue improves and complements genomic data to predict patient survival. Digitizing Pathology for Genomics The tumor microenvironment is a complex milieu that includes not only the cancer cells but also the stromal cells, immune cells, and even normal, healthy cells. Molecular analysis of tumor tissue is therefore a challenging task because all this “extra” genomic information can muddle the results. Conversely, biopsy tissue staining can provide a spatial and cellular readout (architecture and content), but it is mostly qualitative information. In response, Yuan and colleagues have developed a quantitative, computational approach to pathology. When combined with molecular analyses, the authors were able to uncover new knowledge about breast tumor biology and, in turn, predict patient survival. Yuan et al. first collected histopathology images, gene expression data, and DNA copy number variation data for 564 breast cancer patients. Using a portion of the images (the “discovery set”), they developed an image processing approach that automatically classified cells as cancer, lymphocyte, or stroma on the basis of their size and shape. This approach was validated on the remaining samples, and any errors in this analysis were digitally corrected before obtaining a plot of tumor cellular heterogeneity. With exact knowledge of the tumor’s cellular composition, the authors were able to correct copy number data to more accurately reflect HER2 status compared with uncorrected data. Yuan and colleagues combined their digital pathology with genomic information to devise an integrated predictor of survival for estrogen receptor (ER)–negative patients. Higher number of infiltrating lymphocytes (immune cells) as quantified by their image analysis platform were found in a subset of patients with better clinical outcome than the rest of ER-negative patients, and this outcome difference was significantly enhanced with the addition of gene expression. The quantitative and objective nature of this integrated predictor could benefit diagnosis and prognosis in many areas of cancer by using the rich combination of tumor cellular content and genomic data. Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin–stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor–negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.

...read moreread less

286 citations

Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer

[...]

Johns Hopkins University¹, Pacific Northwest National Laboratory², Vanderbilt University³, University of California, San Diego⁴, Stanford University⁵, New York University⁶, Virginia Tech⁷

01 Jun 2016

TL;DR: In this article, a detailed analysis of the molecular components and underlying mechanisms associated with ovarian cancer was provided, such as how different copy-number alterna-tions in the Proteome, the proteins associated with chromosomal instability, the sets of signalingpathways that diverse genome rearrangements converge on, and the ones associated with short overall survival.

...read moreread less

Abstract: To provide a detailed analysis of the molecular com-ponents and underlying mechanisms associatedwith ovarian cancer, we performed a comprehensivemass-spectrometry-based proteomic characteriza-tion of 174 ovarian tumors previously analyzed byThe Cancer Genome Atlas (TCGA), of which 169were high-grade serous carcinomas (HGSCs). Inte-grating our proteomic measurements with thegenomic data yielded a number of insights into dis-ease, such as how different copy-number alterna-tionsinﬂuencetheproteome,theproteinsassociatedwith chromosomal instability, the sets of signalingpathways that diverse genome rearrangementsconverge on, and the ones most associated withshort overall survival. Speciﬁc protein acetylationsassociated with homologous recombination deﬁ-ciency suggest a potential means for stratifying pa-tients for therapy. In addition to providing a valuableresource,theseﬁndingsprovideaviewofhowtheso-maticgenomedrivesthecancerproteomeandasso-ciations between protein and post-translationalmodiﬁcation levels and clinical outcomes in HGSC.

...read moreread less

160 citations

Journal Article•DOI•

DBS: a fast and informative segmentation algorithm for DNA copy number analysis.

[...]

Jun Ruan¹, Zhen Liu¹, Ming Sun¹, Yue Wang², Junqiu Yue, Guoqiang Yu² - Show less +2 more•Institutions (2)

Wuhan University of Technology¹, Virginia Tech²

03 Jan 2019-BMC Bioinformatics

TL;DR: DBS (Deviation Binary Segmentation) is implemented in a platform-independent and open-source Java application (ToolSeg), including a graphical user interface and simulation data generation, as well as various segmentation methods in the native Java language.

...read moreread less

Abstract: Genome-wide DNA copy number changes are the hallmark events in the initiation and progression of cancers. Quantitative analysis of somatic copy number alterations (CNAs) has broad applications in cancer research. With the increasing capacity of high-throughput sequencing technologies, fast and efficient segmentation algorithms are required when characterizing high density CNAs data. A fast and informative segmentation algorithm, DBS (Deviation Binary Segmentation), is developed and discussed. The DBS method is based on the least absolute error principles and is inspired by the segmentation method rooted in the circular binary segmentation procedure. DBS uses point-by-point model calculation to ensure the accuracy of segmentation and combines a binary search algorithm with heuristics derived from the Central Limit Theorem. The DBS algorithm is very efficient requiring a computational complexity of O(n*log n), and is faster than its predecessors. Moreover, DBS measures the change-point amplitude of mean values of two adjacent segments at a breakpoint, where the significant degree of change-point amplitude is determined by the weighted average deviation at breakpoints. Accordingly, using the constructed binary tree of significant degree, DBS informs whether the results of segmentation are over- or under-segmented. DBS is implemented in a platform-independent and open-source Java application (ToolSeg), including a graphical user interface and simulation data generation, as well as various segmentation methods in the native Java language.

...read moreread less

113 citations

Collapse