A fused lasso latent feature model for analyzing multi-sample aCGH data
Reads0
Chats0
TLDR
A procedure called the Fused Lasso Latent Feature Model (FLLat) is proposed that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of copy number variation (CNV) and a method for estimating the false discovery rate.Abstract:
Array-based comparative genomic hybridization (aCGH) enables the measurement of DNA copy number across thousands of locations in a genome. The main goals of analyzing aCGH data are to identify the regions of copy number variation (CNV) and to quantify the amount of CNV. Although there are many methods for analyzing single-sample aCGH data, the analysis of multi-sample aCGH data is a relatively new area of research. Further, many of the current approaches for analyzing multi-sample aCGH data do not appropriately utilize the additional information present in the multiple samples. We propose a procedure called the Fused Lasso Latent Feature Model (FLLat) that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of CNV. The procedure involves modeling each sample of aCGH data as a weighted sum of a fixed number of features. Regions of CNV are then identified through an application of the fused lasso penalty to each feature. Some simulation analyses show that FLLat outperforms single-sample methods when the simulated samples share common information. We also propose a method for estimating the false discovery rate. An analysis of an aCGH data set obtained from human breast tumors, focusing on chromosomes 8 and 17, shows that FLLat and Significance Testing of Aberrant Copy number (an alternative, existing approach) identify similar regions of CNV that are consistent with previous findings. However, through the estimated features and their corresponding weights, FLLat is further able to discern specific relationships between the samples, for example, identifying 3 distinct groups of samples based on their patterns of CNV for chromosome 17.read more
Citations
More filters
Journal ArticleDOI
Principles and methods of integrative genomic analyses in cancer
Vessela N. Kristensen,Ole Christian Lingjærde,Hege G. Russnes,Hans Kristian Moen Vollan,Arnoldo Frigessi,Anne Lise Børresen-Dale +5 more
TL;DR: The objectives, methods and computational tools of integrative genomics that are available to date are reviewed here, as is their implementation in cancer research.
Journal ArticleDOI
Breakpoint analysis of transcriptional and genomic profiles uncovers novel gene fusions spanning multiple human cancer types
Craig P. Giacomini,Steven Sun,Sushama Varma,A. Hunter Shain,Marilyn M. Giacomini,Jay Michael S. Balagtas,Robert T. Sweeney,Everett Lai,Catherine A. Del Vecchio,Andrew D. Forster,Nicole Clarke,Kelli Montgomery,Shirley Zhu,Albert J. Wong,Matt van de Rijn,Robert B. West,Jonathan R. Pollack +16 more
TL;DR: A “breakpoint analysis” pipeline is developed to discover candidate gene fusions by tell-tale transcript level or genomic DNA copy number transitions occurring within genes, and the results highlight a more widespread role of fusion genes in cancer pathogenesis.
Journal ArticleDOI
Accounting for non-genetic factors by low-rank representation and sparse regression for eQTL mapping
TL;DR: A LOw-Rank representation is introduced to account for confounding factors and make use of Sparse regression for eQTL mapping (LORS), and the results indicate that LORS is an effective tool to accounts for non-genetic effects.
Journal ArticleDOI
Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data
TL;DR: This review examines a number of CNV calling tools and their impacts using bovine high-density SNP data to highlight the need for standardizing array data collection, quality assessment and experimental validation.
Journal ArticleDOI
Piecewise-constant and low-rank approximation for identification of recurrent copy number variations
TL;DR: The experimental results show that PLA can successfully reconstruct the recurrent CNV patterns from raw data and achieve better performance compared with alternative methods under a wide range of scenarios.
References
More filters
Journal ArticleDOI
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Yoav Benjamini,Yosef Hochberg +1 more
TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI
Sparsity and smoothness via the fused lasso
TL;DR: The fused lasso is proposed, a generalization that is designed for problems with features that can be ordered in some meaningful way, and is especially useful when the number of features p is much greater than N, the sample size.
Journal ArticleDOI
High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays
Daniel Pinkel,Daniel Pinkel,Richard Segraves,Damir Sudar,Steven M. Clark,Ian Poole,David Kowbel,Colin Collins,Wen Lin Kuo,Chira Chen,Ye Zhai,Shanaz H. Dairkee,Britt-Marie Ljung,Joe W. Gray,Joe W. Gray,Donna G. Albertson,Donna G. Albertson,Donna G. Albertson +17 more
TL;DR: The implementation of array CGH is demonstrated to be able to measure copy number with high precision in the human genome, and to analyse clinical specimens by obtaining new information on chromosome 20 aberrations in breast cancer.
Journal ArticleDOI
Circular binary segmentation for the analysis of array-based DNA copy number data.
TL;DR: A modification ofbinary segmentation is developed, which is called circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number in DNA sequence copy number.
Journal ArticleDOI
Convergence of a block coordinate descent method for nondifferentiable minimization
TL;DR: In this article, the convergence properties of a block coordinate descent method applied to minimize a non-convex function f(x1,.., x 2, N 3 ) with certain separability and regularity properties were studied.
Related Papers (5)
Circular binary segmentation for the analysis of array-based DNA copy number data.
Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data
Global variation in copy number in the human genome
Richard Redon,Shumpei Ishikawa,Karen R. Fitch,Lars Feuk,George H. Perry,T. Daniel Andrews,Heike Fiegler,Michael H. Shapero,Andrew R. Carson,Wenwei Chen,Eun Kyung Cho,Stephanie Dallaire,Jennifer L. Freeman,Juan R. González,Mònica Gratacòs,Jing Huang,Dimitrios Kalaitzopoulos,Daisuke Komura,Jeffrey R. MacDonald,Christian R. Marshall,Rui Mei,Lyndal Montgomery,Keunihiro Nishimura,Kohji Okamura,Fan Shen,Martin J. Somerville,Joelle Tchinda,Armand Valsesia,Cara Woodwark,Fengtang Yang,Junjun Zhang,Tatiana Zerjal,Jane Zhang,Lluís Armengol,Donald F. Conrad,Xavier Estivill,Chris Tyler-Smith,Nigel P. Carter,Hiroyuki Aburatani,Charles Lee,Keith W. Jones,Stephen W. Scherer,Matthew E. Hurles +42 more