scispace - formally typeset
Open AccessJournal ArticleDOI

A fused lasso latent feature model for analyzing multi-sample aCGH data

Reads0
Chats0
TLDR
A procedure called the Fused Lasso Latent Feature Model (FLLat) is proposed that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of copy number variation (CNV) and a method for estimating the false discovery rate.
Abstract
Array-based comparative genomic hybridization (aCGH) enables the measurement of DNA copy number across thousands of locations in a genome. The main goals of analyzing aCGH data are to identify the regions of copy number variation (CNV) and to quantify the amount of CNV. Although there are many methods for analyzing single-sample aCGH data, the analysis of multi-sample aCGH data is a relatively new area of research. Further, many of the current approaches for analyzing multi-sample aCGH data do not appropriately utilize the additional information present in the multiple samples. We propose a procedure called the Fused Lasso Latent Feature Model (FLLat) that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of CNV. The procedure involves modeling each sample of aCGH data as a weighted sum of a fixed number of features. Regions of CNV are then identified through an application of the fused lasso penalty to each feature. Some simulation analyses show that FLLat outperforms single-sample methods when the simulated samples share common information. We also propose a method for estimating the false discovery rate. An analysis of an aCGH data set obtained from human breast tumors, focusing on chromosomes 8 and 17, shows that FLLat and Significance Testing of Aberrant Copy number (an alternative, existing approach) identify similar regions of CNV that are consistent with previous findings. However, through the estimated features and their corresponding weights, FLLat is further able to discern specific relationships between the samples, for example, identifying 3 distinct groups of samples based on their patterns of CNV for chromosome 17.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Principles and methods of integrative genomic analyses in cancer

TL;DR: The objectives, methods and computational tools of integrative genomics that are available to date are reviewed here, as is their implementation in cancer research.
Journal ArticleDOI

Breakpoint analysis of transcriptional and genomic profiles uncovers novel gene fusions spanning multiple human cancer types

TL;DR: A “breakpoint analysis” pipeline is developed to discover candidate gene fusions by tell-tale transcript level or genomic DNA copy number transitions occurring within genes, and the results highlight a more widespread role of fusion genes in cancer pathogenesis.
Journal ArticleDOI

Accounting for non-genetic factors by low-rank representation and sparse regression for eQTL mapping

TL;DR: A LOw-Rank representation is introduced to account for confounding factors and make use of Sparse regression for eQTL mapping (LORS), and the results indicate that LORS is an effective tool to accounts for non-genetic effects.
Journal ArticleDOI

Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data

TL;DR: This review examines a number of CNV calling tools and their impacts using bovine high-density SNP data to highlight the need for standardizing array data collection, quality assessment and experimental validation.
Journal ArticleDOI

Piecewise-constant and low-rank approximation for identification of recurrent copy number variations

TL;DR: The experimental results show that PLA can successfully reconstruct the recurrent CNV patterns from raw data and achieve better performance compared with alternative methods under a wide range of scenarios.
References
More filters
Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.
Journal ArticleDOI

Sparsity and smoothness via the fused lasso

TL;DR: The fused lasso is proposed, a generalization that is designed for problems with features that can be ordered in some meaningful way, and is especially useful when the number of features p is much greater than N, the sample size.
Journal ArticleDOI

Circular binary segmentation for the analysis of array-based DNA copy number data.

TL;DR: A modification ofbinary segmentation is developed, which is called circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number in DNA sequence copy number.
Journal ArticleDOI

Convergence of a block coordinate descent method for nondifferentiable minimization

TL;DR: In this article, the convergence properties of a block coordinate descent method applied to minimize a non-convex function f(x1,.., x 2, N 3 ) with certain separability and regularity properties were studied.
Related Papers (5)