scispace - formally typeset
Open AccessJournal ArticleDOI

Transposable regularized covariance models with an application to missing data imputation

TLDR
Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.
Abstract
Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we present a modification of the matrix-variate normal, the mean-restricted matrix-variate normal, in which the rows and columns each have a separate mean vector and covariance matrix. By placing additive penalties on the inverse covariance matrices of the rows and columns, these so-called transposable regularized covariance models allow for maximum likelihood estimation of the mean and nonsingular covariance matrices. Using these models, we formulate EM-type algorithms for missing data imputation in both the multivariate and transposable frameworks. We present theoretical results exploiting the structure of our transposable models that allow these models and imputation methods to be applied to high-dimensional data. Simulations and results on microarray data and the Netflix data show that these imputation techniques often outperform existing methods and offer a greater degree of flexibility.

read more

Citations
More filters
Journal ArticleDOI

Regression shrinkage and selection via the lasso: a retrospective

TL;DR: In this article, the authors give a brief review of the basic idea and some history and then discuss some developments since the original paper on regression shrinkage and selection via the lasso.
Journal ArticleDOI

Geodesic Convexity and Covariance Estimation

TL;DR: This work considers g-convex functions with positive definite matrix variables, and proves that Kronecker products, and logarithms of determinants are g- Convex, and applies these results to two modern covariance estimation problems: robust estimation in scaled Gaussian distributions, and Kroneker structured models.
Journal ArticleDOI

A Generalized Least-Square Matrix Decomposition

TL;DR: By finding the best low-rank approximation of the data with respect to a transposable quadratic norm, the generalized least-square matrix decomposition (GMD), directly accounts for structural relationships and is demonstrated for dimension reduction, signal recovery, and feature selection with high-dimensional structured data.
Journal ArticleDOI

Covariance Estimation in High Dimensions Via Kronecker Product Expansions

TL;DR: The results establish that PRLS has significantly faster convergence than the standard sample covariance matrix (SCM) estimator, and show that a class of block Toeplitz covariance matrices is approximatable by low separation rank and give bounds on the minimal separation rank r that ensures a given level of bias.
Journal ArticleDOI

Sparse Matrix Graphical Models

TL;DR: This article proposes a novel sparse matrix graphical model that synthetically characterizes the underlying conditional independence structure of the sparse vector-variate graphical model by penalizing, respectively, two precision matrices corresponding to the rows and columns.
References
More filters
Book

Statistical Analysis with Missing Data

TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.
Journal ArticleDOI

Exact Matrix Completion via Convex Optimization

TL;DR: It is proved that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries, and that objects other than signals and images can be perfectly reconstructed from very limited information.
Journal ArticleDOI

Missing value estimation methods for DNA microarrays.

TL;DR: It is shown that KNNimpute appears to provide a more robust and sensitive method for missing value estimation than SVDimpute, and both SVD Impute and KNN Impute surpass the commonly used row average method (as well as filling missing values with zeros).
Journal ArticleDOI

Multiple Imputation After 18+ Years

TL;DR: A description of the assumed context and objectives of multiple imputation is provided, and a review of the multiple imputations framework and its standard results are reviewed.
Related Papers (5)