scispace - formally typeset
Search or ask a question

Showing papers by "Emmanuel J. Candès published in 2008"


Journal ArticleDOI
TL;DR: The theory of compressive sampling, also known as compressed sensing or CS, is surveyed, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition.
Abstract: Conventional approaches to sampling signals or images follow Shannon's theorem: the sampling rate must be at least twice the maximum frequency present in the signal (Nyquist rate). In the field of data conversion, standard analog-to-digital converter (ADC) technology implements the usual quantized Shannon representation - the signal is uniformly sampled at or above the Nyquist rate. This article surveys the theory of compressive sampling, also known as compressed sensing or CS, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition. CS theory asserts that one can recover certain signals and images from far fewer samples or measurements than traditional methods use.

9,686 citations


Journal ArticleDOI
TL;DR: A novel method for sparse signal recovery that in many situations outperforms ℓ1 minimization in the sense that substantially fewer measurements are needed for exact recovery.
Abstract: It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained l1 minimization. In this paper, we study a novel method for sparse signal recovery that in many situations outperforms l1 minimization in the sense that substantially fewer measurements are needed for exact recovery. The algorithm consists of solving a sequence of weighted l1-minimization problems where the weights used for the next iteration are computed from the value of the current solution. We present a series of experiments demonstrating the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing. Interestingly, superior gains are also achieved when our method is applied to recover signals with assumed near-sparsity in overcomplete representations—not by reweighting the l1 norm of the coefficient sequence as is common, but by reweighting the l1 norm of the transformed object. An immediate consequence is the possibility of highly efficient data acquisition protocols by improving on a technique known as Compressive Sensing.

4,869 citations


Journal ArticleDOI
TL;DR: Candes et al. as discussed by the authors established new results about the accuracy of the reconstruction from undersampled measurements, which improved on earlier estimates, and have the advantage of being more elegant. But they did not consider the restricted isometry property of the sensing matrix.

3,421 citations


Posted Content
TL;DR: In this article, a convex relaxation of a rank minimization problem is proposed to approximate the matrix with minimum nuclear norm among all matrices obeying a set of convex constraints.
Abstract: This paper introduces a novel algorithm to approximate the matrix with minimum nuclear norm among all matrices obeying a set of convex constraints. This problem may be understood as the convex relaxation of a rank minimization problem, and arises in many important applications as in the task of recovering a large matrix from a small subset of its entries (the famous Netflix problem). Off-the-shelf algorithms such as interior point methods are not directly amenable to large problems of this kind with over a million unknown entries. This paper develops a simple first-order and easy-to-implement algorithm that is extremely efficient at addressing problems in which the optimal solution has low rank. The algorithm is iterative and produces a sequence of matrices (X^k, Y^k) and at each step, mainly performs a soft-thresholding operation on the singular values of the matrix Y^k. There are two remarkable features making this attractive for low-rank matrix completion problems. The first is that the soft-thresholding operation is applied to a sparse matrix; the second is that the rank of the iterates X^k is empirically nondecreasing. Both these facts allow the algorithm to make use of very minimal storage space and keep the computational cost of each iteration low. We provide numerical examples in which 1,000 by 1,000 matrices are recovered in less than a minute on a modest desktop computer. We also demonstrate that our approach is amenable to very large scale problems by recovering matrices of rank about 10 with nearly a billion unknowns from just about 0.4% of their sampled entries. Our methods are connected with linearized Bregman iterations for l1 minimization, and we develop a framework in which one can understand these algorithms in terms of well-known Lagrange multiplier algorithms.

572 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of estimating the mean of a vector with a sparse subset of covariates providing a close approximation to the unknown mean vector, and they show that the lasso happens to nearly select the best subset of variables.
Abstract: We consider the fundamental problem of estimating the mean of a vector $y=X\beta+z$, where $X$ is an $n\times p$ design matrix in which one can have far more variables than observations, and $z$ is a stochastic error term--the so-called "$p>n$" setup. When $\beta$ is sparse, or, more generally, when there is a sparse subset of covariates providing a close approximation to the unknown mean vector, we ask whether or not it is possible to accurately estimate $X\beta$ using a computationally tractable algorithm. We show that, in a surprisingly wide range of situations, the lasso happens to nearly select the best subset of variables. Quantitatively speaking, we prove that solving a simple quadratic program achieves a squared error within a logarithmic factor of the ideal mean squared error that one would achieve with an oracle supplying perfect information about which variables should and should not be included in the model. Interestingly, our results describe the average performance of the lasso; that is, the performance one can expect in an vast majority of cases where $X\beta$ is a sparse or nearly sparse superposition of variables, but not in all cases. Our results are nonasymptotic and widely applicable, since they simply require that pairs of predictor variables are not too collinear.

520 citations


01 Mar 2008
TL;DR: The basic CS theory is overviewed, the key mathematical ideas underlying this theory are presented, significant implications are discussed, and the fact that randomness can — perhaps surprisingly — lead to very effective sensing mechanisms is highlighted.
Abstract: This article surveys the theory of compressive sampling, also known as compressed sensing or CS, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition. CS theory asserts that one can recover certain signals and images from far fewer samples or measurements than traditional methods use. To make this possible, CS relies on two principles: sparsity, which pertains to the signals of interest, and incoherence, which pertains to the sensing modality. Our intent in this article is to overview the basic CS theory that emerged in the works [1]–[3], present the key mathematical ideas underlying this theory, and survey a couple of important results in the field. Our goal is to explain CS as plainly as possible, and so our article is mainly of a tutorial nature. One of the charms of this theory is that it draws from various subdisciplines within the applied mathematical sciences, most notably probability theory. In this review, we have decided to highlight this aspect and especially the fact that randomness can — perhaps surprisingly — lead to very effective sensing mechanisms. We will also discuss significant implications, explain why CS is a concrete protocol for sensing and compressing data simultaneously (thus the name), and conclude our tour by reviewing important applications.

446 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that if one encodes the information as where is a suitable coding matrix, there are two decoding schemes that allow the recovery of the block of pieces of information with nearly the same accuracy as if no gross errors occurred upon transmission (or equivalently as if one had an oracle supplying perfect information about the sites and amplitudes of the gross errors).
Abstract: This paper discusses a stylized communications problem where one wishes to transmit a real-valued signal (a block of pieces of information) to a remote receiver. We ask whether it is possible to transmit this information reliably when a fraction of the transmitted codeword is corrupted by arbitrary gross errors, and when in addition, all the entries of the codeword are contaminated by smaller errors (e.g., quantization errors). We show that if one encodes the information as where is a suitable coding matrix, there are two decoding schemes that allow the recovery of the block of pieces of information with nearly the same accuracy as if no gross errors occurred upon transmission (or equivalently as if one had an oracle supplying perfect information about the sites and amplitudes of the gross errors). Moreover, both decoding strategies are very concrete and only involve solving simple convex optimization programs, either a linear program or a second-order cone program. We complement our study with numerical simulations showing that the encoder/decoder pair performs remarkably well.

165 citations


Proceedings ArticleDOI
01 Oct 2008
TL;DR: This paper considers three schemes, one based on a certain Restricted Isometry Property and two based on directly sensing the row and column space of the matrix, and studies their properties in terms of exact recovery in the ideal case, and robustness issues for approximately low-rank matrices and for noisy measurements.
Abstract: In this paper, we focus on compressed sensing and recovery schemes for low-rank matrices, asking under what conditions a low-rank matrix can be sensed and recovered from incomplete, inaccurate, and noisy observations. We consider three schemes, one based on a certain Restricted Isometry Property and two based on directly sensing the row and column space of the matrix. We study their properties in terms of exact recovery in the ideal case, and robustness issues for approximately low-rank matrices and for noisy measurements.

148 citations


Proceedings ArticleDOI
01 Sep 2008
TL;DR: It is shown that in very general settings, one can perfectly recover all of the missing entries from a sufficiently large random subset by solving a convex programming problem.
Abstract: Suppose that one observes an incomplete subset of entries selected uniformly at random from a low-rank matrix. When is it possible to complete the matrix and recover the entries that have not been seen? We show that in very general settings, one can perfectly recover all of the missing entries from a sufficiently large random subset by solving a convex programming problem. This program finds the matrix with the minimum nuclear norm agreeing with the observed entries. The techniques used in this analysis draw upon parallels in the field of compressed sensing, demonstrating that objects other than signals and images can be perfectly reconstructed from very limited information.

141 citations


Journal ArticleDOI
TL;DR: The main point of the paper is accurate statistical estimation in high dimensions, which includes theoretical, practical and computational issues, and the Dantzig Selector compares with the Lasso.
Abstract: Rejoinder to ``The Dantzig selector: Statistical estimation when $p$ is much larger than $n$'' [math/0506081]

136 citations


Journal ArticleDOI
TL;DR: In this article, the authors develop detection thresholds for two types of common graphs which exhibit a different behavior from the usual regular lattice with vertices of the form {(i, j) : 0 ≤ i, −i ≤ j ≤ i and j has the parity of i} and oriented edges (i,j) → (i+1, j+s) where s = ±1.
Abstract: Consider a graph with a set of vertices and oriented edges connecting pairs of vertices Each vertex is associated with a random variable and these are assumed to be independent In this setting, suppose we wish to solve the following hypothesis testing problem: under the null, the random variables have common distribution N(0,1) while under the alternative, there is an unknown path along which random variables have distribution N(�, 1), � > 0, and distribution N(0,1) away from it For which values of the mean shiftcan one reliably detect and for which values is this impossible? This paper develops detection thresholds for two types of common graphs which exhibit a different behavior The first is the usual regular lattice with vertices of the form {(i, j) : 0 ≤ i, −i ≤ j ≤ i and j has the parity of i} and oriented edges (i, j) → (i+1, j+s) where s = ±1 We show that for paths of length m start- ing at the origin, the hypotheses become distinguishable (in a minimax sense) ifm ≫ √ log m, while they are not ifm ≪ log m We derive equivalent results in a Bayesian setting where one assumes that all paths are equally likely; there the asymptotic threshold ism ≈ m 1/4 We obtain corresponding results for trees (where the threshold is of order 1 and independent of the size of the tree), for distributions other than the Gaussian, and for other graphs The concept of predictability profile, first introduced by Benjamini, Pemantle and Peres, plays a crucial role in our analysis

Journal ArticleDOI
TL;DR: This paper considers the problem of detecting nonstationary phenomena, and chirps in particular, from very noisy data, and introduces detection strategies which are very sensitive and more flexible than existing feature detectors.

Posted Content
TL;DR: It is demonstrated that in very general settings, one can perfectly recover all of the missing entries from most sufficiently large subsets by solving a convex programming problem that finds the matrix with the minimum nuclear norm agreeing with the observed entries.
Abstract: We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys m >= C n^{1.2} r log n for some positive numerical constant C, then with very high probability, most n by n matrices of rank r can be perfectly recovered by solving a simple convex optimization program. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Similar results hold for arbitrary rectangular matrices as well. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.

Journal ArticleDOI
TL;DR: In this paper, the problem of finding the best approximation to a given signal using chirplets can be reduced to finding the path of minimum cost in a weighted, directed graph, and can be solved in polynomial time via dynamic programming.
Abstract: A generic 'chirp' of the form h(t) = A(t)cos phi(t) can be closely approximated by a connected set of multiscale chirplets with quadratically-evolving phase. The problem of finding the best approximation to a given signal using chirplets can be reduced to that of finding the path of minimum cost in a weighted, directed graph, and can be solved in polynomial time via dynamic programming. For a signal embedded in noise we apply constraints on the path length to obtain a statistic for detection of chirping signals in coloured noise. In this paper we present some results from using this test to detect binary black hole coalescences in simulated LIGO noise.

Posted Content
TL;DR: In this paper, Michielssen and Boag proposed a low-rank approximation algorithm for Fourier integral operators with near-optimal computational complexity, which is based on the butterfly algorithm.
Abstract: This paper is concerned with the fast computation of Fourier integral operators of the general form $\int_{\R^d} e^{2\pi\i \Phi(x,k)} f(k) d k$, where $k$ is a frequency variable, $\Phi(x,k)$ is a phase function obeying a standard homogeneity condition, and $f$ is a given input. This is of interest for such fundamental computations are connected with the problem of finding numerical solutions to wave equations, and also frequently arise in many applications including reflection seismology, curvilinear tomography and others. In two dimensions, when the input and output are sampled on $N \times N$ Cartesian grids, a direct evaluation requires $O(N^4)$ operations, which is often times prohibitively expensive. This paper introduces a novel algorithm running in $O(N^2 \log N)$ time, i. e. with near-optimal computational complexity, and whose overall structure follows that of the butterfly algorithm [Michielssen and Boag, IEEE Trans Antennas Propagat 44 (1996), 1086-1093]. Underlying this algorithm is a mathematical insight concerning the restriction of the kernel $e^{2\pi\i \Phi(x,k)}$ to subsets of the time and frequency domains. Whenever these subsets obey a simple geometric condition, the restricted kernel has approximately low-rank; we propose constructing such low-rank approximations using a special interpolation scheme, which prefactors the oscillatory component, interpolates the remaining nonoscillatory part and, lastly, remodulates the outcome. A byproduct of this scheme is that the whole algorithm is highly efficient in terms of memory requirement. Numerical results demonstrate the performance and illustrate the empirical properties of this algorithm.

Proceedings ArticleDOI
01 Oct 2008

Journal ArticleDOI
TL;DR: In this paper, the problem of finding the best approximation to a given signal using chirplets can be reduced to finding the path of minimum cost in a weighted, directed graph, and can be solved in polynomial time via dynamic programming.
Abstract: A generic `chirp' of the form h(t) = A(t) cos(phi(t)) can be closely approximated by a connected set of multiscale chirplets with quadratically-evolving phase. The problem of finding the best approximation to a given signal using chirplets can be reduced to that of finding the path of minimum cost in a weighted, directed graph, and can be solved in polynomial time via dynamic programming. For a signal embedded in noise we apply constraints on the path length to obtain a statistic for detection of chirping signals in coloured noise. In this paper we present some results from using this test to detect binary black hole coalescences in simulated LIGO noise.