scispace - formally typeset
Search or ask a question
Author

Terence Tao

Bio: Terence Tao is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Conjecture & Random matrix. The author has an hindex of 111, co-authored 606 publications receiving 94316 citations. Previous affiliations of Terence Tao include Hokkaido University & Princeton University.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors considered the model problem of reconstructing an object from incomplete frequency samples and showed that with probability at least 1-O(N/sup -M/), f can be reconstructed exactly as the solution to the lscr/sub 1/ minimization problem.
Abstract: This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discrete-time signal f/spl isin/C/sup N/ and a randomly chosen set of frequencies /spl Omega/. Is it possible to reconstruct f from the partial knowledge of its Fourier coefficients on the set /spl Omega/? A typical result of this paper is as follows. Suppose that f is a superposition of |T| spikes f(t)=/spl sigma//sub /spl tau//spl isin/T/f(/spl tau/)/spl delta/(t-/spl tau/) obeying |T|/spl les/C/sub M//spl middot/(log N)/sup -1/ /spl middot/ |/spl Omega/| for some constant C/sub M/>0. We do not know the locations of the spikes nor their amplitudes. Then with probability at least 1-O(N/sup -M/), f can be reconstructed exactly as the solution to the /spl lscr//sub 1/ minimization problem. In short, exact recovery may be obtained by solving a convex optimization problem. We give numerical values for C/sub M/ which depend on the desired probability of success. Our result may be interpreted as a novel kind of nonlinear sampling theorem. In effect, it says that any signal made out of |T| spikes may be recovered by convex programming from almost every set of frequencies of size O(|T|/spl middot/logN). Moreover, this is nearly optimal in the sense that any method succeeding with probability 1-O(N/sup -M/) would in general require a number of frequency samples at least proportional to |T|/spl middot/logN. The methodology extends to a variety of other situations and higher dimensions. For example, we show how one can reconstruct a piecewise constant (one- or two-dimensional) object from incomplete frequency samples - provided that the number of jumps (discontinuities) obeys the condition above - by minimizing other convex functionals such as the total variation of f.

14,587 citations

Journal ArticleDOI
TL;DR: F can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program) and numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted.
Abstract: This paper considers a natural error correcting problem with real valued input/output. We wish to recover an input vector f/spl isin/R/sup n/ from corrupted measurements y=Af+e. Here, A is an m by n (coding) matrix and e is an arbitrary and unknown vector of errors. Is it possible to recover f exactly from the data y? We prove that under suitable conditions on the coding matrix A, the input f is the unique solution to the /spl lscr//sub 1/-minimization problem (/spl par/x/spl par//sub /spl lscr/1/:=/spl Sigma//sub i/|x/sub i/|) min(g/spl isin/R/sup n/) /spl par/y - Ag/spl par//sub /spl lscr/1/ provided that the support of the vector of errors is not too large, /spl par/e/spl par//sub /spl lscr/0/:=|{i:e/sub i/ /spl ne/ 0}|/spl les//spl rho//spl middot/m for some /spl rho/>0. In short, f can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program). In addition, numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted. This work is related to the problem of finding sparse solutions to vastly underdetermined systems of linear equations. There are also significant connections with the problem of recovering signals from highly incomplete measurements. In fact, the results introduced in this paper improve on our earlier work. Finally, underlying the success of /spl lscr//sub 1/ is a crucial property we call the uniform uncertainty principle that we shall describe in detail.

6,853 citations

Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of recovering a vector x ∈ R^m from incomplete and contaminated observations y = Ax ∈ e + e, where e is an error term.
Abstract: Suppose we wish to recover a vector x_0 Є R^m (e.g., a digital signal or image) from incomplete and contaminated observations y = Ax_0 + e; A is an n by m matrix with far fewer rows than columns (n « m) and e is an error term. Is it possible to recover x_0 accurately based on the data y? To recover x_0, we consider the solution x^# to the l_(1-)regularization problem min ‖x‖l_1 subject to ‖Ax - y‖l(2) ≤ Є, where Є is the size of the error term e. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the vector x_0 is sufficiently sparse, then the solution is within the noise level ‖x^# - x_0‖l_2 ≤ C Є. As a first example, suppose that A is a Gaussian random matrix; then stable recovery occurs for almost all such A's provided that the number of nonzeros of x_0 is of about the same order as the number of observations. As a second instance, suppose one observes few Fourier samples of x_0; then stable recovery occurs for almost any set of n coefficients provided that the number of nonzeros is of the order of n/[log m]^6. In the case where the error term vanishes, the recovery is of course exact, and this work actually provides novel insights into the exact recovery phenomenon discussed in earlier papers. The methodology also explains why one can also very nearly recover approximately sparse signals.

6,727 citations

Journal ArticleDOI
TL;DR: If the objects of interest are sparse in a fixed basis or compressible, then it is possible to reconstruct f to within very high accuracy from a small number of random measurements by solving a simple linear program.
Abstract: Suppose we are given a vector f in a class FsubeRopfN , e.g., a class of digital signals or digital images. How many linear measurements do we need to make about f to be able to recover f to within precision epsi in the Euclidean (lscr2) metric? This paper shows that if the objects of interest are sparse in a fixed basis or compressible, then it is possible to reconstruct f to within very high accuracy from a small number of random measurements by solving a simple linear program. More precisely, suppose that the nth largest entry of the vector |f| (or of its coefficients in a fixed basis) obeys |f|(n)lesRmiddotn-1p/, where R>0 and p>0. Suppose that we take measurements yk=langf# ,Xkrang,k=1,...,K, where the Xk are N-dimensional Gaussian vectors with independent standard normal entries. Then for each f obeying the decay estimate above for some 0

6,342 citations

Posted Content
TL;DR: It is shown that it is possible to recover x0 accurately based on the data y from incomplete and contaminated observations.
Abstract: Suppose we wish to recover an n-dimensional real-valued vector x_0 (e.g. a digital signal or image) from incomplete and contaminated observations y = A x_0 + e; A is a n by m matrix with far fewer rows than columns (n << m) and e is an error term. Is it possible to recover x_0 accurately based on the data y? To recover x_0, we consider the solution x* to the l1-regularization problem min \|x\|_1 subject to \|Ax-y\|_2 <= epsilon, where epsilon is the size of the error term e. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the vector x_0 is sufficiently sparse, then the solution is within the noise level \|x* - x_0\|_2 \le C epsilon. As a first example, suppose that A is a Gaussian random matrix, then stable recovery occurs for almost all such A's provided that the number of nonzeros of x_0 is of about the same order as the number of observations. Second, suppose one observes few Fourier samples of x_0, then stable recovery occurs for almost any set of p coefficients provided that the number of nonzeros is of the order of n/[\log m]^6. In the case where the error term vanishes, the recovery is of course exact, and this work actually provides novel insights on the exact recovery phenomenon discussed in earlier papers. The methodology also explains why one can also very nearly recover approximately sparse signals.

6,226 citations


Cited by
More filters
Book
D.L. Donoho1
01 Jan 2004
TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.
Abstract: Suppose x is an unknown vector in Ropfm (a digital image or signal); we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n=O(m1/4log5/2(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an lscrp ball for 0

18,609 citations

Book
23 May 2011
TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.
Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

17,433 citations

Journal ArticleDOI
TL;DR: In this paper, the authors considered the model problem of reconstructing an object from incomplete frequency samples and showed that with probability at least 1-O(N/sup -M/), f can be reconstructed exactly as the solution to the lscr/sub 1/ minimization problem.
Abstract: This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discrete-time signal f/spl isin/C/sup N/ and a randomly chosen set of frequencies /spl Omega/. Is it possible to reconstruct f from the partial knowledge of its Fourier coefficients on the set /spl Omega/? A typical result of this paper is as follows. Suppose that f is a superposition of |T| spikes f(t)=/spl sigma//sub /spl tau//spl isin/T/f(/spl tau/)/spl delta/(t-/spl tau/) obeying |T|/spl les/C/sub M//spl middot/(log N)/sup -1/ /spl middot/ |/spl Omega/| for some constant C/sub M/>0. We do not know the locations of the spikes nor their amplitudes. Then with probability at least 1-O(N/sup -M/), f can be reconstructed exactly as the solution to the /spl lscr//sub 1/ minimization problem. In short, exact recovery may be obtained by solving a convex optimization problem. We give numerical values for C/sub M/ which depend on the desired probability of success. Our result may be interpreted as a novel kind of nonlinear sampling theorem. In effect, it says that any signal made out of |T| spikes may be recovered by convex programming from almost every set of frequencies of size O(|T|/spl middot/logN). Moreover, this is nearly optimal in the sense that any method succeeding with probability 1-O(N/sup -M/) would in general require a number of frequency samples at least proportional to |T|/spl middot/logN. The methodology extends to a variety of other situations and higher dimensions. For example, we show how one can reconstruct a piecewise constant (one- or two-dimensional) object from incomplete frequency samples - provided that the number of jumps (discontinuities) obeys the condition above - by minimizing other convex functionals such as the total variation of f.

14,587 citations

Journal ArticleDOI
TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.
Abstract: We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include l(1) (the lasso), l(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.

13,656 citations

Journal ArticleDOI
TL;DR: The theory of compressive sampling, also known as compressed sensing or CS, is surveyed, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition.
Abstract: Conventional approaches to sampling signals or images follow Shannon's theorem: the sampling rate must be at least twice the maximum frequency present in the signal (Nyquist rate). In the field of data conversion, standard analog-to-digital converter (ADC) technology implements the usual quantized Shannon representation - the signal is uniformly sampled at or above the Nyquist rate. This article surveys the theory of compressive sampling, also known as compressed sensing or CS, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition. CS theory asserts that one can recover certain signals and images from far fewer samples or measurements than traditional methods use.

9,686 citations