scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging

01 May 2011-Journal of Mathematical Imaging and Vision (Springer US)-Vol. 40, Iss: 1, pp 120-145
TL;DR: A first-order primal-dual algorithm for non-smooth convex optimization problems with known saddle-point structure can achieve O(1/N2) convergence on problems, where the primal or the dual objective is uniformly convex, and it can show linear convergence, i.e. O(ωN) for some ω∈(0,1), on smooth problems.
Abstract: In this paper we study a first-order primal-dual algorithm for non-smooth convex optimization problems with known saddle-point structure. We prove convergence to a saddle-point with rate O(1/N) in finite dimensions for the complete class of problems. We further show accelerations of the proposed algorithm to yield improved rates on problems with some degree of smoothness. In particular we show that we can achieve O(1/N 2) convergence on problems, where the primal or the dual objective is uniformly convex, and we can show linear convergence, i.e. O(? N ) for some ??(0,1), on smooth problems. The wide applicability of the proposed algorithm is demonstrated on several imaging problems such as image denoising, image deconvolution, image inpainting, motion estimation and multi-label image segmentation.
Citations
More filters
Book
27 Nov 2013
TL;DR: The many different interpretations of proximal operators and algorithms are discussed, their connections to many other topics in optimization and applied mathematics are described, some popular algorithms are surveyed, and a large number of examples of proxiesimal operators that commonly arise in practice are provided.
Abstract: This monograph is about a class of optimization algorithms called proximal algorithms. Much like Newton's method is a standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at a higher level of abstraction than classical algorithms like Newton's method: the base operation is evaluating the proximal operator of a function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting a point onto a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Here, we discuss the many different interpretations of proximal operators and algorithms, describe their connections to many other topics in optimization and applied mathematics, survey some popular algorithms, and provide a large number of examples of proximal operators that commonly arise in practice.

3,627 citations

Proceedings ArticleDOI
06 Nov 2011
TL;DR: It is demonstrated that a dense model permits superior tracking performance under rapid motion compared to a state of the art method using features; and the additional usefulness of the dense model for real-time scene interaction in a physics-enhanced augmented reality application is shown.
Abstract: DTAM is a system for real-time camera tracking and reconstruction which relies not on feature extraction but dense, every pixel methods. As a single hand-held RGB camera flies over a static scene, we estimate detailed textured depth maps at selected keyframes to produce a surface patchwork with millions of vertices. We use the hundreds of images available in a video stream to improve the quality of a simple photometric data term, and minimise a global spatially regularised energy functional in a novel non-convex optimisation framework. Interleaved, we track the camera's 6DOF motion precisely by frame-rate whole image alignment against the entire dense model. Our algorithms are highly parallelisable throughout and DTAM achieves real-time performance using current commodity GPU hardware. We demonstrate that a dense model permits superior tracking performance under rapid motion compared to a state of the art method using features; and also show the additional usefulness of the dense model for real-time scene interaction in a physics-enhanced augmented reality application.

2,001 citations


Cites methods from "A First-Order Primal-Dual Algorithm..."

  • ...Gradient ascent/descent time-steps σq, σd are set optimally for the update scheme provided as detailed in [3]....

    [...]

  • ...Following [1][16][3], we use duality principles to arrive at the primal-dual form of g(u)‖∇ξ(u)‖ +Q(u)....

    [...]

  • ...As a function of ξ, the convex sum g(u)‖∇ξ(u)‖ + Q(u) is a small modification of the TV-L(2)2 ROF image denoising model term [11], and can be efficiently optimised using a primal-dual approach [1][16][3]....

    [...]

Proceedings ArticleDOI
21 Jul 2017
TL;DR: In this paper, a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems (e.g., deblurring).
Abstract: Model-based optimization methods and discriminative learning methods have been the two dominant strategies for solving various inverse problems in low-level vision. Typically, those two kinds of methods have their respective merits and drawbacks, e.g., model-based optimization methods are flexible for handling different inverse problems but are usually time-consuming with sophisticated priors for the purpose of good performance, in the meanwhile, discriminative learning methods have fast testing speed but their application range is greatly restricted by the specialized task. Recent works have revealed that, with the aid of variable splitting techniques, denoiser prior can be plugged in as a modular part of model-based optimization methods to solve other inverse problems (e.g., deblurring). Such an integration induces considerable advantage when the denoiser is obtained via discriminative learning. However, the study of integration with fast discriminative denoiser prior is still lacking. To this end, this paper aims to train a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems. Experimental results demonstrate that the learned set of denoisers can not only achieve promising Gaussian denoising results but also can be used as prior to deliver good performance for various low-level vision applications.

1,216 citations

Book
Sébastien Bubeck1
28 Oct 2015
TL;DR: This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms and provides a gentle introduction to structural optimization with FISTA, saddle-point mirror prox, Nemirovski's alternative to Nesterov's smoothing, and a concise description of interior point methods.
Abstract: This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms. Starting from the fundamental theory of black-box optimization, the material progresses towards recent advances in structural optimization and stochastic optimization. Our presentation of black-box optimization, strongly influenced by the seminal book of Nesterov, includes the analysis of cutting plane methods, as well as accelerated gradient descent schemes. We also pay special attention to non-Euclidean settings relevant algorithms include Frank-Wolfe, mirror descent, and dual averaging and discuss their relevance in machine learning. We provide a gentle introduction to structural optimization with FISTA to optimize a sum of a smooth and a simple non-smooth term, saddle-point mirror prox Nemirovski's alternative to Nesterov's smoothing, and a concise description of interior point methods. In stochastic optimization we discuss stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms. We also briefly touch upon convex relaxation of combinatorial problems and the use of randomness to round solutions, as well as random walks based methods.

1,213 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a constrained optimization type of numerical algorithm for removing noise from images is presented, where the total variation of the image is minimized subject to constraints involving the statistics of the noise.

15,225 citations


"A First-Order Primal-Dual Algorithm..." refers background or methods in this paper

  • ...5Interestingly, total variation regularization appears in [34] in the context of motion estimation several years before it was popularized by Rudin, Osher and Fatemi in [33] for image denoising....

    [...]

  • ...In [37], Zhu and Chan used this classical Arrow-Hurwicz method to solve the Rudin Osher and Fatemi (ROF) image denoising problem [33]....

    [...]

  • ...As a prototype for total variation methods in imaging we recall the total variation based image denoising model proposed by Rudin, Osher and Fatemi in [33]....

    [...]

Journal ArticleDOI
TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.
Abstract: We consider the class of iterative shrinkage-thresholding algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods, which can be viewed as an extension of the classical gradient algorithm, is attractive due to its simplicity and thus is adequate for solving large-scale problems even with dense matrix data. However, such methods are also known to converge quite slowly. In this paper we present a new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising numerical results for wavelet-based image deblurring demonstrate the capabilities of FISTA which is shown to be faster than ISTA by several orders of magnitude.

11,413 citations


"A First-Order Primal-Dual Algorithm..." refers background or methods in this paper

  • ...Is shown in [2, 25, 27, 28] that if G or F ∗ is uniformly convex (such that G∗, or respectively F , has a Lipschitz continuous gradient), O(1/N2) convergence can be guaranteed....

    [...]

  • ...Remark 4 In [2, 25, 27], the O(1/N2) estimate is theoretically better than ours since it is on the dual energy G∗(−K∗yN)+F ∗(yN)−(G∗(−K∗ŷ)+F ∗(ŷ)) (which can easily be shown to bound ‖xN − x̂‖2, see for instance [13])....

    [...]

  • ...• FISTA: O(1/N2) fast iterative shrinkage thresholding algorithm on the dual ROF problem (66) [2, 25]....

    [...]

  • ...(35) In that case one can show that ∇G∗ is 1/γ -Lipschitz so that the dual problem (4) can be solved in O(1/N2) using any of the accelerated first order methods of [2, 25, 27], in the sense that the objective (in this case, the dual energy) approaches its optimal value at the rate O(1/N2), where N is the number of first order iterations....

    [...]

  • ...• NEST: Restarted version of Nesterov’s algorithm [2, 25, 28], on the dual Huber-ROF problem....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors introduce and study the most basic properties of three new variational problems which are suggested by applications to computer vision, and study their application in computer vision.
Abstract: : This reprint will introduce and study the most basic properties of three new variational problems which are suggested by applications to computer vision. In computer vision, a fundamental problem is to appropriately decompose the domain R of a function g (x,y) of two variables. This problem starts by describing the physical situation which produces images: assume that a three-dimensional world is observed by an eye or camera from some point P and that g1(rho) represents the intensity of the light in this world approaching the point sub 1 from a direction rho. If one has a lens at P focusing this light on a retina or a film-in both cases a plane domain R in which we may introduce coordinates x, y then let g(x,y) be the strength of the light signal striking R at a point with coordinates (x,y); g(x,y) is essentially the same as sub 1 (rho) -possibly after a simple transformation given by the geometry of the imaging syste. The function g(x,y) defined on the plane domain R will be called an image. What sort of function is g? The light reflected off the surfaces Si of various solid objects O sub i visible from P will strike the domain R in various open subsets R sub i. When one object O1 is partially in front of another object O2 as seen from P, but some of object O2 appears as the background to the sides of O1, then the open sets R1 and R2 will have a common boundary (the 'edge' of object O1 in the image defined on R) and one usually expects the image g(x,y) to be discontinuous along this boundary. (JHD)

5,516 citations


Additional excerpts

  • ...Finally, we consider the problem of finding a partition of an image into k pairwise disjoint regions, which minimizes the total interface between the sets, as for example in the piecewise constant Mumford-Shah problem [ 21 ]...

    [...]