scispace - formally typeset
Open AccessJournal Article

Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization

Aryan Mokhtari, +2 more
- 01 Jan 2020 - 
- Vol. 21, Iss: 105, pp 1-49
Reads0
Chats0
TLDR
Stochastic conditional gradient methods are proposed as an alternative solution relying on Approximating gradients via a simple averaging technique requiring a single stochastic gradient evaluation per iteration, and replacing projection step in proximal methods by a linear program lowers the computational complexity of each iteration.
Abstract
This paper considers stochastic optimization problems for a large class of objective functions, including convex and continuous submodular. Stochastic proximal gradient methods have been widely used to solve such problems; however, their applicability remains limited when the problem dimension is large and the projection onto a convex set is costly. Instead, stochastic conditional gradient methods are proposed as an alternative solution relying on (i) Approximating gradients via a simple averaging technique requiring a single stochastic gradient evaluation per iteration; (ii) Solving a linear program to compute the descent/ascent direction. The averaging technique reduces the noise of gradient approximations as time progresses, and replacing projection step in proximal methods by a linear program lowers the computational complexity of each iteration. We show that under convexity and smoothness assumptions, our proposed method converges to the optimal objective function value at a sublinear rate of $O(1/t^{1/3})$. Further, for a monotone and continuous DR-submodular function and subject to a general convex body constraint, we prove that our proposed method achieves a $((1-1/e)OPT-\eps)$ guarantee with $O(1/\eps^3)$ stochastic gradient computations. This guarantee matches the known hardness results and closes the gap between deterministic and stochastic continuous submodular maximization. Additionally, we obtain $((1/e)OPT -\eps)$ guarantee after using $O(1/\eps^3)$ stochastic gradients for the case that the objective function is continuous DR-submodular but non-monotone and the constraint set is down-closed. By using stochastic continuous optimization as an interface, we provide the first $(1-1/e)$ tight approximation guarantee for maximizing a monotone but stochastic submodular set function subject to a matroid constraint and $(1/e)$ approximation guarantee for the non-monotone case.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Submodular functions and optimization

悟 藤重
TL;DR: In this paper, the Lovasz Extensions of Submodular Functions are extended to include nonlinear weight functions and linear weight functions with continuous variables, and a Decomposition Algorithm is proposed.
Posted Content

Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates

TL;DR: In this article, the authors propose a stochastic gradient algorithm with zeroth-order information, whose rate of convergence depends only poly-logarithmically on the dimensionality of the Hessian matrix.
Journal ArticleDOI

Zeroth-Order Nonconvex Stochastic Optimization: Handling Constraints, High Dimensionality, and Saddle Points

TL;DR: This paper proposes and analyzes zeroth-order stochastic approximation algorithms for nonconvex and convex optimization, with a focus on addressing constrained optimization, high-dimensional setting, and saddle point avoiding.
Proceedings Article

One Sample Stochastic Frank-Wolfe.

TL;DR: One-sample stochastic Frank-Wolfe algorithm (1-SFW) was proposed in this article, which achieves the best known convergence rate of 1/δ(δ)-approximation for projected gradient descent.
Proceedings Article

Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator

TL;DR: This work adopts the recent stochastic path-integrated differential estimator technique (SPIDER) of Fang et al. (2018) for the classical Frank-Wolfe method, and introduces SPIDER-FW for finite-sum minimization as well as the more general expectation minimization problems.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

A Stochastic Approximation Method

TL;DR: In this article, a method for making successive experiments at levels x1, x2, ··· in such a way that xn will tend to θ in probability is presented.
Book ChapterDOI

Large-Scale Machine Learning with Stochastic Gradient Descent

Léon Bottou
TL;DR: A more precise analysis uncovers qualitatively different tradeoffs for the case of small-scale and large-scale learning problems.
Posted Content

An analysis of approximations for maximizing submodular set functions II

TL;DR: In this article, the authors considered the problem of finding a maximum weight independent set in a matroid, where the elements of the matroid are colored and the items of the independent set can have no more than K colors.
Related Papers (5)