On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization.

doi:10.1007/978-3-030-69244-5_11

Book ChapterDOI

On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization.

- pp 129-140

TLDR

In this paper, a decentralized stochastic proximal gradient method, termed DSPG, is proposed to ensure the structural properties of the output to be preserved, where the nonergodic (last) iteration acts as the output.

Abstract:

This paper considers the distributed “nonsmooth+nonsmooth” composite optimization problems for which n agents collaboratively minimize the sum of their local objective functions over the directed networks. In particular, we focus on the scenarios where the sought solutions are desired to possess some structural properties, e.g., sparsity. However, to ensure the convergence, most existing methods produce an ergodic solution via the averaging schemes as the output, which causes the desired structural properties of the output to be destroyed. To address this issue, we develop a new decentralized stochastic proximal gradient method, termed DSPG, in which the nonergodic (last) iteration acts as the output. We also show that the DSPG method achieves the nonergodic convergence rate \(O(\log (T)/\sqrt{T})\) for generally convex objective functions and \(O(\log (T)/T)\) for strongly convex objective functions. When the structure-enhancing regularization is absent and the simple and suffix averaging schemes are used, the convergence rates of DSPG reach \(O(1/\sqrt{T})\) for generally convex objective functions and O(1/T) for strongly convex objective functions, showing improvement relative to the rates \(O(\log (T)/\sqrt{T})\) and \(O(\log (T)/T)\) provided by the existing methods. Simulation examples further illustrate the effectiveness of the proposed method.

References

PDF

Open Access

More filters

Journal ArticleDOI

Deep learning

Yann LeCun, +4 more

- 28 May 2015 -

Nature

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

...read moreread less

Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996 -

Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Journal ArticleDOI

Support vector machines

Marti A. Hearst, +4 more

- 01 Jul 1998 -

IEEE Intelligent Systems & Their Applica...

TL;DR: This issue's collection of essays should help familiarize readers with this interesting new racehorse in the Machine Learning stable, and give a practical guide and a new technique for implementing the algorithm efficiently.

...read moreread less

Book

Proximal Algorithms

Neal Parikh, +1 more

TL;DR: The many different interpretations of proximal operators and algorithms are discussed, their connections to many other topics in optimization and applied mathematics are described, some popular algorithms are surveyed, and a large number of examples of proxiesimal operators that commonly arise in practice are provided.

...read moreread less

Book

Stochastic approximation and recursive algorithms and applications

Harold J. Kushner, +1 more

TL;DR: A review of continuous time models can be found in this paper, where the authors present an algorithm for the Ergodic Cost Problem: Formulation and Algorithms 7.1 Formulation of the control problem 7.2 A Jacobi Type Iteration 7.3 Approximation in Policy Space 7.4 Numerical Methods 7.5 The Control Problem 7.6 The Interpolated Process 7.7 Computations 7.8 Linear Programming 7.

...read moreread less