Distributed delayed stochastic optimization

doi:10.1109/CDC.2012.6426626

Open AccessProceedings ArticleDOI

Distributed delayed stochastic optimization

- pp 5451-5452

TLDR

In this paper, the authors analyzed the convergence of gradient-based distributed optimization algorithms that base their updates on delayed stochastic gradient information and showed that the delay is asymptotically negligible.

Abstract:

We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to gradient-based distributed optimization algorithms where a master node performs parameter updates while worker nodes compute stochastic gradients based on local information in parallel, which may give rise to delays due to asynchrony. We take motivation from statistical problems where the size of the data is so large that it cannot fit on one computer; with the advent of huge datasets in biology, astronomy, and the internet, such problems are now common. Our main contribution is to show that for smooth stochastic problems, the delays are asymptotically negligible and we can achieve order-optimal convergence results. We show n-node architectures whose optimization error in stochastic problems—in spite of asynchronous delays—scales asymptotically as O(1/√nT) after T iterations. This rate is known to be optimal for a distributed system with n nodes even in the absence of delays. We additionally complement our theoretical results with numerical experiments on a logistic regression task.

Citations

PDF

Open Access

More filters

Book

Adaptation, Learning, and Optimization Over Networks

Ali H. Sayed

TL;DR: The limits of performance of distributed solutions are examined and procedures that help bring forth their potential more fully are discussed and a useful statistical framework is adopted and performance results that elucidate the mean-square stability, convergence, and steady-state behavior of the learning networks are derived.

...read moreread less

Proceedings Article

More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server

Qirong Ho, +8 more

TL;DR: A parameter server system for distributed ML, which follows a Stale Synchronous Parallel (SSP) model of computation that maximizes the time computational workers spend doing useful work on ML algorithms, while still providing correctness guarantees.

...read moreread less

Journal ArticleDOI

Federated Learning With Differential Privacy: Algorithms and Performance Analysis

Kang Wei, +8 more

- 17 Apr 2020 -

IEEE Transactions on Information Forensi...

TL;DR: Wang et al. as mentioned in this paper proposed a novel framework based on the concept of differential privacy, in which artificial noise is added to parameters at the clients' side before aggregating, namely, noising before model aggregation FL (NbAFL).

...read moreread less

Proceedings Article

Communication Efficient Distributed Machine Learning with the Parameter Server

Mu Li, +3 more

TL;DR: An in-depth analysis of two large scale machine learning problems ranging from l1 -regularized logistic regression on CPUs to reconstruction ICA on GPUs, using 636TB of real data with hundreds of billions of samples and dimensions is presented.

...read moreread less

Journal ArticleDOI

Petuum: A New Platform for Distributed Machine Learning on Big Data

Eric P. Xing, +9 more

- 01 Jun 2015 -

IEEE Transactions on Big Data

TL;DR: This work proposes a general-purpose framework, Petuum, that systematically addresses data- and model-parallel challenges in large-scale ML, by observing that many ML programs are fundamentally optimization-centric and admit error-tolerant, iterative-convergent algorithmic solutions.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A Stochastic Approximation Method

Herbert Robbins, +1 more

- 01 Sep 1951 -

Annals of Mathematical Statistics

TL;DR: In this article, a method for making successive experiments at levels x1, x2, ··· in such a way that xn will tend to θ in probability is presented.

...read moreread less

Book

Parallel and Distributed Computation: Numerical Methods

Dimitri P. Bertsekas, +1 more

TL;DR: This work discusses parallel and distributed architectures, complexity measures, and communication and synchronization issues, and it presents both Jacobi and Gauss-Seidel iterations, which serve as algorithms of reference for many of the computational approaches addressed later.

...read moreread less

Journal ArticleDOI

Distributed Subgradient Methods for Multi-Agent Optimization

Angelia Nedic, +1 more

- 13 Jan 2009 -

IEEE Transactions on Automatic Control

TL;DR: The authors' convergence rate results explicitly characterize the tradeoff between a desired accuracy of the generated approximate optimal solutions and the number of iterations needed to achieve the accuracy.

...read moreread less

Journal ArticleDOI

RCV1: A New Benchmark Collection for Text Categorization Research

David D. Lewis, +3 more

- 01 Dec 2004 -

Journal of Machine Learning Research

TL;DR: This work describes the coding policy and quality control procedures used in producing the RCV1 data, the intended semantics of the hierarchical category taxonomies, and the corrections necessary to remove errorful data.

...read moreread less

Book

Problem complexity and method efficiency in optimization

John Darzentas

TL;DR: In this article, problem complexity and method efficiency in optimisation are discussed in terms of problem complexity, method efficiency, and method complexity in the context of OO optimization, respectively.

...read moreread less

IEEE Transactions on Automatic Control

Distributed GraphLab: a framework for machine learning and data mining in the cloud

Yucheng Low, +5 more

Distributed delayed stochastic optimization

Citations

Adaptation, Learning, and Optimization Over Networks

More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server

Federated Learning With Differential Privacy: Algorithms and Performance Analysis

Communication Efficient Distributed Machine Learning with the Parameter Server

Petuum: A New Platform for Distributed Machine Learning on Big Data

References

A Stochastic Approximation Method

Parallel and Distributed Computation: Numerical Methods

Distributed Subgradient Methods for Multi-Agent Optimization

RCV1: A New Benchmark Collection for Text Categorization Research

Problem complexity and method efficiency in optimization

Related Papers (5)

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

Large Scale Distributed Deep Networks

Introductory Lectures on Convex Optimization: A Basic Course

Distributed Subgradient Methods for Multi-Agent Optimization

Distributed GraphLab: a framework for machine learning and data mining in the cloud