Showing papers by "Anit Kumar Sahu published in 2020"

PDF

Open Access

Journal Article•DOI•

Federated Learning: Challenges, Methods, and Future Directions

[...]

Tian Li¹, Anit Kumar Sahu², Ameet Talwalkar¹, Virginia Smith¹•Institutions (2)

06 May 2020-IEEE Signal Processing Magazine

TL;DR: In this paper, the authors discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.

...read moreread less

Abstract: Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. Training in heterogeneous and potentially massive networks introduces novel challenges that require a fundamental departure from standard approaches for large-scale machine learning, distributed optimization, and privacy-preserving data analysis. In this article, we discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.

...read moreread less

2,163 citations

Federated Optimization in Heterogeneous Networks

[...]

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith - Show less +2 more

15 Mar 2020

TL;DR: This work introduces a framework, FedProx, to tackle heterogeneity in federated networks, and provides convergence guarantees for this framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work.

...read moreread less

Abstract: Federated Learning is a distributed learning paradigm with two key challenges that differentiate it from traditional distributed optimization: (1) significant variability in terms of the systems characteristics on each device in the network (systems heterogeneity), and (2) non-identically distributed data across the network (statistical heterogeneity). In this work, we introduce a framework, FedProx, to tackle heterogeneity in federated networks. FedProx can be viewed as a generalization and re-parametrization of FedAvg, the current state-of-the-art method for federated learning. While this re-parameterization makes only minor modifications to the method itself, these modifications have important ramifications both in theory and in practice. Theoretically, we provide convergence guarantees for our framework when learning over data from non-identical distributions (statistical heterogeneity), and while adhering to device-level systems constraints by allowing each participating device to perform a variable amount of work (systems heterogeneity). Practically, we demonstrate that FedProx allows for more robust convergence than FedAvg across a suite of realistic federated datasets. In particular, in highly heterogeneous settings, FedProx demonstrates significantly more stable and accurate convergence behavior relative to FedAvg---improving absolute test accuracy by 22% on average.

...read moreread less

1,490 citations

Posted Content•

FedDANE: A Federated Newton-Type Method

[...]

Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, Virginia Smith - Show less +2 more

07 Jan 2020-arXiv: Learning

TL;DR: This work proposes FedDANE, an optimization method that is adapted from DANE, a method for classical distributed optimization, to handle the practical constraints of federated learning, and provides convergence guarantees for this method when learning over both convex and non-convex functions.

...read moreread less

Abstract: Federated learning aims to jointly learn statistical models over massively distributed remote devices. In this work, we propose FedDANE, an optimization method that we adapt from DANE, a method for classical distributed optimization, to handle the practical constraints of federated learning. We provide convergence guarantees for this method when learning over both convex and non-convex functions. Despite encouraging theoretical results, we find that the method has underwhelming performance empirically. In particular, through empirical simulations on both synthetic and real-world datasets, FedDANE consistently underperforms baselines of FedAvg and FedProx in realistic federated settings. We identify low device participation and statistical device heterogeneity as two underlying causes of this underwhelming performance, and conclude by suggesting several directions of future work.

...read moreread less

63 citations

Posted Content•

Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes.

[...]

Satya Narayan Shukla, Anit Kumar Sahu, Devin Willmott, J. Zico Kolter

13 Jul 2020

TL;DR: This work proposes a simple and efficient Bayesian Optimization (BO) based approach for developing black-box adversarial attacks, which consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries compared to the current state-of-the-art black- box adversarial attack.

...read moreread less

Abstract: We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples for deep learning models solely based on information limited to output labels (hard label) to a queried data input. We use Bayesian optimization (BO) to specifically cater to scenarios involving low query budgets to develop efficient adversarial attacks. Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in structured low-dimensional subspace. Our proposed approach achieves better performance to state of the art black-box adversarial attacks that require orders of magnitude more queries than ours.

...read moreread less

14 citations

Journal Article•DOI•

Decentralized Zeroth-Order Constrained Stochastic Optimization Algorithms: Frank–Wolfe and Variants With Applications to Black-Box Adversarial Attacks

[...]

Anit Kumar Sahu¹, Soummya Kar²•Institutions (2)

Bosch¹, Carnegie Mellon University²

18 Aug 2020

TL;DR: An overview of recent work in the area of distributed zeroth-order optimization, focusing on constrained optimization settings and algorithms built around the Frank–Wolfe framework is presented.

...read moreread less

Abstract: Zeroth-order optimization algorithms are an attractive alternative for stochastic optimization problems, when gradient computations are expensive or when closed-form loss functions are not available. Recently, there has been a surge of activity in utilizing zeroth-order optimization algorithms in myriads of applications including black-box adversarial attacks on machine learning frameworks, reinforcement learning, and simulation-based optimization, to name a few. In addition to utilizing the simplicity of a typical zeroth-order optimization scheme, distributed implementations of zeroth-order schemes so as to exploit data parallelizability are getting significant attention recently. This article presents an overview of recent work in the area of distributed zeroth-order optimization, focusing on constrained optimization settings and algorithms built around the Frank–Wolfe framework. In particular, we review different types of architectures, from master–worker-based decentralized to fully distributed, and describe appropriate zeroth-order projection-free schemes for solving constrained stochastic optimization problems catered to these architectures. We discuss performance issues including convergence rates and dimension dependence. In addition, we also focus on more refined extensions such as by employing variance reduction and describe and quantify convergence rates for a variance-reduced decentralized zeroth-order optimization method inspired by martingale difference sequences. We discuss limitations of zeroth-order optimization frameworks in terms of dimension dependence. Finally, we illustrate the use of distributed zeroth-order algorithms in the context of adversarial attacks on deep learning models.

...read moreread less

11 citations

Proceedings Article•DOI•

Exploring the Error-Runtime Trade-off in Decentralized Optimization

[...]

Jianyu Wang¹, Anit Kumar Sahu², Gauri Joshi¹, Soummya Kar¹•Institutions (2)

Carnegie Mellon University¹, Amazon.com²

01 Nov 2020

TL;DR: In this article, the authors propose several variants of the MatchA algorithm and show that MATCHA can work with many other activation schemes and decentralized computation tasks and can reduce the communication delay for free in decentralized environments.

...read moreread less

Abstract: Decentralized stochastic gradient descent (SGD) has recently become one of the most promising methods to use data parallelism in order to train a machine learning model on a network of arbitrarily connected nodes/edge devices. Although the error convergence of decentralized SGD has been well studied in the last decade, most of the previous works do not explicitly consider how the network topology influences the overall convergence time. Communicating over all available links in the network may give faster error convergence, however, it will also incur higher communication overhead. The MATCHA algorithm proposed in [1] achieves a win-win in this error-runtime trade-off by judiciously sampling the communication graph. In this paper, we propose several variants of the MATCHA algorithm and show that MATCHA can work with many other activation schemes and decentralized computation tasks. It is a flexible framework to reduce the communication delay for free in decentralized environments.

...read moreread less

3 citations

Posted Content•

Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes

[...]

Satya Narayan Shukla¹, Anit Kumar Sahu², Devin Willmott³, J. Zico Kolter⁴•Institutions (4)

University of Massachusetts Amherst¹, Amazon.com², Bosch³, Carnegie Mellon University⁴

13 Jul 2020-arXiv: Learning

TL;DR: In this paper, a simple and efficient Bayesian Optimization (BO) based approach for developing black-box adversarial attacks is proposed. But the method is limited to output label (hard label) to a queried data input.

...read moreread less

Abstract: We focus on the problem of black-box adversarial attacks, where the aim is to generate adversarial examples for deep learning models solely based on information limited to output label~(hard label) to a queried data input. We propose a simple and efficient Bayesian Optimization~(BO) based approach for developing black-box adversarial attacks. Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in a structured low-dimensional subspace. We demonstrate the efficacy of our proposed attack method by evaluating both $\ell_\infty$ and $\ell_2$ norm constrained untargeted and targeted hard label black-box attacks on three standard datasets - MNIST, CIFAR-10 and ImageNet. Our proposed approach consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries compared to the current state-of-the-art black-box adversarial attacks.

...read moreread less

1 citations