scispace - formally typeset
Search or ask a question
Author

Nived Rajaraman

Bio: Nived Rajaraman is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Computer science & Estimator. The author has an hindex of 5, co-authored 14 publications receiving 60 citations. Previous affiliations of Nived Rajaraman include Indian Institute of Technology Madras & Indian Institutes of Technology.

Papers
More filters
Posted Content
TL;DR: This paper proposes a secure aggregation protocol, FastSecAgg, that is efficient in terms of computation and communication, and robust to client dropouts, and guarantees security against adaptive adversaries, which can perform client corruptions dynamically during the execution of the protocol.
Abstract: Recent attacks on federated learning demonstrate that keeping the training data on clients' devices does not provide sufficient privacy, as the model parameters shared by clients can leak information about their training data. A 'secure aggregation' protocol enables the server to aggregate clients' models in a privacy-preserving manner. However, existing secure aggregation protocols incur high computation/communication costs, especially when the number of model parameters is larger than the number of clients participating in an iteration -- a typical scenario in federated learning. In this paper, we propose a secure aggregation protocol, FastSecAgg, that is efficient in terms of computation and communication, and robust to client dropouts. The main building block of FastSecAgg is a novel multi-secret sharing scheme, FastShare, based on the Fast Fourier Transform (FFT), which may be of independent interest. FastShare is information-theoretically secure, and achieves a trade-off between the number of secrets, privacy threshold, and dropout tolerance. Riding on the capabilities of FastShare, we prove that FastSecAgg is (i) secure against the server colluding with 'any' subset of some constant fraction (e.g. $\sim10\%$) of the clients in the honest-but-curious setting; and (ii) tolerates dropouts of a 'random' subset of some constant fraction (e.g. $\sim10\%$) of the clients. FastSecAgg achieves significantly smaller computation cost than existing schemes while achieving the same (orderwise) communication cost. In addition, it guarantees security against adaptive adversaries, which can perform client corruptions dynamically during the execution of the protocol.

81 citations

Posted Content
TL;DR: A versatile scheduling problem to model a three-way tradeoff between age of information (AoI), quality/distortion, and energy is considered and a greedy algorithm is proposed that is shown to be 2-competitive, independent of all parameters of the problem.
Abstract: A versatile scheduling problem to model a three-way tradeoff between delay/age, distortion, and energy is considered. The considered problem called the age and quality of information (AQI) is to select which packets to transmit at each time slot to minimize a linear combination of the distortion cost, the age/delay cost and the energy transmission cost in an online fashion. AQI generalizes multiple important problems such as age of information (AoI), the remote estimation problem with sampling constraint, the classical speed scaling problem among others. The worst case input model is considered, where the performance metric is the competitive ratio. A greedy algorithm is proposed that is shown to be 2-competitive, independent of all parameters of the problem. For the special case of AQI problem, a greedy online maximum weight matching based algorithm is also shown to be 2-competitive.

32 citations

Posted Content
TL;DR: This paper focuses on understanding the minimax statistical limits of IL in episodic Markov Decision Processes (MDPs), and proposes a novel algorithm based on minimum-distance functionals in the setting where the transition model is given and the expert is deterministic.
Abstract: Imitation learning (IL) aims to mimic the behavior of an expert policy in a sequential decision-making problem given only demonstrations. In this paper, we focus on understanding the minimax statistical limits of IL in episodic Markov Decision Processes (MDPs). We first consider the setting where the learner is provided a dataset of $N$ expert trajectories ahead of time, and cannot interact with the MDP. Here, we show that the policy which mimics the expert whenever possible is in expectation $\lesssim \frac{|\mathcal{S}| H^2 \log (N)}{N}$ suboptimal compared to the value of the expert, even when the expert follows an arbitrary stochastic policy. Here $\mathcal{S}$ is the state space, and $H$ is the length of the episode. Furthermore, we establish a suboptimality lower bound of $\gtrsim |\mathcal{S}| H^2 / N$ which applies even if the expert is constrained to be deterministic, or if the learner is allowed to actively query the expert at visited states while interacting with the MDP for $N$ episodes. To our knowledge, this is the first algorithm with suboptimality having no dependence on the number of actions, under no additional assumptions. We then propose a novel algorithm based on minimum-distance functionals in the setting where the transition model is given and the expert is deterministic. The algorithm is suboptimal by $\lesssim \min \{ H \sqrt{|\mathcal{S}| / N} ,\ |\mathcal{S}| H^{3/2} / N \}$, showing that knowledge of transition improves the minimax rate by at least a $\sqrt{H}$ factor.

30 citations

Journal ArticleDOI
TL;DR: In this paper, the authors considered the problem of selecting which packets to transmit at each time slot to minimize a linear combination of the utility driven by quality, the AoI, and the energy transmission cost in an online fashion.
Abstract: A versatile scheduling problem to model a three-way tradeoff between age of information (AoI), quality/distortion, and energy is considered. The considered problem called the age and quality of information (AQI) is to select which packets to transmit at each time slot to minimize a linear combination of the utility driven by quality, the AoI, and the energy transmission cost in an online fashion. AQI problem combines tradeoffs from some important distinct problems, such as AoI with multiple sources, the remote sampling problem with sampling constraint, the classical speed scaling problem among others. The arbitrary/adversarial case input model is considered in the online setting, where the performance metric is the competitive ratio. A greedy algorithm is proposed that is shown to be 2-competitive, independent of all parameters of the problem. For the special case of AQI problem, a maximum weight matching based algorithm is also shown to be 2-competitive.

10 citations

Posted Content
TL;DR: More refined approximation guarantee is derived for a special case called the submodular welfare maximization/partition problem that is also tight, for both the offline and the online case.
Abstract: The classical problem of maximizing a submodular function under a matroid constraint is considered. Defining a new measure for the increments made by the greedy algorithm at each step, called the discriminant, improved approximation ratio guarantees are derived for the greedy algorithm. At each step, discriminant measures the multiplicative gap in the incremental valuation between the item chosen by the greedy algorithm and the largest potential incremental valuation for eligible items not selected by it. The new guarantee subsumes all the previous known results for the greedy algorithm, including the curvature based ones, and the derived guarantees are shown to be tight via constructing specific instances. More refined approximation guarantee is derived for a special case called the submodular welfare maximization/partition problem that is also tight, for both the offline and the online case.

6 citations


Cited by
More filters
Book ChapterDOI
01 Jan 2008
TL;DR: Efron and Thisted as discussed by the authors studied the frequency distribution of words in the Shakespearean canon and found that the expected number of words that occur x ≥ 1 times in a large sample of n words is
Abstract: This paper is the first of two written by Brad Efron and Ron Thisted studying the frequency distribution of words in the Shakespearean canon. The key idea due to Fisher in the context of sampling of species is simple and elegant. When applied to Shakespeare the idea appears to be preposterous: an author has a personal vocabulary of word species represented by a distribution G, and text is generated by sampling from this distribution. Most results do not require successive words to be sampled independently, which leaves room for individual style and context, but stationarity is needed for prediction and inference. The expected number of words that occur x ≥ 1 times in a large sample of n words is

199 citations

Proceedings Article
09 Feb 2022
TL;DR: A simple algorithm based on the primal-dual formulation of MDPs, where the dual variables are mod-eled using a density-ratio function against offline data and enjoys polynomial sample complexity, under only realizability and single-policy concentrability.
Abstract: Sample-efficiency guarantees for offline reinforcement learning (RL) often rely on strong assumptions on both the function classes (e.g., Bellman-completeness) and the data coverage (e.g., all-policy concentrability). Despite the recent efforts on relaxing these assumptions, existing works are only able to relax one of the two factors, leaving the strong assumption on the other factor intact. As an important open problem, can we achieve sample-efficient offline RL with weak assumptions on both factors? In this paper we answer the question in the positive. We analyze a simple algorithm based on the primal-dual formulation of MDPs, where the dual variables (discounted occupancy) are modeled using a density-ratio function against offline data. With proper regularization, we show that the algorithm enjoys polynomial sample complexity, under only realizability and single-policy concentrability. We also provide alternative analyses based on different assumptions to shed light on the nature of primal-dual algorithms for offline RL.

48 citations

Posted Content
TL;DR: The quality of an update is model as an increasing function of the processing time spent while generating the update at the transmitter, and distortion is used as a proxy for quality, and model distortion as a decreasing function of processing time.
Abstract: We consider an information update system where an information receiver requests updates from an information provider in order to minimize its age of information. The updates are generated at the information provider (transmitter) as a result of completing a set of tasks such as collecting data and performing computations. We refer to this as the update generation process. We model the $quality$ of an update as an increasing function of the processing time spent while generating the update at the transmitter. In particular, we use $distortion$ as a proxy for $quality$, and model distortion as a decreasing function of processing time. Processing longer at the transmitter results in a better quality (lower distortion) update, but it causes the update to age. We determine the age-optimal policies for the update request times at the receiver and the update processing times at the transmitter subject to a minimum required quality (maximum allowed distortion) constraint on the updates. For the required quality constraint, we consider the cases of constant maximum allowed distortion constraints, as well as age-dependent maximum allowed distortion constraints.

43 citations

Posted Content
TL;DR: In this paper, the authors considered the online version of the SWM problem, where items arrive one at a time in an online manner; when an item arrives, the algorithm must make an irrevocable decision about which agent to assign it to before seeing any subsequent items.
Abstract: In the Submodular Welfare Maximization (SWM) problem, the input consists of a set of $n$ items, each of which must be allocated to one of $m$ agents. Each agent $\ell$ has a valuation function $v_\ell$, where $v_\ell(S)$ denotes the welfare obtained by this agent if she receives the set of items $S$. The functions $v_\ell$ are all submodular; as is standard, we assume that they are monotone and $v_\ell(\emptyset) = 0$. The goal is to partition the items into $m$ disjoint subsets $S_1, S_2, \ldots S_m$ in order to maximize the social welfare, defined as $\sum_{\ell = 1}^m v_\ell(S_\ell)$. In this paper, we consider the online version of SWM. Here, items arrive one at a time in an online manner; when an item arrives, the algorithm must make an irrevocable decision about which agent to assign it to before seeing any subsequent items. This problem is motivated by applications to Internet advertising, where user ad impressions must be allocated to advertisers whose value is a submodular function of the set of users / impressions they receive. In the random order model, the adversary can construct a worst-case set of items and valuations, but does not control the order in which the items arrive; instead, they are assumed to arrive in a random order. Obtaining a competitive ratio of $1/2 + \Omega(1)$ for the random order model has been an important open problem for several years. We solve this open problem by demonstrating that the greedy algorithm has a competitive ratio of at least $0.505$ for the Online Submodular Welfare Maximization problem in the random order model. For special cases of submodular functions including weighted matching, weighted coverage functions and a broader class of "second-order supermodular" functions, we provide a different analysis that gives a competitive ratio of $0.51$.

40 citations

Proceedings Article
Laixi Shi, Gen Li, Yuhang Wei, Yuxin Chen, Yuejie Chi 
28 Feb 2022
TL;DR: This work studies a pessimistic variant of Q-learning in the context of finite-horizon Markov decision processes, and characterize its sample complexity under the singlepolicy concentrability assumption which does not require the full coverage of the state-action space.
Abstract: Offline or batch reinforcement learning seeks to learn a near-optimal policy using history data without active exploration of the environment. To counter the insufficient coverage and sample scarcity of many offline datasets, the principle of pessimism has been recently introduced to mitigate high bias of the estimated values. While pessimistic variants of model-based algorithms (e.g., value iteration with lower confidence bounds) have been theoretically investigated, their modelfree counterparts — which do not require explicit model estimation — have not been adequately studied, especially in terms of sample efficiency. To address this inadequacy, we study a pessimistic variant of Q-learning in the context of finitehorizon Markov decision processes, and characterize its sample complexity under the single-policy concentrability assumption which does not require the full coverage of the state-action space. In addition, a variance-reduced pessimistic Qlearning algorithm is proposed to achieve nearoptimal sample complexity. Altogether, this work highlights the efficiency of model-free algorithms in offline RL when used in conjunction with pessimism and variance reduction.

39 citations