scispace - formally typeset
A

Alexander Rakhlin

Researcher at Massachusetts Institute of Technology

Publications -  196
Citations -  10056

Alexander Rakhlin is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Regret & Minimax. The author has an hindex of 51, co-authored 181 publications receiving 7872 citations. Previous affiliations of Alexander Rakhlin include University of California, Berkeley & National Research University of Electronic Technology.

Papers
More filters
Posted Content

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

TL;DR: This paper investigates the optimality of SGD in a stochastic setting, and shows that for smooth problems, the algorithm attains the optimal O(1/T) rate, however, for non-smooth problems the convergence rate with averaging might really be Ω(log(T)/T), and this is not just an artifact of the analysis.
Proceedings Article

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

TL;DR: In this article, the optimality of SGD in a stochastic setting was investigated, and it was shown that SGD attains the optimal O(1/T) rate for smooth problems.
Proceedings Article

Competing in the dark: An efficient algorithm for bandit linear optimization

TL;DR: This work introduces an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O∗( √ T ) regret and presents a novel connection between online learning and interior point methods.
Journal ArticleDOI

Size-independent sample complexity of neural networks

TL;DR: In this article, the sample complexity of learning neural networks is studied by providing new bounds on their Rademacher complexity assuming norm constraints on the parameter matrix of each layer, and these bounds have improved dependence on the network depth and under some additional assumptions, are fully independent of the network size.
Proceedings Article

Online Learning With Predictable Sequences

TL;DR: In this article, the authors present methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences, where the sequence encountered by the learner is described well by a known predictable process.