scispace - formally typeset
Open AccessPosted Content

Learning From An Optimization Viewpoint

Karthik Sridharan
- 18 Apr 2012 - 
Reads0
Chats0
TLDR
This dissertation establishes a strong connection between offline convex optimization problems and statistical learning problems and shows that for a large class of high dimensional optimization problems, MD is in fact near optimal even for convex optimized problems.
Abstract
In this dissertation we study statistical and online learning problems from an optimization viewpoint.The dissertation is divided into two parts : I. We first consider the question of learnability for statistical learning problems in the general learning setting. The question of learnability is well studied and fully characterized for binary classification and for real valued supervised learning problems using the theory of uniform convergence. However we show that for the general learning setting uniform convergence theory fails to characterize learnability. To fill this void we use stability of learning algorithms to fully characterize statistical learnability in the general setting. Next we consider the problem of online learning. Unlike the statistical learning framework there is a dearth of generic tools that can be used to establish learnability and rates for online learning problems in general. We provide online analogs to classical tools from statistical learning theory like Rademacher complexity, covering numbers, etc. We further use these tools to fully characterize learnability for online supervised learning problems. II. In the second part, for general classes of convex learning problems, we provide appropriate mirror descent (MD) updates for online and statistical learning of these problems. Further, we show that the the MD is near optimal for online convex learning and for most cases, is also near optimal for statistical convex learning. We next consider the problem of convex optimization and show that oracle complexity can be lower bounded by the so called fat-shattering dimension of the associated linear class. Thus we establish a strong connection between offline convex optimization problems and statistical learning problems. We also show that for a large class of high dimensional optimization problems, MD is in fact near optimal even for convex optimization.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal Article

Breaking the curse of dimensionality with convex neural networks

TL;DR: In this paper, the authors consider neural networks with a single hidden layer and non-decreasing positively homogeneous activation functions like the rectified linear units and provide a detailed theoretical analysis of their generalization performance, with a study of both the approximation and the estimation errors.
Proceedings ArticleDOI

Distributed stochastic optimization and learning

TL;DR: It is shown how the best known guarantees are obtained by an accelerated mini-batched SGD approach, and the runtime and sample costs of the approach with those of other distributed optimization algorithms are compared.
Journal ArticleDOI

Stochastic online optimization. Single-point and multi-point non-linear multi-armed bandits. Convex and strongly-convex case

TL;DR: The aim of this paper is to derive the converge rate of the proposed methods and to determine a noise level which does not significantly affect the convergence rate.
Posted Content

Breaking the Curse of Dimensionality with Convex Neural Networks

TL;DR: This work considers neural networks with a single hidden layer and non-decreasing homogeneous activa-tion functions like the rectified linear units and shows that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace.
Proceedings Article

Stochastic Convex Optimization with Multiple Objectives

TL;DR: This paper examines a two stages exploration-exploitation based algorithm and an efficient primal-dual stochastic algorithm which leverages on the theory of Lagrangian method in constrained optimization and attains the optimal convergence rate of O(1/√T) in high probability for general Lipschitz continuous objectives.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Journal ArticleDOI

Regularization and variable selection via the elastic net

TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Book

The Probabilistic Method

Joel Spencer
TL;DR: A particular set of problems - all dealing with “good” colorings of an underlying set of points relative to a given family of sets - is explored.
Book ChapterDOI

On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities

TL;DR: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady.
Related Papers (5)