scispace - formally typeset
Open AccessPosted Content

Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression

Reads0
Chats0
TLDR
Algorithms for learning GLMs and SIMs, which are both computationally and statistically efficient and modify the isotonic regression step in Isotron to fit a Lipschitz monotonic function, and provide an efficient O(n log(n) algorithm for this step, improving upon the previous O( n2) algorithm.
Abstract
Generalized Linear Models (GLMs) and Single Index Models (SIMs) provide powerful generalizations of linear regression, where the target variable is assumed to be a (possibly unknown) 1-dimensional function of a linear predictor. In general, these problems entail non-convex estimation procedures, and, in practice, iterative local search heuristics are often used. Kalai and Sastry (2009) recently provided the first provably efficient method for learning SIMs and GLMs, under the assumptions that the data are in fact generated under a GLM and under certain monotonicity and Lipschitz constraints. However, to obtain provable performance, the method requires a fresh sample every iteration. In this paper, we provide algorithms for learning GLMs and SIMs, which are both computationally and statistically efficient. We also provide an empirical study, demonstrating their feasibility in practice.

read more

Citations
More filters
Journal ArticleDOI

Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks

TL;DR: In this paper, the problem of learning a shallow neural network that best fits a training data set was studied in the over-parameterized regime, where the numbers of observations are fewer than the number of parameters in the model.
Proceedings Article

Globally optimal gradient descent for a ConvNet with Gaussian inputs

TL;DR: This work provides the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations, and shows that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time.
Posted Content

Failures of Gradient-Based Deep Learning

TL;DR: This work describes four types of simple problems, for which the gradient-based algorithms commonly used in deep learning either fail or suffer from significant difficulties.
Proceedings Article

Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

TL;DR: This work describes the minimax rates for contextual bandits with general, potentially nonparametric function classes, and shows that the first universal and optimal reduction from contextual bandits to online regression is provided, which requires no distributional assumptions beyond realizability.
Proceedings Article

Learning One-hidden-layer ReLU Networks via Gradient Descent

TL;DR: It is proved that tensor initialization followed by gradient descent can converge to the ground-truth parameters at a linear rate up to some statistical error.
References
More filters
Journal ArticleDOI

Generalized Linear Models

Eric R. Ziegel
- 01 Aug 2002 - 
TL;DR: This is the Ž rst book on generalized linear models written by authors not mostly associated with the biological sciences, and it is thoroughly enjoyable to read.
Book

Kernel Methods for Pattern Analysis

TL;DR: This book provides an easy introduction for students and researchers to the growing field of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new specific application, and covering all the necessary conceptual and mathematical tools to do so.
Journal ArticleDOI

Generalized linear models. 2nd ed.

TL;DR: A class of statistical models that generalizes classical linear models-extending them to include many other models useful in statistical analysis, of particular interest for statisticians in medicine, biology, agriculture, social science, and engineering.
Journal ArticleDOI

Generalized Linear Models

TL;DR: Generalized linear models, 2nd edn By P McCullagh and J A Nelder as mentioned in this paper, 2nd edition, New York: Manning and Hall, 1989 xx + 512 pp £30
Book ChapterDOI

Rademacher and gaussian complexities: risk bounds and structural results

TL;DR: In this paper, the authors investigate the use of data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities, in a decision theoretic setting and prove general risk bounds in terms of these complexities.
Related Papers (5)