scispace - formally typeset
Proceedings ArticleDOI

How to use expert advice

Reads0
Chats0
TLDR
This work analyzes algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts', and shows how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context.
Abstract
We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called `experts''. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the algorithm by the difference between the expected number of mistakes it makes on the bit sequence and the expected number of mistakes made by the best expert on this sequence, where the expectation is taken with respect to the randomization in the predictions. We show that the minimum achievable difference is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show how this leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also extend our analysis to the case in which log loss is used instead of the expected number of mistakes.

read more

Citations
More filters
Journal ArticleDOI

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

TL;DR: The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.
Book

Pattern recognition and neural networks

TL;DR: Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks in this self-contained account.

Selection of relevant features and examples in machine

TL;DR: A survey of machine learning methods for handling data sets containing large amounts of irrelevant information can be found in this article, where the authors focus on two key issues: selecting relevant features and selecting relevant examples.
Journal ArticleDOI

Selection of relevant features and examples in machine learning

TL;DR: This survey reviews work in machine learning on methods for handling data sets containing large amounts of irrelevant information and describes the advances that have been made in both empirical and theoretical work in this area.
Journal ArticleDOI

The Nonstochastic Multiarmed Bandit Problem

TL;DR: A solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs.
References
More filters
Book

Neural Networks: A Comprehensive Foundation

Simon Haykin
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.
Journal ArticleDOI

Paper: Modeling by shortest data description

Jorma Rissanen
- 01 Sep 1978 - 
TL;DR: The number of digits it takes to write down an observed sequence x1,...,xN of a time series depends on the model with its parameters that one assumes to have generated the observed data.
Proceedings ArticleDOI

A theory of the learnable

TL;DR: This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.
Book

Estimation of Dependences Based on Empirical Data

TL;DR: In this article, the Big Picture of Inference: Direct Inference Instead of Generalization (INFI) instead of generalization (2000-2010) is presented. But this is not the case in this paper.
Journal ArticleDOI

The weighted majority algorithm

TL;DR: A simple and effective method, based on weighted voting, is introduced for constructing a compound algorithm, which is robust in the presence of errors in the data, and is called the Weighted Majority Algorithm.