scispace - formally typeset
Open AccessJournal ArticleDOI

On evaluating stream learning algorithms

João Gama, +2 more
- 01 Mar 2013 - 
- Vol. 90, Iss: 3, pp 317-346
Reads0
Chats0
TLDR
A general framework for assessing predictive stream learning algorithms and defends the use of prequential error with forgetting mechanisms to provide reliable error estimators, and proves that, in stationary data and for consistent learning algorithms, the holdout estimator, the preQUential error and the prequentially error estimated over a sliding window or using fading factors, all converge to the Bayes error.
Abstract
Most streaming decision models evolve continuously over time, run in resource-aware environments, and detect and react to changes in the environment generating data. One important issue, not yet convincingly addressed, is the design of experimental work to evaluate and compare decision models that evolve over time. This paper proposes a general framework for assessing predictive stream learning algorithms. We defend the use of prequential error with forgetting mechanisms to provide reliable error estimators. We prove that, in stationary data and for consistent learning algorithms, the holdout estimator, the prequential error and the prequential error estimated over a sliding window or using fading factors, all converge to the Bayes error. The use of prequential error with forgetting mechanisms reveals to be advantageous in assessing performance and in comparing stream learning algorithms. It is also worthwhile to use the proposed methods for hypothesis testing and for change detection. In a set of experiments in drift scenarios, we evaluate the ability of a standard change detection algorithm to detect change using three prequential error estimators. These experiments point out that the use of forgetting mechanisms (sliding windows or fading factors) are required for fast and efficient change detection. In comparison to sliding windows, fading factors are faster and memoryless, both important requirements for streaming applications. Overall, this paper is a contribution to a discussion on best practice for performance assessment when learning is a continuous process, and the decision models are dynamic and evolve over time.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Machine learning

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Journal ArticleDOI

A survey on concept drift adaptation

TL;DR: The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art and aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
Journal ArticleDOI

Ensemble learning for data stream analysis

TL;DR: This paper surveys research on ensembles for data stream classification as well as regression tasks and discusses advanced learning concepts such as imbalanced data streams, novelty detection, active and semi-supervised learning, complex data representations and structured outputs.
Journal ArticleDOI

Learning under Concept Drift: A Review

TL;DR: A high quality, instructive review of current research developments and trends in the concept drift field is conducted, and a framework of learning under concept drift is established including three main components: concept drift detection, concept drift understanding, and concept drift adaptation.
Journal ArticleDOI

A survey on data preprocessing for data stream mining

TL;DR: This survey summarizes, categorize and analyze those contributions on data preprocessing that cope with streaming data, and takes into account the existing relationships between the different families of methods (feature and instance selection, and discretization).
References
More filters
Book

Neural networks for pattern recognition

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book

Pattern classification and scene analysis

TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Journal ArticleDOI

Machine learning

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Related Papers (5)