scispace - formally typeset
Journal ArticleDOI

Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm

TLDR
A chunk-based incremental ensemble algorithm called Dynamic Updated Ensemble (DUE) for learning imbalanced data streams with concept drift, which can timely react to multiple kinds of concept drifts and keep a limited number of classifiers to ensure high efficiency.
Abstract
Learning nonstationary data streams has been well studied in recent years. However, most of the researches assume that the class imbalance of data streams is relatively balanced. Only a few approaches tackle the joint issue of concept drift and class imbalance due to its complexity. Meanwhile, the existing chunk ensembles for classifying imbalanced nonstationary data streams always need to store previous data, which consumes plenty of memory usage. To overcome these issues, we propose a chunk-based incremental ensemble algorithm called Dynamic Updated Ensemble (DUE) for learning imbalanced data streams with concept drift. Compared to the existing techniques, its merits are five-fold: (1) it learns one chunk at a time without requiring access to previous data; (2) it emphasizes misclassified examples in the model update procedure; (3) it can timely react to multiple kinds of concept drifts; (4) it can adapt to the new condition when switching majority class to minority class; (5) it keeps a limited number of classifiers to ensure high efficiency. Experiments on synthetic and real datasets demonstrate the effectiveness of DUE in learning nonstationary imbalanced data streams.

read more

Citations
More filters
Journal ArticleDOI

UFFDFR: Undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection for imbalanced data classification

TL;DR: A novel three-stage undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection (UFFDFR) is proposed that improves the classification performance on imbalanced data by removing noise and unrepresentative samples from the majority class.
Journal ArticleDOI

Evidential reasoning based ensemble classifier for uncertain imbalanced data

TL;DR: The proposed evidential reasoning based ensemble classifier (EREC), based on an affinity propagation based oversampling method, is applied to the diagnosis of thyroid nodules using the datasets of five radiologists, obtained from a tertiary hospital located in Hefei, Anhui, China.
Journal ArticleDOI

ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams

Alberto Cano, +1 more
- 20 Apr 2022 - 
TL;DR: In this paper , a robust online self-adjusting ensemble (ROSE) classifier is proposed to detect concept drift and create a background ensemble for faster adaptation to changes in data streams.
Journal ArticleDOI

A survey of active and passive concept drift handling methods

TL;DR: Many concept drift handling methods in this survey are analyzed and summarized in terms of the comparing algorithms, learning model, applicable drift type, advantages, and disadvantages of the algorithms.
Journal ArticleDOI

Recurrent Adaptive Classifier Ensemble for Handling Recurring Concept Drifts

TL;DR: In this paper, a recurrent adaptive classifier ensemble (RACE) is proposed to handle recurring concepts with minimum computational overheads, which preserves an archive of previously learned models that are diverse and always trains both new and existing classifiers.
References
More filters
Journal Article

Statistical Comparisons of Classifiers over Multiple Data Sets

TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.
Journal ArticleDOI

A survey on concept drift adaptation

TL;DR: The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art and aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
Proceedings ArticleDOI

Mining high-speed data streams

TL;DR: This paper describes and evaluates VFDT, an anytime system that builds decision trees using constant memory and constant time per example, and applies it to mining the continuous stream of Web access data from the whole University of Washington main campus.
Proceedings ArticleDOI

Dissecting Android Malware: Characterization and Evolution

TL;DR: Systematize or characterize existing Android malware from various aspects, including their installation methods, activation mechanisms as well as the nature of carried malicious payloads reveal that they are evolving rapidly to circumvent the detection from existing mobile anti-virus software.
Proceedings ArticleDOI

Mining time-changing data streams

TL;DR: An efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner is proposed, called CVFDT, which stays current while making the most of old data by growing an alternative subtree whenever an old one becomes questionable, and replacing the old with the new when the new becomes more accurate.
Related Papers (5)