Journal ArticleDOI
Incremental learning imbalanced data streams with concept drift: The dynamic updated ensemble algorithm
TLDR
A chunk-based incremental ensemble algorithm called Dynamic Updated Ensemble (DUE) for learning imbalanced data streams with concept drift, which can timely react to multiple kinds of concept drifts and keep a limited number of classifiers to ensure high efficiency.Abstract:
Learning nonstationary data streams has been well studied in recent years. However, most of the researches assume that the class imbalance of data streams is relatively balanced. Only a few approaches tackle the joint issue of concept drift and class imbalance due to its complexity. Meanwhile, the existing chunk ensembles for classifying imbalanced nonstationary data streams always need to store previous data, which consumes plenty of memory usage. To overcome these issues, we propose a chunk-based incremental ensemble algorithm called Dynamic Updated Ensemble (DUE) for learning imbalanced data streams with concept drift. Compared to the existing techniques, its merits are five-fold: (1) it learns one chunk at a time without requiring access to previous data; (2) it emphasizes misclassified examples in the model update procedure; (3) it can timely react to multiple kinds of concept drifts; (4) it can adapt to the new condition when switching majority class to minority class; (5) it keeps a limited number of classifiers to ensure high efficiency. Experiments on synthetic and real datasets demonstrate the effectiveness of DUE in learning nonstationary imbalanced data streams.read more
Citations
More filters
Journal ArticleDOI
UFFDFR: Undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection for imbalanced data classification
Ming Zheng,Tong Li,Xiaoyao Zheng,Qingying Yu,Chuanming Chen,Ding Zhou,Changlong Lv,Weiyi Yang +7 more
TL;DR: A novel three-stage undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection (UFFDFR) is proposed that improves the classification performance on imbalanced data by removing noise and unrepresentative samples from the majority class.
Journal ArticleDOI
Evidential reasoning based ensemble classifier for uncertain imbalanced data
TL;DR: The proposed evidential reasoning based ensemble classifier (EREC), based on an affinity propagation based oversampling method, is applied to the diagnosis of thyroid nodules using the datasets of five radiologists, obtained from a tertiary hospital located in Hefei, Anhui, China.
Journal ArticleDOI
ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams
Alberto Cano,Bartosz Krawczyk +1 more
TL;DR: In this paper , a robust online self-adjusting ensemble (ROSE) classifier is proposed to detect concept drift and create a background ensemble for faster adaptation to changes in data streams.
Journal ArticleDOI
A survey of active and passive concept drift handling methods
TL;DR: Many concept drift handling methods in this survey are analyzed and summarized in terms of the comparing algorithms, learning model, applicable drift type, advantages, and disadvantages of the algorithms.
Journal ArticleDOI
Recurrent Adaptive Classifier Ensemble for Handling Recurring Concept Drifts
TL;DR: In this paper, a recurrent adaptive classifier ensemble (RACE) is proposed to handle recurring concepts with minimum computational overheads, which preserves an archive of previously learned models that are diverse and always trains both new and existing classifiers.
References
More filters
Journal Article
Statistical Comparisons of Classifiers over Multiple Data Sets
TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.
Journal ArticleDOI
A survey on concept drift adaptation
TL;DR: The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art and aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
Proceedings ArticleDOI
Mining high-speed data streams
Pedro Domingos,Geoff Hulten +1 more
TL;DR: This paper describes and evaluates VFDT, an anytime system that builds decision trees using constant memory and constant time per example, and applies it to mining the continuous stream of Web access data from the whole University of Washington main campus.
Proceedings ArticleDOI
Dissecting Android Malware: Characterization and Evolution
Yajin Zhou,Xuxian Jiang +1 more
TL;DR: Systematize or characterize existing Android malware from various aspects, including their installation methods, activation mechanisms as well as the nature of carried malicious payloads reveal that they are evolving rapidly to circumvent the detection from existing mobile anti-virus software.
Proceedings ArticleDOI
Mining time-changing data streams
TL;DR: An efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner is proposed, called CVFDT, which stays current while making the most of old data by growing an alternative subtree whenever an old one becomes questionable, and replacing the old with the new when the new becomes more accurate.
Related Papers (5)
Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift
SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining
Sheng Chen,Haibo He +1 more