Learning optimal classifier chains for real-time big data mining

doi:10.1109/ALLERTON.2013.6736568

Proceedings ArticleDOI

Learning optimal classifier chains for real-time big data mining

Jie Xu, +2 more

- pp 512-519

Chats0

TLDR

This paper proposes online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics.

Abstract:

A plethora of emerging Big Data applications require processing and analyzing streams of data to extract valuable information in real-time. For this, chains of classifiers which can detect various concepts need to be constructed in real-time. In this paper, we propose online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics. The proposed solution does not require the distributed local classifiers to exchange any information when learning at runtime. Moreover, our algorithm requires only limited feedback of the mining performance to enable the learning of the optimal classifier chain. We model the learning problem of the optimal classifier chain at run-time as a multi-player multi-armed bandit problem with limited feedback. To our best knowledge, this paper is the first that applies bandit techniques to stream mining problems. However, existing bandit algorithms are inefficient in the considered scenario due to the fact that each component classifier learns its optimal classification functions using only the aggregate overall reward without knowing its own individual reward and without exchanging information with other classifiers. We prove that the proposed algorithms achieve logarithmic learning regret uniformly over time and hence, they are order optimal. Therefore, the long-term time average performance loss tends to zero. We also design learning algorithms whose regret is linear in the number of classification functions. This is much smaller than the regret results which can be obtained using existing bandit algorithms that scale linearly in the number of classifier chains and exponentially in the number of classification functions.

Learning optimal classifier chains for real-time big data mining

Citations

Machine learning on big data

A Big Data-as-a-Service Framework: State-of-the-Art and Perspectives

Mining the Situation: Spatiotemporal Traffic Prediction With Big Data

A scalable machine learning online service for big data real-time analysis

Distributed Multi-Agent Online Learning Based on Global Feedback

References

Finite-time Analysis of the Multiarmed Bandit Problem

Asymptotically efficient adaptive allocation rules

Using confidence bounds for exploitation-exploration trade-offs

Stochastic Linear Optimization Under Bandit Feedback

Load shedding in a data stream manager

Related Papers (5)

Context-adaptive big data stream mining

A Distributed Approach for Optimizing Cascaded Classifier Topologies in Real-Time Stream Mining Systems

Adaptive Topologic Optimization for Large-Scale Stream Mining

Resource Management for Networked Classifiers in Distributed Stream Mining Systems

A Framework for Classification in Data Streams Using Multi-strategy Learning