scispace - formally typeset
Proceedings ArticleDOI

Learning optimal classifier chains for real-time big data mining

Reads0
Chats0
TLDR
This paper proposes online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics.
Abstract
A plethora of emerging Big Data applications require processing and analyzing streams of data to extract valuable information in real-time. For this, chains of classifiers which can detect various concepts need to be constructed in real-time. In this paper, we propose online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics. The proposed solution does not require the distributed local classifiers to exchange any information when learning at runtime. Moreover, our algorithm requires only limited feedback of the mining performance to enable the learning of the optimal classifier chain. We model the learning problem of the optimal classifier chain at run-time as a multi-player multi-armed bandit problem with limited feedback. To our best knowledge, this paper is the first that applies bandit techniques to stream mining problems. However, existing bandit algorithms are inefficient in the considered scenario due to the fact that each component classifier learns its optimal classification functions using only the aggregate overall reward without knowing its own individual reward and without exchanging information with other classifiers. We prove that the proposed algorithms achieve logarithmic learning regret uniformly over time and hence, they are order optimal. Therefore, the long-term time average performance loss tends to zero. We also design learning algorithms whose regret is linear in the number of classification functions. This is much smaller than the regret results which can be obtained using existing bandit algorithms that scale linearly in the number of classifier chains and exponentially in the number of classification functions.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Machine learning on big data

TL;DR: A framework of ML on big data (MLBiD) is introduced to guide the discussion of its opportunities and challenges and provides directions for identification of associated opportunities and challenged and open up future work in many unexplored or under explored research areas.
Journal ArticleDOI

A Big Data-as-a-Service Framework: State-of-the-Art and Perspectives

TL;DR: A tensor-based multiple clustering on bicycle renting and returning data is illustrated, which can provide several suggestions for rebalancing of the bicycle-sharing system and some challenges about the proposed framework are discussed.
Journal ArticleDOI

Mining the Situation: Spatiotemporal Traffic Prediction With Big Data

TL;DR: A novel online framework that could learn from the current traffic situation (or context) in real-time and predict the future traffic by matching the current situation to the most effective prediction model trained using historical data is proposed.
Proceedings ArticleDOI

A scalable machine learning online service for big data real-time analysis

TL;DR: Systems implementing this architecture could provide companies with on-demand tools facilitating the tasks of storing, analyzing, understanding and reacting to their data, either in batch or stream fashion; and could turn into a valuable asset for improving the business performance and be a key market differentiator in this fast pace environment.
Journal ArticleDOI

Distributed Multi-Agent Online Learning Based on Global Feedback

TL;DR: This paper develops online learning algorithms that enable the agents to cooperatively learn how to maximize the overall reward in scenarios where only noisy global feedback is available without exchanging any information among themselves and demonstrates how the regret depends on the size of the action space.
References
More filters
Journal ArticleDOI

Finite-time Analysis of the Multiarmed Bandit Problem

TL;DR: This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.
Journal Article

Using confidence bounds for exploitation-exploration trade-offs

TL;DR: It is shown how a standard tool from statistics, namely confidence bounds, can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-off, and improves the regret from O(T3/4) to T1/2.
Proceedings Article

Stochastic Linear Optimization Under Bandit Feedback

TL;DR: A nearly complete characterization of the classical stochastic k-armed bandit problem in terms of both upper and lower bounds for the regret is given, and two variants of an algorithm based on the idea of “upper confidence bounds” are presented.
Book ChapterDOI

Load shedding in a data stream manager

TL;DR: This paper examines a technique for dynamically inserting and removing drop operators into query plans as required by the current load, and addresses the problems of determining when load shedding is needed, where in the query plan to insert drops, and how much of the load should be shed at that point in the plan.
Related Papers (5)