scispace - formally typeset
Search or ask a question

Showing papers on "Classifier chains published in 2012"


Journal ArticleDOI
TL;DR: The results of the analysis show that for multi-label classification the best performing methods overall are random forests of predictive clustering trees (RF-PCT) and hierarchy of multi- label classifiers (HOMER), followed by binary relevance (BR) and classifier chains (CC).

711 citations


Journal ArticleDOI
TL;DR: It is claimed that two types of label dependence should be distinguished, namely conditional and marginal dependence, and three scenarios in which the exploitation of one of these types of dependence may boost the predictive performance of a classifier are presented.
Abstract: Most of the multi-label classification (MLC) methods proposed in recent years intended to exploit, in one way or the other, dependencies between the class labels. Comparing to simple binary relevance learning as a baseline, any gain in performance is normally explained by the fact that this method is ignoring such dependencies. Without questioning the correctness of such studies, one has to admit that a blanket explanation of that kind is hiding many subtle details, and indeed, the underlying mechanisms and true reasons for the improvements reported in experimental studies are rarely laid bare. Rather than proposing yet another MLC algorithm, the aim of this paper is to elaborate more closely on the idea of exploiting label dependence, thereby contributing to a better understanding of MLC. Adopting a statistical perspective, we claim that two types of label dependence should be distinguished, namely conditional and marginal dependence. Subsequently, we present three scenarios in which the exploitation of one of these types of dependence may boost the predictive performance of a classifier. In this regard, a close connection with loss minimization is established, showing that the benefit of exploiting label dependence does also depend on the type of loss to be minimized. Concrete theoretical results are presented for two representative loss functions, namely the Hamming loss and the subset 0/1 loss. In addition, we give an overview of state-of-the-art decomposition algorithms for MLC and we try to reveal the reasons for their effectiveness. Our conclusions are supported by carefully designed experiments on synthetic and benchmark data.

317 citations


Proceedings ArticleDOI
27 Aug 2012
TL;DR: A detailed probabilistic analysis of classifier chains from a risk minimization perspective is provided, thereby helping to gain a better understanding of this approach.
Abstract: The idea of classifier chains has recently been introduced as a promising technique for multi-label classification. However, despite being intuitively appealing and showing strong performance in empirical studies, still very little is known about the main principles underlying this type of method. In this paper, we provide a detailed probabilistic analysis of classifier chains from a risk minimization perspective, thereby helping to gain a better understanding of this approach. As a main result, we clarify that the original chaining method seeks to approximate the joint mode of the conditional distribution of label vectors in a greedy manner. As a result of a theoretical regret analysis, we conclude that this approach can perform quite poorly in terms of subset 0/1 loss. Therefore, we present an enhanced inference procedure for which the worst-case regret can be upper-bounded far more tightly. In addition, we show that a probabilistic variant of chaining, which can be utilized for any loss function, becomes tractable by using Monte Carlo sampling. Finally, we present experimental results confirming the validity of our theoretical findings.

59 citations


Posted Content
09 Nov 2012
TL;DR: In this article, a double Monte Carlo scheme (M2CC) was proposed to find a good chain sequence and perform efficient inference for high-dimensional data sets and obtains the best overall accuracy on several real data sets with input dimension as high as 1449 and up to 103 labels.
Abstract: Multi-label classification (MLC) is the supervised learning problem where an instance may be associated with multiple labels. Modeling dependencies between labels allows MLC methods to improve their performance at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies. On the one hand, the original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors down the chain. On the other hand, a recent Bayes-optimal method improves the performance, but is computationally intractable in practice. Here we present a novel double-Monte Carlo scheme (M2CC), both for finding a good chain sequence and performing efficient inference. The M2CC algorithm remains tractable for high-dimensional data sets and obtains the best overall accuracy, as shown on several real data sets with input dimension as high as 1449 and up to 103 labels.

33 citations


Journal ArticleDOI
TL;DR: A hybrid framework is proposed which employs continuous probabilistic latent semantic analysis (PLSA) to model continuous quantity and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage and can predict semantic annotation precisely for unseen images.
Abstract: In order to bridge the semantic gap exists in image retrieval, this paper propose an approach combining generative and discriminative learning to accomplish the task of automatic image annotation and retrieval. We firstly present continuous probabilistic latent semantic analysis (PLSA) to model continuous quantity. Furthermore, we propose a hybrid framework which employs continuous PLSA to model visual features of images in generative learning stage and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage. Since the framework combines the advantages of generative and discriminative learning, it can predict semantic annotation precisely for unseen images. Finally, we conduct a series of experiments on a standard Corel dataset. The experiment results show that our approach outperforms many state-of-the-art approaches.

5 citations


Posted Content
09 Nov 2012
TL;DR: This paper focuses on the classifier chains (CC) approach for modeling dependencies, and presents novel Monte Carlo schemes, both for finding a good chain sequence and performing efficient inference.
Abstract: Multi-dimensional classification (MDC) is the supervised learning problem where an instance may be associated with multiple classes, rather than with a single class as in traditional binary or multi-class single-dimensional classification (SDC) problems. MDC is closely related to multi-task learning, and multi-target learning (generally, in the literature, multi-target refers to the regression case). Modeling dependencies between labels allows MDC methods to improve their performance at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies. On the one hand, the original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors down the chain. On the other hand, a recent Bayes-optimal method improves the performance, but is computationally intractable in practice. Here we present novel Monte Carlo schemes, both for finding a good chain sequence and performing efficient inference. Our algorithms remain tractable for high-dimensional data sets and obtains the best overall accuracy, as shown on several real data sets.

2 citations


Journal ArticleDOI
TL;DR: A novel model of risk-neutral reinforcement learning in a traditional Bucket Brigade credit-allocation market under the pressure of a Genetic Algorithm is developed and suggests a path toward a new type of LCS built on stable, heterogeneous, and risk-averse preferences under efficient auctions and access to more complete markets exploitable by competing risk management strategies.
Abstract: Both economics and biology have come to agree that successful behavior in a stochastic environment responds to the variance of potential outcomes. Unfortunately, when biological and economic paradigms are mated together in a learning classifier system (LCS), decision-making agents called classifiers typically simply ignore risk. Since a fundamental problem of learning is risk management, LCS have not always performed as well as theoretically predicted. This paper develops a novel model of risk-neutral reinforcement learning in a traditional Bucket Brigade credit-allocation market under the pressure of a Genetic Algorithm. I demonstrate the applicability of the basic model to the classical LCS design and reexamine two basic issues where traditional LCS performance fails to meet expectations: default hierarchies and long chains of coupled classifiers. Risk-neutrality and noisy probabilistic auctions create dynamic instability in both areas, while identical preferences result in market failure in default hierarchies and exponential attenuation of price signals down classifier chains. Despite the limitations of simple risk-neutral classifiers, I show they’re capable of cheap short-run emulation of more rational behaviors. Still, risk-neutral information markets are a dead end. The model suggests a path toward a new type of LCS built on stable, heterogeneous, and risk-averse preferences under efficient auctions and access to more complete markets exploitable by competing risk management strategies. This will require a radical rethinking of the evolutionary and economic algorithms, but ultimately heralds a return to a market-based approach to LCS.

1 citations


01 Jan 2012
TL;DR: An approach to estimate a discrete joint density online, that is, the algorithm is only provided the current example, its current estimate, and a limited amount of memory, is proposed.
Abstract: We propose an approach to estimate a discrete joint density online, that is, the algorithm is only provided the current example, its current estimate, and a limited amount of memory To design an online estimator for discrete densities, we use classifier chains to model dependencies among features Each classifier in the chain estimates the probability of one particular feature Because a single chain may not provide a reliable estimate, we also consider ensembles of classifier chains Our experiments on synthetic data show that the approach is feasible and the estimated densities approach the true, known distribution with increasing amounts of data

Book ChapterDOI
12 Oct 2012
TL;DR: A hybrid framework is presented which employs continuous probabilistic latent semantic analysis to model continuous quantity and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage and can predict semantic annotation precisely for unseen images.
Abstract: We firstly propose continuous probabilistic latent semantic analysis (PLSA) to model continuous quantity. In addition, corresponding Expectation-Maximization (EM) algorithm is derived to determine the model parameters. Furthermore, we present a hybrid framework which employs continuous PLSA to model visual features of images in generative learning stage and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage. Since the framework combines the advantages of generative and discriminative learning, it can predict semantic annotation precisely for unseen images. Finally, we conduct a series of experiments on a standard Corel dataset. The experiment results show that our approach outperforms many state-of-the-art approaches.