scispace - formally typeset
Search or ask a question

Showing papers on "Classifier chains published in 2015"


Journal ArticleDOI
TL;DR: This paper compares CT with several recently proposed classifier chain methods to show that it occupies an important niche: it is highly competitive on standard multi-label problems, yet it can also scale up to thousands or even tens of thousands of labels.

81 citations


Journal ArticleDOI
TL;DR: A performance comparison of state-of-the-art multi-label learning algorithms for the analysis of multivariate sequential clinical data from medical records of patients affected by chronic diseases using the publicly available MIMIC-II dataset.

57 citations


Proceedings ArticleDOI
11 Jul 2015
TL;DR: This paper proposes a novel genetic algorithm capable of searching for a single optimized label ordering, while at the same time taking into consideration the utilization of partial chains, and demonstrates that the approach is able to produce models that are both simpler and more accurate.
Abstract: Multi-label classification (MLC) is the task of assigning multiple class labels to an object based on the features that describe the object. One of the most effective MLC methods is known as Classifier Chains (CC). This approach consists in training q binary classifiers linked in a chain, y1 → y2 → ... → yq, with each responsible for classifying a specific label in {l1, l2, ..., lq}. The chaining mechanism allows each individual classifier to incorporate the predictions of the previous ones as additional information at classification time. Thus, possible correlations among labels can be automatically exploited. Nevertheless, CC suffers from two important drawbacks: (i) the label ordering is decided at random, although it usually has a strong effect on predictive accuracy; (ii) all labels are inserted into the chain, although some of them might carry irrelevant information to discriminate the others. In this paper we tackle both problems at once, by proposing a novel genetic algorithm capable of searching for a single optimized label ordering, while at the same time taking into consideration the utilization of partial chains. Experiments on benchmark datasets demonstrate that our approach is able to produce models that are both simpler and more accurate.

29 citations


Proceedings ArticleDOI
01 Jun 2015
TL;DR: A simple and efficient framework for multi-label classification, called Group sensitive Classifier Chains, which assumes that similar examples not only share the same label correlations, but also tend to have similar labels.
Abstract: In multi-label classification, labels often have correlations with each other. Exploiting label correlations can improve the performances of classifiers. Current multi-label classification methods mainly consider the global label correlations. However, the label correlations may be different over different data groups. In this paper, we propose a simple and efficient framework for multi-label classification, called Group sensitive Classifier Chains. We assume that similar examples not only share the same label correlations, but also tend to have similar labels. We augment the original feature space with label space and cluster them into groups, then learn the label dependency graph in each group respectively and build the classifier chains on each group specific label dependency graph. The group specific classifier chains which are built on the nearest group of the test example are used for prediction. Comparison results with the state-of-the-art approaches manifest competitive performances of our method.

23 citations


Proceedings ArticleDOI
01 Jan 2015
TL;DR: A novel probabilistic ensemble framework for multi-label classification that is based on the mixtures-of-experts architecture is developed that can recover a rich set of dependency relations among inputs and outputs that a single multi- label classification model cannot capture due to its modeling simplifications.
Abstract: We develop a novel probabilistic ensemble framework for multi-label classification that is based on the mixtures-of-experts architecture. In this framework, we combine multi-label classification models in the classifier chains family that decompose the class posterior distribution P(Y1, …, Yd |X) using a product of posterior distributions over components of the output space. Our approach captures different input-output and output-output relations that tend to change across data. As a result, we can recover a rich set of dependency relations among inputs and outputs that a single multi-label classification model cannot capture due to its modeling simplifications. We develop and present algorithms for learning the mixtures-of-experts models from data and for performing multi-label predictions on unseen data instances. Experiments on multiple benchmark datasets demonstrate that our approach achieves highly competitive results and outperforms the existing state-of-the-art multi-label classification methods.

12 citations


Proceedings ArticleDOI
24 Aug 2015
TL;DR: The widely known classifier chains method for multi-label classification overcomes the disadvantages of BR and achieves higher predictive performance, but still retains important advantages of BR, most importantly low time complexity.
Abstract: The widely known classifier chains method for multi-label classification, which is based on the binary relevance (BR) method, overcomes the disadvantages of BR and achieves higher predictive performance, but still retains important advantages of BR, most importantly low time complexity. Nevertheless, despite its advantages, it is clear that a randomly arranged chain can be poorly ordered. We overcome this issue with a different strategy: Several times K-means algorithms are employed to get the correlations between labels and to confirm the order of binary classifiers. The algorithm ensure the right correlations be transmitted persistently as great as possible by improve the earlier predictions accuracy. The experimental results on the Reuters-21578 text chat data set and image data set show that the approach is efficient and appealing in most cases.

11 citations


Proceedings Article
25 Jul 2015
TL;DR: A novel polytree-augmented classifier chains method based on the max-sum algorithm that is competitive with the state-of-the-art multi-label classification methods.
Abstract: Multi-label classification is a challenging and appealing supervised learning problem where a subset of labels, rather than a single label seen in traditional classification problems, is assigned to a single test instance. Classifier chains based methods are a promising strategy to tackle multilabel classification problems as they model label correlations at acceptable complexity. However, these methods are difficult to approximate the underlying dependency in the label space, and suffer from the problems of poorly ordered chain and error propagation. In this paper, we propose a novel polytree-augmented classifier chains method to remedy these problems. A polytree is used to model reasonable conditional dependence between labels over attributes, under which the directional relationship between labels within causal basins could be appropriately determined. In addition, based on the max-sum algorithm, exact inference would be performed on polytrees at reasonable cost, preventing from error propagation. The experiments performed on both artificial and benchmark multi-label data sets demonstrated that the proposed method is competitive with the state-of-the-art multi-label classification methods.

11 citations


Journal ArticleDOI
TL;DR: This work proposes MIML-ECC (ensemble of classifier chains), which exploits bag-level context through label correlations to improve instance-level prediction accuracy and achieves higher or comparable accuracy in comparison with several recent methods and baselines.
Abstract: In multi-instance multi-label (MIML) instance annotation, the goal is to learn an instance classifier while training on a MIML dataset, which consists of bags of instances paired with label sets; instance labels are not provided in the training data. The MIML formulation can be applied in many domains. For example, in an image domain, bags are images, instances are feature vectors representing segments in the images, and the label sets are lists of objects or categories present in each image. Although many MIML algorithms have been developed for predicting the label set of a new bag, only a few have been specifically designed to predict instance labels. We propose MIML-ECC (ensemble of classifier chains), which exploits bag-level context through label correlations to improve instance-level prediction accuracy. The proposed method is scalable in all dimensions of a problem (bags, instances, classes, and feature dimension) and has no parameters that require tuning (which is a problem for prior methods). In experiments on two image datasets, a bioacoustics dataset, and two artificial datasets, MIML-ECC achieves higher or comparable accuracy in comparison with several recent methods and baselines.

11 citations


Proceedings ArticleDOI
09 Nov 2015
TL;DR: A new algorithm ETCC is proposed, which can optimize the order chain on global perspective and have a better result than the algorithm CC, which is a popular multi-label classified algorithm.
Abstract: Parkinson disease is a chronic, degenerative disease of the central nervous system, which commonly occurs in the elderly. Until now, no treatment has shown efficacy. Traditional Chinese Medicine is a new way for Parkinson, and the data of Chinese Medicine for Parkinson is a multi-label dataset. Classifier Chains(CC) is a popular multi-label classification algorithm, this algorithm considers the relativity between labels, and contains the high efficiency of Binary classification algorithm at the same time. But CC algorithm does not indicate how to obtain the predicted order chain actually, while more emphasizes the randomness or artificially specified. In this paper, we try to apply Multi-label classification technology to build a model of Chinese Medicine for Parkinson, which we hope to improve this field. We propose a new algorithm ETCC based on CC model. This algorithm can optimize the order chain on global perspective and have a better result than the algorithm CC.

10 citations



Journal Article
Wang Shao1
TL;DR: A classifier circle method for multi-label learning initializes the label ordering randomly, and then subsequently and iteratively updates the classifier for each label by connecting the labels as a circle.
Abstract: Exploiting label relationship to help improve learning performance is important for multi-label learning. The classifier chain method and its variants are shown to be a powerful solution to such a problem. However, its learning process requires the ordering of labels, which is hard to obtain in real-world situations, and incorrect label ordering may cause a suboptimal performance. To overcome the drawback, this paper presents a classifier circle method for multi-label learning. It initializes the label ordering randomly, and then subsequently and iteratively updates the classifier for each label by connecting the labels as a circle. Experimental results on a number of data sets show that the proposal outperforms classifier chains method as well as many state-of-the-art multi-label methods.

Proceedings ArticleDOI
13 Apr 2015
TL;DR: The effect of the block size, in particular with respect to the performance of the two extremes, BR and CC, is shown, which leads to degraded performance, whereas others improve performance to a noticeable but modest extent.
Abstract: Two fundamental and prominent methods for multi-label classification, Binary Relevance (BR) and Classifier Chains (CC), are usually considered to be distinct methods without direct relationship. However, BR and CC can be generalized to one single method: blockwise classifier chains (BCC), where labels within a block (i.e. a group of labels of fixed size) are predicted independently as in BR but then combined to predict the next block's labels as in CC. In other words, only the blocks are connected in a chain. BR is then a special case of BCC with a block size equal to the number of labels, and CC a special case with a block size equal to one. The rationale behind BCC is to limit the propagation of errors made by inaccurate classifiers early in the chain, which should be alleviated by the expected block effect. Another, yet different generalization is based on the divide-and-conquer principle, not error propagation, but fails to exhibit the desired block effect. Ensembles of BCC are also discussed and experiments confirm that their performance is on par with ensembles of CC. Further experiments show the effect of the block size, in particular with respect to the performance of the two extremes, BR and CC. As it turns out, some regions of the block size parameter space lead to degraded performance, whereas others improve performance to a noticeable but modest extent.

01 Jan 2015
TL;DR: A classifier circle method for multi-label learning initializes the label ordering randomly, and then subsequently and iteratively updates the classifier for each label by connecting the labels as a circle.
Abstract: Exploiting label relationship to help improve learning performance is important for multi-label learning. The classifier chain method and its variants are shown to be a powerful solution to such a problem. However, its learning process requires the ordering of labels, which is hard to obtain in real-world situations, and incorrect label ordering may cause a suboptimal performance. To overcome the drawback, this paper presents a classifier circle method for multi-label learning. It initializes the label ordering randomly, and then subsequently and iteratively updates the classifier for each label by connecting the labels as a circle. Experimental results on a number of data sets show that the proposal outperforms classifier chains method as well as many state-of-the-art multi-label methods.

Proceedings ArticleDOI
12 Jul 2015
TL;DR: A novel method called Metro Map Classifier (MMC) is presented in which binary classes are connected by a metro map and can be applied in any order and show that MMCs perform between 10% and 50% better depending on the type of content.
Abstract: There are several existing multidimensional classification methods which attempt to build meaningful multidimensional structures from simple Binary Relevance (BR) classifiers. One of the recent methods is the Classifier Chains (CC) method which applies several binary classes in sequence. While the method offers a major reduction in complexity it is not clear how to define the order of binary classes in the chain. This paper presents a novel method called Metro Map Classifier (MMC) in which binary classes are connected by a metro map and can be applied in any order. Results show that MMCs perform between 10% and 50% better depending on the type of content. The ultimate target of this research is automation when selecting a small subset of content from Big Data.

Book ChapterDOI
01 Jan 2015
TL;DR: The Classifier Chain (CC) method was applied to transform the Generalized Maximum Entropy choice model from a single-label model to a multi- label model, indicating that the incorporation of the information on dependence patterns among alternatives can improve prediction performance.
Abstract: Multi-label classification can be applied to study empirically discrete choice problems, in which each individual chooses more than one alternative. We applied the Classifier Chain (CC) method to transform the Generalized Maximum Entropy (GME) choice model from a single-label model to a multi-label model. The contribution of our CC-GME model lies in the advantages of both the GME and CC models. Specifically, the GME model can not only predict each individual’s choice, but also robustly estimate model parameters that describe factors determining his or her choices. The CC model is a problem transformation method that allows the decision on each alternative to be correlated. We used Monte-Carlo simulations and occupational hazard data to compare the CC-GME model with other selected methodologies for multi-label problems using the Hamming Loss, Accuracy, Precision and Recall measures. The results confirm the robustness of GME estimates with respect to relevant parameters regardless of the true error distributions. Moreover, the CC method outperforms other methods, indicating that the incorporation of the information on dependence patterns among alternatives can improve prediction performance.

Book ChapterDOI
01 Jan 2015
TL;DR: In this article, the authors compare the behaviors of 12 multi-label classification methods in an interactive framework where "good" predictions must be produced in a very short time from a very small set of multilabel training examples.
Abstract: Interactive classification-based systems engage users to coach learning algorithms to take into account their own individual preferences. However most of the recent interactive systems limit the users to a single-label classification, which may be not expressive enough in some organization tasks such as film classification, where a multi-label scheme is required. The objective of this paper is to compare the behaviors of 12 multi-label classification methods in an interactive framework where “good” predictions must be produced in a very short time from a very small set of multi-label training examples. Experimentations highlight important performance differences for four complementary evaluation measures (Log-Loss, Ranking-Loss, Learning and Prediction Times). The best results are obtained for Multi-label k Nearest Neighbors (ML-kNN), ensemble of classifier chains (ECC), and ensemble of binary relevance (EBR).

Journal ArticleDOI
TL;DR: In this work, the classifier chains method is applied in E-MIMLSVM+ to incorporate label correlations instead of multi-task learning techniques to reduce time complexity and improve the predictive performance.
Abstract: In the multi-instance multi-label learning framework, an example is described by multiple instances and associated with multiple class labels at the same time. An idea of tackling with multi-instance multi-label problems is to identify its equivalence in the traditional supervised learning framework. However, some useful information such as the correlation between labels may be lost in the process of degeneration, which will influence the classification performance. In E-MIMLSVM+ algorithm, multi-task learning techniques are utilized to incorporate label correlations, while it is time consuming as well as memory consuming. Therefore, we propose an improved algorithm. In our algorithm, the classifier chains method is applied in E-MIMLSVM+ to incorporate label correlations instead of multi-task learning techniques. The experimental results show that the proposed algorithm can reduce time complexity and improve the predictive performance.

Posted Content
01 Jun 2015-viXra
TL;DR: This work presents an alternative model structure among the labels, such that the Bayesian optimal inference is then computationally feasible, and shows that the Viterbi CC can perform best on a range of real-world datasets.
Abstract: Multi-dimensional classification (also known variously as multi-target, multi-objective, and multi-output classification) is the supervised learning problem where an instance is associated to qualitative discrete variables (a.k.a. labels), rather than with a single class, as in traditional classification problems. Since these classes are often strongly correlated, modeling the dependencies between them allows MDC methods to improve their performance -- at the expense of an increased computational cost. A popular method for multi-label classification is the classifier chains (CC), in which the predictions of individual classifiers are cascaded along a chain, thus taking into account inter-label dependencies. Different variant of CC methods have been introduced, and many of them perform very competitively across a wide range of benchmark datasets. However, scalability limitations become apparent on larger datasets when modeling a fully-cascaded chain. In this work, we present an alternative model structure among the labels, such that the Bayesian optimal inference is then computationally feasible. The inference is efficiently performed using a Viterbi-type algorithm. As an additional contribution to the literature we analyze the relative advantages and interaction of three aspects of classifier chain design with regard to predictive performance versus efficiency: finding a good chain structure vs.a random structure, carrying out complete inference vs. approximate or greedy inference, and a linear vs. non-linear base classifier. We show that our Viterbi CC can perform best on a range of real-world datasets.

Journal ArticleDOI
TL;DR: This paper presents the double layer based classifier chains method (DCC), which overcomes dis- advantages of BR and inherits the benefit of classifier chain method (CC), and extends this approach further in an ensemble framework.
Abstract: In multi-label learning, each training example is associated with a set of labels and the task is to predict the proper label set for each unseen instance. The widely known binary relevance method (BR) for multi-label classification considers each label as an independent binary problem. It is ignored in the literature due to inadequacy of not considering label correlations. In this paper, we present our double layer based classifier chains method (DCC), which overcomes dis- advantages of BR and inherits the benefit of classifier chain method (CC). This algorithm decomposes the multi-label classification problem into two classification processes to generate classifier chain. Each classifier in the chain is respon- sible for learning and predicting the binary association of the label given the attribute space expanded by all prior binary relevance predictions in the chain. This chaining allows DCC to take into account correlations in the label space. We also extend this approach further in an ensemble framework. An extensive evaluation covers a broad range of multi-label datasets with a variety of evaluation measures specifically designed for multi-label classification. Experiments on bench- mark datasets validate the effectiveness of proposed approach comparing with state-of-art methods in terms of average ranking. �