Showing papers on "Classifier chains published in 2013"

PDF

Open Access

Proceedings Article•DOI•

A Genetic Algorithm for Optimizing the Label Ordering in Multi-label Classifier Chains

[...]

Eduardo Corrêa Gonçalves, Alexandre Plastino, Alex A. Freitas¹•Institutions (1)

04 Nov 2013

TL;DR: Experiments on diverse benchmark datasets, followed by the Wilcoxon test for assessing statistical significance, indicate that the proposed genetic algorithm for optimizing the label ordering in classifier chains produces more accurate classifiers.

...read moreread less

Abstract: First proposed in 2009, the classifier chains model (CC) has become one of the most influential algorithms for multi-label classification. It is distinguished by its simple and effective approach to exploit label dependencies. The CC method involves the training of q single-label binary classifiers, where each one is solely responsible for classifying a specific label in ll, ..., lq. These q classifiers are linked in a chain, such that each binary classifier is able to consider the labels predicted by the previous ones as additional information at classification time. The label ordering has a strong effect on predictive accuracy, however it is decided at random and/or combining random orders via an ensemble. A disadvantage of the ensemble approach consists of the fact that it is not suitable when the goal is to generate interpretable classifiers. To tackle this problem, in this work we propose a genetic algorithm for optimizing the label ordering in classifier chains. Experiments on diverse benchmark datasets, followed by the Wilcoxon test for assessing statistical significance, indicate that the proposed strategy produces more accurate classifiers.

...read moreread less

98 citations

Rectifying Classifier Chains for Multi-Label Classification.

[...]

Robin Senge, Juan José del Coz, Eyke Hüllermeier

01 Jan 2013

TL;DR: In this article, the authors analyze the influence of the discrepancy between the feature spaces used in training and testing on the performance of classifier chains and propose two modifications to overcome this problem.

...read moreread less

Abstract: Classifier chains have recently been proposed as an appealing method for tackling the multi-label classification task. In addition to several empirical studies showing its state-of-the-art performance, especially when being used in its ensemble variant, there are also some first results on theoretical properties of classifier chains. Continuing along this line, we analyze the influence of a potential pitfall of the learning process, namely the discrepancy between the feature spaces used in training and testing: While true class labels are used as supplementary attributes for training the binary models along the chain, the same models need to rely on estimations of these labels at prediction time. We elucidate under which circumstances the attribute noise thus created can affect the overall prediction performance. As a result of our findings, we propose two modifications of classifier chains that are meant to overcome this problem. Experimentally, we show that our variants are indeed able to produce better results in cases where the original chaining process is likely to fail.

...read moreread less

29 citations

Proceedings Article•DOI•

Efficient monte carlo optimization for multi-label classifier chains

[...]

Jesse Read¹, Luca Martino¹, David Luengo•Institutions (1)

Carlos III Health Institute¹

26 May 2013

TL;DR: This paper presents a novel double-Monte Carlo scheme (M2CC), both for finding a good chain sequence and performing efficient inference, which remains tractable for high-dimensional data sets and obtains the best overall accuracy.

...read moreread less

Abstract: Multi-label classification (MLC) is the supervised learning problem where an instance may be associated with multiple labels. Modeling dependencies between labels allows MLC methods to improve their performance at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies. On the one hand, the original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors down the chain. On the other hand, a recent Bayes-optimal method improves the performance, but is computationally intractable in practice. Here we present a novel double-Monte Carlo scheme (M2CC), both for finding a good chain sequence and performing efficient inference. The M2CC algorithm remains tractable for high-dimensional data sets and obtains the best overall accuracy, as shown on several real data sets with input dimension as high as 1449 and up to 103 labels.

...read moreread less

27 citations

Journal Article•DOI•

Learning semantic concepts from image database with hybrid generative/discriminative approach

[...]

Zhixin Li¹, Zhongzhi Shi², Weizhong Zhao³, Zhiqing Li³, Zhenjun Tang¹ - Show less +1 more•Institutions (3)

Guangxi Normal University¹, Chinese Academy of Sciences², Xiangtan University³

01 Oct 2013-Engineering Applications of Artificial Intelligence

TL;DR: A hybrid approach is proposed to learn the semantic concepts of images automatically using continuous probabilistic latent semantic analysis (PLSA) and its corresponding Expectation-Maximization (EM) algorithm.

...read moreread less

26 citations

Book Chapter•DOI•

Selective Ensemble of Classifier Chains

[...]

Nan Li¹, Nan Li², Zhi-Hua Zhou¹•Institutions (2)

Nanjing University¹, Soochow University (Suzhou)²

15 May 2013

TL;DR: This paper proposes selective ensemble of classifier chains (SECC) which tries to select a subset of classifiers to composite the ensemble whilst keeping or improving the performance, and forms this problem as a convex optimization problem which can be efficiently solved by the stochastic gradient descend method.

...read moreread less

Abstract: In multi-label learning, the relationship among labels is well accepted to be important, and various methods have been proposed to exploit label relationships. Amongst them, ensemble of classifier chains (ECC) which builds multiple chaining classifiers by random label orders has drawn much attention. However, the ensembles generated by ECC are often unnecessarily large, leading to extra high computational and storage cost. To tackle this issue, in this paper, we propose selective ensemble of classifier chains (SECC) which tries to select a subset of classifier chains to composite the ensemble whilst keeping or improving the performance. More precisely, we focus on the performance measure F1-score, and formulate this problem as a convex optimization problem which can be efficiently solved by the stochastic gradient descend method. Experiments show that, compared with ECC, SECC is able to obtain much smaller ensembles while achieving better or at least comparable performance.

...read moreread less

25 citations

Proceedings Article•DOI•

Clinical multi-label free text classification by exploiting disease label relation

[...]

Rui-Wei Zhao¹, Guo-Zheng Li¹, Jia-Ming Liu¹, Xiao Wang²•Institutions (2)

Tongji University¹, Zhengzhou University²

01 Dec 2013

TL;DR: This work proposes a novel multi-label learning algorithm called Ensemble of Sampled Classifier Chains (ESCC) to improve clinical text data classification, which automatically learns to select relevant disease information that is helpful to improve classification performance when exploiting possible disease relation.

...read moreread less

Abstract: Clinical data describing a patient's health status can be multi-labelled. For example, a clinical record describing patient suffering from cough and fever should be tagged with both two disease labels. These co-occurred labels often have interrelation which can be exploited to improve disease classifications. In this work, we treat the categorization of free clinical text as a multi-label learning problem. However, we discover that some commonly used multi-label learning methods might suffer from some severe side effects in exploiting complicated disease label relation, such as over-exploitation of label relation and error-propagation in label prediction. Based on these findings, we propose a novel multi-label learning algorithm called Ensemble of Sampled Classifier Chains (ESCC) to improve clinical text data classification. ESCC automatically learns to select relevant disease information that is helpful to improve classification performance when exploiting possible disease relation. In our conducted experiments, ESCC shows strong advantages over other state-of-the-art multi-label algorithms on medical text data with significant improvement in performance. The proposed algorithm is promising in mining knowledge from a wide range of multi-label medical text data.

...read moreread less

17 citations

Proceedings Article•DOI•

Learning optimal classifier chains for real-time big data mining

[...]

Jie Xu¹, Cem Tekin¹, Mihaela van der Schaar¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 2013

TL;DR: This paper proposes online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics.

...read moreread less

Abstract: A plethora of emerging Big Data applications require processing and analyzing streams of data to extract valuable information in real-time. For this, chains of classifiers which can detect various concepts need to be constructed in real-time. In this paper, we propose online distributed algorithms which can learn how to construct the optimal classifier chain in order to maximize the stream mining performance (i.e. mining accuracy minus cost) based on the dynamically-changing data characteristics. The proposed solution does not require the distributed local classifiers to exchange any information when learning at runtime. Moreover, our algorithm requires only limited feedback of the mining performance to enable the learning of the optimal classifier chain. We model the learning problem of the optimal classifier chain at run-time as a multi-player multi-armed bandit problem with limited feedback. To our best knowledge, this paper is the first that applies bandit techniques to stream mining problems. However, existing bandit algorithms are inefficient in the considered scenario due to the fact that each component classifier learns its optimal classification functions using only the aggregate overall reward without knowing its own individual reward and without exchanging information with other classifiers. We prove that the proposed algorithms achieve logarithmic learning regret uniformly over time and hence, they are order optimal. Therefore, the long-term time average performance loss tends to zero. We also design learning algorithms whose regret is linear in the number of classification functions. This is much smaller than the regret results which can be obtained using existing bandit algorithms that scale linearly in the number of classifier chains and exponentially in the number of classification functions.

...read moreread less

12 citations

Proceedings Article•DOI•

Online Estimation of Discrete Densities

[...]

Michael Geilke¹, Eibe Frank², Andreas Karwath¹, Stefan Kramer¹•Institutions (2)

University of Mainz¹, University of Waikato²

01 Dec 2013

TL;DR: These experiments demonstrate that, even though designed to work online, EDDO delivers estimators of competitive accuracy compared to batch Bayesian structure learners and batch variants of EDDO.

...read moreread less

Abstract: We address the problem of estimating a discrete joint density online, that is, the algorithm is only provided the current example and its current estimate. The proposed online estimator of discrete densities, EDDO (Estimation of Discrete Densities Online), uses classifier chains to model dependencies among features. Each classifier in the chain estimates the probability of one particular feature. Because a single chain may not provide a reliable estimate, we also consider ensembles of classifier chains and ensembles of weighted classifier chains. For all density estimators, we provide consistency proofs and propose algorithms to perform certain inference tasks. The empirical evaluation of the estimators is conducted in several experiments and on data sets of up to several million instances: We compare them to density estimates computed from Bayesian structure learners, evaluate them under the influence of noise, measure their ability to deal with concept drift, and measure the run-time performance. Our experiments demonstrate that, even though designed to work online, EDDO delivers estimators of competitive accuracy compared to batch Bayesian structure learners and batch variants of EDDO.

...read moreread less

9 citations

Posted Content•

Multi-Label Classifier Chains for Bird Sound

[...]

Forrest Briggs, Xiaoli Z. Fern, Jed Irvine

22 Apr 2013-arXiv: Learning

TL;DR: This work proposes to use an ensemble of classifier chains combined with a histogram-of-segments representation for multi-label classification of birdsong, and shows that the proposed method usually outperforms binary relevance and is better in some cases and worse in others compared to the MIML algorithms.

...read moreread less

Abstract: Bird sound data collected with unattended microphones for automatic surveys, or mobile devices for citizen science, typically contain multiple simultaneously vocalizing birds of different species. However, few works have considered the multi-label structure in birdsong. We propose to use an ensemble of classifier chains combined with a histogram-of-segments representation for multi-label classification of birdsong. The proposed method is compared with binary relevance and three multi-instance multi-label learning (MIML) algorithms from prior work (which focus more on structure in the sound, and less on structure in the label sets). Experiments are conducted on two real-world birdsong datasets, and show that the proposed method usually outperforms binary relevance (using the same features and base-classifier), and is better in some cases and worse in others compared to the MIML algorithms.

...read moreread less

7 citations

Proceedings Article•DOI•

Context-Aware MIML Instance Annotation

[...]

Forrest Briggs¹, Xiaoli Z. Fern¹, Raviv Raich¹•Institutions (1)

Oregon State University¹

01 Dec 2013

TL;DR: This work proposes MIML-ECC (ensemble of classifier chains), which exploits bag-level context through label correlations to improve instance-level prediction accuracy and achieves higher or comparable accuracy in comparison to several recent methods and baselines.

...read moreread less

Abstract: In multi-instance multi-label (MIML) instance annotation, the goal is to learn an instance classifier while training on a MIML dataset, which consists of bags of instances paired with label sets, instance labels are not provided in the training data. The MIML formulation can be applied in many domains. For example, in an image domain, bags are images, instances are feature vectors representing segments in the images, and the label sets are lists of objects or categories present in each image. Although many MIML algorithms have been developed for predicting the label set of a new bag, only a few have been specifically designed to predict instance labels. We propose MIML-ECC (ensemble of classifier chains), which exploits bag-level context through label correlations to improve instance-level prediction accuracy. The proposed method is scalable in all dimensions of a problem (bags, instances, classes, and feature dimension), and has no parameters that require tuning (which is a problem for prior methods). In experiments on two image datasets, a bioacoustics dataset, and two artificial datasets, MIML-ECC achieves higher or comparable accuracy in comparison to several recent methods and baselines.

...read moreread less

6 citations

Book Chapter•DOI•

Heuristic Classifier Chains for Multi-label Classification

[...]

Tomasz Kajdanowicz¹, Przemysław Kazienko¹•Institutions (1)

Wrocław University of Technology¹

18 Sep 2013

TL;DR: A novel heuristic approach for finding appropriate label order in chain is presented and it is demonstrated that the method obtains competitive overall accuracy and is also tractable to higher-dimensional data.

...read moreread less

Abstract: Multi-label classification, in opposite to conventional classification, assumes that each data instance may be associated with more than one labels simultaneously Multi-label learning methods take advantage of dependencies between labels, but this implies greater learning computational complexity The paper considers Classifier Chain multi-label classification method, which in original form is fast, but assumes the order of labels in the chain This leads to propagation of inference errors down the chain On the other hand recent Bayes-optimal method, Probabilistic Classifier Chain, overcomes this drawback, but is computationally intractable In order to find the trade off solution it is presented a novel heuristic approach for finding appropriate label order in chain It is demonstrated that the method obtains competitive overall accuracy and is also tractable to higher-dimensional data

...read moreread less

Classifier Chains for Multi-Label Classification with Incomplete Labels

[...]

Jafer Almuallim

12 Jun 2013

TL;DR: This work proposes a new method that consider the multi-label learning problem in which portion of label assignment is missing, and extends the work of ensemble classifier chain to learn models using training examples with incomplete label assignment.

...read moreread less

Abstract: Many methods have been explored in the literature of multi-label learning, ranging from simple problem transformation to more complex method that capture correlation among labels. However, mostly all existing works do not address the challenge with incomplete label data. The goal of this project is to extend the work of ensemble classifier chain to learn models using training examples with incomplete label assignment. This scenario is highly expected in many real-world application. For example, in image annotation, a user provides partial tags, or label assignment, for the image. We propose a new method that consider the multi-label learning problem in which portion of label assignment is missing. A further evaluation is covered in this project to study the effect of different parameters accompany this approach.

...read moreread less