scispace - formally typeset
Search or ask a question

Showing papers on "Classifier chains published in 2017"


Proceedings Article
01 Jan 2017
TL;DR: This paper replaces classifier chains with recurrent neural networks, a sequence-to-sequence prediction algorithm which has recently been successfully applied to sequential prediction tasks in many domains, and compares different ways of ordering the label set, and gives some recommendations on suitable ordering strategies.
Abstract: Multi-label classification is the task of predicting a set of labels for a given input instance. Classifier chains are a state-of-the-art method for tackling such problems, which essentially converts this problem into a sequential prediction problem, where the labels are first ordered in an arbitrary fashion, and the task is to predict a sequence of binary values for these labels. In this paper, we replace classifier chains with recurrent neural networks, a sequence-to-sequence prediction algorithm which has recently been successfully applied to sequential prediction tasks in many domains. The key advantage of this approach is that it allows to focus on the prediction of the positive labels only, a much smaller set than the full set of possible labels. Moreover, parameter sharing across all classifiers allows to better exploit information of previous decisions. As both, classifier chains and recurrent neural networks depend on a fixed ordering of the labels, which is typically not part of a multi-label problem specification, we also compare different ways of ordering the label set, and give some recommendations on suitable ordering strategies.

141 citations


Journal ArticleDOI
TL;DR: This work investigates the application of multi-label classification methods to classify Tamil phonemes using Binary Relevance and Label Powerset and BR’s improvement Classifier Chains methods with different base classifiers.

29 citations


Journal ArticleDOI
TL;DR: An algorithm CCnet is proposed which is a combination of classifier chains and elastic-net regularization and it is shown that the feature selection is stable with respect to the order of fitting the models in the chain.

20 citations


Posted Content
TL;DR: The implemented methods are binary relevance, classifier chains, nested stacking, dependent binary relevance and stacking, which can be used with any base learner that is accessible in mlr, and there is access to the multilabel classification versions of randomForestSRC and rFerns.
Abstract: We implemented several multilabel classification algorithms in the machine learning package mlr. The implemented methods are binary relevance, classifier chains, nested stacking, dependent binary relevance and stacking, which can be used with any base learner that is accessible in mlr. Moreover, there is access to the multilabel classification versions of randomForestSRC and rFerns. All these methods can be easily compared by different implemented multilabel performance measures and resampling methods in the standardized mlr framework. In a benchmark experiment with several multilabel datasets, the performance of the different methods is evaluated.

15 citations


Journal ArticleDOI
TL;DR: In this paper, the authors implemented several multilabel classification algorithms in the machine learning package mlr. The implemented methods are binary relevance, classifier chains, nested stacking, dependent binary relevance and stacking, which can be used with any base learner.
Abstract: We implemented several multilabel classification algorithms in the machine learning package mlr. The implemented methods are binary relevance, classifier chains, nested stacking, dependent binary relevance and stacking, which can be used with any base learner that is accessible in mlr. Moreover, there is access to the multilabel classification versions of randomForestSRC and rFerns. All these methods can be easily compared by different implemented multilabel performance measures and resampling methods in the standardized mlr framework. In a benchmark experiment with several multilabel datasets, the performance of the different methods is evaluated.

10 citations


Proceedings ArticleDOI
TL;DR: A hierarchical framework that builds chains of local binary neural networks after one global neural network over all the class labels, Local Classifier Chains based Convolutional Neural Networks (LCC-CNN).
Abstract: This paper focuses on improving the performance of current convolutional neural networks in face recognition without changing the network architecture We propose a hierarchical framework that builds chains of local binary neural networks after one global neural network over all the class labels, Local Classifier Chains based Convolutional Neural Networks (LCC-CNN) Two different criteria based on a similarity matrix and confusion matrix are introduced to select binary label pairs to create local deep networks To avoid error propagation, each testing sample travels through one global model and a local classifier chain to obtain its final prediction The proposed framework has been evaluated with UHDB31 and CASIA-WebFace datasets The experimental results indicate that our framework achieves better performance when compared with using only baseline methods as the global deep network The accuracy is improved by 27% and 07% on the two datasets, respectively

9 citations


Book ChapterDOI
21 Jun 2017
TL;DR: This short note introduces multi-objective optimisation for feature subset selection in multi-label classification, using label powerset, binary relevance, classifier chains and calibrated label ranking as the multi- label learning methods, and decision trees and SVMs as base learners.
Abstract: In this short note we introduce multi-objective optimisation for feature subset selection in multi-label classification We aim at optimise multiple multi-label loss functions simultaneously, using label powerset, binary relevance, classifier chains and calibrated label ranking as the multi-label learning methods, and decision trees and SVMs as base learners Experiments on multi-label benchmark datasets show that the feature subset obtained through MOO performs reasonably better than the systems that make use of exhaustive feature sets

9 citations


Book ChapterDOI
22 Aug 2017
TL;DR: Posters usually highlight a movie scene or characters, and at the same time should inform about the genre or the plot of the movie to attract the potential audience, so the assumption was that the relevant information can be captured in visual features.
Abstract: Classification of movies into genres from the accompanying promotional materials such as posters is a typical multi-label classification problem. Posters usually highlight a movie scene or characters, and at the same time should inform about the genre or the plot of the movie to attract the potential audience, so our assumption was that the relevant information can be captured in visual features.

8 citations


Journal ArticleDOI
TL;DR: An alternative admissible heuristic for the A* algorithm with two promising advantages in comparison to the above-mentioned heuristic, namely, it is more dominant for the same depth and, hence, it explores fewer nodes and it is suitable for nonlinear classifiers.
Abstract: Probabilistic Classifier Chains (PCC) is a very interesting method to cope with multi-label classification, since it is able to obtain the entire joint probability distribution of the labels. However, such probability distribution is obtained at the expense of a high computational cost. Several efforts have been made to overcome this pitfall, proposing different inference methods for estimating the probability distribution. Beam search and the - approximate algorithms are two methods of this kind. A more recently approach is based on the A* algorithm with an admissible heuristic, but it is limited to be used just for linear classifiers as base methods for PCC. This paper goes in that direction presenting an alternative admissible heuristic for the A* algorithm with two promising advantages in comparison to the above-mentioned heuristic, namely, i) it is more dominant for the same depth and, hence, it explores fewer nodes and ii) it is suitable for nonlinear classifiers. Additionally, the paper proposes an efficient implementation for the computation of the heuristic that reduces the number of models that must be evaluated by half. The experiments show, as theoretically expected, that this new algorithm reaches Bayes-optimal predictions in terms of subset 0/1 loss and explores fewer nodes than other state-of-the-art methods that also provide optimal predictions. In spite of exploring fewer nodes, this new algorithm is not as fast as the -approximate algorithm with =0 when the search for an optimal solution is highly directed. However, it shows its strengths when the datasets present more uncertainty, making faster predictions than other state-of-the-art approaches.

5 citations


Book ChapterDOI
TL;DR: In this article, the authors proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model, which allows anticipating the instance-specific chain order without a significant increase in computational burden.
Abstract: In this paper, we deal with the task of building a dynamic ensemble of chain classifiers for multi-label classification. To do so, we proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model. Such modes allows anticipating the instance-specific chain order without a significant increase in computational burden. The proposed chain models are built using the Naive Bayes classifier and nearest neighbour approach as a base single-label classifiers. To take the benefits of the proposed algorithms, we developed a simple heuristic that allows the system to find relatively good label order. The heuristic sort labels according to the label-specific classification quality gained during the validation phase. The heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the above-mentioned heuristic is an efficient tool for building dynamic chain classifiers.

3 citations


01 Jan 2017
TL;DR: The Classifier Chain breaks the multi-label problem into multiple binary classification problems, chaining them into one another to solve multi-dimensional classification problems.
Abstract: Context: Multi-label classification concerns classification with multi-dimensional output. The Classifier Chain breaks the multi-label problem into multiple binary classification problems, chaining ...

Journal ArticleDOI
TL;DR: eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments and is available at http://www.heiderlab.de.
Abstract: Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations. Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage. eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at http://www.heiderlab.de .

Proceedings ArticleDOI
13 Jan 2017
TL;DR: A genetic algorithm (GA) based on attribute correlation for multi-label classification is proposed and the results of the experiments prove that the performance of classification can be improved.
Abstract: The classifier chains (CC) model has been used widely for multi-label classification, its remarkable characteristic is in consideration of the association between the labels, and the CC method adds the classifiers before it to predict the current instance. Then, the association between the labels is added to each of the current classification of the instance. However, because the CC model requires all the labels to join the chain, the disadvantage of the CC model is that the labels with wrong or redundant information will affect the performance of the classifier. Considering the issue, this paper proposes a genetic algorithm (GA) based on attribute correlation for multi-label classification. The results of the experiments prove that the performance of classification can be improved.

Book ChapterDOI
TL;DR: In this paper, label correlations are considered in multi-label classification tasks and the performance of the methods is examined by experiments done on image, musical, audio, and text datasets, and the comparison is done for proposed by authors Labels Chain technique and well known methods which also take into account label correlations, such as Label Power-set, Classifier Chains and Ensembles of Classifier chains.
Abstract: In multi-label classification tasks, very often labels are correlated and to not lose important information, methods should take into account existing dependencies. Such situation especially takes place in the case of multimedia datasets. In the paper, universal problem transformation methods providing for label correlations are considered. The comparison is done for proposed by authors Labels Chain technique [4] and well known methods which also take into account label correlations, such as Label Power-set, Classifier Chains and Ensembles of Classifier Chains. The performance of the methods is examined by experiments done on image, musical, audio and text datasets.

Journal Article
TL;DR: A classifier that allows us to change label order of the chain without rebuilding the entire model is built using the Naive Bayes classifier as a base single-label classifier and a simple heuristic allows the system to find relatively good label order.
Abstract: In this paper, we addressed an issue of building dynamic classifier chain ensembles for multi-label classification. We built a classifier that allows us to change label order of the chain without rebuilding the entire model. Such a model allows anticipating the instance-specific chain order without a significant increase in computational burden. The proposed chain model is built using the Naive Bayes classifier as a base single-label classifier. Additionally, we proposed a simple heuristic that allows the system to find relatively good label order. That is, the heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the above-mentioned heuristic is an efficient tool for building dynamic chain classifiers.

Proceedings ArticleDOI
22 Sep 2017
TL;DR: This paper builds the deep belief networks (DBN) as a single-label classifier for each class, and extends the feature space for one class with the hidden layer information in the DBN built for other classes.
Abstract: In multi-label learning, each instance in the dataset is associated with a set of labels, and the correlations between different labels are important. The existing Classifier Chains transform the multi-label learning into a chain of binary classification and exploit label correlations by extending the feature space with the 0/1 label associations of all previous binary classifiers. In this paper, we exploit label correlations using the hidden layer information in deep networks. We build the deep belief networks(DBN) as a single-label classifier for each class, and extend the feature space for one class with the hidden layer information in the DBN built for other classes. Experiments on real-world multi-label learning problems shows that the DBN Chain structure is highly comparable to the existing method.

Posted Content
TL;DR: The system developed for the Youtube-8M Video Understanding Challenge, in which a large-scale benchmark dataset was used for multi-label video classification, contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part.
Abstract: This paper introduces the system we developed for the Youtube-8M Video Understanding Challenge, in which a large-scale benchmark dataset was used for multi-label video classification. The proposed framework contains hierarchical deep architecture, including the frame-level sequence modeling part and the video-level classification part. In the frame-level sequence modelling part, we explore a set of methods including Pooling-LSTM (PLSTM), Hierarchical-LSTM (HLSTM), Random-LSTM (RLSTM) in order to address the problem of large amount of frames in a video. We also introduce two attention pooling methods, single attention pooling (ATT) and multiply attention pooling (Multi-ATT) so that we can pay more attention to the informative frames in a video and ignore the useless frames. In the video-level classification part, two methods are proposed to increase the classification performance, i.e. Hierarchical-Mixture-of-Experts (HMoE) and Classifier Chains (CC). Our final submission is an ensemble consisting of 18 sub-models. In terms of the official evaluation metric Global Average Precision (GAP) at 20, our best submission achieves 0.84346 on the public 50% of test dataset and 0.84333 on the private 50% of test data.