Showing papers on "Classifier chains published in 2020"

PDF

Open Access

Journal Article•DOI•

Dealing with class imbalance in classifier chains via random undersampling

[...]

Bin Liu¹, Grigorios Tsoumakas¹•Institutions (1)

15 Mar 2020-Knowledge Based Systems

TL;DR: Two extensions of ECC's basic approach are presented, where a varying number of binary models per label are built and chains of different sizes are constructed in order to improve the exploitation of majority examples with approximately the same computational budget.

...read moreread less

Abstract: Class imbalance is an intrinsic characteristic of multi-label data. Most of the labels in multi-label data sets are associated with a small number of training examples, much smaller compared to the size of the data set. Class imbalance poses a key challenge that plagues most multi-label learning methods. Ensemble of Classifier Chains (ECC), one of the most prominent multi-label learning methods, is no exception to this rule, as each of the binary models it builds is trained from all positive and negative examples of a label. To make ECC resilient to class imbalance, we first couple it with random undersampling. We then present two extensions of this basic approach, where we build a varying number of binary models per label and construct chains of different sizes, in order to improve the exploitation of majority examples with approximately the same computational budget. Experimental results on 16 multi-label datasets demonstrate the effectiveness of the proposed approaches in a variety of evaluation metrics.

...read moreread less

56 citations

Journal Article•DOI•

Comparison of base classifiers for multi-label learning

[...]

Edward Kien Yee Yapp, Xiang Li, Wen Feng Lu¹, Puay Siew Tan•Institutions (1)

National University of Singapore¹

21 Jun 2020-Neurocomputing

TL;DR: The sensitivity of five problem transformation and two ensemble methods to four types of classifiers is studied and the statistical performance of a classifier is found to be generally consistent across the metrics for any given method.

...read moreread less

25 citations

Proceedings Article•DOI•

Multilabel Classification of Hate Speech and Abusive Words on Indonesian Twitter Social Media

[...]

Rahmat Hendrawan¹, Adiwijaya¹, Said Al Faraby¹•Institutions (1)

Telkom University¹

01 Aug 2020

TL;DR: It is shown that the translation, stemming, and stopword removal are not effective, and the problem of dependencies between labels greatly affects the results of classification.

...read moreread less

Abstract: Hate speech and abusive words spread widely on social media. The impact of hate speech on social media is hazardous, which can lead to discrimination, social conflict, and even genocide. Hate speech also has target types, categories, and levels. This research discusses the classification of hate speech and abusive words in the text on social media Twitter in Indonesian, English, and a mixture of both up to the types, categories, and levels. Classification of hate speech multilabel text is investigated using RFDT, BiLSTM, and BiLSTM with the pre-trained BERT model. The Classifier Chains, Label Powerset, and Binary Relevance methods are also used as data transformation, and TF-IDF is also used as feature extraction combined with the RFDT classification method. Some scenarios of the preprocessing stage are also carried out to find the best results, namely full preprocess, without stopword removal, and without stemming and without stopword removal. The problem of having Indonesian, English, and a mixture of both is solved in two ways, namely, without being translated and translated into Indonesian. The best results with an accuracy of 76.12% were obtained using the RFDT classification method with Classifier Chains, without translation, without stemming, and without stopword removal. This research also shows that the translation, stemming, and stopword removal are not effective, and the problem of dependencies between labels greatly affects the results of classification.

...read moreread less

15 citations

Journal Article•DOI•

Label Specific Features-Based Classifier Chains for Multi-Label Classification

[...]

Wei Weng¹, Wang Dahan¹, Chin-Ling Chen², Juan Wen³, Shunxiang Wu³ - Show less +1 more•Institutions (3)

Xiamen University of Technology¹, Chaoyang University of Technology², Xiamen University³

13 Mar 2020-IEEE Access

TL;DR: A novel and effective algorithm named LSF-CC, i.e. Label Specific Features based Classifier Chain for multi-label classification, which outperforms well-established approaches in terms of classification performances.

...read moreread less

Abstract: Multi-label classification tackles the problems in which each instance is associated with multiple labels. Due to the interdependence among labels, exploiting label correlations is the main means to enhance the performances of classifiers and a variety of corresponding multi-label algorithms have been proposed. Among those algorithms Classifier Chains (CC) is one of the most effective methods. It induces binary classifiers for each label, and these classifiers are linked in a chain. In the chain, the labels predicted by previous classifiers are used as additional features for the current classifier. The original CC has two shortcomings which potentially decrease classification performances: random label ordering, noise in original and additional features. To deal with these problems, we propose a novel and effective algorithm named LSF-CC, i.e. Label Specific Features based Classifier Chain for multi-label classification. At first, a feature estimating technique is employed to produce a list of most relevant features and labels for each label. According to these lists, we define a chain to guarantee that the most frequent labels that appear in these lists are top-ranked. Then, label specific features can be selected from the original feature space and label space. Based on these label specific features, corresponding binary classifiers are learned for each label. Experiments on several multi-label data sets from various domains have shown that the proposed method outperforms well-established approaches.

...read moreread less

12 citations

Journal Article•DOI•

Partial Classifier Chains with Feature Selection by Exploiting Label Correlation in Multi-Label Classification

[...]

Zhenwu Wang¹, Tielin Wang¹, Benting Wan², Mengjie Han³•Institutions (3)

China University of Mining and Technology¹, Jiangxi University of Finance and Economics², Dalarna University³

10 Oct 2020-Entropy

TL;DR: This work proposes a partial classifier chain method with feature selection (PCC-FS) that exploits the label correlation between label and feature spaces and thus solves the two previously mentioned problems simultaneously.

...read moreread less

Abstract: Multi-label classification (MLC) is a supervised learning problem where an object is naturally associated with multiple concepts because it can be described from various dimensions. How to exploit the resulting label correlations is the key issue in MLC problems. The classifier chain (CC) is a well-known MLC approach that can learn complex coupling relationships between labels. CC suffers from two obvious drawbacks: (1) label ordering is decided at random although it usually has a strong effect on predictive performance; (2) all the labels are inserted into the chain, although some of them may carry irrelevant information that discriminates against the others. In this work, we propose a partial classifier chain method with feature selection (PCC-FS) that exploits the label correlation between label and feature spaces and thus solves the two previously mentioned problems simultaneously. In the PCC-FS algorithm, feature selection is performed by learning the covariance between feature set and label set, thus eliminating the irrelevant features that can diminish classification performance. Couplings in the label set are extracted, and the coupled labels of each label are inserted simultaneously into the chain structure to execute the training and prediction activities. The experimental results from five metrics demonstrate that, in comparison to eight state-of-the-art MLC algorithms, the proposed method is a significant improvement on existing multi-label classification.

...read moreread less

11 citations

Proceedings Article•DOI•

Per-Instance Configuration of the Modularized CMA-ES by Means of Classifier Chains and Exploratory Landscape Analysis

[...]

Raphael Patrick Prager¹, Heike Trautmann¹, Hao Wang², Thomas Bäck³, Pascal Kerschke¹ - Show less +1 more•Institutions (3)

University of Münster¹, University of Paris², Leiden University³

01 Dec 2020

TL;DR: This paper proposes a well-performing instance-specific algorithm configuration model which selects an (almost) optimal configuration of modules for a given problem instance and the structure of this configuration model is able to capture inter-dependencies between modules.

...read moreread less

Abstract: In this paper, we rely on previous work proposing a modularized version of CMA-ES, which captures several alterations to the conventional CMA-ES developed in recent years. Each alteration provides significant advantages under certain problem properties, e.g., multi-modality, high conditioning. These distinct advancements are implemented as modules which result in 4608 unique versions of CMA-ES. Previous findings illustrate the competitive advantage of enabling and disabling the aforementioned modules for different optimization problems. Yet, this modular CMA-ES is lacking a method to automatically determine when the activation of specific modules is auspicious and when it is not. We propose a well-performing instance-specific algorithm configuration model which selects an (almost) optimal configuration of modules for a given problem instance. In addition, the structure of this configuration model is able to capture inter-dependencies between modules, e.g., two (or more) modules might only be advantageous in unison for some problem types, making the orchestration of modules a crucial task. This is accomplished by chaining multiple random forest classifiers together into a so-called Classifier Chain based on a set of numerical features extracted by means of Exploratory Landscape Analysis (ELA) to describe the given problem instances.

...read moreread less

10 citations

Journal Article•DOI•

The benefits of target relations: A comparison of multitask extensions and classifier chains

[...]

Esra Adıyeke¹, Esra Adıyeke², Mustafa Gokce Baydogan¹•Institutions (2)

Boğaziçi University¹, Bahçeşehir University²

01 Nov 2020-Pattern Recognition

TL;DR: An alternative estimation strategy, minimum error chain policy, is introduced that gradually expands the input space using the estimations that approximate to true characteristics of outputs, namely out-of-bag estimations in tree-based ensemble framework.

...read moreread less

10 citations

Journal Article•DOI•

Probabilistic regressor chains with Monte Carlo methods

[...]

Jesse Read¹, Luca Martino²•Institutions (2)

École Polytechnique¹, King Juan Carlos University²

06 Nov 2020-Neurocomputing

TL;DR: This paper studies and develops methods for regressor chains, and presents a sequential Monte Carlo scheme in the framework of a probabilistic regressor chain that can be effective, flexible and useful in several types of data.

...read moreread less

8 citations

Book Chapter•DOI•

Applicability of Machine Learning Methods to Multi-label Medical Text Classification

[...]

Iuliia Lenivtceva¹, Evgenia Slasten, Mariya Kashina¹, Georgy Kopanitsa¹•Institutions (1)

Saint Petersburg State University of Information Technologies, Mechanics and Optics¹

03 Jun 2020

TL;DR: This work investigates the applicability of several machine learning models and classifier chains (CC) to medical unstructured text classification and shows that using CC strategy allows to improve classification performance.

...read moreread less

Abstract: Structuring medical text using international standards allows to improve interoperability and quality of predictive modelling. Medical text classification task facilitates information extraction. In this work we investigate the applicability of several machine learning models and classifier chains (CC) to medical unstructured text classification. The experimental study was performed on a corpus of 11671 manually labeled Russian medical notes. The results showed that using CC strategy allows to improve classification performance. Ensemble of classifier chains based on linear SVC showed the best result: 0.924 micro F-measure, 0.872 micro precision and 0.927 micro recall.

...read moreread less

5 citations

Proceedings Article•DOI•

Multi-label Classification of Indonesian Hate Speech on Twitter Using Support Vector Machines

[...]

Karimah Mutisari Hana¹, Adiwijaya¹, Said Al Faraby¹, Arif Bramantoro²•Institutions (2)

Telkom University¹, King Abdulaziz University²

30 Nov 2020

TL;DR: This research proposes a system that classifies hate speech written in Indonesian language on Twitter and handles the noisiness of twitter data, such as mixed languages and non-standard text, using Support Vector Machines as a classifier.

...read moreread less

Abstract: Hate speech has become a hot issue as it spreads massively on today’s social media with specific targets, categories, and levels. In addition, hate speech can cause social conflict and even genocide. This research proposes a system that classifies hate speech written in Indonesian language on Twitter. It also handles the noisiness of twitter data, such as mixed languages and non-standard text. We not only use Support Vector Machines (SVM) as a classifier, but also compare it with other methods, such as deep learning, CNN and DistilBERT. Apart from standard text preprocessing, we propose to accommodate the effect of translating in handling the multilingual content. The data transformation methods used in the SVM model are Label Power-set (LP) and Classifier Chains (CC). The experiment result shows that the classification using the SVM and CC without stemming, stopword removal, and translation provides the best accuracy of 74.88%. The best SVM hyperparameter on multilabel classification is the sigmoid kernel, the regularization parameter value of 10, and the gamma value of 0.1. Stemming, stopword removal, and translation preprocessing are less effective in this research. Moreover, CNN has a flaw in predicting labels for the training data with a low occurrence rate.

...read moreread less

4 citations

Proceedings Article•

Learning Classifier Chains Using Matrix Regularization: Application to Multimorbidity Prediction.

[...]

Paweł Teisseyre¹•Institutions (1)

Polish Academy of Sciences¹

01 Jan 2020

TL;DR: The proposed novel method parCC (parsimonious classifier chains) is proposed that controls the total number of features without significant deterioration in the quality of the prediction and is applied to predict multimorbidity using various medical diagnostic tests.

...read moreread less

Abstract: We study a problem of learning classifier chains in multilabel classification with a special focus on feature selection. It turns out that standard classifier chains tend to select too many features, when feature selection method is embedded in base learner, which is due to the fact that selection is performed separately for each of the models in the chain. This can be a serious limitation in domains where the acquisition of feature values is costly or where including too many features (e.g. diagnostic tests) is associated with negative effects. We propose a novel method parCC (parsimonious classifier chains) that controls the total number of features without significant deterioration in the quality of the prediction. In the proposed method we jointly learn all models in the chain by combining `2,1 regularization to select features shared across the models and `1 regularization to select relevant labels in each model. In theoretical analysis we provide a bound on generalization error for the algorithm using Rademacher complexity. We apply our method to predict multimorbidity (co-occurrence of multiple diseases in one patient) using various medical diagnostic tests. The experiments carried out on a large clinical database (MIMIC III) show that parCC achieves higher accuracy than related methods when the number of features is limited. We also demonstrate the efficacy of the proposed method on a set of standard benchmark datasets.

...read moreread less

Journal Article•DOI•

User Clustering for MIMO NOMA via Classifier Chains and Gradient-Boosting Decision Trees

[...]

Chaouki Ben Issaid¹, Carles Anton-Haro, Xavier Mestre, Mohamed-Slim Alouini²•Institutions (2)

University of Oulu¹, King Abdullah University of Science and Technology²

17 Nov 2020-IEEE Access

TL;DR: A data-driven approach to group users in a Non-Orthogonal Multiple Access (NOMA) MIMO setting by coupling a Classifier Chain with a Gradient Boosting Decision Tree (GBDT), namely, the LightGBM algorithm.

...read moreread less

Abstract: In this article, we propose a data-driven approach to group users in a Non-Orthogonal Multiple Access (NOMA) MIMO setting. Specifically, we formulate user clustering as a multi-label classification problem and solve it by coupling a Classifier Chain (CC) with a Gradient Boosting Decision Tree (GBDT), namely, the LightGBM algorithm. The performance of the proposed CC-LightGBM scheme is assessed via numerical simulations. For benchmarking, we consider two classical adaptation learning schemes: Multi-Label k-Nearest Neighbours (ML-KNN) and Multi-Label Twin Support Vector Machines (ML-TSVM); as well as other naive approaches. Besides, we also compare the computational complexity of the proposed scheme with those of the aforementioned benchmarks.

...read moreread less

Journal Article•DOI•

Multi-Label Classification for Drill-Core Hyperspectral Mineral Mapping

[...]

I. C. Contreras¹, Mahdi Khodadadzadeh¹, Richard Gloaguen¹•Institutions (1)

Helmholtz-Zentrum Dresden-Rossendorf¹

21 Aug 2020-The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

TL;DR: The quantitative and qualitative analysis of the obtained results shows that the multi-label classification approach provides meaningful and descriptive mineral maps and outperforms the single-label RF classification for the mineral mapping task.

...read moreread less

Abstract: . A multi-label classification concept is introduced for the mineral mapping task in drill-core hyperspectral data analysis. As opposed to traditional classification methods, this approach has the advantage of considering the different mineral mixtures present in each pixel. For the multi-label classification, the well-known Classifier Chain method (CC) is implemented using the Random Forest (RF) algorithm as the base classifier. High-resolution mineralogical data obtained from Scanning Electron Microscopy (SEM) instrument equipped with the Mineral Liberation Analysis (MLA) software are used for generating the training data set. The drill-core hyperspectral data used in this paper cover the visible-near infrared (VNIR) and the short-wave infrared (SWIR) range of the electromagnetic spectrum. The quantitative and qualitative analysis of the obtained results shows that the multi-label classification approach provides meaningful and descriptive mineral maps and outperforms the single-label RF classification for the mineral mapping task.

...read moreread less

Proceedings Article•DOI•

Multi-label classification with classifier chains of ANN models

[...]

Vishwa Mohan Singh¹, Omkaresh Kulkarni¹•Institutions (1)

Massachusetts Institute of Technology¹

06 Nov 2020

TL;DR: This article explored methods of using neural network classifiers in the classifier chain model and tried to address some problems with such architecture while compare their performance on different types of data using different metrics with each other and with other well performing multi-label classification methods.

...read moreread less

Abstract: Multi-label classification is a generalization of a multi-class classification problem where one entity can belong to more than one class from the class set. Recent works have proposed multiple methods of solving this problem that involves both statistical and deep learning methods. While methods exist for using deep learning models for this problem, most of them require the model to have a high dimension output vector and the property of inter-dependency of classes has not been explored. An ensemble of statistical models called the chain classifiers can be used to address these issues. This study explores methods of using neural network classifiers in the classifier chain model and tries to address some problems with such architecture while compare their performance on different types of data using different metrics with each other and with other well performing multi-label classification methods.

...read moreread less

Book Chapter•DOI•

Extreme Gradient Boosted Multi-label Trees for Dynamic Classifier Chains

[...]

Simon Bohlender¹, Eneldo Loza Mencía¹, Moritz Kulessa¹•Institutions (1)

Technische Universität Darmstadt¹

19 Oct 2020

TL;DR: This work combines the boosting of extreme gradient boosted trees with the concept of dynamic classifier chains (DCC), an effective and scalable state-of-the-art technique, and incorporates DCC in a fast multi-label extension of XGBoost which is made publicly available.

...read moreread less

Abstract: Classifier chains is a key technique in multi-label classification, since it allows to consider label dependencies effectively. However, the classifiers are aligned according to a static order of the labels. In the concept of dynamic classifier chains (DCC) the label ordering is chosen for each prediction dynamically depending on the respective instance at hand. We combine this concept with the boosting of extreme gradient boosted trees (XGBoost), an effective and scalable state-of-the-art technique, and incorporate DCC in a fast multi-label extension of XGBoost which we make publicly available. As only positive labels have to be predicted and these are usually only few, the training costs can be further substantially reduced. Moreover, as experiments on eleven datasets show, the length of the chain allows for more control over the usage of previous predictions and hence over the measure one wants to optimize.

...read moreread less

Proceedings Article•DOI•

Non-Communicable Diseases Classification using Multi-Label Learning Techniques

[...]

Worawith Sangkatip¹, Jiratta Jiratta Phuboon-ob¹•Institutions (1)

Mahasarakham University¹

21 Oct 2020

TL;DR: In this article, the authors used Binary relevance (BR), Classifier Chains (CC), Random K-Iabelsets (RAkEL), Multi-Label k-Nearest Neighbor (ML-KNN) and Multi-label classification was used in this research.

...read moreread less

Abstract: Non-communicable diseases: NCDs are one of leading causes of death in the world. Multi-NCDs patients tend to undergo and suffer from multiple coexistent diseases. This research aims at classifying NCDs patients who are diagnosed with other NCDs. Multi-label classification was used in this research. There are four diseases types used in this study, i.e. diabetes, hyper-tension, cardiovascular and stroke. Binary relevance (BR), Classifier Chains (CC), The random k-Iabelsets (RAkEL) and Multi-Label k-Nearest Neighbor (ML-KNN) are adopted to transform Multi-NCDs to disease label. The experiments are conducted on the physical examination datasets collected from electronic health records. In the experiments, the comparative results of the techniques are demonstrated. The result showed that the RAkEL method outperformed other methods and achieved the best accuracy of 91.07%.

...read moreread less

Journal Article•

Stacklite – Stack overflow tag prediction

[...]

Gada Shrenika, Ch. Ramya, A. Rajashekar Reddy

07 Nov 2020-International Journal of Advance Research, Ideas and Innovations in Technology

TL;DR: This work is analyzing the past available data to predict the tags automatically based on the question a user enters which increases the enhancement of user experience.

...read moreread less

Abstract: Nowadays data plays a major role in every aspect of our life. The past data that is available can be used for analysis and to predict the future. For websites that are based on learning, the old data which the users are posting and tagging can be used to analyze and predict what new implementations can be done to increase the user experience. Similarly, Stack Overflow is the largest learning forum that is used by most of the developers to learn and share their programming knowledge. To post a question, users need to enter the tags related to the question manually. Here we are analyzing the past available data to predict the tags automatically based on the question a user enters which increases the enhancement of user experience.

...read moreread less

Posted Content•

Extreme Gradient Boosted Multi-label Trees for Dynamic Classifier Chains

[...]

Bohlender, Simon, Loza Mencia, Eneldo, Kulessa, Moritz - Show less +2 more

15 Jun 2020-arXiv: Learning

TL;DR: In this paper, the authors combine the concept of dynamic classifier chains (DCC) with the boosting of extreme gradient boosted trees (XGBoost), an effective and scalable state-of-the-art technique.

...read moreread less

Abstract: Classifier chains is a key technique in multi-label classification, since it allows to consider label dependencies effectively. However, the classifiers are aligned according to a static order of the labels. In the concept of dynamic classifier chains (DCC) the label ordering is chosen for each prediction dynamically depending on the respective instance at hand. We combine this concept with the boosting of extreme gradient boosted trees (XGBoost), an effective and scalable state-of-the-art technique, and incorporate DCC in a fast multi-label extension of XGBoost which we make publicly available. As only positive labels have to be predicted and these are usually only few, the training costs can be further substantially reduced. Moreover, as experiments on eleven datasets show, the length of the chain allows for a more control over the usage of previous predictions and hence over the measure one want to optimize.

...read moreread less