scispace - formally typeset
Book ChapterDOI

Domain of Competency of Classifiers on Overlapping Complexity of Datasets Using Multi-label Classification with Meta-Learning

TLDR
The results and implications of a study investigating the connection between four overlapping measures and the performance of three classifiers, namely KNN, C4.5 and SVM, suggest the existence of a strong negative correlation between the classifiers’ performance and class overlapping present in the data.
Abstract
A classifier’s performance can be greatly influenced by the characteristics of the underlying dataset We aim at investigating the connection between the overlapping complexity of dataset and the performance of a classifier in order to understand the domain of competence of these machine learning classifiers In this paper, we report the results and implications of a study investigating the connection between four overlapping measures and the performance of three classifiers, namely KNN, C45 and SVM In this study, we first evaluated the performance of the three classifiers over 1060 binary classification datasets Next, we constructed a multi-label classification dataset by computing the four overlapping measures as features and multi-labeled with the competent classifiers over these 1060 binary classification datasets The generated multi-label classification dataset is then used to estimate the domain of the competence of the three classifiers with respect to the overlapping complexity This allowed us to express the domain of competence of these classifiers as a set of rules obtained through multi-label rule learning We found classifiers’ performance invariably degraded with the datasets having high values of complexity measures (N1 and N3) This suggested for the existence of a strong negative correlation between the classifiers’ performance and class overlapping present in the data

read more

References
More filters
Journal ArticleDOI

ML-KNN: A lazy learning approach to multi-label learning

TL;DR: Experiments on three different real-world multi-label learning problems, i.e. Yeast gene functional analysis, natural scene classification and automatic web page categorization, show that ML-KNN achieves superior performance to some well-established multi- label learning algorithms.
Proceedings Article

Integrating classification and association rule mining

TL;DR: The integration is done by focusing on mining a special subset of association rules, called class association rules (CARs), and shows that the classifier built this way is more accurate than that produced by the state-of-the-art classification system C4.5.
Book ChapterDOI

Mining Multi-label Data

TL;DR: A large body of research in supervised learning deals with the analysis of single-label data, where training examples are associated with a single label λ from a set of disjoint labels L, however, training examples in several application domains are often associated withA set of labels Y ⊆ L.
Journal ArticleDOI

The lack of a priori distinctions between learning algorithms

TL;DR: It is shown that one cannot say: if empirical misclassification rate is low, the Vapnik-Chervonenkis dimension of your generalizer is small, and the training set is large, then with high probability your OTS error is small.
Related Papers (5)