Class decomposition via clustering: a new framework for low-variance classifiers
read more
Citations
Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network.
Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network
Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network
DeTrac: Transfer Learning of Class Decomposed Medical Images in Convolutional Neural Networks
A genetic algorithm approach to optimising random forests applied to class engineered data
References
The Nature of Statistical Learning Theory
Classification and Regression Trees.
C4.5: Programs for Machine Learning
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Related Papers (5)
Frequently Asked Questions (20)
Q2. What are the contributions mentioned in the paper "Class decomposition via clustering: a new framework for low-variance classifiers" ?
In this paper the authors propose a pre-processing step to classification that applies a clustering algorithm to the training set to discover local patterns in the attribute or input space. The authors demonstrate how this knowledge can be exploited to enhance the predictive accuracy of simple classifiers. Decomposing classes into clusters makes the new class distribution easier to approximate and provides a viable way to reduce bias while limiting the growth in variance.
Q3. What future works have the authors mentioned in the paper "Class decomposition via clustering: a new framework for low-variance classifiers" ?
Future work will look for ways to improve the computational efficiency of their approach ( as sug- gested in Section 3. 3 ). Future work will address the feasibility of dynamically varying the growth rate of the complexity of the class of functions during model selection. Such model can then be refined using smaller complexity steps by augmenting the number of classifiers per class, as suggested in their approach.
Q4. What is the purpose of the class decomposition process?
Their class decomposition process aims at eliminating distributions unfavorable to simple classifiers where a class spreads out into multiple regions.
Q5. What is the purpose of the model?
The model is intended to capture correlations between the feature variables and the target variable to predict the class label of new data objects.
Q6. What is the way to group examples into clusters?
The clustering algorithm follows the Expectation Maximization (EM) technique [14]; it groups examples into clusters by modelling each cluster through a probability density function.
Q7. What is the trick to reducing the bias of simple linear classifiers?
The trick lies on identifying regions of high class density within subsets of examples of the same class which the authors accomplish through clustering.
Q8. What is the last step to prune cluster sets?
If no more cluster sets are produced,then the last step simply prunes lower cardinality cluster sets that have a cluster in common.
Q9. What is the advantage of using all examples belonging to the same class for analysis?
This has the advantage of using all examples belonging to the same class for analysis, whereas in decision tree induction, the continuous partitioning of the data progressively lessens the statistical support of every decision rule, an effect known as the fragmentation problem.
Q10. What is the default value for the clustering algorithm?
Implementations of the SVM, Naive Bayes, and EM-clustering are part of the WEKA machine-learning class library [18], set with default values.
Q11. What is the limitation of their approach?
One limitation of their approach is the amount of CPU time necessary to find the best class-assignment configuration (Section 3.2).
Q12. What is the way to increase the complexity of a classifier?
One way to increase the complexity of the classifier is to enlarge the original space of linear combinations to allow for more flexibility on the decision boundaries, for example by adding higher order polynomials (Figure 1a, dashed line).
Q13. How do the authors map the set of examples into a new set?
The authors map the set of examples in Tj into a new set T ′j by renaming every class label to indicate not only the class but also the cluster to which each example belongs.
Q14. What is the method used to improve classifier performance?
The authors test their methodology on twenty datasets from the University of California at Irvine repository, using two simple classifiers: Naive Bayes and a Support Vector Machine with a polynomial kernel of degree one.
Q15. What is the VC dimension of a simple classifier?
The results above indicate that the complexity of a simple classifier, as measured by the VC dimension, grows at a slower rate with increased boundaries than with more flexible boundaries.
Q16. What is the way to improve classifier accuracy?
The authors propose an approach to improve the accuracy of simple classifiers through a pre-processing step that applies a clustering algorithm over examples belonging to the same class.
Q17. What does the algorithm do when it does not improve performance?
When clustering does not seem to improve performance, their approach simply reverts the effects of clustering, leaving the original dataset intact.
Q18. What is the algorithm's approach to calculating clusters?
Next the authors start looking for pairs of clusters (e.g. {cj1 , c j 2 }) and computepredictive accuracy assuming the two clusters on each pair are mapped to the same index.
Q19. What is the case where a classifier defines a discriminant function for each class?
The authors consider the case where a classifier defines a discriminant function for each class gj(x), j = 1, 2, · · · , k and chooses the class corresponding to the discriminant function with highest value (ties are broken arbitrarily):h(x) = ym iff gm(x) ≥ gj(x) (1)Possibly, the simplest case is that of a linear discriminant function, where the approximation is based on a linear model:gj(x) = w0 +n∑i=1wixi (2)where each wi, 0 ≤ i ≤ n, is a coefficient that must be learned by the classification algorithm.
Q20. How do the authors increase the representational power of in small steps?
Here the authors provide evidence showing that their approach increases the representational power of φ in small steps to avoid a large increase in variance.