Simultaneous feature selection and clustering using mixture models
read more
Citations
Machine Learning : A Probabilistic Perspective
A survey on feature selection methods
Subset Selection in Regression
Unsupervised feature selection for multi-cluster data
MILES: Multiple-Instance Learning via Embedded Instance Selection
References
Rapid object detection using a boosted cascade of simple features
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
Data clustering: a review
Normalized cuts and image segmentation
Related Papers (5)
Frequently Asked Questions (14)
Q2. What are the future works mentioned in the paper "Simultaneous feature selection and clustering using mixture models" ?
There are several avenues for future work. How to extend the algorithm to cope with this is a challenging problem. The authors can replace the mixture of Gaussians by a mixture of multinomial distribution, thereby making the proposed algorithm also applicable to categorical data. Finally, principles other than MML, such as variational Bayes [ 12 ], can be adopted to perform model selection.
Q3. What is the way to initialize a model?
since the model selection algorithm determines the number of components, it can be initialized with a large value of K, thus alleviating the need for a good initialization, as shown in [18].
Q4. How can the authors reduce the complexity of the algorithm?
The authors can furtherreduce the complexity by adopting optimization techniquesapplicable for standard EM for Gaussian mixture, such assampling the data, compressing the data [8], or usingefficient data structures [45], [54].
Q5. What is the strength of the proposed algorithm?
Another strength of the proposed algorithm is that byinitialization with a large number of Gaussian components,the algorithm is less sensitive to the local minimumproblem than the standard EM algorithm.
Q6. What is the name of the task of selecting the “best” feature subset?
The task of selecting the “best” feature subset is known as feature selection, sometimes as variable selection or subset selection.
Q7. What is the popular algorithm for clustering?
The CLIQUE algorithm [1] is popular in the data mining community and it finds hyperrectangular shaped clusters using a subset of attributes for a large database.
Q8. What is the composition of the texture data set?
The texture data set (texture) consists of 4,000 19- dimensional Gabor filter features from a collage of four Brodatz textures [27].
Q9. What is the wdbc image segmentation data set?
The image segmentation data set (image) contains 2,320 data points with 19 features from seven classes; each pattern consists of features extracted from a 3 3 region taken from seven types of outdoor images: brickface, sky, foliage, cement, window, path, and grass.
Q10. Why is zernike so difficult to cluster?
The high error rate for zernike is due to the fact that digit images are inherently more difficult to cluster: for example, “4” can be written in a manner very similar to “9” and it is difficult for any unsupervised learning algorithm to distinguish among them.
Q11. What is the general trend of the feature number?
The authors can see the general trendthat as the feature number increases, the saliency decreases, inaccordance with the true characteristics of the data.
Q12. What is the way to avoid running EM many times?
The proposed algorithm can avoid running EM many times with different numbers of components and different feature subsets, and can achieve better performance than using all the available features for clustering.
Q13. Why are the class labels not involved in their experiment?
Since these data sets were collected for supervised classification, the class labels are not involved in their experiment, except for evaluation of the clustering results.
Q14. What is the EM algorithm for determining the feature saliency?
By treating Z (the hidden class labels) and as hidden variables, one can derive (see details in Appendix B) thefollowing EM algorithm for parameter estimation:.