scispace - formally typeset
Journal ArticleDOI

An effective attribute clustering approach for feature selection and replacement

Reads0
Chats0
TLDR
An attribute clustering method based on genetic algorithms is proposed for feature selection and feature replacement that combines both the average accuracy of attribute substitution in clusters and the cluster balance as the fitness function.
Abstract
Feature selection is an important preprocessing step in mining and learning. A good set of features cannot only improve the accuracy of classification, but can also reduce the time to derive rules. It is executed especially when the amount of attributes in a given training data is very large. In this article, an attribute clustering method based on genetic algorithms is proposed for feature selection and feature replacement. It combines both the average accuracy of attribute substitution in clusters and the cluster balance as the fitness function. Experimental comparison with the k-means clustering approach and all combinations of attributes also shows the proposed approach can get a good trade-off between accuracy and time complexity. Besides, after feature selection, the rules derived from only the selected features may usually be hard to use if some values of the selected features cannot be obtained in current environments. This problem can be easily solved in our proposed approach. The attributes with...

read more

Citations
More filters
Journal ArticleDOI

Feature Selection with Attributes Clustering by Maximal Information Coefficient

TL;DR: This paper proposes a unsupervised feature selection method which could reuse a specific data exploration result and follows the idea of clustering attributes and combines two state-of-the-art data analyzing methods, that's maximal information coefficient and affinity propagation.
Journal ArticleDOI

Using group genetic algorithm to improve performance of attribute clustering

TL;DR: This study improves the performance of the GA-based attribute clustering process based on the grouping genetic algorithm (GGA).
Proceedings ArticleDOI

An evolutionary attribute clustering and selection method based on feature similarity

TL;DR: The previous GA-based clustering method for attribute clustering and feature selection is modified for a better execution performance based on feature similarity and feature dependence.
Journal ArticleDOI

Subspace Selective Ensemble Algorithm Based on Feature Clustering

TL;DR: A feature-clustering-based subspace selective ensemble learning algorithm was proposed to improve ensemble classifier performance, allowing for high dimensional data sets, and the results showed the classification accuracy increase significantly.
Journal Article

Finding minimal reducts from incomplete information systems

TL;DR: A theorem is proved on the basis of defined binary matrix and asymmetry relation, and a genetic algorithm based on this theorem is proposed for finding the minimal reducts from incomplete information systems.
References
More filters
Book

Genetic algorithms in search, optimization, and machine learning

TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.

Genetic algorithms in search, optimization and machine learning

TL;DR: This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.
Book

Adaptation in natural and artificial systems

TL;DR: Names of founding work in the area of Adaptation and modiication, which aims to mimic biological optimization, and some (Non-GA) branches of AI.

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Related Papers (5)