scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Floating search methods in feature selection

01 Nov 1994-Pattern Recognition Letters (Elsevier Science Inc.)-Vol. 15, Iss: 11, pp 1119-1125
TL;DR: Sequential search methods characterized by a dynamically changing number of features included or eliminated at each step, henceforth "floating" methods, are presented and are shown to give very good results and to be computationally more effective than the branch and bound method.
About: This article is published in Pattern Recognition Letters.The article was published on 1994-11-01. It has received 3104 citations till now. The article focuses on the topics: Beam search & Jump search.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the maximal statistical dependency criterion based on mutual information (mRMR) was proposed to select good features according to the maximal dependency condition. But the problem of feature selection is not solved by directly implementing mRMR.
Abstract: Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we first derive an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection. Then, we present a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers). This allows us to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits, arrhythmia, NCI cancer cell lines, and lymphoma tissues). The results confirm that mRMR leads to promising improvement on feature selection and classification accuracy.

8,078 citations

05 Aug 2003
TL;DR: This work derives an equivalent form, called minimal-redundancy-maximal-relevance criterion (mRMR), for first-order incremental feature selection, and presents a two-stage feature selection algorithm by combining mRMR and other more sophisticated feature selectors (e.g., wrappers).

7,075 citations


Cites background from "Floating search methods in feature ..."

  • ...Index Terms—Feature selection, mutual information, minimal redundancy, maximal relevance, maximal dependency, classification....

    [...]

Journal ArticleDOI
TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

6,527 citations

Journal ArticleDOI
TL;DR: This paper addresses the problem of the classification of hyperspectral remote sensing images by support vector machines by understanding and assessing the potentialities of SVM classifiers in hyperdimensional feature spaces and concludes that SVMs are a valid and effective alternative to conventional pattern recognition approaches.
Abstract: This paper addresses the problem of the classification of hyperspectral remote sensing images by support vector machines (SVMs) First, we propose a theoretical discussion and experimental analysis aimed at understanding and assessing the potentialities of SVM classifiers in hyperdimensional feature spaces Then, we assess the effectiveness of SVMs with respect to conventional feature-reduction-based approaches and their performances in hypersubspaces of various dimensionalities To sustain such an analysis, the performances of SVMs are compared with those of two other nonparametric classifiers (ie, radial basis function neural networks and the K-nearest neighbor classifier) Finally, we study the potentially critical issue of applying binary SVMs to multiclass problems in hyperspectral data In particular, four different multiclass strategies are analyzed and compared: the one-against-all, the one-against-one, and two hierarchical tree-based strategies Different performance indicators have been used to support our experimental studies in a detailed and accurate way, ie, the classification accuracy, the computational time, the stability to parameter setting, and the complexity of the multiclass architecture The results obtained on a real Airborne Visible/Infrared Imaging Spectroradiometer hyperspectral dataset allow to conclude that, whatever the multiclass strategy adopted, SVMs are a valid and effective alternative to conventional pattern recognition approaches (feature-reduction procedures combined with a classification method) for the classification of hyperspectral remote sensing data

3,607 citations


Cites background from "Floating search methods in feature ..."

  • ...Since the identification of the optimal solution is computationally unfeasible, techniques that lead to suboptimal solutions are normally used....

    [...]

Journal ArticleDOI
TL;DR: The objective is to provide a generic introduction to variable elimination which can be applied to a wide array of machine learning problems and focus on Filter, Wrapper and Embedded methods.

3,517 citations


Cites background or methods from "Floating search methods in feature ..."

  • ...The SFS and SFFS methods suffer from producing nested subsets since the forward inclusion was always unconditional which means that two highly correlated variables might be included if it gave the highest performance in the SFS evaluation....

    [...]

  • ...The Sequential Floating Forward Selection (SFFS) [33,34] algorithm is more flexible than the naive SFS because it introduces an additional backtracking step....

    [...]

  • ...It can be noted that a statistical distance measure can also be used as the objective function for the search algorithms as done in [9,10,33,35]....

    [...]

  • ...To avoid the nesting effect, adaptive version of the SFFS was developed in [35,36]....

    [...]

  • ...8 gives the result of applying SFFS with RBF as the wrapper....

    [...]

References
More filters
Journal ArticleDOI
Narendra1, Fukunaga
TL;DR: In this paper, a branch and bound-based feature subset selection algorithm is proposed to select the best subset of m features from an n-feature set without exhaustive search, which is computationally computationally unfeasible.
Abstract: A feature subset selection algorithm based on branch and bound techniques is developed to select the best subset of m features from an n-feature set. Existing procedures for feature subset selection, such as sequential selection and dynamic programming, do not guarantee optimality of the selected feature subset. Exhaustive search, on the other hand, is generally computationally unfeasible. The present algorithm is very efficient and it selects the best subset without exhaustive search. Computational aspects of the algorithm are discussed. Results of several experiments demonstrate the very substantial computational savings realized. For example, the best 12-feature set from a 24-feature set was selected with the computational effort of evaluating only 6000 subsets. Exhaustive search would require the evaluation of 2 704 156 subsets.

1,301 citations

Journal ArticleDOI
TL;DR: The preliminary results suggest that GA is a powerful means of reducing the time for finding near-optimal subsets of features from large sets.

848 citations


"Floating search methods in feature ..." refers methods in this paper

  • ...Further work includes the use of genetic algorithms for feature selection (Siedlecki and Sklansky, 1989) or the possibility of applying simulated annealing technique (Siedlecki and Sklansky, 1988)....

    [...]

Journal ArticleDOI
TL;DR: A direct method of measurement selection is proposed to determine the best subset of d measurements out of a set of D total measurements, using a nonparametric estimate of the probability of error given a finite design sample set.
Abstract: A direct method of measurement selection is proposed to determine the best subset of d measurements out of a set of D total measurements. The measurement subset evaluation procedure directly employs a nonparametric estimate of the probability of error given a finite design sample set. A suboptimum measurement subset search procedure is employed to reduce the number of subsets to be evaluated. Teh primary advantage of the approach is the direct but nonparametric evaluation of measurement subsets, for the M class problem.

790 citations


"Floating search methods in feature ..." refers methods in this paper

  • ...A feature selection technique using the divergence distance as the criterion function and the sequential backward selection (SBS) method as the search algorithm was introduced already by Marill and Green (1963) and its "bottom up" counterpart known as sequential forward selection (SFS) by Whitney (1971). Both these methods are generally suboptimal and suffer from the so-called "nesting effect"....

    [...]

Journal ArticleDOI
TL;DR: Some of the theoretical problems encountered in trying to determine a more formal measure of the effectiveness of a set of tests are discussed; a measure which might be a practical substitute for the empirical evaluation.
Abstract: In the type of recognition system under discussion, the physical sample to be recognized is first subjected to a battery of tests; on the basis of the test results, the sample is then assigned to one of a number of prespecified categories. The theory of how test results should be combined to yield an optimal assignment has been discussed in an earlier paper. Here, attention is focused on the tests themselves. At present, we usually measure the effectiveness of a set of tests empirically, i.e., by determining the percentage of correct recognitions made by some recognition device which uses these tests. In this paper, we discuss some of the theoretical problems encountered in trying to determine a more formal measure of the effectiveness of a set of tests; a measure which might be a practical substitute for the empirical evaluation. Specifically, the following question is considered: What constitutes an effective set of tests, and how is this effectiveness dependent on the correlations among, and the properties of, the individual tests in the set? Specific suggestions are considered for the case in which the test results are normally distributed, but arbitrarily correlated. The discussion is supported by the results of experiments dealing with automatic recognition of hand-printed characters.

626 citations


"Floating search methods in feature ..." refers methods in this paper

  • ...…selection technique using the divergence distance as the criterion function and the sequential backward selection (SBS) method as the search algorithm was introduced already by Marill and Green ( 1963) and its "bottom up" counterpart known as sequential forward selection (SFS) by Whitney ( 197 1 )....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a review of feature selection for multidimensional pattern classification is presented, and the potential benefits of Monte Carlo approaches such as simulated annealing and genetic algorithms are compared.
Abstract: We review recent research on methods for selecting features for multidimensional pattern classification. These methods include nonmonotonicity-tolerant branch-and-bound search and beam search. We describe the potential benefits of Monte Carlo approaches such as simulated annealing and genetic algorithms. We compare these methods to facilitate the planning of future research on feature selection.

366 citations