scispace - formally typeset
Search or ask a question

Showing papers on "Feature selection published in 1980"


Journal ArticleDOI
TL;DR: An efficient procedure which integrates feature selection and binary decision tree construction is presented which yields an optimal classification decision at each node based on the Kolmogorov-Smirnov criterion.

109 citations



Journal ArticleDOI
TL;DR: Monte Carlo results presented here further confirm the relatively good performance of non-parametric Bayes theorem type algorithms compared to parametric (linear and quadratic) algorithms and point out certain procedures which should be used in the selection of the density estimation windows for non- Parametric algorithms to improve their performance.

55 citations


Journal ArticleDOI
TL;DR: To gain an understanding of the learning process, the author derives expressions for success probability as a function of training time for a one-dimensional increment error correction classifier with imperfect labels.

25 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of selecting variables for the sample linear discriminant function is considered in the context of two multivariate normal populations with the same covariance matrix, and it is concluded that, provided the significance level of the F test is not too conservative, there should be a fairly high degree of confidence that the overall error rate is not increased by selection decisions based on the well-known F test.
Abstract: SUMMARY The problem of selecting variables for the sample linear discriminant function is considered in the context of two multivariate normal populations with the same covariance matrix. Selection decisions based on the well-known F test of 'no additional information' are contrasted with those based on a criterion which considers the asymptotic probability that there is no increase in the overall conditional error rate on deletion of a subset of variables. On the basis of the various data sets analysed it is concluded that, provided the significance level of the F test is not too conservative, there should be a fairly high degree of confidence that the overall error rate is not increased by selection decisions based on the F test. 1. IntroductioIl Let x= (x1, . . ., xp)' be an observation vector consisting of the available characteristics associated with an object which is to be allocated to one of two possible populations, say [Il and 2, assumed to be multivariate normal with the same covariance matrix, so that

19 citations


Journal ArticleDOI
Haruo Yanai1
TL;DR: In this paper, a generalized method of variable selection is proposed to select criterion variables as well as explanatory variables simultaneously in canonical correlation analysis, using the G.C.D Generalized Coefficient of Determination as a maximization criterion.
Abstract: We propose a generalized method of variable selection, which is applied for the case, in which the number of the criterion variables exceeds two. By using the method, we can select criterion variables as well as explanatory variables simultaneously in canonical correlation analysis, using the G.C.D Generalized Coefficient of Determination as a maximization criterion. Furthermore, the generalized method of variable selection can be applied to factor analysis, in which case forward selection method is also performed to real variables, with the number of latent factor variables as fixed. Finally, we show two numerical examples demonstrating the validity of our procedure.

17 citations


Journal ArticleDOI
TL;DR: It is shown that under certain conditions features with significantly different discrimination power are considered as equivalent by the Pe rule, and a rule for breaking ties is suggested to refine the feature ordering induced by thePe rule.
Abstract: The low sensitivity of the probability of error rule (Pe rule) for feature selection is demonstrated and discussed. It is shown that under certain conditions features with significantly different discrimination power are considered as equivalent by the Pe rule. The main reason for this phenomenon lies in the fact that, directly, the Pe rule depends only on the most probable class and that, under the stated condition, the prior most probable class remains the posterior most probable class regardless of the result for the observed feature. A rule for breaking ties is suggested to refine the feature ordering induced by the Pe rule. By this tie-breaking rule, when two features have the same value for the expected probability of error, the feature with the higher variance for the probability of error is preferred.

7 citations


Journal ArticleDOI
TL;DR: This paper considers the problem of finding the best feature subset by exhaustive search, using probabilistic distance measures as criteria, and a combinatorial algorithm is presented for generating all possible r-feature combinations from a given set of S features in (Sr) steps.

7 citations


ReportDOI
01 Feb 1980
TL;DR: A feature reduction algorithm based on a linear decision-tree classifier is proposed, and an example is presented to illustrate the use and validity of this algorithm.
Abstract: : Workload models are extremely important for computer performance evaluation The problem of feature reduction for the purpose of the formulation of workload models has received widespread attention This paper briefly reviews existing schemes for feature selection and reduction, and proposes a feature reduction algorithm based on a linear decision-tree classifier An example is presented to illustrate the use and validity of this algorithm (Author)

1 citations


Journal ArticleDOI
TL;DR: A method is developed for choosing the dimensionality of the patterns using the axes of the feature space as the eigenvectors of matrices of the form R 2 −1 R 1 where R 1 and R 2 are real symmetric matrices.
Abstract: This paper considers the problem of selection of dimensionality and sample size for feature extraction in pattern recognition. In general, the axes of the feature space are selected as the eigenvectors of matrices of the form R 2 −1 R 1 where R 1 and R 2 are real symmetric matrices. Expressions are derived for obtaining the changes in the eigenvalues and eigenvectors when there are changes of first order of smallness in the matrices R 1 and R 2. Based on this theory, a method is developed for choosing the dimensionality of the patterns. Also expressions are derived for the selection of sample size for estimating the eigenvectors, for two gaussian distributed pattern classes with equal means, unequal covariance matrices and with unequal means and equal covariance matrices.