Showing papers on "Feature selection published in 1990"

PDF

Open Access

Feature Selection Using a Multilayer Perceptron

[...]

Dennis W. Ruck, Steven K. Rogers, Matthew Kabrisky, Wright-Patterson Afb

01 Jan 1990

TL;DR: A technique has been developed which analyzes the weights in a multilayer perceptron to determine which features the network finds important and which are unimportant, and the saliency measure is used to compare the results of two different training rules on a target recognition problem.

...read moreread less

Abstract: The problem of selecting the best set of features for target r ecognition using a multilayer perceptron is addressed in this paper. A technique has been developed which analyzes the weights in a multilayer perceptron to determine which features the network finds important and which are unimportant. A brief introduction to the use of multilayer perceptrons for classification and the training rules available is followed by the mathematical development of the saliency measure for multilayer perceptrons. The tec hnique is applied to two different image databases and is found to be consistent with statistical techniques and independent of the network initial conditions. The saliency measure is the n used to compare the results of two different training rules on a target recognition problem.

...read moreread less

262 citations

Journal Article•DOI•

Rotation invariant image recognition using features selected via a systematic method

[...]

A. Khontanzad¹, Y. H. Hong¹•Institutions (1)

Southern Methodist University¹

01 Oct 1990-Pattern Recognition

TL;DR: Taking advantage of the orthogonality property, a systematic feature selection method for choosing an appropriate number of the Zernike features is developed, based on computing a measure of the information content differences of features of different classes.

...read moreread less

166 citations

Proceedings Article•

Using Genetic Algorithms to Improve Pattern Classification Performance

[...]

Eric Chang¹, Richard P. Lippmann¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 1990

TL;DR: The results suggest that genetic algorithms are becoming practical for pattern classification problems as faster serial and parallel computers are developed.

...read moreread less

Abstract: Genetic algorithms were used to select and create features and to select reference exemplar patterns for machine vision and speech pattern classification tasks. For a complex speech recognition task, genetic algorithms required no more computation time than traditional approaches to feature selection but reduced the number of input features required by a factor of five (from 153 to 33 features). On a difficult artificial machine-vision task, genetic algorithms were able to create new features (polynomial functions of the original features) which reduced classification error rates from 19% to almost 0%. Neural net and k nearest neighbor (KNN) classifiers were unable to provide such low error rates using only the original features. Genetic algorithms were also used to reduce the number of reference exemplar patterns for a KNN classifier. On a 338 training pattern vowel-recognition problem with 10 classes, genetic algorithms reduced the number of stored exemplars from 338 to 43 without significantly increasing classification error rate. In all applications, genetic algorithms were easy to apply and found good solutions in many fewer trials than would be required by exhaustive search. Run times were long, but not unreasonable. These results suggest that genetic algorithms are becoming practical for pattern classification problems as faster serial and parallel computers are developed.

...read moreread less

82 citations

Proceedings Article•DOI•

A modeling approach to feature selection

[...]

J. Sheinvald¹, Byron Dom¹, W. Niblack¹•Institutions (1)

IBM¹

16 Jun 1990

TL;DR: In this paper, an information-theoretic approach is used to derive a new feature selection criterion capable of detecting features that are totally useless by fitting a probability model to a given training set of classified feature-vectors using the minimum description length criterion (MDLC) for model selection.

...read moreread less

Abstract: An information-theoretic approach is used to derive a new feature selection criterion capable of detecting features that are totally useless. Since the number of useless features is initially unknown, traditional class-separability and distance measures are not capable of coping with this problem. The useless feature-subset is detected by fitting a probability model to a given training set of classified feature-vectors using the minimum-description-length criterion (MDLC) for model selection. The resulting criterion for the Gaussian case is a simple closed-form expression, having a plausible geometric interpretation, and is proved to be consistent, i.e., it yields the true useless subset with probability 1 as the size of the training set grows to infinity. Simulations show excellent results compared to the cross-validation method and other information-theoretic criteria, even for small-sized training sets. >

...read moreread less

42 citations

Proceedings Article•DOI•

Using genetic algorithms to select and create features for pattern classification

[...]

E.I. Chang, Richard P. Lippmann, D.W. Tong

17 Jun 1990

TL;DR: On a difficult artificial machine-vision task, genetic algorithms were able to create new features (polynomial functions of the original features) which dramatically reduced classification error rates.

...read moreread less

Abstract: Genetic algorithms were used for feature selection and creation in two pattern-classification problems. On a machine-version inspection task, it was found that genetic algorithms performed no better than conventional approaches to feature selection but required much more computation. On a difficult artificial machine-vision task, genetic algorithms were able to create new features (polynomial functions of the original features) which dramatically reduced classification error rates. Neural network and nearest-neighbor classifiers were unable to provide such low error rates using only the original features

...read moreread less

34 citations

Journal Article•DOI•

Evaluation of the branch and bound algorithm for feature selection

[...]

Yoshihiko Hamamoto¹, S. Uchimura², Y. Matsuura¹, Taiho Kanaoka¹, Shingo Tomita¹ - Show less +1 more•Institutions (2)

Yamaguchi University¹, Oshima National College of Maritime Technology²

01 Jul 1990-Pattern Recognition Letters

TL;DR: It is shown that the branch and bound algorithm guarantees the optimal feature subset without evaluating all possible feature subsets, if the criterion function used satisfies the ‘monotonicity’ property.

...read moreread less

24 citations

Genetic Algorithms for Feature Selection for Counterpropagation Networks

[...]

F. Z. Brill, Donald E. Brown, Worthy N. Martin

09 Apr 1990

17 citations

Journal Article•

Optical waveform feature selection using a pseudo-similarity method

[...]

K. J. Siddiqui, Y.-H. Liu, D. R. Hay, C. Y. Suen

01 Jan 1990-Journal of acoustic emission

10 citations

Journal Article•DOI•

Comparison of methods for outlier detection and their effects on the classification results for a particular data base

[...]

Manuel Gómez, Z.A. de Benzo, C. Gomez, Eunice Marcano, H. Torres, M. Ramirez - Show less +2 more

01 Jan 1990-Analytica Chimica Acta

TL;DR: Hawkins' procedure seems the better method for detecting outliers when multinormal distribution procedure and Hawkins' procedure were applied, and the two subsets produced did not differ greatly.

...read moreread less

8 citations

Journal Article•

Optimum feature selection for decision functions

[...]

Manabu Ichino

01 Jan 1990-Systems and Computers in Japan

7 citations

Journal Article•DOI•

Optimum feature selection for decision functions

[...]

Manabu¹•Institutions (1)

Tokyo Denki University¹

01 Jan 1990-Systems and Computers in Japan

TL;DR: An optimum feature selection method which is applicable to arbitrary (nonlinear) decision functions is presented and numerical examples of feature selection for a linear and a quadratic decision function are presented.

...read moreread less

Abstract: Feature selection is one of the most important processes in the design of pattern classifiers. This paper presents an optimum feature selection method which is applicable to arbitrary (nonlinear) decision functions. It is assumed that a finite number of training samples (training set) is given for each pattern class, and the decision function is designed based on the training sets. The training sets are edited by removing the samples which are classified incorrectly by the decision function. Then the feature selection problem is transformed to a modified zero-one integer program. In this method, under a chosen permissible error, a minimum feature subset can be found which is combinationally optimum. Numerical examples of feature selection for a linear and a quadratic decision function are presented.

...read moreread less

Proceedings Article•

Principal and discriminant component analysis for feature selection in isolated word recognition

[...]

Eduardo Lleida Solano, Climent Nadeu Camprubí

01 Jan 1990

Proceedings Article•DOI•

Feature Selection and Decision Space Mapping for Sensor Fusion

[...]

Cynthia L. Beer¹, Gerald M. Flachs¹, David R. Scott¹, Jay B. Jordan¹•Institutions (1)

New Mexico State University¹

01 Mar 1990

TL;DR: An information fusion approach is presented for mapping a multiple dimensional feature space into a lower dimensional decision space with simplified decision boundaries by measuring differences in probability density functions of features.

...read moreread less

Abstract: An information fusion approach is presented for mapping a multiple dimensional feature space into a lower dimensional decision space with simplified decision boundaries. A new statistic, called the tie statistic, is used to perform the mapping by measuring differences in probability density functions of features. These features are then evaluated based on the separation of the decision classes using a parametric beta representation for the tie statistic. The feature evaluation and fusion methods are applied to perform texture recognition.

...read moreread less

Proceedings Article•DOI•

Classification of abnormal cortisol patterns by features from Wigner spectra

[...]

T.P. Wang¹, Mingui Sun¹, C.C. Li¹, Anthony H. Vagnucci¹•Institutions (1)

University of Pittsburgh¹

16 Jun 1990

TL;DR: A study on applying the discrete pseudo-Wigner distribution to classification of the plasma cortisol short-time series of two disease categories is presented and shows a great potential for using Wigner spectra for feature extraction.

...read moreread less

Abstract: A study on applying the discrete pseudo-Wigner distribution to classification of the plasma cortisol short-time series of two disease categories is presented. The autocomponent selection in the Wigner distribution has been studied in regard to feature selection for the pattern recognition. Three energy density features are selected along with the mean level of each time series. These features allow a simple linear classification of the available data. The results show a great potential for using Wigner spectra for feature extraction. >

...read moreread less

Implementing Variable Selection Techniques in Regression.

[...]

Jerome D. Thayer

01 Apr 1990

TL;DR: Variable selection techniques in stepwise regression analysis are discussed, and general and specific criticisms of the stepwise method in the literature are outlined and misuses of the method are considered.

...read moreread less

Abstract: Variable selection techniques in stepwise regression analysis are discussed. In stepwise regression, variables are added or deleted from a model in sequence to produce a final "good" or "best" predictive model. Stepwise computer programs are discussed and four different variable selection strategies are described. These strategies include the forward method, backward method, forward stepwise method, and backward stepwise method. General and specific criticisms of the stepwise method in the literature are outlined, and misuses of the method are considered. Although most statisticians would agree that stepwise methods should not be used when an explanatory model is desired, it is common to see research articles where explanatory interpretations are given to a model that is called a "prediction" model. Stepwise methods should not be used to determine the number of variables in the final model. When model selection is being performed, the stepwise method can be helpful if: the initial choice of variables is conducted based on theory, the defaults are not used automatically, more than one run is done using different variable selection methods, and the final model is chosen through an intelligent process rather than automatically using the final model generated by the computer program. Three tables and outlines of the models described are included. (TJH) *********************************************************************** * Reproductions supplied by EDRS are the best that can be made * * from the original document. * *********************************************************************** V U S DEPARTMENT OF EDUCATION Ott,ce of Educattonal Research and Improvement EDUCATIONAL RESOURCr.S INFORMATION CENTER (ERIC) fi4n.'is document has been reproduced as received from the person of organization origInating 0 Minor changes have been made to improve reproduction quality 0 Pc.rts or opn.ons stated .n ohs emu ment do not necessar., represent official OERI pos,toon or policy "PERMISSION TO REPRODUCE THIS MATERIAL HAS BEEN GRANTED BY 614 Ti). Ili/P161 TO THE EDUCATIONAL RESOURCES INFORMATION CENTER (ERIC)" Implementing Variable Selection Techniques in Regression

...read moreread less

Journal Article•DOI•

A Non-Parametric Variable Selection Algorithm for Allocatory Linear Discriminant Analysis:

[...]

Samuel L. Seaman¹, Dean M. Young¹•Institutions (1)

Baylor University¹

01 Dec 1990-Educational and Psychological Measurement

TL;DR: In this article, a nonparametric algorithm is proposed for the selection of variables in allocatory discriminant analysis, which is based on the ability to reuse calculations for the inverse of a nonsingular matrix.

...read moreread less

Abstract: A computationally efficient nonparametric algorithm is proposed for the selection of variables in allocatory discriminant analysis. The efficiency of the algorithm derives from an ability to reuse calculations for the inverse of a nonsingular matrix. A subset of the original variables is found for which the leave-one-out estimate of the conditional probability of misclassification is never significantly greater than the estimated conditional probability of misclassification for the full set of predicate variables.

...read moreread less

Book Chapter•DOI•

Selecting the Best Subset of Variables in Principal Component Analysis

[...]

P. L. Gonzalez, R. Cléroux, B. Rioux

01 Jan 1990

TL;DR: The problem of variable selection in principal component analysis (PCA) has been studied by several authors as discussed by the authors, but as yet no selection procedures are found in the classical statistical computer softwares.

...read moreread less

Abstract: The problem of variable selection, in Principal Component Analysis (PCA) has been studied by several authors [1] but as yet, no selection procedures are found in the classical statistical computer softwares. Such selection procedures are found, on the other hand, for linear regression or discriminant analysis because the selection criteria are based on well known quantities such as the multiple correlation coefficient or the average prediction error.

...read moreread less

Proceedings Article•DOI•

Selection of input and output variables as a model reduction problem

[...]

J.P. Keller¹, Dominique Bonvin²•Institutions (2)

Australian National University¹, École Polytechnique Fédérale de Lausanne²

23 May 1990

TL;DR: In this paper, a variable selection procedure maximizes the dominance of the subsystem of the selected variables, hence justifying the proposed selection, and prior model reduction can significantly improve variable selection since it eliminates the negative effects of inappropriate modelling.

...read moreread less

Abstract: The procedure for selecting input and output variables, proposed by Keller and Bonvin [4] is analysed using the principle of internal dominance proposed by Moore [6]. Using that principle, a dominance condition for the subsystem of the selected variables over the subsystem of the neglected variables is derived. It can be shown that the proposed variable selection procedure maximizes the dominance of the subsystem of the selected variables, hence justifying the proposed selection. The investigation further shows that prior model reduction can significantly improve variable selection since it eliminates the negative effects of inappropriate modelling.

...read moreread less

Proceedings Article•DOI•

FAST: parallel airplane pattern recognition

[...]

K. Ma¹, R.J. Jannorone¹, J.W. Gorman¹•Institutions (1)

University of South Carolina¹

11 Mar 1990

TL;DR: A new feature selection approach is presented for using parallel distributed processing to identify a three-dimensional object from a two-dimensional image recorded at an arbitrary viewing angle and range and compares favorably with established feature selection approaches.

...read moreread less

Abstract: A new feature selection approach is presented for using parallel distributed processing to identify a three-dimensional object from a two-dimensional image recorded at an arbitrary viewing angle and range. One vector of 32 feature variables is used to describe a two-dimensional binary image. The feature variables are based on counts of nearest neighbor conjuncts, which reflect shape and area differences among airplanes. Thirteen standardized airplanes are used in the experiment in order to compare the results with established feature selection approaches. Results based on the new approach compare favorably with results from traditional approaches. In addition, a relatively fast compact parallel hardware design and data structure are presented and compared with traditional algorithms. >

...read moreread less

Journal Article•DOI•

An adaptive approach to feature selection in pattern recognition

[...]

Helen C. Shen¹, R. M. Pilkey¹•Institutions (1)

University of Waterloo¹

01 Dec 1990-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: A ranking scheme for features based on a feature’s calculated “performance potential” is outlined, made up of a number of performance measures: extraction time, memory requirements, variance, covariance and classification success.

...read moreread less

Abstract: Feature selection is an important phase in most pattern recognition problems, especially when the space of the extracted features is very large. Feature selection methods attempt to reduce the feature space to satisfy certain objectives. We propose the concept of defining a performance potential as a measure of the effectiveness of the set of selected features. This paper begins by outlining a ranking scheme for features based on a feature’s calculated “performance potential”. The performance potential is made up of a number of performance measures: extraction time, memory requirements, variance, covariance and classification success. An adaptive scheme is proposed to process a number of initial features and arrive at the “best” subset based on their performance potential. The approach is applied to a texture analysis problem. The results of the testing of the approach point to conclusions concerning its effectiveness.

...read moreread less

Comparison of Selection Procedures and Validation of Criterion Used in Selection of Significant Control Variates of a Simulation Model

[...]

James A. Gigliotti

01 Mar 1990

TL;DR: The purpose of this thesis was to revise and extend the capabilities of existing software for selecting the significant control variables of a simulation model, based on a newly developed selection criterion, and validate the effectiveness of the new selection criterion by comparison to results derived using other common selection criteria.

...read moreread less

Abstract: : The purpose of this thesis was three-fold. The first purpose was to revise and extend the capabilities of existing software for selecting the significant control variables of a simulation model, based on a newly developed selection criterion. The second purpose was to compare the results obtained using the revised software employing two different selection procedures. And the third purpose was then to validate the effectiveness of the new selection criterion by comparison to results derived using other common selection criteria. Extensive revision of the existing software was necessary to prepare it for use. Initially, the software was revised to extend its adaptability to evaluating new data and to increase user friendliness. Next, a new procedure was added to the software to permit it to evaluate data using a Stepwise (Forward Selection) procedure. Previously, the software only performed analysis of the data through an Enumerated Subsets approach. After revision of the software was complete, it was renamed the Variable Subset Selection Program (VSSP). Once the VSSP was ready, it was used to evaluate two types of data. The first type of data was created using a known stochastic structure. Three sets of this data was used, each set using a different covariance structure between the responses and control variables. The second type of data was created from an untested simulation model.

...read moreread less

Proceedings Article•DOI•

Statistical feature selection for isolated word recognition

[...]

Eduardo Lleida¹, C. Nadeu¹, E. Monte¹, J. B. Mariño¹•Institutions (1)

ETSI¹

03 Apr 1990

TL;DR: A procedure for feature selection in isolated word recognition is discussed and the speech recognition results show a significant improvement in the recognition performance with a digit database and the confusable E-set.

...read moreread less

Abstract: A procedure for feature selection in isolated word recognition is discussed. The feature selection is performed in two steps. The first step takes into account the temporal correlation among feature vectors in order to obtain a transformation matrix which projects the initial template of N feature vectors to a new space where they are uncorrelated. This step gives a new template of M feature vectors, where M >

...read moreread less

Proceedings Article•DOI•

Object Recognition with Adaptive Decision Trees

[...]

Matthew O. Ward¹, Wai Ping Lam²•Institutions (2)

Worcester Polytechnic Institute¹, Bell Labs²

01 Mar 1990

TL;DR: This paper describes a system which has been developed based on conjecture that humans learn to recognize objects by incrementally modifying and extending an internal representation based on the characteristics which distinguish objects from the rest of the environment.

...read moreread less

Abstract: It is conjectured that humans learn to recognize objects by incrementally modifying and extending an internal representation based on the characteristics which distinguish objects from the rest of the environment. As new objects are encountered, it is often required to recall similar yet distinct objects and determine what differentiates the new objects from the old. Sometimes all that is required is to refine the allowable range for a particular feature, i.e., use a higher level of precision. Other times a previously useful feature must be discarded for a more powerful one in order to perform efficient recognition. This paper describes a system which has been developed based on this conjecture. Initially, a decision tree is created for recognizing a set of training objects using an automatically selected subset of extractable features. Factors involved in the creation include the cost of extraction and comparison of features, their discriminating strength within the domain, and the stability or separability of classes of objects using the features. The system then allows incremental, local modification of the tree to accommodate new objects or instances of old objects which were incorrectly identified. Various strategies for tree modification have been implemented, some of which guarantee the correct recognition of objects previously recognized and others which require some degree of retraining to maintain perfect recollection. Strategy selection is based on the technique which minimizes a metric based on the increased cost and complexity of the tree and the potential decrease in the stability of recognition.

...read moreread less

The Multiclass Probabilistic Neural Network (PNN) Classifier

[...]

Anthony Zaknich

01 Oct 1990

TL;DR: The PNN method has been successfully used to distinguish amongst the resonant sounds of five thin metal gongs of different regular shapes having the same areas and thicknesses and can be usefully generalised to other similar classification problems.

...read moreread less

Abstract: This paper describes a general multiclass classification algorithm called the Probabilistic Neural Network (PNN) (Specht, 1988). Its decision surfaces approach the Bayes- optimal boundaries by non-parametric probability density function (PDF) estimation as the number of training samples grow. Theoretical and practical aspects of the PNN classification method are discussed as well as its advanatages and disadvantages in comparison to the Backpropagation Network (BN). The algorithm has been implemented primarily as a research tool for feature selection and classification for general Pattern Recognition (PR) problems. It will also be used for classifier-directed signal processing tasks. The method has been successfully used to distinguish amongst the resonant sounds of five thin metal gongs of different regular shapes having the same areas and thicknesses. This example application includes a description of data sampling, primary analysis, feature selection and classification which can be usefully generalised to other similar classification problems. Some other PNN applications investigated by the author and others mentioned in the literature are described to show some of the PNN''sfeatures and uses. These include a Bayes-optimal maximum liklehood signal detector, ship hull classification from sonar signals and electrocardiogram classifications from QRX complexes.

...read moreread less

Proceedings Article•DOI•

Optical Implementation Of The Kittler-Young Transform For Image Classification

[...]

Xiaoyang Li¹, Francis T. S. Yu¹, Don A. Gregory²•Institutions (2)

Pennsylvania State University¹, United States Department of the Army²

05 Feb 1990

TL;DR: The Kittler-Young (K-Y) transform is a nonparametric method for feature extraction that ensures the information of class variances and mean squares are utilized optimally in feature selection.

...read moreread less

Abstract: The Kittler-Young (K-Y) transform is a nonparametric method for feature extraction. The important property of the K-Y transform is that the information of class variances and mean squares are utilized optimally in feature selection. A joint transform correlator (JTC) is used to extract the features of the K-Y transform from input images optically. Making use of these features, classifications are performed on a microcomputer.

...read moreread less

Journal Article•DOI•

An efficient user interface based on maximizing shared information

[...]

Joe Suzuki¹, Toshiyasu Matsushima¹, Shigeichi Hirsawa¹, Hiroshige Inazumi•Institutions (1)

Waseda University¹

01 Jan 1990-Electronics and Communications in Japan Part Iii-fundamental Electronic Science

TL;DR: A new method is proposed for determining feature selection ordering and a stopping rule which focuses on the remaining patterns at each stage, and which maximizes the value of mutual information between the user's responses and the required pattern.

...read moreread less

Abstract: In designing systems with a human-computer interface with a minimum number of interactions, there are two issues to consider: the determination of a proper sequence of questions by the user, and proper termination by the computer system, based on previous instructions. In short, these issues are those of feature selection ordering and a stopping rule for pattern recognition processes. Conventional treatments of these problems have been investigated from the viewpoint of an average error rate. When the number of patterns is large, however, and the number of instructions to be terminated is large, from the point of view of user interface efficiency, and the average error rate is not an effective indicator at the intermediate stage. In this paper, a new method is proposed for determining feature selection ordering and a stopping rule which focuses on the remaining patterns at each stage, and which maximizes the value of mutual information between the user's responses and the required pattern. Important properties associated with this scheme have been demonstrated while evaluating its performance via computer simulation. One is that the average information gain at each stage decreases monotonically, and another is that this scheme produces the minimum error rate.

...read moreread less

Journal Article•DOI•

Clustering Tendency Applied to Chemical Feature Selection

[...]

Peter C. Jurs¹, Richard G. Lawson¹•Institutions (1)

Pennsylvania State University¹

01 Oct 1990-Drug Information Journal

TL;DR: The goal in this work has been to use variables’ contribution to clustering tendency to distinguish those that contribute to clusters from those variables that do not, and to choose the smallest subsets of variables that will support clustering.

...read moreread less

Abstract: Methods for feature selection in cluster analysis are not yet well established, although research has demonstrated clearly that extraneous descriptors can mask natural clusters in data. The goal in...

...read moreread less