Showing papers on "Feature selection published in 1995"

PDF

Open Access

Journal Article•DOI•

Bayesian Model Selection in Social Research

[...]

01 Jan 1995-Sociological Methodology

TL;DR: In this article, a Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented, which is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software.

...read moreread less

Abstract: It is argued that P-values and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a single model, they ignore model uncertainty and so underestimate the uncertainty about quantities of interest. The Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented. Implementing this is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software. Specific results are presented for most of the types of model commonly used in sociology. It is shown that this approach overcomes the difficulties with P-values and standard model selection procedures based on them. It also allows easy comparison of nonnested models, and permits the quantification of the evidence for a null hypothesis of interest, such as a convergence theory or a hypothesis about societal norms.

...read moreread less

6,100 citations

Nonlinear Time Series Analysis.

[...]

James A. Stewart

01 Mar 1995

TL;DR: This thesis applies neural network feature selection techniques to multivariate time series data to improve prediction of a target time series and results indicate that the Stochastics and RSI indicators result in better prediction results than the moving averages.

...read moreread less

Abstract: : This thesis applies neural network feature selection techniques to multivariate time series data to improve prediction of a target time series. Two approaches to feature selection are used. First, a subset enumeration method is used to determine which financial indicators are most useful for aiding in prediction of the S&P 500 futures daily price. The candidate indicators evaluated include RSI, Stochastics and several moving averages. Results indicate that the Stochastics and RSI indicators result in better prediction results than the moving averages. The second approach to feature selection is calculation of individual saliency metrics. A new decision boundary-based individual saliency metric, and a classifier independent saliency metric are developed and tested. Ruck's saliency metric, the decision boundary based saliency metric, and the classifier independent saliency metric are compared for a data set consisting of the RSI and Stochastics indicators as well as delayed closing price values. The decision based metric and the Ruck metric results are similar, but the classifier independent metric agrees with neither of the other metrics. The nine most salient features, determined by the decision boundary based metric, are used to train a neural network and the results are presented and compared to other published results. (AN)

...read moreread less

1,545 citations

Proceedings Article•DOI•

Chi2: feature selection and discretization of numeric attributes

[...]

Huan Liu¹, Rudy Setiono¹•Institutions (1)

National University of Singapore¹

05 Nov 1995

TL;DR: Chi2 is a simple and general algorithm that uses the /spl chi//sup 2/ statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data, and achieves feature selection via discretization.

...read moreread less

Abstract: Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant attributes. This paper describes Chi2 a simple and general algorithm that uses the /spl chi//sup 2/ statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data, and achieves feature selection via discretization. The empirical results demonstrate that Chi/sup 2/ is effective in feature selection and discretization of numeric and ordinal attributes.

...read moreread less

960 citations

Book•

Chemometrics in Analytical Spectroscopy

[...]

Mike J. Adams

01 Aug 1995

TL;DR: Descriptive statistics the acquisition and enhancement of data feature selection and extraction pattern recognition - unsupervised analysis pattern Recognition - supervised learning calibration and regression analysis matrix tools and operations.

...read moreread less

Abstract: Descriptive statistics the acquisition and enhancement of data feature selection and extraction pattern recognition - unsupervised analysis pattern recognition - supervised learning calibration and regression analysis matrix tools and operations.

...read moreread less

429 citations

Proceedings Article•

Feature subset selection using the wrapper method: overfltting and dynamic search space topology

[...]

Ron Kohavi¹, Dan Sommerfield¹•Institutions (1)

Stanford University¹

20 Aug 1995

TL;DR: This work introduces compound operators that dynamically change the topology of the search space to better utilize the information available from the evaluation of feature subsets and shows that compound operators unify previous approaches that deal with relevant and irrelevant features.

...read moreread less

Abstract: In the wrapper approach to feature subset selection, a search for an optimal set of features is made using the induction algorithm as a black box. The estimated future performance of the algorithm is the heuristic guiding the search. Statistical methods for feature subset selection including forward selection, backward elimination, and their stepwise variants can be viewed as simple hill-climbing techniques in the space of feature subsets. We utilize best-first search to find a good feature subset and discuss overfitting problems that may be associated with searching too many feature subsets. We introduce compound operators that dynamically change the topology of the search space to better utilize the information available from the evaluation of feature subsets. We show that compound operators unify previous approaches that deal with relevant and irrelevant features. The improved feature subset selection yields significant improvements for real-world datasets when using the ID3 and the Naive-Bayes induction algorithms.

...read moreread less

358 citations

Journal Article•DOI•

Predictive Model Selection

[...]

Purushottam W. Laud, Joseph G. Ibrahim¹•Institutions (1)

Harvard University¹

01 Jan 1995-Journal of the royal statistical society series b-methodological

TL;DR: In this article, a predictive Bayesian viewpoint is advocated to avoid the specification of prior probabilities for the candidate models and the detailed interpretation of the parameters in each model, and using criteria derived from a certain predictive density and a prior specification that emphasizes the observables, they implement the proposed methodology for three common problems arising in normal linear models: variable subset selection, selection of a transformation of predictor variables and estimation of a parametric variance function.

...read moreread less

Abstract: We consider the problem of selecting one model from a large class of plausible models. A predictive Bayesian viewpoint is advocated to avoid the specification of prior probabilities for the candidate models and the detailed interpretation of the parameters in each model. Using criteria derived from a certain predictive density and a prior specification that emphasizes the observables, we implement the proposed methodology for three common problems arising in normal linear models: variable subset selection, selection of a transformation of predictor variables and estimation of a parametric variance function. Interpretation of the relative magnitudes of the criterion values for various models is facilitated by a calibration of the criteria. Relationships between the proposed criteria and other well-known criteria are examined

...read moreread less

337 citations

Journal Article•DOI•

Determining input features for multilayer perceptrons

[...]

Lisa M. Belue¹, Kenneth W. Bauer¹•Institutions (1)

Air Force Institute of Technology¹

01 Mar 1995-Neurocomputing

TL;DR: This method of determining input features is shown to result in more accurate, faster training multilayer perceptron classifiers.

...read moreread less

161 citations

Proceedings Article•

Hybrid learning using genetic algorithms and decision trees for pattern classification

[...]

J. Bala¹, J. Huang¹, H. Vafaie¹, K. Dejong¹, Harry Wechsler¹ - Show less +1 more•Institutions (1)

George Mason University¹

20 Aug 1995

TL;DR: A hybrid learning methodology that integrates genetic algorithms (GAs) and decision tree learning (ID3) in order to evolve optimal subsets of discriminatory features for robust pattern classification is introduced.

...read moreread less

Abstract: This paper introduces a hybrid learning methodology that integrates genetic algorithms (GAs) and decision tree learning (ID3) in order to evolve optimal subsets of discriminatory features for robust pattern classification. A GA is used to search the space of all possible subsets of a large set of candidate discrimination features. For a given feature subset, ID3 is invoked to produce a decision tree. The classification performance of the decision tree on unseen data is used as a measure of fitness for the given feature set, which, in turn, is used by the GA to evolve better feature sets. This GA-ID3 process iterates until a feature subset is found with satisfactory classification performance. Experimental results are presented which illustrate the feasibility of our approach on difficult problems involving recognizing visual concepts in satellite and facial image data. The results also show improved classification performance and reduced description complexity when compared against standard methods for feature selection.

...read moreread less

139 citations

Journal Article•DOI•

Feature selection in the pattern classification problem of digital chest radiograph segmentation

[...]

Michael F. McNitt-Gray¹, H.K. Huang, James Sayre•Institutions (1)

University of California, Los Angeles¹

01 Jan 1995-IEEE Transactions on Medical Imaging

TL;DR: This work shows that, in the authors' pattern classification problem, using a feature selection step reduced the number of features used, reduced the processing time requirements, and gave results comparable to the full set of features.

...read moreread less

Abstract: In pattern classification problems, the choice of variables to include in the feature vector is a difficult one. The authors have investigated the use of stepwise discriminant analysis as a feature selection step in the problem of segmenting digital chest radiographs. In this problem, locally calculated features are used to classify pixels into one of several anatomic classes. The feature selection step was used to choose a subset of features which gave performance equivalent to the entire set of candidate features, while utilizing less computational resources. The impact of using the reduced/selected feature set on classifier performance is evaluated for two classifiers: a linear discriminator and a neural network. The results from the reduced/selected feature set were compared to that of the full feature set as well as a randomly selected reduced feature set. The results of the different feature sets were also compared after applying an additional postprocessing step which used a rule-based spatial information heuristic to improve the classification results. This work shows that, in the authors' pattern classification problem, using a feature selection step reduced the number of features used, reduced the processing time requirements, and gave results comparable to the full set of features. >

...read moreread less

134 citations

Journal Article•DOI•

Comparison of multivariate methods based on latent vectors and methods based on wavelength selection for the analysis of near-infrared spectroscopic data

[...]

Delphine Jouan-Rimbaud¹, Beata Walczak¹, Desire Massart¹, K.A. Prebble²•Institutions (2)

Vrije Universiteit Brussel¹, Wellcome Trust²

10 Apr 1995-Analytica Chimica Acta

TL;DR: In this paper, a comparison of several calibration methods (principal component regression (PCR), partial least squares, multiple linear regression), with and without feature selection, applied on near-infrared spectroscopic data is presented for a pharmaceutical application.

...read moreread less

108 citations

Journal Article•DOI•

Spectral transformation and wavelength selection in near-infrared spectra classification

[...]

Wen Wu¹, Beata Walczak¹, Desire Massart¹, K.A. Prebble²•Institutions (2)

Vrije Universiteit Brussel¹, Wellcome Trust²

10 Nov 1995-Analytica Chimica Acta

TL;DR: In this paper, the authors used near-infrared spectroscopy to discriminate between different dosage strengths of tablets in blister packs, and three data transformation methods were studied, the second derivative appears to be the most effective transformation.

...read moreread less

Book Chapter•DOI•

Text categorization and relational learning

[...]

William W. Cohen¹•Institutions (1)

Bell Labs¹

09 Jul 1995

TL;DR: It is shown that FOIL usually forms classifiers with lower error rates and higher rates of precision and recall with a relational encoding than with a propositional encoding, and its performance can be improved by relation selection, a first order analog of feature selection.

...read moreread less

Abstract: We evaluate the first order learning system FOIL on a series of text categorization problems. It is shown that FOIL usually forms classifiers with lower error rates and higher rates of precision and recall with a relational encoding than with a propositional encoding. We show that FOIL's performance can be improved by relation selection, a first order analog of feature selection. Relation selection improves FOIL's performance as measured by any of recall, precision, F-measure, or error rate. With an appropriate level of relation selection, FOIL appears to be competitive with or superior to existing propositional techniques.

...read moreread less

Journal Article•DOI•

Feature selection based on the approximation of class densities by finite mixtures of special type

[...]

Pavel Pudil¹, Jana Novovičová¹, N. Choakjarernwanit¹, Josef Kittler¹•Institutions (1)

University of Surrey¹

01 Sep 1995-Pattern Recognition

TL;DR: A new method of feature selection based on the approximation of class conditional densities by a mixture of parameterized densities of a special type, suitable especially for multimodal data, is presented.

...read moreread less

Journal Article•DOI•

Consistent Variable Selection in Linear Models

[...]

Xiaodong Zheng¹, Wei-Yin Loh²•Institutions (2)

Utah State University¹, University of Wisconsin-Madison²

01 Mar 1995-Journal of the American Statistical Association

TL;DR: In this paper, a method of estimating linear model dimension and variable selection is proposed based on a new class of penalty functions and a procedure of sorting covariates based on t-statistics.

...read moreread less

Abstract: A method of estimating linear model dimension and variable selection is proposed This new criterion, which generalizes the Cp criterion, the Akaike information criterion (AIC), the Bayes information criterion, and the phiv criterion and is consistent under certain conditions, is based on a new class of penalty functions and a procedure of sorting covariates based on t-statistics In the course of introducing this method, we discuss the important role of the penalty function in the consistency of model dimension estimation and in variable selection The proposed method requires less computation than resampling-based methods that search over all subsets of covariates for the true model Simulation results show that the new method is superior to the Cp criterion and AIC in finite-sample situations as well

...read moreread less

Book Chapter•DOI•

Stochastic search variable selection

[...]

Walter R. Gilks, Sylvia Richardson, David Spiegelhalter

01 Dec 1995

Proceedings Article•DOI•

Genetic algorithms as a tool for restructuring feature space representations

[...]

H. Vafaie¹, K. De Jong¹•Institutions (1)

George Mason University¹

05 Nov 1995

TL;DR: The approach involves the use of genetic algorithms as a "front end" to a traditional tree induction system (ID3) in order to find the best feature set to be used by the induction system.

...read moreread less

Abstract: This paper describes an approach being explored to improve the usefulness of machine learning techniques to classify complex, real world data. The approach involves the use of genetic algorithms as a "front end" to a traditional tree induction system (ID3) in order to find the best feature set to be used by the induction system. This approach has been implemented and tested on difficult texture classification problems. The results are encouraging and indicate significant advantages of the presented approach.

...read moreread less

Book Chapter•DOI•

An inductive learning approach to prognostic prediction

[...]

W. Nick Street¹, Olvi L. Mangasarian¹, William H. Wolberg¹•Institutions (1)

University of Wisconsin-Madison¹

09 Jul 1995

TL;DR: An inductive learning method based on linear programming that predicts recurrence times using censored training examples, that is, examples in which the available training output may be only a lower bound on the “right answer.”

...read moreread less

Abstract: This paper introduces the Recurrence Surface Approximation, an inductive learning method based on linear programming that predicts recurrence times using censored training examples, that is, examples in which the available training output may be only a lower bound on the “right answer.” This approach is augmented with a feature selection method that chooses an appropriate feature set within the context of the linear programming generalizer. Computational results in the field of breast cancer prognosis are shown. A straightforward translation of the prediction method to an artificial neural network model is also proposed.

...read moreread less

Proceedings Article•DOI•

Input variable selection for neural networks: application to predicting the U.S. business cycle

[...]

J. Utans¹, John Moody¹, S. Rehfuss¹, Hava T. Siegelmann²•Institutions (2)

Oregon Health & Science University¹, Technion – Israel Institute of Technology²

09 Apr 1995

TL;DR: This work demonstrates a technique called "sensitivity-based pruning" (SBP) that removes irrelevant input variables from a nonlinear forecasting or regression model and makes use of a saliency measure computed for each input variable and uses estimates of prediction risk for determining the number of input variables to prune.

...read moreread less

Abstract: Selecting a "best subset" of input variables is a critical issue in forecasting. This is especially true when the number of available input series is large, and an exhaustive search through all combinations of variables is computationally infeasible. Inclusion of irrelevant variables not only doesn't help prediction, but can reduce forecasting accuracy through added noise or systematic bias. We demonstrate a technique called "sensitivity-based pruning" (SBP) that removes irrelevant input variables from a nonlinear forecasting or regression model. The technique makes use of a saliency measure computed for each input variable and uses estimates of prediction risk for determining the number of input variables to prune. We present preliminary results of the SBP technique applied to neural network predictors of a key business cycle measure, the US Index of Industrial Production.

...read moreread less

Dissertation•

Wavelet neural networks for eeg modeling and classification

[...]

Javier Echauz, George J. Vachtsevanos

01 Jan 1995

TL;DR: Wavelet neural networks are introduced as a new class of elliptic basis function neural networks and wavelet networks, and are applied to the numerical modeling and classification of EEGs, finding them to be ideally suited for problems of EEG analysis.

...read moreread less

Abstract: Wavelet neural networks (WNNs) are introduced as a new class of elliptic basis function neural networks and wavelet networks, and are applied to the numerical modeling and classification of EEGs. The implementation of the networks is achieved in two possibly cyclical stages of structure and parameter identification. For structure identification, two methods are developed: one generic, based on data clusterings, and one specific, using wavelet analysis. For parameter identification, two methods are also implemented: the Levenberg-Marquardt algorithm and a genetic algorithm of ranking type. The problem of model generalization is considered from both, a crossvalidation and a regularization point of view. For the latter, a corrected average squared error (CASE) is derived as a new model selection criterion that does not rely on assumptions about error distributions or modeling paradigms. For EEG modeling, the nonlinear dynamics framework is employed in the reconstruction of state-spaces via the embedding scheme. Preprocessing for the resulting state-vector is introduced in terms of decorrelation and compression. The naive application of chaos theory to EEGs is shown to be useful in feature extraction, but not in corroborating theories about the nature of EEGs. For the latter, the concept of modeling resolution is introduced. It is shown that the chaos-in-the-brain question becomes meaningful only as a function of modeling resolution. For EEG classification, a general WNN classification system is implemented as a cascade of synergistic feature selection, WNN nonlinear discrimination, and decision logic. A feature library is described including raw and model-based features, ranging from traditional measures to chaotic indicators. Training for maximum-likelihood classification is shown to be inductively feasible via a decoder-type WNN classifier adjusted with nonanalytic methods. WNNs were found to be ideally suited for problems of EEG analysis due to the long-duration/low-frequency and short-duration/high-frequency structure of EEG signals.

...read moreread less

Journal Article•DOI•

Selection of features and evaluation of visual measurements during robotic visual servoing tasks

[...]

Nikolaos Papanikolopoulos¹•Institutions (1)

University of Minnesota¹

01 Jul 1995-Journal of Intelligent and Robotic Systems

TL;DR: This paper presents some of the computer vision techniques employed in order to automatically select features, measure features' displacements, and evaluate measurements during robotic visual servoing tasks and the most robust proved to be the Sum-of-Squared Differences (SSD) optical flow technique.

...read moreread less

Abstract: This paper presents some of the computer vision techniques that were employed in order to automatically select features, measure features' displacements, and evaluate measurements during robotic visual servoing tasks. We experimented with a lot of different techniques, but the most robust proved to be the Sum-of-Squared Differences (SSD) optical flow technique. In addition, several techniques for the evaluation of the measurements are presented. One important characteristic of these techniques is that they can also be used for the selection of features for tracking in conjunction with several numerical criteria that guarantee the robustness of the servoing. These techniques are important aspects of our work since they can be used either on-line or off-line. An extension of the SSD measure to color images is presented and the results from the application of these techniques to real images are discussed. Finally, the derivation of depth maps through the controlled motion of the handeye system is outlined and the important role of the automatic feature selection algorithm in the accurate computation of the depth-related parameters is highlighted.

...read moreread less

Feature selection via the discovery of simple classification rules

[...]

Geoffrey Holmes, Craig G. Nevill-Manning

01 Apr 1995

TL;DR: A method to achieve this using a very simple algorithm that gives good performance across different supervised learning schemes and when compared to one of the most common methods for feature subset selection.

...read moreread less

Abstract: It has been our experience that in order to obtain useful results using supervised learning of real-world datasets it is necessary to perform feature subset selection and to perform many experiments using computed aggregates from the most relevant features. It is, therefore, important to look for selection algorithms that work quickly and accurately so that these experiments can be performed in a reasonable length of time, preferably interactively. This paper suggests a method to achieve this using a very simple algorithm that gives good performance across different supervised learning schemes and when compared to one of the most common methods for feature subset selection.

...read moreread less

Journal Article•DOI•

Threshold variable selection in open‐loop threshold autoregressive models

[...]

Rong Chen¹•Institutions (1)

Texas A&M University¹

01 Sep 1995-Journal of Time Series Analysis

TL;DR: A digression concept is introduced and two simple algorithms to classify the observations without knowing the threshold variable are proposed and used with several graphical procedures to search for the most suitable threshold variable.

...read moreread less

Abstract: . An open-loop threshold autoregressive model is defined as The main difficulty for building such a model is that the threshold variable Zt is usually unknown. In practice, there may exist many possible candidates for the threshold variable Zt. It is difficult and tedious, if not impossible, to search for the best among all the candidates using standard model selection procedures. In this paper, we introduce a digression concept and propose two simple algorithms to classify the observations without knowing the threshold variable. The classification is then used with several graphical procedures to search for the most suitable threshold variable. Simulated and real examples are included to illustrate the proposed procedures.

...read moreread less

Proceedings Article•

Exploiting upper approximation in the rough set methodology

[...]

Jitender S. Deogun¹, Vijay V. Raghavan², Hayri Sever²•Institutions (2)

University of Nebraska–Lincoln¹, Sewanee: The University of the South²

20 Aug 1995

TL;DR: It is proved that the stepwise backward selection algorithm finds a small subset of relevant features that are ideally sufficient and necessary to define target concepts with respect to a given threshold.

...read moreread less

Abstract: In this paper, we investigate enhancements to an upper classifier - a decision algorithm generated by an upper classification method, which is one of the classification methods in rough set theory Specifically, we consider two enhancements First, we present a stepwise backward feature selection algorithm to preprocess a given set of features This is important because rough classification methods are incapable of removing superfluous features We prove that the stepwise backward selection algorithm finds a small subset of relevant features that are ideally sufficient and necessary to define target concepts with respect to a given threshold This threshold value indicates an acceptable degradation in the quality of an upper classifier Second, to make an upper classifier adaptive, we associate it with some kind of frequency information, which we call incremental information An extended decision table is used to represent an adaptive upper classifier It is also used for interpreting an upper classifier either deterministically or nondeterministically

...read moreread less

Patent•

A top down preprocessor for a machine vision system

[...]

Thomas P. Vogl¹, Kim T. Blackwell¹, Daniel L. Alkon¹•Institutions (1)

Environmental Research Institute of Michigan¹

02 Mar 1995

TL;DR: In this paper, an image recognition and classification system includes a preprocessor in which a "top-down" method is used to extract features from an image; an associative learning neural network system, which groups the features into patterns and classifies the patterns: and a feedback mechanism which improves system performance by tuning preprocessor scale, feature detection, and feature selection.

...read moreread less

Abstract: An image recognition and classification system includes a preprocessor in which a "top-down" method is used to extract features from an image; an associative learning neural network system, which groups the features into patterns and classifies the patterns: and a feedback mechanism which improves system performance by tuning preprocessor scale, feature detection, and feature selection.

...read moreread less

Journal Article•DOI•

Approaches for Stereo Matching

[...]

Takouhi Ozanian

04 Jan 1995-Modeling Identification and Control

TL;DR: This review focuses on the last decade's development of the computational stereopsis for recovering three-dimensional information, and special attention is paid to parallelism as a way to achieve the required level of performance.

...read moreread less

Abstract: This review focuses on the last decade's development of the computational stereopsis for recovering three-dimensional information. The main components of the stereo analysis are exposed: image acquisition and camera modeling, feature selection, feature matching and disparity interpretation. A brief survey is given of the well known feature selection approaches and the estimation parameters for this selection are mentioned. The difficulties in identifying correspondent locations in the two images are explained. Methods as to how effectively to constrain the search for correct solution of the correspondence problem are discussed, as are strategies for the whole matching process. Reasons for the occurrence of matching errors are considered. Some recently proposed approaches, employing new ideas in the modeling of stereo matching in terms of energy minimization, are described. Acknowledging the importance of computation time for real-time applications, special attention is paid to parallelism as a way to achieve the required level of performance. The development of trinocular stereo analysis as an alternative to the conventional binocular one, is described. Finally a classification based on the test images for verification of the stereo matching algorithms, is supplied

...read moreread less

Proceedings Article•DOI•

Distinction Sensitive Learning Vector Quantization (DSLVQ) application as a classifier based feature selection method for a Brain Computer Interface

[...]

M. Pregenzer¹, Gert Pfurtscheller¹•Institutions (1)

Graz University of Technology¹

26 Jun 1995

TL;DR: It is shown that optimal electrode positions as well as frequency bands are strongly dependent on each subject and that a subject specific feature selection is when important for BCI systems.

...read moreread less

Abstract: This paper describes a simple but very powerful method for feature selection. The Distinction Sensitive Learning Vector Quantizer (DSLVQ) is a learning classifier which focuses on relevant features according to its own instance based classifications. Two different experiments describe the application of DSLVQ as a feature selector for an EEG-based Brain Computer Interface (BCI) system. It is shown that optimal electrode positions as well as frequency bands are strongly dependent on each subject and that a subject specific feature selection is when important for BCI systems.

...read moreread less

Journal Article•DOI•

Selection of variables, and assessment of their performance, in mixed-variable discriminant analysis

[...]

Wojtek J. Krzanowski¹•Institutions (1)

University of Exeter¹

01 Apr 1995-Computational Statistics & Data Analysis

TL;DR: Several methods have been proposed in recent years for selecting variables in such mixed-variable discrimination situations, and a brief review of the possibilities can be found in this paper, where some of the methods are simply variations on the same basic underlying model (the location model).

...read moreread less

Proceedings Article•DOI•

Projection pursuit in high dimensional data reduction: initial conditions, feature selection and the assumption of normality

[...]

L.O. Jimenez¹, David A. Landgrebe¹•Institutions (1)

Purdue University¹

22 Oct 1995

TL;DR: Using a technique called projection pursuit, a pre-processing dimensional reduction method has been developed based on the optimization of a projection index and a method to estimate an initial value that can more quickly lead to the global maximum is presented.

...read moreread less

Abstract: Supervised classification techniques use labeled samples to train the classifier. Often the number of such samples is limited, thus limiting the precision with which class characteristics can be estimated. As the number of spectral bands becomes large, the limitation on performance imposed by the limited number of training samples can become severe. Such consequences suggest the value of reducing the dimensionality by a pre-processing method that takes advantage of the asymptotic normality of projected data. Using a technique called projection pursuit, a pre-processing dimensional reduction method has been developed based on the optimization of a projection index. A method to estimate an initial value that can more quickly lead to the global maximum is presented for projection pursuit using the Bhattacharyya distance as the projection index.

...read moreread less

Posted Content•

Bayesian Variable Selection with Related Predictors

[...]

Hugh A. Chipman¹•Institutions (1)

University of Chicago¹

30 Oct 1995-arXiv: Data Analysis, Statistics and Probability

TL;DR: In this article, a Bayesian approach that goes beyond the standard independence prior for variable selection is adopted, and preference for certain models is interpreted as prior information, which may then be incorporated in a model selection procedure.

...read moreread less

Abstract: In data sets with many predictors, algorithms for identifying a good subset of predictors are often used. Most such algorithms do not account for any relationships between predictors. For example, stepwise regression might select a model containing an interaction AB but neither main effect A or B. This paper develops mathematical representations of this and other relations between predictors, which may then be incorporated in a model selection procedure. A Bayesian approach that goes beyond the standard independence prior for variable selection is adopted, and preference for certain models is interpreted as prior information. Priors relevant to arbitrary interactions and polynomials, dummy variables for categorical factors, competing predictors, and restrictions on the size of the models are developed. Since the relations developed are for priors, they may be incorporated in any Bayesian variable selection algorithm for any type of linear model. The application of the methods here is illustrated via the Stochastic Search Variable Selection algorithm of George and McCulloch (1993), which is modified to utilize the new priors. The performance of the approach is illustrated with two constructed examples and a computer performance dataset. Keywords: Model Selection, Prior Distributions, Interaction, Dummy Variable

...read moreread less

Proceedings Article•DOI•

arboART: ART based hierarchical clustering and its application to questionnaire data analysis

[...]

Shigekazu Ishihara, Keiko Ishihara, Mitsuo Nagamachi, Yukihiro Matsubara

27 Nov 1995

TL;DR: A hierarchical clustering mechanism that enables to make a tree structure graph of classification result of samples, and find features of each cluster, arboART is utilized to automatic rule generation of Kansei engineering expert systems.

...read moreread less

Abstract: A hierarchical clustering mechanism is designed to analyze multidimensional data and feature selection based on ART-type neural networks. Prototype of clusters obtained from an ART's top-down vectors are sent to another ART. Several ART networks that have different similarity criteria are used for cluster combination. This scheme of hierarchical clustering (arboART) enables to make a tree structure graph of classification result of samples, and find features of each cluster. arboART is utilized to automatic rule generation of Kansei engineering expert systems. Analyzing result on color evaluation experiment by arboART and comparison with conventional multivariate analysis are shown.

...read moreread less