Showing papers on "Dimensionality reduction published in 1996"

PDF

Open Access

Proceedings Article•

[...]

03 Jul 1996

TL;DR: An efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion is given, showing that the algorithm effectively handles datasets with a very large number of features.

...read moreread less

Abstract: In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for defining the theoretically optimal, but computationally intractable, method for feature subset selection is presented. We show that our goal should be to eliminate a feature if it gives us little or no additional information beyond that subsumed by the remaining features. In particular, this will be the case for both irrelevant and redundant features. We then give an efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion. The conditions under which the approximate algorithm is successful are examined. Empirical results are given on a number of data sets, showing that the algorithm effectively handles datasets with a very large number of features.

...read moreread less

1,713 citations

The EM algorithm for mixtures of factor analyzers

[...]

Zoubin Ghahramani¹, Geoffrey E. Hinton•Institutions (1)

University of Toronto¹

21 May 1996

TL;DR: This work presents an exact Expectation{Maximization algorithm for determining the parameters of this mixture of factor analyzers which concurrently performs clustering and dimensionality reduction, and can be thought of as a reduced dimension mixture of Gaussians.

...read moreread less

Abstract: Factor analysis, a statistical method for modeling the covariance structure of high dimensional data using a small number of latent variables, can be extended by allowing di erent local factor models in di erent regions of the input space. This results in a model which concurrently performs clustering and dimensionality reduction, and can be thought of as a reduced dimension mixture of Gaussians. We present an exact Expectation{Maximization algorithm for tting the parameters of this mixture of factor analyzers.

...read moreread less

705 citations

Journal Article•DOI•

Feature-based methods for large scale dynamic programming

[...]

John N. Tsitsiklis¹, Benjamin Van Roy¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1996-Machine Learning

TL;DR: A methodological framework is developed and algorithms that employ two types of feature-based compact representations; that is, representations that involve feature extraction and a relatively simple approximation architecture are developed.

...read moreread less

Abstract: We develop a methodological framework and present a few different ways in which dynamic programming and compact representations can be combined to solve large scale stochastic control problems. In particular, we develop algorithms that employ two types of feature-based compact representations; that is, representations that involve feature extraction and a relatively simple approximation architecture. We prove the convergence of these algorithms and provide bounds on the approximation error. As an example, one of these algorithms is used to generate a strategy for the game of Tetris. Furthermore, we provide a counter-example illustrating the difficulties of integrating compact representations with dynamic programming, which exemplifies the shortcomings of certain simple approaches.

...read moreread less

527 citations

Proceedings Article•DOI•

Parametric feature detection

[...]

Shree K. Nayar¹, Simon Baker¹, Hiroshi Murase•Institutions (1)

Columbia University¹

18 Jun 1996

TL;DR: An algorithm to automatically construct detectors for arbitrary parametric features is proposed and the results of detailed experiments are presented which demonstrate the robustness of feature detection and the accuracy of parameter estimation.

...read moreread less

Abstract: We propose an algorithm to automatically construct feature detectors for arbitrary parametric features. To obtain a high level of robustness we advocate the use of realistic multi-parameter feature models and incorporate optical and sensing effects. Each feature is represented as a densely sampled parametric manifold in a low dimensional subspace of a Hilbert space. During detection, the brightness distribution around each image pixel is projected into the subspace. If the projection lies sufficiently close to the feature manifold, the feature is detected and the location of the closest manifold point yields the feature parameters. The concepts of parameter reduction by normalization, dimension reduction, pattern rejection, and heuristic search are all employed to achieve the required efficiency. By applying the algorithm to appropriate parametric feature models, detectors have been constructed for five features, namely, step edge, roof edge, line, corner, and circular disc. Detailed experiments are reported on the robustness of detection and the accuracy of parameter estimation.

...read moreread less

139 citations

Proceedings Article•

Feature selection and classification - a probabilistic wrapper approach

[...]

Huan Liu, Rudy Setiono

04 Jun 1996

TL;DR: A probabilistic wrapper model is proposed as another method besides the exhaus tive search and the heuristic approach to avoid local minima and exhaustive search and can be used to improve the predictive accuracy of an induction algorithm.

...read moreread less

Abstract: Feature selection is de ned as a problem to nd a minimum set of M features for an inductive al gorithm to achieve the highest predictive accuracy from the data described by the original N features where M N A probabilistic wrapper model is proposed as another method besides the exhaus tive search and the heuristic approach The aim of this model is to avoid local minima and exhaustive search The highest predictive accuracy is the crite rion in search of the smallest M Analysis and ex periments show that this model can e ectively nd relevant features and remove irrelevant ones in the context of improving the predictive accuracy of an induction algorithm It is simple straightforward and providing fast solutions while searching for the optimal The applications of such a model its future work and some related issues are also discussed

...read moreread less

138 citations

Proceedings Article•DOI•

Algorithms for feature selection: An evaluation

[...]

D. Zongker¹, Anil K. Jain¹•Institutions (1)

Michigan State University¹

25 Aug 1996

TL;DR: The results show that the sequential forward floating selection (SFFS) algorithm, proposed by Pudil et al. (1994), dominates the other algorithms tested, and illustrates the dangers of using feature selection in small sample size situations.

...read moreread less

Abstract: A large number of algorithms have been proposed for doing feature subset selection. The goal of this paper is to evaluate the quality of feature subsets generated by the various algorithms, and also compare their computational requirements. Our results show that the sequential forward floating selection (SFFS) algorithm, proposed by Pudil et al. (1994), dominates the other algorithms tested. This paper also illustrates the dangers of using feature selection in small sample size situations. It gives the results of applying feature selection to land use classification of SAR satellite images using four different texture models. Pooling features derived from different texture models, followed by a feature selection results in a substantial improvement in the classification accuracy. Application of feature selection to classification of handprinted characters illustrates the value of feature selection in reducing the number of features needed for classifier design.

...read moreread less

121 citations

Patent•

Feature extraction system and face image recognition system

[...]

Masaki Souma¹, Kenji Nagao¹•Institutions (1)

Panasonic¹

12 Dec 1996

TL;DR: In this paper, a feature extraction system for statistically analyzing a set of samples of feature vectors to calculate a feature being an index for a pattern identification, which is capable of identifying confusing data with a high robustness.

...read moreread less

Abstract: A feature extraction system for statistically analyzing a set of samples of feature vectors to calculate a feature being an index for a pattern identification, which is capable of identifying confusing data with a high robustness. In this system, a storage section stores a feature vector inputted through an input section and a neighborhood vector selection section selects a specific feature vector from the feature vectors existing in the storage section. The specific feature is a neighborhood vector close in distance to the feature vector stored in the storage section. Further, the system is equipped with a feature vector space production section for outputting a partial vector space. The partial vector space is made to maximize the local scattering property of the feature vector when the feature vector is orthogonally projected to that space.

...read moreread less

105 citations

Patent•

Method and apparatus for object recognition

[...]

Robert Lawrence Stephen¹, C. Lee Giles¹•Institutions (1)

Princeton University¹

29 Mar 1996

TL;DR: In this article, a hybrid convolutional neural network (HNN) and a self-organizing map neural network are used for object recognition. But they do not provide invariance to translation, rotation, scale, and deformation.

...read moreread less

Abstract: A hybrid neural network system for object recognition exhibiting local image sampling, a self-organizing map neural network, and a hybrid convolutional neural network. The self-organizing map provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the hybrid convolutional neural network provides for partial invariance to translation, rotation, scale, and deformation. The hybrid convolutional network extracts successively larger features in a hierarchical set of layers. Alternative embodiments using the Karhunen-Loeve transform in place of the self-organizing map, and a multi-layer perceptron in place of the convolutional network are described.

...read moreread less

90 citations

Journal Article•DOI•

Recent developments in discriminant analysis on high dimensional spectral data

[...]

Y. Mallet¹, Danny Coomans¹, O. de Vel¹•Institutions (1)

James Cook University¹

01 Dec 1996-Chemometrics and Intelligent Laboratory Systems

TL;DR: A selection of classifiers and a selection of dimensionality reducing techniques are applied to the discrimination of seagrass spectral data and results indicate a promising future for wavelets in discriminant analysis, and the recently introduced flexible and penalized discriminantAnalysis.

...read moreread less

83 citations

Proceedings Article•DOI•

Efficient retrieval for browsing large image databases

[...]

Daniel Wu¹, Ambuj K. Singh¹, Divyakant Agrawal¹, Amr El Abbadi¹, Terence R. Smith¹ - Show less +1 more•Institutions (1)

University of California, Santa Barbara¹

12 Nov 1996

TL;DR: It is shown that for even moderately large databases (in fact, only 1856 texture images), these approaches do not scale well for exact retrieval, but as a browsing tool, these dimensionality reduction techniques hold much promise.

...read moreread less

Abstract: The management of large image databases poses several interesting and challenging problems. These problems range from ingesting the data and extracting meta-data to the efficient storage and retrieval of the data. Of particular interest are the retrieval methods and user interactions with an image database during browsing. In image databases, the response to a given query is not an exact well-defined set, rather, the user poses a query and expects a set of responses that should contain many possible candidates from which the user chooses the answer set. We first present the browsing model in Alexandria, a digital library for maps and satellite images. Designed for content-based retrieval, the relevant information in an image is encoded in the form of a multi-% dimensional feature vector. Various techniques have been previously proposed for the efficient retrieval of such vectors by reducing the dimensionality of such vectors. We show that for even moderately large databases (in fact, only 1856 texture images), these approaches do not scale well for exact retrieval. However, as a browsing tool, these dimensionality reduction techniques hold much promise.

...read moreread less

72 citations

Book Chapter•DOI•

Learning Bayesian Networks Using Feature Selection

[...]

Gregory Provan, Moninder Singh¹•Institutions (1)

University of Pennsylvania¹

01 Jan 1996

TL;DR: The new approach selects a subset of features that maximizes predictive accuracy prior to the network learning phase, and generates networks that are computationally simpler to evaluate and display predictive accuracy comparable to that of Bayesian networks which model all attributes.

...read moreread less

Abstract: This paper introduces a novel enhancement for learning Bayesian networks with a bias for small, high-predictive-accuracy networks. The new approach selects a subset of features that maximizes predictive accuracy prior to the network learning phase. We examine explicitly the effects of two aspects of the algorithm, feature selection and node ordering. Our approach generates networks that are computationally simpler to evaluate and display predictive accuracy comparable to that of Bayesian networks which model all attributes.

...read moreread less

Local models and Gaussian mixture models for statistical data processing

[...]

Nandakishore Kambhatla¹•Institutions (1)

Oregon Health & Science University¹

03 Oct 1996

TL;DR: Local models or Gaussian mixture models can be efficient tools for dimension reduction, exploratory data analysis, feature extraction, classification and regression, and proposed algorithms for regularizing them are presented.

...read moreread less

Abstract: In this dissertation, we present local linear models for dimension reduction and Gaussian mixture models for classification and regression. When the data has different structure in different parts of the input space, fitting once global model can be slow and inaccurate. Simple learning models can quickly learn the structure of the data in small (local) regions. Thus, local learning techniques can offer us faster and more accurate model fitting. Gaussian mixture models form a soft local model of the data; data points belong to all "local" regions (Gaussians) at once with differing degrees of membership. Thus, mixture models blend together the different (local) models. We show that local linear dimension reduction approximates maximum likelihood signal extraction for a mixture of Gaussians signal-plus-noise model. The thesis of this document is that "local learning models can perform efficient (fast and accurate) data processing". We propose local linear dimension reduction algorithms which partition the input space and build separate low dimensional coordinate systems in disjoint regions of the input space. We compare the local linear models with a global linear model (principal components analysis) and a global non-linear model (five layered auto-associative neural networks). For speech and image data, the local linear models incur about half the error of the global models while training nearly an order of magnitude faster than the neural networks. Under certain conditions, the local linear models are related to a mixture of Gaussians data model. Motivated by the relation between local linear dimension reduction and Gaussians mixture models we present Gaussian mixture models for classification and regression and propose algorithms for regularizing them. Our results with speech phoneme classification and some benchmark regression tasks indicate that the mixture models perform comparably with a global model (neural networks). To summarize, local models or Gaussian mixture models can be efficient tools for dimension reduction, exploratory data analysis, feature extraction, classification and regression.

...read moreread less

Journal Article•DOI•

Artificial neural networks in classification of NIR spectral data: Selection of the input

[...]

Wen Wu¹, D.L. Massart¹•Institutions (1)

Vrije Universiteit Brussel¹

01 Nov 1996-Chemometrics and Intelligent Laboratory Systems

TL;DR: The results suggest that PCA/FIT is useful way to pretreat the data as input of NN, and univariate feature selection followed by PCA reduces somewhat the size of the structure of the NN for some data sets.

...read moreread less

Proceedings Article•

An evaluation of statistical approaches to MEDLINE indexing.

[...]

Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1996

TL;DR: This paper applies two statistical learning algorithms, the Linear Least Squares Fit (LLSF) mapping and a Nearest Neighbor classifier named ExpNet, to a large collection of MEDLINE documents, and both LLSF and ExpNet successfully scaled to this very large problem.

...read moreread less

Abstract: Whether or not high accuracy classification methods can be scaled to large applications is crucial for the ultimate usefulness of such methods in text categorization. This paper applies two statistical learning algorithms, the Linear Least Squares Fit (LLSF) mapping and a Nearest Neighbor classifier named ExpNet, to a large collection of MEDLINE documents. With the use of suitable dimensionality reduction techniques and efficient algorithms, both LLSF and ExpNet successfully scaled to this very large problem with a result significantly outperforming word-matching and other automatic learning methods applied to the same corpus.

...read moreread less

Journal Article•DOI•

Dimensionality reduction via discretization

[...]

Huan Liu¹, Rudy Setiono¹•Institutions (1)

National University of Singapore¹

01 Feb 1996-Knowledge Based Systems

TL;DR: A method that reduces data vertically and horizontally, keeps the discriminating power of the original data, and paves the way for extracting concepts from the raw data is introduced.

...read moreread less

Abstract: The existence of numeric data and large numbers of records in a database present a challenging task in terms of explicit concepts extraction from the raw data. The paper introduces a method that reduces data vertically and horizontally, keeps the discriminating power of the original data, and paves the way for extracting concepts. The method is based on discretization (vertical reduction) and feature selection (horizontal reduction). The experimental results show that (a) the data can be effectively reduced by the proposed method; (b) the predictive accuracy of a classifier (C4.5) can be improved after data and dimensionality reduction; and (c) the classification rules learned are simpler.

...read moreread less

Proceedings Article•DOI•

Convolutional neural networks for face recognition

[...]

Steve Lawrence¹, C.L. Giles¹, Ah Chung Tsoi²•Institutions (2)

Princeton University¹, University of Queensland²

18 Jun 1996

TL;DR: This work presents a hybrid neural network solution which is capable of rapid classification, requires only fast, approximate normalization and preprocessing, and consistently exhibits better classification performance than the eigenfaces approach on the database.

...read moreread less

Abstract: Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult. We present a hybrid neural network solution which compares favorably with other methods. The system combines local image sampling, a self-organizing map neural network, and a convolutional neural network. The self-organizing map provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides for partial invariance to translation, rotation, scale, and deformation. The method is capable of rapid classification, requires only fast, approximate normalization and preprocessing, and consistently exhibits better classification performance than the eigenfaces approach on the database considered as the number of images per person in the training database is varied from 1 to 5. With 5 images per person the proposed method and eigenfaces result in 3.8% and 10.5% error respectively. The recognizer provides a measure of confidence in its output and classification error approaches zero when rejecting as few as 10% of the examples. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details.

...read moreread less

Proceedings Article•

Performing effective feature selection by investigating the deep structure of the data

[...]

Marco Richeldi¹, Pier Luca Lanzi²•Institutions (2)

CSELT¹, Polytechnic University of Milan²

02 Aug 1996

TL;DR: ADHOC (Automatic Discoverer of Higher-Order Correlation), an algorithm that combines the advantages of both filter and feedback models to enhance the understanding of the given data and to increase the efficiency of the feature selection process is introduced.

...read moreread less

Abstract: This paper introduces ADHOC (Automatic Discoverer of Higher-Order Correlation), an algorithm that combines the advantages of both filter and feedback models to enhance the understanding of the given data and to increase the efficiency of the feature selection process. ADHOC partitions the observed features into a number of groups, called factors, that reflect the major dimensions of the phenomenon under consideration. The set of learned factors define the starting point of the search of the best performing feature subset. A genetic algorithm is used to explore the feature space originated by the factors and to determine the set of most informative feature configurations. The feature subset evaluation function is the performance of the induction algorithm. This approach offers three main advantages: (i) the likelihood of selecting good performing features grows; (ii) the complexity of search diminishes consistently; (iii) the possibility of selecting a bad feature subset due to overfitting problems decreases. Extensive experiments on real-world data have been conducted to demonstrate the effectiveness of ADHOC as data reduction technique as well as feature selection method.

...read moreread less

Proceedings Article•DOI•

Feature extraction by neural network nonlinear mapping for pattern classification

[...]

Boaz Lerner¹, Hugo Guterman, Mayer Aladjem, Its'hak Dinstein, Yitzhak Romem - Show less +1 more•Institutions (1)

Ben-Gurion University of the Negev¹

25 Aug 1996

TL;DR: The experiments reveal that Sammon's mapping, the multilayer perceptron (MLP) and the principal component analysis (PCA) based feature extractors yield similar classification performance, and the PCA based initialization affords better human chromosome classification performance even when using a few eigenvectors.

...read moreread less

Abstract: Feature extraction for exploratory data projection aims for data visualization by a projection of a high-dimensional space onto two or three-dimensional space, while feature extraction for classification generally requires more than two or three features. We study extraction of more than three features, using neural network (NN) implementation of Sammon's mapping to be applied for classification. The experiments reveal that Sammon's mapping, the multilayer perceptron (MLP) and the principal component analysis (PCA) based feature extractors yield similar classification performance. We investigate a random- and PCA-based initializations of Sammon's mapping. When the PCA is applied to initialize Sammon's projection, only one experiment is required and only a fraction of the training period is needed to achieve performance comparable with that of the random initialization. Furthermore, the PCA based initialization affords better human chromosome classification performance even when using a few eigenvectors.

...read moreread less

Proceedings Article•DOI•

Linear feature extractors based on mutual information

[...]

Kurt D. Bollacker, Joydeep Ghosh¹•Institutions (1)

University of Texas at Austin¹

25 Aug 1996

TL;DR: These feature extractors consider general dependencies between features and class labels, as opposed to well known linear methods such as PCA which does not consider class labels and LDA, which uses only simple low order dependencies.

...read moreread less

Abstract: This paper presents and evaluates two linear feature extractors based on mutual information. These feature extractors consider general dependencies between features and class labels, as opposed to well known linear methods such as PCA which does not consider class labels and LDA, which uses only simple low order dependencies. As evidenced by several simulations on high dimensional data sets, the proposed techniques provide superior feature extraction and better dimensionality reduction while having similar computational requirements.

...read moreread less

Proceedings Article•DOI•

Mutual information feature extractors for neural classifiers

[...]

Kurt D. Bollacker, Joydeep Ghosh

03 Jun 1996

TL;DR: These feature extractors consider general dependencies between features and class labels, as opposed to statistical techniques such as PCA which does not consider class labels and LDA, which uses only simple first order dependencies.

...read moreread less

Abstract: Presents and evaluates two linear feature extractors based on mutual information. These feature extractors consider general dependencies between features and class labels, as opposed to statistical techniques such as PCA which does not consider class labels and LDA, which uses only simple first order dependencies. As evidenced by several simulations on high dimensional data sets, the proposed techniques provide superior feature extraction and better dimensionality reduction while having similar computational requirements.

...read moreread less

Journal Article•DOI•

Test of linear trend in eigenvalues of a covariance matrix with application to data analysis.

[...]

Peter M. Bentler¹, Ke-Hai Yuan¹•Institutions (1)

University of California, Los Angeles¹

01 Nov 1996-British Journal of Mathematical and Statistical Psychology

TL;DR: This paper develops a formal statistical test for the 'scree plot', a special case of this test is the classical test for equality of eigenvalues which has been suggested in several texts as the criterion to decide the number of principal components to retain.

...read moreread less

Abstract: Principal component analysis and factor analysis are the most widely used tools for dimension reduction in data analysis. Both methods require some good criterion to judge the number of dimensions to be kept. The classical method focuses on testing the equality of eigenvalues. As real data hardly have this property, practitioners turn to some ad hoc criterion in judging the dimensionality of their data. One such popular method, the ‘scree test’ or ‘scree plot’ as described in many texts and statistical programs, is based on the trend in eigenvalues of sample covariance (correlation) matrix. The principal components or common factors corresponding to eigenvalues which exhibit a slow linear decrease arc discarded in further data analysis. This paper develops a formal statistical test for the ‘scree plot’. A special case of this test is the classical test for equality of eigenvalues which has been suggested in several texts as the criterion to decide the number of principal components to retain. Comparisons between equality of eigenvalues and the slow linear decrease in eigenvalues on some classical examples support the hypothesis of slow linear decrease. A physical background to such a phenomenon is also suggested.

...read moreread less

Dimension Reduction and Discretization in Stochastic Problems by Regression Method

[...]

Ove Ditlevsen¹•Institutions (1)

Technical University of Denmark¹

01 Jan 1996

Book Chapter•DOI•

Cost-Sensitive Feature Reduction Applied to a Hybrid Genetic Algorithm

[...]

Nada Lavrač, Dragan Gamberger, Peter D. Turney¹•Institutions (1)

National Research Council¹

23 Oct 1996

TL;DR: A case study of data preprocessing for a hybrid genetic algorithm shows that the elimination of irrelevant features can substantially improve the efficiency of learning and cost-sensitive feature elimination can be effective for reducing costs of induced hypotheses.

...read moreread less

Abstract: This study is concerned with whether it is possible to detect what information contained in the training data and background knowledge is relevant for solving the learning problem, and whether irrelevant information can be eliminated in preprocessing before starting the learning process. A case study of data preprocessing for a hybrid genetic algorithm shows that the elimination of irrelevant features can substantially improve the efficiency of learning. In addition, cost-sensitive feature elimination can be effective for reducing costs of induced hypotheses.

...read moreread less

Journal Article•DOI•

Feature reduction by Fourier transform in pattern recognition of NIR data

[...]

Wen Wu¹, Beata Walczak¹, W. Penninckx¹, Desire Massart¹•Institutions (1)

Vrije Universiteit Brussel¹

20 Sep 1996-Analytica Chimica Acta

TL;DR: In this article, a Fourier transform (FT) was used as a tool to reduce the number of variables in pattern recognition of NIR data, and five procedures were designed to select the FT coefficients as the input of the classifier of regularized discriminant analysis.

...read moreread less

Journal Article•DOI•

Neural network learning to non-linear principal component analysis

[...]

Jian-Hui Jiang¹, Ji-Hong Wang¹, Xia Chu¹, Ru-Qin Yu¹•Institutions (1)

Hunan University¹

30 Dec 1996-Analytica Chimica Acta

TL;DR: In this paper, an approach to non-linear principal component analysis (NPCA) has been developed by combining the essential properties of linear PCA, least squares approximation property and structure preservation property.

...read moreread less

Journal Article•DOI•

Dimension reduction, feature extraction and interpretation of data with network computing

[...]

Yoh-Han Pao¹•Institutions (1)

Case Western Reserve University¹

01 Aug 1996-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: It is shown that internal representations of neural networks do not yield unique feature values but can provide the basis for facilitating a number of useful information management tasks, such as memorization, categorization, discovery, associative recall and others.

...read moreread less

Abstract: The subject matter of this paper is one of long-standing interest to the Pattern Recognition and Artificial Intelligence research communities, namely that of "feature extraction" for facilitating the task of classification or various other tasks. We show that internal representations of neural networks do not yield unique feature values but can provide the basis for facilitating a number of useful information management tasks, such as memorization, categorization, discovery, associative recall and others. These matters are illustrated with three sets of data, one of a benchmark nature, another of the nature of real-world sensor data, and a third set consisting of semiconductor crystal structure parameters.

...read moreread less

Proceedings Article•DOI•

On feature extraction for limited class problem

[...]

Fumitaka Kimura¹, T. Wakabayashi, Y. Miyake•Institutions (1)

Mie University¹

25 Aug 1996

TL;DR: The result of experiment shows that the FKL provides the richest features in discriminating power for the limited class problem when compared with other techniques including the canonical discriminant analysis, the principal component analysis, and the orthonormal discriminant vector method (ODV).

...read moreread less

Abstract: The availability of the canonical discriminant analysis to a limited class problem is restricted because the number of extracted features can not be or exceed the number of classes. In order to remove the restriction, a new feature extraction technique FKL is proposed and is tested by handwritten numeral recognition experiment. While the canonical discriminant analysis maximizes the variance ratio (F-ratio), and the principal component analysis (K-L expansion) minimizes the mean square error of dimension reduction, the FKL optimizes both the F-ratio and the mean square error simultaneously. The result of experiment shows that the FKL provides the richest features in discriminating power for the limited class problem when compared with other techniques including the canonical discriminant analysis, the principal component analysis, and the orthonormal discriminant vector method (ODV).

...read moreread less

Journal Article•DOI•

A non‐linear mapping‐based generalized backpropagation network for unsupervised learning

[...]

Jian-Hui Jiang¹, Ji-Hong Wang¹, Yi-Zeng Liang¹, Ru-Qin Yu¹•Institutions (1)

Hunan University¹

01 May 1996-Journal of Chemometrics

TL;DR: An unsupervised learning network is developed by incorporating the idea of non‐linear mapping (NLM) into a backpropagation (BP) algorithm, which makes the BP learning algorithms more competent for many supervised and un Supervised learning tasks provided that an appropriate criterion has been designed.

...read moreread less

Abstract: An unsupervised learning network is developed by incorporating the idea of non-linear mapping (NLM) into a backpropagation (BP) algorithm. This network performs the learning process by 2iteratively adjusting its network parameters to minimize an appropriate criterion using a generalized BP (GBP) algorithm. This generalization makes the BP learning algorithms more competent for many supervised and unsupervised learning tasks provided that an appropriate criterion has been designed. Results of numerical simulation and real data show that the proposed technique is a promising approach to visualize multidimensional clusters by mapping the multidimensional data to a perceivable low-dimensional space.

...read moreread less

Proceedings Article•DOI•

Constructing non-orthogonal feature bases

[...]

Heikki Hyötyniemi¹•Institutions (1)

Helsinki University of Technology¹

03 Jun 1996

TL;DR: A feature extraction algorithm based on self-organising maps that can be interpreted as a non-orthogonal basis spanning the space of the input vectors is presented.

...read moreread less

Abstract: A feature extraction algorithm based on self-organising maps is presented. The converged feature map can be interpreted as a non-orthogonal basis spanning the space of the input vectors. The new algorithm can be shown to be a generalization of the generalised Hebbian algorithm (GHA).

...read moreread less

Proceedings Article•DOI•

Comparative study of feature extraction techniques for neural network classifier [power system simulation]

[...]

M.B. Zayan¹, M.A. El-Sharkawi, N.R. Prasad•Institutions (1)

Southern Arkansas University¹

28 Jan 1996

TL;DR: Two feature extraction algorithms, the minimum entropy method and the Karhunen-Loe've expansion, have been studied to examine their intraset clustering and interset class dispersion.

...read moreread less

Abstract: This paper compares two feature extraction techniques for neural network classifiers. The techniques evaluated are used for the dynamic security assessment of power systems. The feature extraction methods are used to map the observation vectors from the measurement space into a lower dimension feature space. The patterns in the feature space can then be utilized to train a neural network (NN) classifier. The NN classifier is used to classify a given power system into either a "secure" or "insecure" class. The feature vectors not only represent a reduction in dimensionality, but also lead to an improvement in class dispersion, and hence to a better classification. Two feature extraction algorithms, the minimum entropy method and the Karhunen-Loe've expansion, have been studied to examine their intraset clustering and interset class dispersion. A NN pattern classifier system is developed to illustrate the feasibility of classifying any given operating condition into either secure, or insecure class. Security assessment data from two utility power systems are used to test the proposed techniques.

...read moreread less