scispace - formally typeset
Search or ask a question

Showing papers on "Dimensionality reduction published in 1982"


Journal ArticleDOI
TL;DR: The method is based on successively predicting each element in the data matrix after deleting the corresponding row and column of the matrix, and makes use of recently published algorithms for updating a singular value decomposition.
Abstract: A method is described for choosing the number of components to retain in a principal component analysis when the aim is dimensionality reduction. The correspondence between principal component analysis and the singular value decomposition of the data matrix is used. The method is based on successively predicting each element in the data matrix after deleting the corresponding row and column of the matrix, and makes use of recently published algorithms for updating a singular value decomposition. These are very fast, which renders the proposed technique a practicable one for routine data analysis.

364 citations


Book ChapterDOI
TL;DR: This chapter presents an overview of Optical Character Recognition for statisticians interested in extending their endeavors from the traditional realm of pattern classification to the many other alluring aspects of OCR.
Abstract: Publisher Summary This chapter presents an overview of Optical Character Recognition (OCR) for statisticians interested in extending their endeavors from the traditional realm of pattern classification to the many other alluring aspects of OCR. The most important dimensions of data entry from the point of view of a project manager considering the acquisition of an OCR system are described in the chapter. The major applications are categorized according to the type of data to be converted to computer-readable form and optical scanners are described. The preprocessing necessary before the actual character classification can take place is discussed in the chapter. It outlines the classical decision-theoretic formulation of the character classification problem. The various statistical approximations to the optimal classifier, including dimensionality reduction, feature extraction, and feature selection is discussed with references to the appropriate statistical techniques. The importance of accurate estimation of the error and reject rates are discussed in the chapter and a fundamental relation between the error rate and the reject rate in optimal systems are described in the chapter.

44 citations



Journal ArticleDOI
TL;DR: A systematic feature extraction procedure is proposed, based on successive extractions of features, using the Gaussian minus-log-likelihood ratio as a basis for the extracted features.
Abstract: A systematic feature extraction procedure is proposed. It is based on successive extractions of features. At each stage a dimensionality reduction is made and a new feature is extracted. A specific example is given using the Gaussian minus-log-likelihood ratio as a basis for the extracted features. This form has the advantage that if both classes are Gaussianly distributed, only a single feature, the sufficient statistic, is extracted. If the classes are not Gaussianly distributed, additional features are extracted in an effort to improve the classification performance. Two examples are presented to demonstrate the performance of the procedure.

18 citations


Journal ArticleDOI
TL;DR: A method of achieving dimensionality reduction is presented and it is demonstrated that the resulting cluster center set is similar to the simplex signal set in communication theory, which is a minimum energy signal set.
Abstract: A method of achieving dimensionality reduction is presented. The reduced dimensionality is achieved by utilizing a least squared error technique under the assumption that the goodness criterion is the maximum separation of classes. The criterion is met by first maximizing the spread of the cluster centers, and then minimizing the within class scatter. The derivation of the desired transformation from an arbitrary p-space to a space of lower dimension, say l, is completed with the assumption that the cluster centers are known. The criterion for the cluster center location is the minimization of the variance of the distance between the cluster center and the transformed pattern. It is demonstrated that the resulting cluster center set is similar to the simplex signal set in communication theory, which is a minimum energy signal set.

8 citations


01 Jan 1982
TL;DR: In this article, a method of dimensionality reduction is presented by utilizing aleast squared error technique under the assumption that the goodness criterion is themaxi- mum separation of classes.
Abstract: A method ofachieving dimensionality reduction ispresented. Thereduced dimensionality isachieved byutilizing aleast squared error technique undertheassumption that thegoodness criterion isthemaxi- mum separation ofclasses. Thecriterion ismetbyfirst maximizing the spread ofthecluster centers, andthen minimizing thewithin class scatter. Thederivation ofthedesired transformation fromanarbitrary p-space toaspace oflower dimension, say1,iscompleted withtheassumption thatthecluster centers areknown.Thecriterion forthecluster center location istheminimization ofthevariance ofthedistance between the cluster center andthetransformed pattern. Itisdemonstrated thatthe resulting cluster center setissimilar tothesimplex signal setincom- munication theory, which isaminimumenergy signal set. IndexTerms- Clustering, dimensionality reduction, discriminant func- tions, least squared error. I.INTRODUCTION Thecomparison ofvarious dimensionality reduction tech- niques fromananalysis aswell asperformance standpoint con- tinues without abatement. Obviously, this activity isbasedon thepractical needforaneffective dimensionality reduction technique, andonthetheoretical interest inthesubject. Cer- tainly theattention isdeserved simply because manypattern classification applications areuntenable without having aneffec- tivedimensionality reduction technique asaprecursor. The original space(domain) inwhichthepattern class isdefined precludes theemployment ofmanysophisticated pattern classi- fication techniques because thecomputational requirements render theminfeasible. Thesetechniques canbecomevery powerful andeffective whendimensionality reduction is achieved byatransformation fromthedomaintoaspaceof lower dimensionality. Inthispaper, we describe a simple andstraightforward methodofachieving dimensionality reduction whichweem- ployinconjunction withapattern classification technique in therangespace. Wecompare ourresults quantitatively aswell asqualitatively withtheresults achieved byusing themethods ofother researchers (1),(2).Itisdemonstrated that thecrite- rionpresented hereshould beusedwhentheaimisseparation ofclasses, especially whenthenumberofclasses isgreater than thedimensionality oftherange bytwoormore.

6 citations


Proceedings ArticleDOI
01 Nov 1982
TL;DR: Optimum results were obtained by combining distance based feature selection methods with nonlinear discriminant analysis and the successive solution of 2-class problems improves the results compared to the solution of the 3-class problem.
Abstract: Numerical experiments were performed to find optimum feature extraction procedures for the classification of mouse L-fibroblasts into Gl, S and G2 subpopulations. From images of these cells different feature sets such as geometric, densitometric, textural and chromatin features were derived which served as data base for the numerical experiments. Linear and nonlinear supervised stepwise learning techniques for the discrimination of the cells into Gl, S and G2 were performed. The classification error was used as criterion for the evaluation of the different numerical feature selection methods. Optimum results were obtained by combining distance based feature selection methods with nonlinear discriminant analysis. The successive solution of 2-class problems improves the results compared to the solution of the 3-class problem. Linear discriminant analysis then may surpass quadratic discriminant analysis.© (1982) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

2 citations