scispace - formally typeset
Search or ask a question

Showing papers on "Feature extraction published in 1991"


BookDOI
01 May 1991
TL;DR: This dissertation describes a number of algorithms developed to increase the robustness of automatic speech recognition systems with respect to changes in the environment, including the SNR-Dependent Cepstral Normalization, (SDCN) and the Codeword-Dependent Cep stral normalization (CDCN).
Abstract: This dissertation describes a number of algorithms developed to increase the robustness of automatic speech recognition systems with respect to changes in the environment. These algorithms attempt to improve the recognition accuracy of speech recognition systems when they are trained and tested in different acoustical environments, and when a desk-top microphone (rather than a close-talking microphone) is used for speech input. Without such processing, mismatches between training and testing conditions produce an unacceptable degradation in recognition accuracy. Two kinds of environmental variability are introduced by the use of desk-top microphones and different training and testing conditions: additive noise and spectral tilt introduced by linear filtering. An important attribute of the novel compensation algorithms described in this thesis is that they provide joint rather than independent compensation for these two types of degradation. Acoustical compensation is applied in our algorithms as an additive correction in the cepstral domain. This allows a higher degree of integration within SPHINX, the Carnegie Mellon speech recognition system, that uses the cepstrum as its feature vector. Therefore, these algorithms can be implemented very efficiently. Processing in many of these algorithms is based on instantaneous signal-to-noise ratio (SNR), as the appropriate compensation represents a form of noise suppression at low SNRs and spectral equalization at high SNRs. The compensation vectors for additive noise and spectral transformations are estimated by minimizing the differences between speech feature vectors obtained from a "standard" training corpus of speech and feature vectors that represent the current acoustical environment. In our work this is accomplished by minimizing the distortion of vector-quantized cepstra that are produced by the feature extraction module in SPHINX. In this dissertation we describe several algorithms including the SNR-Dependent Cepstral Normalization, (SDCN) and the Codeword-Dependent Cepstral Normalization (CDCN). With CDCN, the accuracy of SPHINX when trained on speech recorded with a close-talking microphone and tested on speech recorded with a desk-top microphone is essentially the same obtained when the system is trained and tested on speech from the desk-top microphone. An algorithm for frequency normalization has also been proposed in which the parameter of the bilinear transformation that is used by the signal-processing stage to produce frequency warping is adjusted for each new speaker and acoustical environment. The optimum value of this parameter is again chosen to minimize the vector-quantization distortion between the standard environment and the current one. In preliminary studies, use of this frequency normalization produced a moderate additional decrease in the observed error rate.

474 citations


Journal ArticleDOI
TL;DR: Different implementations of adaptive smoothing are presented, first on a serial machine, for which a multigrid algorithm is proposed to speed up the smoothing effect, then on a single instruction multiple data (SIMD) parallel machine such as the Connection Machine.
Abstract: A method to smooth a signal while preserving discontinuities is presented. This is achieved by repeatedly convolving the signal with a very small averaging mask weighted by a measure of the signal continuity at each point. Edge detection can be performed after a few iterations, and features extracted from the smoothed signal are correctly localized (hence, no tracking is needed). This last property allows the derivation of a scale-space representation of a signal using the adaptive smoothing parameter k as the scale dimension. The relation of this process to anisotropic diffusion is shown. A scheme to preserve higher-order discontinuities and results on range images is proposed. Different implementations of adaptive smoothing are presented, first on a serial machine, for which a multigrid algorithm is proposed to speed up the smoothing effect, then on a single instruction multiple data (SIMD) parallel machine such as the Connection Machine. Various applications of adaptive smoothing such as edge detection, range image feature extraction, corner detection, and stereo matching are discussed. >

436 citations


Journal ArticleDOI
TL;DR: This paper proves that SV feature vector has some important properties of algebraic and geometric invariance, and insensitiveness to noise, and these properties are very useful for the description and recognition of images.

314 citations


Journal ArticleDOI
01 Feb 1991
TL;DR: The authors develop methodologies for the automatic selection of image features to be used to visually control the relative position and orientation (pose) between the end-effector of an eye-in-hand robot and a workpiece.
Abstract: The authors develop methodologies for the automatic selection of image features to be used to visually control the relative position and orientation (pose) between the end-effector of an eye-in-hand robot and a workpiece. A resolved motion rate control scheme is used to update the robot's pose based on the position of three features in the camera's image. The selection of these three features depends on a blend of image recognition and control criteria. The image recognition criteria include feature robustness, completeness, cost of feature extraction, and feature uniqueness. The control criteria include system observability, controllability, and sensitivity. A weighted criteria function is used to select the combination of image features that provides the best control of the end-effector of a general six-degrees-of-freedom manipulator. Both computer simulations and laboratory experiments on a PUMA robot arm were conducted to verify the performance of the feature-selection criteria. >

223 citations


Journal ArticleDOI
TL;DR: Preprocessing, feature extraction and postprocessing techniques for commercial reading machines for optical character recognition (OCR) and problems related to handwritten and printed character recognition are pointed out.
Abstract: In order to highlight the interesting problems and actual results on the state of the art in optical character recognition (OCR), this paper describes and compares preprocessing, feature extraction and postprocessing techniques for commercial reading machines. Problems related to handwritten and printed character recognition are pointed out, and the functions and operations of the major components of an OCR system are described. Historical background on the development of character recognition is briefly given and the working of an optical scanner is explained. The specifications of several recognition systems that are commercially available are reported and compared.

221 citations


Journal ArticleDOI
TL;DR: It is proved that, even in the absence of image error, each model must be represented by a 2D surface in the index space, which places an unexpected lower bound on the space required to implement indexing and proves that no quantity is invariant for all projections of a model into the image.
Abstract: Model-based visual recognition systems often match groups of image features to groups of model features to form initial hypotheses, which are then verified. In order to accelerate recognition considerably, the model groups can be arranged in an index space (hashed) offline such that feasible matches are found by indexing into this space. For the case of 2D images and 3D models consisting of point features, bounds on the space required for indexing and on the speedup that such indexing can achieve are demonstrated. It is proved that, even in the absence of image error, each model must be represented by a 2D surface in the index space. This places an unexpected lower bound on the space required to implement indexing and proves that no quantity is invariant for all projections of a model into the image. Theoretical bounds on the speedup achieved by indexing in the presence of image error are also determined, and an implementation of indexing for measuring this speedup empirically is presented. It is found that indexing can produce only a minimal speedup on its own. However, when accompanied by a grouping operation, indexing can provide significant speedups that grow exponentially with the number of features in the groups. >

147 citations


Patent
17 Jul 1991
TL;DR: In this article, a method of processing an image including the steps of locating within the image the position of at least one predetermined feature, extracting from the image data representing each feature, and calculating for each feature a feature vector representing the position in an N-dimensional space, such space being defined by a plurality of reference vectors each of which is an eigenvector of a training set of like features in which the image of each feature is modified to normalize the shape of the feature, which step is carried out before calculating the corresponding feature vector.
Abstract: A method of processing an image including the steps of: locating within the image the position of at least one predetermined feature; extracting from the image data representing each feature; and calculating for each feature a feature vector representing the position of the image data of the feature in an N-dimensional space, such space being defined by a plurality of reference vectors each of which is an eigenvector of a training set of like features in which the image data of each feature is modified to normalize the shape of each feature thereby to reduce its deviation from a predetermined standard shape of the feature, which step is carried out before calculating the corresponding feature vector.

128 citations


Journal ArticleDOI
TL;DR: Recent advances in and perspectives of research on speaker-dependent-feature extraction from speech waves, automatic speaker identification and verification, speaker adaptation in speech recognition, and voice conversion techniques are discussed.

108 citations


Journal ArticleDOI
TL;DR: An important aspect of the approach is to exhibit how a priori information regarding nonuniform class membership, uneven distribution between train and test sets, and misclassification costs may be exploited in a regularized manner in the training phase of networks.
Abstract: The problem of multiclass pattern classification using adaptive layered networks is addressed. A special class of networks, i.e., feed-forward networks with a linear final layer, that perform generalized linear discriminant analysis is discussed, This class is sufficiently generic to encompass the behavior of arbitrary feed-forward nonlinear networks. Training the network consists of a least-square approach which combines a generalized inverse computation to solve for the final layer weights, together with a nonlinear optimization scheme to solve for parameters of the nonlinearities. A general analytic form for the feature extraction criterion is derived, and it is interpreted for specific forms of target coding and error weighting. An important aspect of the approach is to exhibit how a priori information regarding nonuniform class membership, uneven distribution between train and test sets, and misclassification costs may be exploited in a regularized manner in the training phase of networks. >

99 citations


Proceedings ArticleDOI
W. Jang1, Zeungnam Bien1
09 Apr 1991
TL;DR: By means of various examples, the method of feature-based servoing of a robot proposed is proved to be very effective for conducting object-oriented robotic tasks.
Abstract: A method is presented for using image features in servoing a robot manipulator. Specifically, the concept of a feature is mathematically defined, and the differential relationship between the robot motion and feature vector is derived in terms of a feature Jacobian matrix and its generalized inverse. The feature-based PID (proportional-integral-derivative) controller is established with three scalar gains and an n*n matrix. By means of various examples, the method of feature-based servoing of a robot proposed is proved to be very effective for conducting object-oriented robotic tasks. >

95 citations


Journal ArticleDOI
01 Mar 1991
TL;DR: Algorithms for dimensionality reduction and feature extraction and their applications as effective pattern recognizers in identifying computer users are presented and the applications of these algorithms could lead to better results in securing access to computer systems.
Abstract: Algorithms for dimensionality reduction and feature extraction and their applications as effective pattern recognizers in identifying computer users are presented. Fisher's linear discriminant technique was used for the reduction of dimensionality of the patterns. An approach for the extraction of physical features from pattern vectors is developed. This approach relies on shuffling two pattern vectors. The shuffling approach is competitive with the use of Fisher's technique in terms of speed and results. An online identification system was developed. The system was tested over a period of five weeks, used by ten participants, and in 1.17% of cases gave the error of being unable to decide. The applications of these algorithms in identifying computer users could lead to better results in securing access to computer systems. The user types a password and the system identifies not only the word but the time between each keystroke and the next. >

Journal ArticleDOI
TL;DR: A hybrid decision tree classifier design procedure that produces efficient and accurate classifiers for remote sensing problems is proposed and empirical tests suggest that the hybrid design produces higher accuracy with fewer features.
Abstract: In applying pattern recognition methods in remote sensing problems, an inherent limitation is that there is almost always only a small number of training samples with which to design the classifier. A hybrid decision tree classifier design procedure that produces efficient and accurate classifiers for this situation is proposed. In doing so, several key questions are addressed, among them the question of the feature extraction techniques to be used and the mathematical relationship between sample size, dimensionality, and risk value. Empirical tests comparing the hybrid design classifier with a conventional single layered one are presented. They suggest that the hybrid design produces higher accuracy with fewer features. The need for fewer features is an important advantage, because it reflects favorably on both the size of the training set needed and the amount of computation time that will be needed in analysis. >

Patent
Arturo Pizano1, May-Inn Tan1, Naoto Gambo1
06 Aug 1991
TL;DR: In this article, a pattern recognition system is proposed to classify digitized images of business forms according to a predefined set of templates, which are then stored in a data dictionary and used to determine their class membership.
Abstract: Business forms are a special class of documents typically used to collect or distribute data; they represent a vast majority of the paperwork need to conduct business. The present invention provides a pattern recognition system that classifies digitized images of business forms according to a predefined set of templates. The process involves a training phase, during which images of the template forms are scanned, analyzed and stored in a data dictionary, and a recognition phase, during which images of actual forms are compared to the templates in the dictionary to determine their class membership. The invention provides the feature extraction and matching methods, as well as the organization of the form dictionary. The performance of the system was evaluated using a collection of computer generated test forms. The methodology for creating these forms, and the results of the evaluation are also described. Business forms are characterized by the presence of horizontal and vertical lines that delimit the useable space. The present invention identifies these so called regular lines in bi-level digital images to separate text from graphics before applying an optical character recognizer; or as a feature extractor in a form recognition system. The approach differs from existing vectorization, line extraction, and text-graphics separation methods, in that it focuses exclusively on the recognition of horizontal and vertical lines.

Journal ArticleDOI
TL;DR: An 80386 PC-based system was designed to track automatically multiple, miniature radiopaque markers implanted in the heart wall, eliminating the need for tedious, time-consuming manual digitization of marker coordinates.

Journal ArticleDOI
TL;DR: The paper concludes that neural net paradigms have become sufficiently powerful to justify research programs aimed at implementing APR methods for geophysical inversion.
Abstract: SUMMARY This paper is a philosophical exploration of adaptive pattern recognition paradigms for geophysical data inversion, aimed at overcoming many of the problems faced by current inversion methods. APR (adaptive pattern recognition) methods are based upon encoding exemplar patterns in such a way that their features can be used to classify subsequent test patterns. These paradigms are adaptive in that they learn from experience and are capable of inferring rules to deal with incomplete data. APR paradigms can also be highly effective in dealing with noise and other data distortions through the use of exemplars which characterize such distortions. Rather than merely seeking to reduce the point by point mismatch between data and model curves, effective APR paradigms would match patterns by establishing a feature vocabulary and inferring rules to weight the relative importance of these features in interpreting data. They have the advantage that prototype data sets can include analogue modelling data and field survey data rather than being restricted to models for which a numerical forward model can be calculated. The success of this approach to inversion will depend upon the effectiveness of replacing continuous parameter estimation with microclassification (discretized parameter estimation). Once the viability of APR schemes has been established for inverting data from individual geophysical methods, the task of joint interpretation of data from different geophysical survey methods could be accomplished in an optimum fashion by using hierarchical adaptive schemes. The application of APR to inversion is explored from the standpoint of neural net implementations. The foundations and properties of seven well-known neural net paradigms are examined in terms of several attributes necessary to build an effective inversion system. Different input space representation concepts for feature extraction and data compression are presented, including moment methods and a non-reversible, generalized Fourier transform method. Both parametric and non-parametric concepts for output space representation are explored as microclassification paradigms for quantitative estimation of Earth properties. The paper concludes that neural net paradigms have become sufficiently powerful to justify research programs aimed at implementing APR methods for geophysical inversion.

Journal ArticleDOI
TL;DR: It is shown how signal processing algorithms combined with pattern recognition techniques allow extraction of information about spectral characteristics and type of modulation.

Proceedings ArticleDOI
03 Jun 1991
TL;DR: The method for the derivation of such invariants, based on Lie group theory and applicable to a wide spectrum of transformation groups, is described and invariant curve parameterizations are developed for affine and projective transformations.
Abstract: Semidifferential invariants, combining coordinates in different points together with their derivatives, are used for the description of planar contours. Their use can be seen as a tradeoff between two extreme strategies currently used in shape recognition: (invariant) feature extraction methods, involving high-order derivatives, and invariant coordinate descriptions, leading to the correspondence problem of reference points. The method for the derivation of such invariants, based on Lie group theory and applicable to a wide spectrum of transformation groups, is described. As an example, invariant curve parameterizations are developed for affine and projective transformations. The usefulness of the approach is illustrated with two examples: (1) recognition of a test set of 12 planar objects viewed under conditions allowing affine approximations, and (2) the detection of symmetry in perspective projections of curves. >

Journal ArticleDOI
TL;DR: An analog neural network that can be taught to recognize stimulus sequences is used to recognize the digits in connected speech, using linear circuits for signal filtering and nonlinear circuits for simple decisions, feature extraction, and noise suppression.
Abstract: An analog neural network that can be taught to recognize stimulus sequences is used to recognize the digits in connected speech The circuit computes in the analog domain, using linear circuits for signal filtering and nonlinear circuits for simple decisions, feature extraction, and noise suppression An analog perceptron learning rule is used to organize the subset of connections used in the circuit that are specific to the chosen vocabulary Computer simulations of the learning algorithm and circuit demonstrate recognition scores >99 % for a single-speaker connected-digit data base There is no clock The circuit is data driven, and there is no necessity for endpoint detection or segmentation of the speech signal during recognition Training in the presence of noise provides noise immunity up to the trained level For the speech problem studied, the circuit connections need only be accurate to about 3-b digitization depth for optimum performance The algorithm used maps efficiently onto analog neutral network hardware >

Proceedings ArticleDOI
M.A. Shackleton1, W.J. Welsh1
03 Jun 1991
TL;DR: A facial feature classification technique that independently captures both the geometric configuration and the image detail of a particular feature is described and results show that features can be reliably recognized using the representation vectors obtained.
Abstract: A facial feature classification technique that independently captures both the geometric configuration and the image detail of a particular feature is described. The geometric configuration is first extracted by fitting a deformable template to the shape of the feature (for example, an eye) in the image. This information is then used to geometrically normalize the image in such a way that the feature in the image attains a standard shape. The normalized image of the facial feature is then classified in terms of a set of principal components previously obtained from a representative set of training images of similar features. This classification stage yields a representation vector which can be used for recognition matching of the feature in terms of image detail alone without the complication of changes in facial expression. Implementation of the system is described and results are given for its application to a set of test faces. These results show that features can be reliably recognized using the representation vectors obtained. >

Journal ArticleDOI
TL;DR: A new method called transformation-ring-projection (TRP) is proposed to achieve the size-orientation-invariance characteristic and provides the feasibility or VLSI implementation to speed up computation for real-time processing.
Abstract: The size-orientation-invariance characteristic plays an important role in pattern recognition. It has many applications in computer vision, optical character recognition (OCR), office automation, electronic publication, graphics, etc. In this paper, a new method called transformation-ring-projection (TRP) is proposed to achieve this characteristic. In TRP, shape transformation technique is employed to center the pattern image and normalize its size; the ring-projection scheme is used to handle the orientation problem. An experiment was conducted to verify the proposed method in character recognition. The TRP algorithm requires only simple and regular operations, and provides the feasibility or VLSI implementation to speed up computation for real-time processing. A study on VLSI architecture with extensive parallel processing and pipelining capabilities for the proposed TRP algorithm is also presented.

Journal ArticleDOI
Maan Ammar1
TL;DR: This paper compares the performances of parametric and reference pattern based features (RPBFs) in the verification of skillfully simulated handwritten signatures and shows that two-dimensional RPBFs will give much better performance.
Abstract: This paper compares the performances of parametric and reference pattern based features (RPBFs) in the verification of skillfully simulated handwritten signatures. The comparison shows that RPBFs significantly improve results and give about 90% correct verification using only shape features. The performance of the used shape features is independent of the signature shape, language and position in the document. The careful analysis of the experimental results of using RPBFs in verification has led to the conclusion that two-dimensional RPBFs will give much better performance.

Proceedings ArticleDOI
09 Apr 1991
TL;DR: A method is presented for feature extraction which is important for future data matching/fusion procedures and several techniques of surface reconstruction and their limitations are presented.
Abstract: Deriving a terrain model from sensor data is an important task for the autonomous navigation of a mobile robot. An approach is presented for autonomous underwater vehicles using a side scan sonar system. Some general aspects of the type of data and filtering techniques to improve it are discussed. An estimated bottom contour is derived using a geometric reflection model and information about shadows and highlights. Several techniques of surface reconstruction and their limitations are presented. A method is presented for feature extraction which is important for future data matching/fusion procedures. >

Proceedings ArticleDOI
02 Dec 1991
TL;DR: Preliminary results indicate that both accuracy and real time response can be achieved using nonparametric statistical methods which permits considerable data compression and which supports pattern recognition techniques for identifying user behavior.
Abstract: Obstacles to achieving anomaly detection in real time include the large volume of data associated with user behavior and the nature of that data. The paper describes preliminary results from a research project which is developing a new approach to handling such data. The approach involves nonparametric statistical methods which permits considerable data compression and which supports pattern recognition techniques for identifying user behavior. This approach applies these methods to a combination of measurements of resource usage and structural information about the behavior of processes. Preliminary results indicate that both accuracy and real time response can be achieved using these methods. >

Journal ArticleDOI
TL;DR: A feature extraction algorithm and its application in the signal analysis by an interchangeable use of two linear approximation methods to fix the on- and offset of the ECG waves and their parameters.

Journal ArticleDOI
TL;DR: An algorithm for straight edge extraction from intensity images and an algorithm forstraight line matching using a matching function, which characterizes the similarity of edge lines of two images and is based on not only the geometrical relations of the lines but also the information from the intensity images.

Patent
23 Apr 1991
TL;DR: In this paper, a neural network system capable of performing integrated processing of a plurality of information includes a feature extractor group and an information processing unit for learning features of the learning data.
Abstract: A neural network system capable of performing integrated processing of a plurality of information includes a feature extractor group for extracting a plurality of learning feature data from learning data in a learning mode and a plurality of object feature data from object data to be processed in an execution mode, and an information processing unit for learning features of the learning data, based on the plurality of learning feature data from the feature extractor group and corresponding teacher data in the learning mode, and determining final learning result data from the plurality of object feature data from the feature extractor group in accordance with the learning result, including a logic representing relation between the plurality of object feature data in the execution mode.

Journal ArticleDOI
TL;DR: The algorithms developed under the concept of strokes are suitable for recognizing large sets of Chinese characters and do not have to be modified when the number of characters increases.
Abstract: This paper describes typical research on Chinese optical character recognition in Taiwan. Chinese characters can be represented by a set of basic line segments called strokes. Several approaches to the recognition of handwritten Chinese characters by stroke analysis are described here. A typical optical character recognition (OCR) system consists of four main parts: image preprocessing, feature extraction, radical extraction and matching. Image preprocessing is used to provide the suitable format for data processing. Feature extraction is used to extract stable features from the Chinese character. Radical extraction is used to decompose the Chinese character into radicals. Finally, matching is used to recognize the Chinese character. The reasons for using strokes as the features for Chinese character recognition are the following. First, all Chinese characters can be represented by a combination of strokes. Second, the algorithms developed under the concept of strokes do not have to be modified when the number of characters increases. Therefore, the algorithms described in this paper are suitable for recognizing large sets of Chinese characters.

Proceedings ArticleDOI
01 Jan 1991
TL;DR: In this paper, a general framework for the analysis of time sequences is presented, which uses spatio-temporal filtering in a hierarchical structure and features extracted include speed, acceleration and disparity/depth.
Abstract: The paper presents a general framework for the analysis of time sequences. Features extracted include speed, acceleration and disparity/depth. The method uses spatio-temporal filtering in a hierarchical structure. Synthetic and real world examples are included. >

Patent
03 May 1991
TL;DR: One rotationally invariant feature extracted by the system is the number of intercepts between boundary transitions in the image with at least a selected one of a plurality of radii centered at the centroid of the character.
Abstract: A feature-based optical character recognition system, employing a feature-based recognition device such as a neural network or an absolute distance measure device, extracts a set of features from segmented character images in a document, at least some of the extracted features being at least nearly impervious to rotation or skew of the document image, so as to enhance the reliability of the system. One rotationally invariant feature extracted by the system is the number of intercepts between boundary transitions in the image with at least a selected one of a plurality of radii centered at the centroid of the character in the image.

Proceedings ArticleDOI
08 Jul 1991
TL;DR: The authors compare the object classification performance in an automated cytology screener that consists of a Sun workstation, a DataCube image processing system, and an automatic stage/optics/illumination system.
Abstract: A squamous intraepithelial lesion (SIL) detection algorithm has been developed to process conventional Pap smears yielding a superior result (J.S.-J. Lee et al., 1990). The authors compare the object classification performance in an automated cytology screener. It consists of a Sun workstation, a DataCube image processing system, and an automatic stage/optics/illumination system. The system allows automated screening of 10 slides unattended. The main functional modules of the SIL algorithm include: image segmentation, feature extraction, and object classification. The classifiers used include neural network classifiers, statistical binary decision tree classifiers, a hybrid classifier, and the integration of multiple classifiers in an attempt to further improve algorithm performance. >