scispace - formally typeset
Search or ask a question

Showing papers on "Feature extraction published in 1997"


Journal ArticleDOI
TL;DR: The results demonstrate that subvoxel accuracy with respect to the stereotactic reference solution can be achieved completely automatically and without any prior segmentation, feature extraction, or other preprocessing steps which makes this method very well suited for clinical applications.
Abstract: A new approach to the problem of multimodality medical image registration is proposed, using a basic concept from information theory, mutual information (MI), or relative entropy, as a new matching criterion. The method presented in this paper applies MI to measure the statistical dependence or information redundancy between the image intensities of corresponding voxels in both images, which is assumed to be maximal if the images are geometrically aligned. Maximization of MI is a very general and powerful criterion, because no assumptions are made regarding the nature of this dependence and no limiting constraints are imposed on the image content of the modalities involved. The accuracy of the MI criterion is validated for rigid body registration of computed tomography (CT), magnetic resonance (MR), and photon emission tomography (PET) images by comparison with the stereotactic registration solution, while robustness is evaluated with respect to implementation issues, such as interpolation and optimization, and image content, including partial overlap and image degradation. Our results demonstrate that subvoxel accuracy with respect to the stereotactic reference solution can be achieved completely automatically and without any prior segmentation, feature extraction, or other preprocessing steps which makes this method very well suited for clinical applications.

4,773 citations


Journal ArticleDOI
TL;DR: A novel fast algorithm for independent component analysis is introduced, which can be used for blind source separation and feature extraction, and the convergence speed is shown to be cubic.
Abstract: We introduce a novel fast algorithm for independent component analysis, which can be used for blind source separation and feature extraction. We show how a neural network learning rule can be transformed into a fixedpoint iteration, which provides an algorithm that is very simple, does not depend on any user-defined parameters, and is fast to converge to the most accurate solution allowed by the data. The algorithm finds, one at a time, all nongaussian independent components, regardless of their probability distributions. The computations can be performed in either batch mode or a semiadaptive manner. The convergence of the algorithm is rigorously proved, and the convergence speed is shown to be cubic. Some comparisons to gradient-based algorithms are made, showing that the new algorithm is usually 10 to 100 times faster, sometimes giving the solution in just a few iterations.

3,215 citations


Journal ArticleDOI
TL;DR: A hybrid neural-network for human face recognition which compares favourably with other methods and analyzes the computational complexity and discusses how new classes could be added to the trained recognizer.
Abstract: We present a hybrid neural-network for human face recognition which compares favourably with other methods. The system combines local image sampling, a self-organizing map (SOM) neural network, and a convolutional neural network. The SOM provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the Karhunen-Loeve transform in place of the SOM, and a multilayer perceptron (MLP) in place of the convolutional network for comparison. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details. We analyze the computational complexity and discuss how new classes could be added to the trained recognizer.

2,954 citations


Journal ArticleDOI
TL;DR: This work studies the problem of choosing an optimal feature set for land use classification based on SAR satellite images using four different texture models and shows that pooling features derived from different texture Models, followed by a feature selection results in a substantial improvement in the classification accuracy.
Abstract: A large number of algorithms have been proposed for feature subset selection. Our experimental results show that the sequential forward floating selection algorithm, proposed by Pudil et al. (1994), dominates the other algorithms tested. We study the problem of choosing an optimal feature set for land use classification based on SAR satellite images using four different texture models. Pooling features derived from different texture models, followed by a feature selection results in a substantial improvement in the classification accuracy. We also illustrate the dangers of using feature selection in small sample size situations.

2,238 citations


Book ChapterDOI
08 Oct 1997
TL;DR: A new method for performing a nonlinear form of Principal Component Analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.
Abstract: A new method for performing a nonlinear form of Principal Component Analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible d-pixel products in images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

2,223 citations


Journal ArticleDOI
TL;DR: An improved version of the minutia extraction algorithm proposed by Ratha et al. (1995), which is much faster and more reliable, is implemented for extracting features from an input fingerprint image captured with an online inkless scanner and an alignment-based elastic matching algorithm has been developed.
Abstract: Fingerprint verification is one of the most reliable personal identification methods. However, manual fingerprint verification is incapable of meeting today's increasing performance requirements. An automatic fingerprint identification system (AFIS) is needed. This paper describes the design and implementation of an online fingerprint verification system which operates in two stages: minutia extraction and minutia matching. An improved version of the minutia extraction algorithm proposed by Ratha et al. (1995), which is much faster and more reliable, is implemented for extracting features from an input fingerprint image captured with an online inkless scanner. For minutia matching, an alignment-based elastic matching algorithm has been developed. This algorithm is capable of finding the correspondences between minutiae in the input image and the stored template without resorting to exhaustive search and has the ability of adaptively compensating for the nonlinear deformations and inexact pose transformations between fingerprints. The system has been tested on two sets of fingerprint images captured with inkless scanners. The verification accuracy is found to be acceptable. Typically, a complete fingerprint verification procedure takes, on an average, about eight seconds on a SPARC 20 workstation. These experimental results show that our system meets the response time requirements of online verification with high accuracy.

1,376 citations


Proceedings ArticleDOI
17 Jun 1997
TL;DR: This paper shows that a new variant of the k-d tree search algorithm makes indexing in higher-dimensional spaces practical, and is integrated into a fully developed recognition system, which is able to detect complex objects in real, cluttered scenes in just a few seconds.
Abstract: Shape indexing is a way of making rapid associations between features detected in an image and object models that could have produced them. When model databases are large, the use of high-dimensional features is critical, due to the improved level of discrimination they can provide. Unfortunately, finding the nearest neighbour to a query point rapidly becomes inefficient as the dimensionality of the feature space increases. Past indexing methods have used hash tables for hypothesis recovery, but only in low-dimensional situations. In this paper we show that a new variant of the k-d tree search algorithm makes indexing in higher-dimensional spaces practical. This Best Bin First, or BBF search is an approximate algorithm which finds the nearest neighbour for a large fraction of the queries, and a very close neighbour in the remaining cases. The technique has been integrated into a fully developed recognition system, which is able to detect complex objects in real, cluttered scenes in just a few seconds.

1,044 citations


Proceedings ArticleDOI
21 Apr 1997
TL;DR: A real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input is constructed and extensive data on system performance and the cross-validated training/test setup used to evaluate the system is provided.
Abstract: We report on the construction of a real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input. We have examined 13 features intended to measure conceptually distinct properties of speech and/or music signals, and combined them in several multidimensional classification frameworks. We provide extensive data on system performance and the cross-validated training/test setup used to evaluate the system. For the datasets currently in use, the best classifier classifies with 5.8% error on a frame-by-frame basis, and 1.4% error when integrating long (2.4 second) segments of sound.

1,028 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a technique for constructing random fields from a set of training samples, where each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the empirical distribution of the training data.
Abstract: We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are non-Markovian and have a large number of parameters that must be estimated. Relations to other learning approaches, including decision trees, are given. As a demonstration of the method, we describe its application to the problem of automatic word classification in natural language processing.

998 citations


Journal ArticleDOI
TL;DR: The discriminatory power of various human facial features is studied and a new scheme for Automatic Face Recognition (AFR) is proposed and an efficient projection-based feature extraction and classification scheme for AFR is proposed.
Abstract: In this paper the discriminatory power of various human facial features is studied and a new scheme for Automatic Face Recognition (AFR) is proposed. Using Linear Discriminant Analysis (LDA) of different aspects of human faces in spatial domain, we first evaluate the significance of visual information in different parts/features of the face for identifying the human subject. The LDA of faces also provides us with a small set of features that carry the most relevant information for classification purposes. The features are obtained through eigenvector analysis of scatter matrices with the objective of maximizing between-class and minimizing within-class variations. The result is an efficient projection-based feature extraction and classification scheme for AFR. Soft decisions made based on each of the projections are combined, using probabilistic or evidential approaches to multisource data analysis. For medium-sized databases of human faces, good classification accuracy is achieved using very low-dimensional feature vectors.

892 citations


Journal ArticleDOI
TL;DR: This work employs the new geometric active contour models, previously formulated, for edge detection and segmentation of magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound medical imagery, and leads to a novel snake paradigm in which the feature of interest may be considered to lie at the bottom of a potential well.
Abstract: We employ the new geometric active contour models, previously formulated, for edge detection and segmentation of magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound medical imagery. Our method is based on defining feature-based metrics on a given image which in turn leads to a novel snake paradigm in which the feature of interest may be considered to lie at the bottom of a potential well. Thus, the snake is attracted very quickly and efficiently to the desired feature.

Journal ArticleDOI
TL;DR: A new form of point representation for describing 3D free-form surfaces is proposed, which serves to describe the structural neighbourhood of a point in a more complete manner than just using the 3D coordinates of the point.
Abstract: Few systems capable of recognizing complex objects with free-form (sculptured) surfaces have been developed. The apparent lack of success is mainly due to the lack of a competent modelling scheme for representing such complex objects. In this paper, a new form of point representation for describing 3D free-form surfaces is proposed. This representation, which we call the point signature, serves to describe the structural neighbourhood of a point in a more complete manner than just using the 3D coordinates of the point. Being invariant to rotation and translation, the point signature can be used directly to hypothesize the correspondence to model points with similar signatures. Recognition is achieved by matching the signatures of data points representing the sensed surface to the signatures of data points representing the model surface. The use of point signatures is not restricted to the recognition of a single-object scene to a small library of models. Instead, it can be extended naturally to the recognition of scenes containing multiple partially-overlapping objects (which may also be juxtaposed with each other) against a large model library. No preliminary phase of segmenting the scene into the component objects is required. In searching for the appropriate candidate model, recognition need not proceed in a linear order which can become prohibitive for a large model library. For a given scene, signatures are extracted at arbitrarily spaced seed points. Each of these signatures is used to vote for models that contain points having similar signatures. Inappropriate models with low votes can be rejected while the remaining candidate models are ordered according to the votes they received. In this way, efficient verification of the hypothesized candidates can proceed by testing the most likely model first. Experiments using real data obtained from a range finder have shown fast recognition from a library of fifteen models whose complexities vary from that of simple piecewise quadric shapes to complicated face masks. Results from the recognition of both single-object and multiple-object scenes are presented.

Journal ArticleDOI
TL;DR: The paper demonstrates a successful application of PDBNN to face recognition applications on two public (FERET and ORL) and one in-house (SCR) databases and experimental results on three different databases such as recognition accuracies as well as false rejection and false acceptance rates are elaborated.
Abstract: This paper proposes a face recognition system, based on probabilistic decision-based neural networks (PDBNN). With technological advance on microelectronic and vision system, high performance automatic techniques on biometric recognition are now becoming economically feasible. Among all the biometric identification methods, face recognition has attracted much attention in recent years because it has potential to be most nonintrusive and user-friendly. The PDBNN face recognition system consists of three modules: First, a face detector finds the location of a human face in an image. Then an eye localizer determines the positions of both eyes in order to generate meaningful feature vectors. The facial region proposed contains eyebrows, eyes, and nose, but excluding mouth (eye-glasses will be allowed). Lastly, the third module is a face recognizer. The PDBNN can be effectively applied to all the three modules. It adopts a hierarchical network structures with nonlinear basis functions and a competitive credit-assignment scheme. The paper demonstrates a successful application of PDBNN to face recognition applications on two public (FERET and ORL) and one in-house (SCR) databases. Regarding the performance, experimental results on three different databases such as recognition accuracies as well as false rejection and false acceptance rates are elaborated. As to the processing speed, the whole recognition process (including PDBNN processing for eye localization, feature extraction, and classification) consumes approximately one second on Sparc10, without using hardware accelerator or co-processor.

Journal ArticleDOI
01 Oct 1997
TL;DR: Geometric hashing, a technique originally developed in computer vision for matching geometric features against a database of such features, finds use in a number of other areas.
Abstract: Geometric hashing, a technique originally developed in computer vision for matching geometric features against a database of such features, finds use in a number of other areas. Matching is possible even when the recognizable database objects have undergone transformations or when only partial information is present. The technique is highly efficient and of low polynomial complexity.

Proceedings ArticleDOI
26 Oct 1997
TL;DR: A new method for invisibly watermarking high-quality color and gray-scale images intended for use in image verification applications, where one is interested in knowing whether the content of an image has been altered since some earlier time, perhaps because of the act of a malicious party.
Abstract: We propose a new method for invisibly watermarking high-quality color and gray-scale images. This method is intended for use in image verification applications, where one is interested in knowing whether the content of an image has been altered since some earlier time, perhaps because of the act of a malicious party. It consists of both a watermark stamping process which embeds a watermark in a source image, and a watermark extraction process which extracts a watermark from a stamped image. The extracted watermark can be used to determine whether the image has been altered. The processing used in the stamping and extraction processes is presented. We also discuss some advantages of this technique over other invisible watermarking techniques for the verification application; these include a high degree of invisibility, color preservation, ease of decoding, and a high degree of protection against retention of the watermark after unauthorized alterations.

Journal ArticleDOI
TL;DR: This paper proposes the use of a three-layer feedforward neural network to select those input attributes that are most useful for discriminating classes in a given set of input patterns.
Abstract: Feature selection is an integral part of most learning algorithms. Due to the existence of irrelevant and redundant attributes, by selecting only the relevant attributes of the data, higher predictive accuracy can be expected from a machine learning method. In this paper, we propose the use of a three-layer feedforward neural network to select those input attributes that are most useful for discriminating classes in a given set of input patterns. A network pruning algorithm is the foundation of the proposed algorithm. By adding a penalty term to the error function of the network, redundant network connections can be distinguished from those relevant ones by their small weights when the network training process has been completed. A simple criterion to remove an attribute based on the accuracy rate of the network is developed. The network is retrained after removal of an attribute, and the selection process is repeated until no attribute meets the criterion for removal. Our experimental results suggest that the proposed method works very well on a wide variety of classification problems.

Journal ArticleDOI
TL;DR: Chi2 is a simple and general algorithm that uses the /spl chi//sup 2/ statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data and achieves feature selection via discretization.
Abstract: Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the /spl chi//sup 2/ statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data. It achieves feature selection via discretization. It can handle mixed attributes, work with multiclass data, and remove irrelevant and redundant attributes.

Journal ArticleDOI
TL;DR: Experimental results prove that the approach using the variable duration outperforms the method using fixed duration in terms of both accuracy and speed.
Abstract: A fast method of handwritten word recognition suitable for real time applications is presented in this paper. Preprocessing, segmentation and feature extraction are implemented using a chain code representation of the word contour. Dynamic matching between characters of a lexicon entry and segment(s) of the input word image is used to rank the lexicon entries in order of best match. Variable duration for each character is defined and used during the matching. Experimental results prove that our approach using the variable duration outperforms the method using fixed duration in terms of both accuracy and speed. Speed of the entire recognition process is about 200 msec on a single SPARC-10 platform and the recognition accuracy is 96.8 percent are achieved for lexicon size of 10, on a database of postal words captured at 212 dpi.

Journal Article
Armin Gruen1, Haihong Li
TL;DR: This paper deals with semi-automatic linear feature extraction from digital images for GIS data capture, where the identification task is pe$ormed manually on a single image, while a special automatic digital module performs the high precision feature tracking in two-dimensional image space or even three-dimensional object space.
Abstract: This paper deals with semi-automatic linear feature extraction from digital images for GIS data capture, where the identification task is pe$ormed manually on a single image, while a special automatic digital module performs the high precision feature tracking in two-dimensional (2-0) image space or even three-dimensional (3-0) object space. A human operator identifies the object from an on-screen display of a digital image, selects the particular class this object belongs to, and provides a very few coarsely distributed seed points. subseq;ently, with th;?sk seed as an approximation of the ~osition and sham the linear feature will be extracted automatically by either a dynamic programming approach or by LSB-S~~~~S [Least-Squares E-spline Snakes). With dynamic programming, the optimization problem is set up as a discrete multistage decision process and is solved by a "timedelayed" algorithm. It ensures global optimality, is numerically stable, and allows for hard constraints to be enforced on the solution. In the least-squares approach, we combine three types of observation equations, one radiometric, formulating the matching of a generic object model with image data, and two that express the internal geometric constraints of a curve and the location of operator-given seed points. The solution is obtained by solving a pair of independent normal equations to estimate the parameters of the spline curve. Both techniques can be used in a monoplotting mode, which combines one image with its underlying DTM. The LSB-S~~~~S approach is also implemented in a multi-image mode, which uses multiple images simultaneously and provides for a robust and mathematically sound full 3D approach. These techniques are not restricted to aerial images. They can be applied to satellite and close-range images as well. The issues related to the mathematical modeling of the proposed methods are discussed and experimental results are shown in this paper too.

Proceedings ArticleDOI
17 Jun 1997
TL;DR: An active-camera real-time system for tracking, shape description, and classification of the human face and mouth using only an SGI Indy computer using 2-D blob features, which are spatially-compact clusters of pixels that are similar in terms of low-level image properties.
Abstract: This paper describes an active-camera real-time system for tracking, shape description, and classification of the human face and mouth using only an SGI Indy computer. The system is based on use of 2-D blob features, which are spatially-compact clusters of pixels that are similar in terms of low-level image properties. Patterns of behavior (e.g., facial expressions and head movements) can be classified in real-time using Hidden Markov Model (HMM) methods. The system has been tested on hundreds of users and has demonstrated extremely reliable and accurate performance. Typical classification accuracies are near 100%.

Proceedings ArticleDOI
TL;DR: A relevance feedback based interactive retrieval approach, which effectively takes into account the above two characteristics in CBIR and greatly reduces the user's effort of composing a query and captures the users' information need more precisely.
Abstract: Content-based image retrieval (CBIR) has become one of the most active research areas in the past few years. Many visual feature representations have been explored and many systems built. While these research efforts establish the basis of CBIR, the usefulness of the proposed approaches is limited. Specifically, these efforts have relatively ignored two distinct characteristics of CBIR systems: (1) the gap between high level concepts and low level features; (2) subjectivity of human perception of visual content. This paper proposes a relevance feedback based interactive retrieval approach, which effectively takes into account the above two characteristics in CBIR. During the retrieval process, the user's high level query and perception subjectivity are captured by dynamically updated weights based on the user's relevance feedback. The experimental results show that the proposed approach greatly reduces the user's effort of composing a query and captures the user's information need more precisely.

Book ChapterDOI
17 Sep 1997
TL;DR: In this article, a video-based recognition of isolated signs is proposed, focusing on the manual parameters of sign language, the system aims for the signer dependent recognition of 262 different signs.
Abstract: This paper is concerned with the video-based recognition of isolated signs. Concentrating on the manual parameters of sign language, the system aims for the signer dependent recognition of 262 different signs. For hidden Markov modelling a sign is considered a doubly stochastic process, represented by an unobservable state sequence. The observations emitted by the states are regarded as feature vectors, that are extracted from video frames. The system achieves recognition rates up to 94%.

Journal ArticleDOI
01 Jun 1997
TL;DR: Three novel feature extraction schemes for texture classification are proposed, indicating that the wavelet-based approach is the most accurate, exhibits the best noise performance and has the lowest computational complexity.
Abstract: Three novel feature extraction schemes for texture classification are proposed. The schemes employ the wavelet transform, a circularly symmetric Gabor filter or a Gaussian Markov random field with a circular neighbour set to achieve rotation-invariant texture classification. The schemes are shown to give a high level of classification accuracy compared to most existing schemes, using both fewer features (four) and a smaller area of analysis (16/spl times/16). Furthermore, unlike most existing schemes, the proposed schemes are shown to be rotation invariant demonstrate a high level of robustness noise. The performances of the three schemes are compared, indicating that the wavelet-based approach is the most accurate, exhibits the best noise performance and has the lowest computational complexity.

Journal ArticleDOI
TL;DR: A very large family of binary features for two-dimensional shapes determined by inductive learning during the construction of classification trees is introduced, which makes it possible to narrow the search for informative ones at each node of the tree.
Abstract: We introduce a very large family of binary features for two-dimensional shapes. The salient ones for separating particular shapes are determined by inductive learning during the construction of classification trees. There is a feature for every possible geometric arrangement of local topographic codes. The arrangements express coarse constraints on relative angles and distances among the code locations and are nearly invariant to substantial affine and nonlinear deformations. They are also partially ordered, which makes it possible to narrow the search for informative ones at each node of the tree. Different trees correspond to different aspects of shape. They are statistically and weakly dependent due to randomization and are aggregated in a simple way. Adapting the algorithm to a shape family is then fully automatic once training samples are provided. As an illustration, we classified handwritten digits from the NIST database; the error rate was 0.7 percent.

Journal ArticleDOI
TL;DR: The application of deformable templates to recognition of handprinted digits shows that there does exist a good low-dimensional representation space and methods to reduce the computational requirements, the primary limiting factor, are discussed.
Abstract: We investigate the application of deformable templates to recognition of handprinted digits. Two characters are matched by deforming the contour of one to fit the edge strengths of the other, and a dissimilarity measure is derived from the amount of deformation needed, the goodness of fit of the edges, and the interior overlap between the deformed shapes. Classification using the minimum dissimilarity results in recognition rates up to 99.25 percent on a 2,000 character subset of NIST Special Database 1. Additional experiments on an independent test data were done to demonstrate the robustness of this method. Multidimensional scaling is also applied to the 2,000/spl times/2,000 proximity matrix, using the dissimilarity measure as a distance, to embed the patterns as points in low-dimensional spaces. A nearest neighbor classifier is applied to the resulting pattern matrices. The classification accuracies obtained in the derived feature space demonstrate that there does exist a good low-dimensional representation space. Methods to reduce the computational requirements, the primary limiting factor of this method, are discussed.

01 Jan 1997
TL;DR: This thesis proposes an alternate architecture that goes beyond the basilar-membrane model, and, using which, auditory features can be computed in real time, and presents a unified framework for the problem of dimension reduction and HMM parameter estimation by modeling the original features with reduced-rank HMM.
Abstract: Biologically motivated feature extraction algorithms have been found to provide significantly robust performance in speech recognition systems, in the presence of channel and noise degradation, when compared to the standard features such as mel-cepstrum coefficients. However, auditory feature extraction is computationally expensive, and makes these features useless for real-time speech recognition systems. In this thesis, I investigate the use of low power techniques and custom analog VLSI for auditory feature extraction. I first investigated the basilar-membrane model and the hair-cell model chips that were designed by Liu (Liu, 1992). I performed speech recognition experiments to evaluate how well these chips would perform as a front-end to a speech recognizer. Based on the experience gained by these experiments, I propose an alternate architecture that goes beyond the basilar-membrane model, and, using which, auditory features can be computed in real time. These chips have been designed and tested, and consume only a few milliwatts of power as compared to general purpose digital machines that consume several Watts. I have also investigated Linear Discriminant Analysis (LDA) for dimension reduction of auditory features. Researchers have used Fisher-Rao linear discriminant analysis (LDA) to reduce the feature dimension. They model the low-dimensional features obtained from LDA as the outputs of a Markov process with hidden states (HMM). I present a unified framework for the problem of dimension reduction and HMM parameter estimation by modeling the original features with reduced-rank HMM. This re-formulation also leads to a generalization of LDA that is consistent with the heteroscedastic state models used in HMM, and give better performance when tested on a digit recognition task.

Journal ArticleDOI
TL;DR: A new model-based vision (MBV) algorithm is developed to find regions of interest corresponding to masses in digitized mammograms and to classify the masses as malignant/benign, demonstrating that the MBV approach provides a structured order of integrating complex stages into a system for radiologists.
Abstract: A new model-based vision (MBV) algorithm is developed to find regions of interest (ROI's) corresponding to masses in digitized mammograms and to classify the masses as malignant/benign. The MBV algorithm is comprised of 5 modules to structurally identify suspicious ROI's, eliminate false positives, and classify the remaining as malignant or benign. The focus of attention module uses a difference of Gaussians (DoG) filter to highlight suspicious regions in the mammogram. The index module uses tests to reduce the number of nonmalignant regions from 8.39 to 2.36 per full breast image. Size, shape, contrast, and Laws texture features are used to develop the prediction module's mass models. Derivative-based feature saliency techniques are used to determine the best features for classification. Nine features are chosen to define the malignant/benign models. The feature extraction module obtains these features from all suspicious ROI's. The matching module classifies the regions using a multilayer perceptron neural network architecture to obtain an overall classification accuracy of 100% for the segmented malignant masses with a false-positive rate of 1.8 per full breast image. This system has a sensitivity of 92% for locating malignant ROI's. The database contains 272 images (12 b, 100 /spl mu/m) with 36 malignant and 53 benign mass images. The results demonstrate that the MBV approach provides a structured order of integrating complex stages into a system for radiologists.

Journal ArticleDOI
TL;DR: The role of signature shape description and shape similarity measure is discussed in the context of signature recognition and verification and the proposed method allows definite training control and at the same time significantly reduces the number of enrollment samples required to achieve a good performance.

Proceedings ArticleDOI
03 Nov 1997
TL;DR: This paper proposes an entropy measure for ranking features, and conducts extensive experiments to show that the method is able to find the important features and compares well with a similar feature ranking method that requires class information unlike this method.
Abstract: Dimensionality reduction is an important problem for efficient handling of large databases. Many feature selection methods exist for supervised data having class information. Little work has been done for dimensionality reduction of unsupervised data in which class information is not available. Principal component analysis (PCA) is often used. However, PCA creates new features. It is difficult to obtain intuitive understanding of the data using the new features only. We are concerned with the problem of determining and choosing the important original features for unsupervised data. Our method is based on the observation that removing an irrelevant feature from the feature set may not change the underlying concept of the data, but not so otherwise. We propose an entropy measure for ranking features, and conduct extensive experiments to show that our method is able to find the important features. Also it compares well with a similar feature ranking method (Relief) that requires class information unlike our method.

Proceedings ArticleDOI
22 Jul 1997
TL;DR: The advanced mine detection and classification (AMDAC) algorithm consists of an improved detection density algorithm, a classification feature extractor that uses a stepwise feature selection strategy, a k-nearest neighbor attractor-based neural network (KNN) classifier, and an optimal discriminatory filter classifier.
Abstract: An advanced capability for automated detection and classification of sea mines in sonar imagery has been developed. The advanced mine detection and classification (AMDAC) algorithm consists of an improved detection density algorithm, a classification feature extractor that uses a stepwise feature selection strategy, a k-nearest neighbor attractor-based neural network (KNN) classifier, and an optimal discriminatory filter classifier. The detection stage uses a nonlinear matched filter to identify mine-size regions in the sonar image that closely match a mine's signature. For each detected mine-like region, the feature extractor calculates a large set of candidate classification features. A stepwise feature selection process then determines the subset features that optimizes probability of detection and probability of classification for each of the classifiers while minimizing false alarms.