scispace - formally typeset
Search or ask a question

Showing papers in "Pattern Recognition in 1997"


Journal ArticleDOI
TL;DR: AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities.
Abstract: In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six ''real world'' medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for ''single number'' evaluation of machine learning algorithms.

5,359 citations


Journal ArticleDOI
TL;DR: These processes and a set of tools to facilitate content-based video retrieval and browsing using the feature data set are presented in detail as functions of an integrated system.
Abstract: This paper presents an integrated system solution for computer assisted video parsing and content-based video retrieval and browsing. The effectiveness of this solution lies in its use of video content information derived from a parsing process, being driven by visual feature analysis. That is, parsing will temporally segment and abstract a video source, based on low-level image analyses; then retrieval and browsing of video will be based on key-frame, temporal and motion features of shots. These processes and a set of tools to facilitate content-based video retrieval and browsing using the feature data set are presented in detail as functions of an integrated system.

535 citations


Journal ArticleDOI
TL;DR: This paper presents a general technique for thresholding of digital images based on Renyi's entropy, which includes two of the previously proposed well known global thresholding methods.
Abstract: Image segmentation is an important and fundamental task in many digital image processing systems. Image segmentation by thresholding is the simplest technique and involves the basic assumption that objects and background in the digital image have distinct gray-level distributions. In this paper, we present a general technique for thresholding of digital images based on Renyi's entropy. Our method includes two of the previously proposed well known global thresholding methods. The effectiveness of the proposed method is demonstrated by using examples from the real-world and synthetic images.

414 citations


Journal ArticleDOI
TL;DR: A color segmentation algorithm which combines region growing and region merging processes to generate a non-partitioned segmentation of the image being processed in spatially disconnected but colorimetrically similar regions.
Abstract: In this paper we present a color segmentation algorithm which combines region growing and region merging processes. This algorithm starts with the region growing process which is based on criteria that take into account color similarity and spatial proximity. The resulting regions are then merged on the basis of a criterion that takes into account only color similarity, in order to generate a non-partitioned segmentation of the image being processed in spatially disconnected but colorimetrically similar regions. Adequate attention is then paid to merge non-representative regions into the other ones to proceed to a segmentation which corresponds to the visual judgement.

411 citations


Journal ArticleDOI
TL;DR: A new clustering algorithm called Competitive Agglomeration (CA), which minimizes an objective function that incorporates the advantages of both hierarchical and partitional clustering, is presented.
Abstract: We present a new clustering algorithm called Competitive Agglomeration (CA), which minimizes an objective function that incorporates the advantages of both hierarchical and partitional clustering. The CA algorithm produces a sequence of partitions with a decreasing number of clusters. The initial partition has an over specified number of clusters, and the final one has the ''optimal'' number of clusters. The update equation in the CA algorithm creates an environment in which clusters compete for feature points and only clusters with large cardinalities survive. The algorithm can incorporate different distance measures in the objective function to f find an unknown number of clusters of various shapes.

360 citations


Journal ArticleDOI
TL;DR: A feature-based segmentation approach to the object detection problem is pursued, where the features are computed over multiple spatial orientations and frequencies, which helps in the detection of objects located in complex backgrounds.
Abstract: This paper pertains to the detection of objects located in complex backgrounds. A feature-based segmentation approach to the object detection problem is pursued, where the features are computed over multiple spatial orientations and frequencies. The method proceeds as follows: a given image is passed through a bank of even-symmetric Gabor filters. A selection of these filtered images is made and each (selected) filtered image is subjected to a nonlinear (sigmoidal like) transformation. Then, a measure of texture energy is computed in a window around each transformed image pixel. The texture energy (“Gabor features”) and their spatial locations are inputted to a squared-error clustering algorithm. This clustering algorithm yields a segmentation of the original image—it assigns to each pixel in the image a cluster label that identifies the amount of mean local energy the pixel possesses across different spatial orientations and frequencies. The method is applied to a number of visual and infrared images, each one of which contains one or more objects. The region corresponding to the object is usually segmented correctly, and a unique signature of “Gabor features” is typically associated with the segment containing the object(s) of interest. Experimental results are provided to illustrate the usefulness of this object detection method in a number of problem domains. These problems arise in IVHS, military reconnaissance, fingerprint analysis, and image database query.

306 citations


Journal ArticleDOI
TL;DR: This paper describes an approach for integrating a large number of context-dependent features into a semi-automated tool that provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highly specialized features.
Abstract: Digital library access is driven by features, but the relevance of a feature for a query is not always obvious. This paper describes an approach for integrating a large number of context-dependent features into a semi-automated tool. Instead of requiring universal similarity measures or manual selection of relevant features, the approach provides a learning algorithm for selecting and combining groupings of the data, where groupings can be induced by highly specialized features. The selection process is guided by positive and negative examples from the user. The inherent combinatorics of using multiple features is reduced by a multistage grouping generation, weighting, and collection process. The stages closest to the user are trained fastest and slowly propagate their adaptations back to earlier stages, improving overall performance.

271 citations


Journal ArticleDOI
TL;DR: A novel data clustering algorithm is described, which is a hybrid approach combining a genetic algorithm with the classical c-means clustering algorithms, and it is shown that substantial improvement of image quality is obtained by using the genetic approach.
Abstract: This paper describes a novel data clustering algorithm, which is a hybrid approach combining a genetic algorithm with the classical c-means clustering algorithm (CMA). The proposed technique is superior to CMA in the sense that it converges to a nearby global optimum rather than a local one. As an application, the problem of color image quantization is elaborated. Here, it is shown that substantial improvement of image quality is obtained by using the genetic approach.

222 citations


Journal ArticleDOI
TL;DR: Based on 2-D cluster approach, a fast algorithm for point pattern matching is proposed to effectively solve the problems of optimal matches between two point pattern under geometrical transformation and correctly identify the missing or spurious points of patterns.
Abstract: Based on 2-D cluster approach, a fast algorithm for point pattern matching is proposed to effectively solve the problems of optimal matches between two point pattern under geometrical transformation and correctly identify the missing or spurious points of patterns. Theorems and algorithms are developed to determine the matching pairs support of each point pair and its transformation parameters (scaling s and rotation ϑ) on a two-parameter space ( s,ϑ ). Experiments are conducted both on real and synthetic data. The experimental results show that the proposed matching algorithm can handle translation, rotation, and scaling differences under noisy or distorted condition. The computational time is just about 0.5 s for 50 to 50 point matching on Sun-4 workstation.

210 citations


Journal ArticleDOI
TL;DR: Different strategies that can be applied in this context to reach a decision (e.g. assignment to a class or rejection), provided the possible consequences of each action can be quantified are examined.
Abstract: The Dempster-Shafer theory provides a convenient framework for decision making based on very limited or weak information. Such situations typically arise in pattern recognition problems when patterns have to be classified based on a small number of training vectors, or when the training set does not contain samples from all classes. This paper examines different strategies that can be applied in this context to reach a decision (e.g. assignment to a class or rejection), provided the possible consequences of each action can be quantified. The corresponding decision rules are analysed under different assumptions concerning the completeness of the training set. These approaches are then demonstrated using real data.

206 citations


Journal ArticleDOI
TL;DR: This paper looks at shot detection and characterization using compressed video data directly and proposes a scheme consisting of comparing intensity, row, and column histograms of successive I frames of MPEG video using the chi-square test.
Abstract: The organization of video information for video databases requires segmentation of a video into its constituent shots and their subsequent characterization in terms of content and camera work. In this paper, we look at these two steps using compressed video data directly. For shot detection, we suggest a scheme consisting of comparing intensity, row, and column histograms of successive I frames of MPEG video using the chi-square test. For characterization of segmented shots, we address the problem of classifying shot motion into different categories using a set of features derived from motion vectors of P and B frames of MPEG video. The central component of the proposed shot motion characterization scheme is a decision tree classifier built through a process of supervised learning. Experimental results using a variety of videos are presented to demonstrate the effectiveness of performing shot detection and characterization directly on compressed video.

Journal ArticleDOI
TL;DR: An approach of selecting shape points and outer-layer used for erosion during each iteration of parallel thinning is introduced and the approach produces good skeleton for different types of corners.
Abstract: This paper is concerned with a new parallel thinning algorithm for three-dimensional digital images that preserves the topology and maintains their shape We introduce an approach of selecting shape points and outer-layer used for erosion during each iteration The approach produces good skeleton for different types of corners The concept of using two image versions in thinning is introduced and its necessity in parallel thinning is justified The robustness of the algorithm under pseudo-random noise as well as rotation with respect to shape properties is studied and the results are found to be satisfactory

Journal ArticleDOI
TL;DR: The proposed approach utilizes a set of high-frequency channel energies to characterize texture features, followed by a multi-thresholding technique for coarse segmentation, and the number of texture classes is determined by an inter-scale fusion in which the segmentation results at multiple scales are integrated.
Abstract: In this paper, a mechanism for unsupervised texture segmentation is presented. The approach is based on the multiscale representation of the discrete (dyadic) wavelet transform which can be implemented by a fast iterative algorithm. For unsupervised segmentation it is generally difficult to determine the number of classes to be identified. The proposed approach offers an approach to circumvent this problem. Our method utilizes a set of high-frequency channel energies to characterize texture features, followed by a multi-thresholding technique for coarse segmentation. The coarsely segmented results at the same scale are incorporated by an intea-scale fusion procedure. A fine segmentation technique is then used to reclassify the ambiguously labeled pixels generated from the intea-scale fusion step. Finally, the number of texture classes is determined by an inter-scale fusion in which the segmentation results at multiple scales are integrated. The performance of this method is demonstrated by several experiments on synthetic images, natural textures from Brodatz's album and real-world textured images. Since the choice of wavelets is very extensive and open, we further explore various types of wavelets for texture segmentation. The time cost of the proposed method is also measured.

Journal ArticleDOI
TL;DR: The role of signature shape description and shape similarity measure is discussed in the context of signature recognition and verification and the proposed method allows definite training control and at the same time significantly reduces the number of enrollment samples required to achieve a good performance.
Abstract: In this paper a method for off-line signature verification based on geometric feature extraction and neural network classification is proposed. The role of signature shape description and shape similarity measure is discussed in the context of signature recognition and verification. Geometric features of input signature image are simultaneously examined under several scales by a neural network classifier. An overall match rating is generated by combining the outputs at each scale. Artificially generated genuine and forgery samples from enrollment reference signatures are used to train the network, which allows definite training control and at the same time significantly reduces the number of enrollment samples required to achieve a good performance. Experiments show that 90% correct classification rate can be achieved on a database of over 3000 signature images.

Journal ArticleDOI
TL;DR: Two segmentation algorithms are presented and edge detection and region growing approaches are combined to find large and crisp segments for coarse segmentation towards other applications like object recognition and image understanding.
Abstract: Segmentation is one of the most important preprocessing steps towards pattern recognition and image understanding and a significant step towards image compression and coding. With detecting edges, most of the large segments can be found and separated from others by edge pixels. It is, however, the pixels on edge locations or those in high detailed areas whose association to adjacent segments must be found. A pixel can be a part of the closest segment or in association with the neighboring pixels from a new smaller segment. In this paper, two segmentation algorithms are presented. One is used for fine segmentation towards compression and coding of images and the other for coarse segmentation towards other applications like object recognition and image understanding. Edge detection and region growing approaches are combined to find large and crisp segments for coarse segmentation. Segments can grow or expand based on two fuzzy criteria. The fuzzy region growing and expanding approaches presented here use histogram tables for fine segmentation. The procedures introduced here can be used in any order or combination to yield the best result for any particular application or image type.

Journal ArticleDOI
TL;DR: To assist human analysis of video data, a technique has been developed to perform automatic, content-based video indexing from object motion to analyse the semantic content of the video.
Abstract: To assist human analysis of video data, a technique has been developed to perform automatic, content-based video indexing from object motion Moving objects are detected in the video sequence using motion segmentation methods By tracking individual objects through the segmented data, a symbolic representation of the video is generated in the form of a directed graph describing the objects and their movement This graph is then annotated using a rule-based classification scheme to identify events of interest, eg, appearance/disappearance, deposit/removal, entrance/exit, and motion/rest of objects One may then use an index into the motion graph instead of the raw data to analyse the semantic content of the video Application of this technique to surveillance video analysis is discussed

Journal ArticleDOI
Demin Wang1
TL;DR: Experimental results indicate that watershed transformation with the algorithms proposed in this paper produces meaningful segmentations, even without a region merging step, which can efficiently improve segmentation accuracy and significantly reduce the computational cost of watershed-based image segmentation methods.
Abstract: Watershed transformation is a powerful tool for image segmentation. However, the effectiveness of the image segmentation methods based on watershed transformation is limited by the quality of the gradient image used in the methods. In this paper we present a multiscale algorithm for computing gradient images, with effective handling of both step and blurred edges. We also present an algorithm for eliminating irrelevant minima in the resulting gradient images. Experimental results indicate that watershed transformation with the algorithms proposed in this paper produces meaningful segmentations, even without a region merging step. The proposed algorithms can efficiently improve segmentation accuracy and significantly reduce the computational cost of watershed-based image segmentation methods.

Journal ArticleDOI
TL;DR: Experimental results show that combination of the classifiers increases reliability of the recognition results and is the unique feature of this work.
Abstract: This paper is concerned with signature verification. Three different types of global features have been used for the classification of signatures. Feed-forward neural net based classifiers have been used. The features used for the classification are projection moments and upper and lower envelope based characteristics. Output of the three classifiers is combined using a connectionist scheme. Combination of these feature based classifiers for signature verification is the unique feature of this work. Experimental results show that combination of the classifiers increases reliability of the recognition results.

Journal ArticleDOI
TL;DR: The generalized Dunn's index and the Davies-Bouldin index for cluster validation using graph structures, such as GG, RNG and MST are generalized and superiority over some existing cluster validity indices is established.
Abstract: In this article we have generalized Dunn's index and the Davies-Bouldin index for cluster validation using graph structures, such as GG, RNG and MST. Unlike Dunn's index and the Davies-Bouldin index, the proposed indices are not sensitive to noisy points and are applicable to hyperspherical and structural clusters as well. The relationships between various indices have also been established. The effectiveness of the generalized indices and superiority over some existing cluster validity indices are established using eight data sets.

Journal ArticleDOI
TL;DR: This paper casts the optimisation process into a Bayesian framework by exploiting the recently reported global consistency measure of Wilson and Hancock as a fitness measure, and demonstrates empirically that the method possesses polynomial convergence time and that the convergence rate is more rapid than simulated annealing.
Abstract: This paper describes a framework for performing relational graph matching using genetic search There are three novel ingredients to the work Firstly, we cast the optimisation process into a Bayesian framework by exploiting the recently reported global consistency measure of Wilson and Hancock as a fitness measure The second novel idea is to realise the crossover process at the level of subgraphs, rather than employing string-based or random crossover Finally, we accelerate convergence by employing a deterministic hill-climbing process prior to selection Since we adopt the Bayesian consistency measure as a fitness function, the basic measure of relational distance underpinning the technique is Hamming distance Our standpoint is that genetic search provides a more attractive means of performing stochastic discrete optimisation on the global consistency measure than alternatives such as simulated annealing Moreover, the action of the optimisation process is easily understood in terms of its action in the Hamming distance domain We demonstrate empirically not only that the method possesses polynomial convergence time but also that the convergence rate is more rapid than simulated annealing We provide some experimental evaluation of the method in the matching of aerial stereograms and evaluate its sensitivity on synthetically generated graphs

Journal ArticleDOI
TL;DR: Several algorithms for preprocessing, feature extraction, pre-classification, and main classification, and modified Bayes classifier and subspace method for the robust main classification are experimentally compared to improve the recognition accuracy of handwritten Japanese character recognition.
Abstract: Several algorithms for preprocessing, feature extraction, pre-classification, and main classification are experimentally compared to improve the recognition accuracy of handwritten Japanese character recognition. The compared algorithms are three types of nonlinear normalization for the preprocessing, the discriminant analysis and the principal component analysis for the feature extraction, the minimum distance classifiers and the linear classifier for the high-speed pre-classification, and modified Bayes classifier and subspace method for the robust main classification. The performance of the recognition algorithm is fully tested using the ETL9B character database. The recognition accuracy of 99.15% at the recognition speed of eight characters per second is achieved. This accuracy is the best one ever reported for the database.

Journal ArticleDOI
TL;DR: It is shown that, whilst both methods are capable of determining cluster validity for data sets in which clusters tend towards a multivariate Gaussian distribution, the parametric method inevitably fails for clusters which have a non-Gaussian structure whilst the scale-space method is more robust.
Abstract: Much work has been published on methods for assessing the probable number of clusters or structures within unknown data sets. This paper aims to look in more detail at two methods, a broad parametric method, based around the assumption of Gaussian clusters and the other a non-parametric method which utilises methods of scale-space filtering to extract robust structures within a data set. It is shown that, whilst both methods are capable of determining cluster validity for data sets in which clusters tend towards a multivariate Gaussian distribution, the parametric method inevitably fails for clusters which have a non-Gaussian structure whilst the scale-space method is more robust.

Journal ArticleDOI
TL;DR: This work presents two new algorithms for the detection of circles and ellipses which use the FHT algorithm as a basis: Fast Circle Hough Transform (FCHT) and Fast Ellipse Houghtransform (FEHT).
Abstract: In this work we present two new algorithms for the detection of circles and ellipses which use the FHT algorithm as a basis: Fast Circle Hough Transform (FCHT) and Fast Ellipse Hough Transform (FEHT). The first stage of these two algorithms, devoted to obtaining the centers of the figures, is computationally the most costly. With the objective of improving the execution times of this stage it has been implemented using a new focusing algorithm instead of the typical polling process in a parameter space. This new algorithm uses a new strategy that manages to reduce the execution times, especially in the case where multiple figures appear in the image, or when they are of very different sizes. We also perform a labeling of the image points that permits discriminating which of these belong to each figure, saving computations in subsequent stages.

Journal ArticleDOI
TL;DR: Thresholding, the problem of pixel classification is attempted here using fuzzy clustering algorithms, using segmented regions are fuzzy subsets, with soft partitions characterizing the region boundaries.
Abstract: Thresholding, the problem of pixel classification is attempted here using fuzzy clustering algorithms. The segmented regions are fuzzy subsets, with soft partitions characterizing the region boundaries. The validity of the assumptions and thresholding schemes are investigated in the presence of distinct region proportions. The hard k means and fuzzy c means algorithms have been found useful when object and background regions are well balanced. Fuzzy thresholding is also formulated as extraction of normal densities to provide optimal partitions. Regional imbalances in gray distributions are taken care of in region normalized histograms.

Journal ArticleDOI
TL;DR: This paper addresses automatic interpretation of images of outdoor scenes by using a large database of ground-truth labelled images, a neural network is trained as a pattern classifier, thereby enabling image databases to be queried on scene content.
Abstract: This paper addresses automatic interpretation of images of outdoor scenes. The method allows instances of objects from a number of generic classes to be identified: vegetation, buildings; vehicles; roads, etc., thereby enabling image databases to be queried on scene content. The feature set is based, in part, on psychophysical principles and includes measures of colour, texture and shape. Using a large database of ground-truth labelled images, a neural network is trained as a pattern classifier. The method is demonstrated on a large test set to provide highly accurate image interpretations, with over 90% of the image area labelled correctly.

Journal ArticleDOI
TL;DR: Empirical investigations show that the proposed MLP-based scheme is superior to the other schemes implemented.
Abstract: In this paper a new scheme of feature ranking and hence feature selection using a Multilayer Perception (MLP) Network has been proposed. The novelty of the proposed MLP-based scheme and its difference from another MLP-based feature ranking scheme have been analyzed. In addition we have modified an existing feature ranking/selection scheme based on fuzzy entropy. Empirical investigations show that the proposed MLP-based scheme is superior to the other schemes implemented.

Journal ArticleDOI
TL;DR: A computationally efficient procedure for skew detection and text line position determination in digitized documents, which is based on the cross-correlation between the pixels of vertical lines in a document, which provides good and accurate results while it requires only a short computational time.
Abstract: This paper proposes a computationally efficient procedure for skew detection and text line position determination in digitized documents, which is based on the cross-correlation between the pixels of vertical lines in a document. The determination of the skew angle in documents is essential in optical character recognition systems. Due to the text skew, each horizontal text line intersects a predefined set of vertical lines at non-horizontal positions. Using only the pixels on these vertical lines we construct a correlation matrix and evaluate the skew angle of the document with high accuracy. In addition, using the same matrix, we compute the positions of text lines in the document. The proposed method is tested on a variety of mixed-type documents and it provides good and accurate results while it requires only a short computational time. We illustrate the effectiveness of the algorithm by presenting four characteristic examples.

Journal ArticleDOI
TL;DR: An improved method for eye-feature extraction, descriptions, and tracking using deformable templates using region-based template deformation, which avoids problems such as template shrinking, adjusting the weights of energy terms, failure of orientation adjustment due to some exceptional cases.
Abstract: We propose an improved method for eye-feature extraction, descriptions, and tracking using deformable templates. Some existing algorithms are exploited to locate the initial position of eye features and then deformable templates are used for extracting and describing the eye features. Rather than using original energy minimization for matching the templates, the region-based approach is proposed for template deformation. Based on the region properties, the new strategy avoids problems such as template shrinking, adjusting the weights of energy terms, failure of orientation adjustment due to some exceptional cases. Our strategies are also coupled with Canny edge operator to give a new back-end processing. By integrating the local edge information from the edge detection and the global collector from our region-based template deformation, this processing stage can generate accurate eye-feature descriptions. Finally, the template deformation process is applied to tracking eye features.

Journal ArticleDOI
TL;DR: The method can estimate qualitatively camera pan, tilt, zoom, roll, and horizontal and vertical tracking and can distinguish pan from horizontal tracking, and tilt from vertical tracking.
Abstract: We propose a simple technique for extracting camera motion parameters from a sequence of images. The method can estimate qualitatively camera pan, tilt, zoom, roll, and horizontal and vertical tracking. Unlike most other comparable techniques, the present method can distinguish pan from horizontal tracking, and tilt from vertical tracking. The technique can be applied to the automated indexing of video and film sequences.

Journal ArticleDOI
TL;DR: From the double-loop type down to the arch type in the order given above, the framework employs both a geometric grouping and a global geometric shape analysis of fingerprint ridges to accomplish the required task.
Abstract: Given a digitized fingerprint image, we would like to classify it into one of several types already established in the literature. In this paper, we consider five types for classification: double loop, whorl, left loop, right loop, and arch. We illustrate the use of a geometric framework for a systematic top-down classification of the foregoing types. From the double-loop type down to the arch type in the order given above, the framework employs both a geometric grouping and a global geometric shape analysis of fingerprint ridges to accomplish the required task. These processes are based on the framework's underlying B-spline representation and interpretation of the ridges.