scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 1999"


Journal ArticleDOI
TL;DR: In this paper, a 3D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion is presented, which is based on matching surfaces by matching points using the spin image representation.
Abstract: We present a 3D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching surfaces by matching points using the spin image representation. The spin image is a data level shape descriptor that is used to match surfaces represented as surface meshes. We present a compression scheme for spin images that results in efficient multiple object recognition which we verify with results showing the simultaneous recognition of multiple objects from a library of 20 models. Furthermore, we demonstrate the robust performance of recognition in the presence of clutter and occlusion through analysis of recognition trials on 100 scenes.

2,798 citations


Journal ArticleDOI
TL;DR: Six well-known SFS algorithms are implemented and compared, and the performance of the algorithms was analyzed on synthetic images using mean and standard deviation of depth error, mean of surface gradient error, and CPU timing.
Abstract: Since the first shape-from-shading (SFS) technique was developed by Horn in the early 1970s, many different approaches have emerged. In this paper, six well-known SFS algorithms are implemented and compared. The performance of the algorithms was analyzed on synthetic images using mean and standard deviation of depth (Z) error, mean of surface gradient (p, q) error, and CPU timing. Each algorithm works well for certain images, but performs poorly for others. In general, minimization approaches are more robust, while the other approaches are faster.

1,879 citations


Journal ArticleDOI
TL;DR: Most major filtering approaches to texture feature extraction are reviewed and a ranking of the tested approaches based on extensive experiments is presented, showing the effect of the filtering is highlighted, keeping the local energy function and the classification algorithm identical for most approaches.
Abstract: In this paper, we review most major filtering approaches to texture feature extraction and perform a comparative study. Filtering approaches included are Laws masks (1980), ring/wedge filters, dyadic Gabor filter banks, wavelet transforms, wavelet packets and wavelet frames, quadrature mirror filters, discrete cosine transform, eigenfilters, optimized Gabor filters, linear predictors, and optimized finite impulse response filters. The features are computed as the local energy of the filter responses. The effect of the filtering is highlighted, keeping the local energy function and the classification algorithm identical for most approaches. For reference, comparisons with two classical nonfiltering approaches, co-occurrence (statistical) and autoregressive (model based) features, are given. We present a ranking of the tested approaches based on extensive experiments.

1,567 citations


Journal ArticleDOI
TL;DR: This work proposes a method for automatically classifying facial images based on labeled elastic graph matching, a 2D Gabor wavelet representation, and linear discriminant analysis, and a visual interpretation of the discriminant vectors.
Abstract: We propose a method for automatically classifying facial images based on labeled elastic graph matching, a 2D Gabor wavelet representation, and linear discriminant analysis. Results of tests with three image sets are presented for the classification of sex, "race", and expression. A visual interpretation of the discriminant vectors is provided.

1,095 citations


Journal ArticleDOI
TL;DR: This paper explores and compares techniques for automatically recognizing facial actions in sequences of images and provides converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions.
Abstract: The facial action coding system (FAGS) is an objective method for quantifying facial movement in terms of component actions. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include: analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions.

1,086 citations


Journal ArticleDOI
TL;DR: A similarity measure is developed, based on fuzzy logic, that exhibits several features that match experimental findings in humans and is an extension to a more general domain of the feature contrast model due to Tversky (1977).
Abstract: With complex multimedia data, we see the emergence of database systems in which the fundamental operation is similarity assessment. Before database issues can be addressed, it is necessary to give a definition of similarity as an operation. We develop a similarity measure, based on fuzzy logic, that exhibits several features that match experimental findings in humans. The model is dubbed fuzzy feature contrast (FFC) and is an extension to a more general domain of the feature contrast model due to Tversky (1977). We show how the FFC model can be used to model similarity assessment from fuzzy judgment of properties, and we address the use of fuzzy measures to deal with dependencies among the properties.

834 citations


Journal ArticleDOI
Hyeon-Kyu Lee1, Jin Hyung Kim2
TL;DR: A new method is developed using the hidden Markov model (HMM) based technique that calculates the likelihood threshold of an input pattern and provides a confirmation mechanism for the provisionally matched gesture patterns.
Abstract: A new method is developed using the hidden Markov model (HMM) based technique. To handle nongesture patterns, we introduce the concept of a threshold model that calculates the likelihood threshold of an input pattern and provides a confirmation mechanism for the provisionally matched gesture patterns. The threshold model is a weak model for all trained gestures in the sense that its likelihood is smaller than that of the dedicated gesture model for a given gesture. Consequently, the likelihood can be used as an adaptive threshold for selecting proper gesture model. It has, however, a large number of states and needs to be reduced because the threshold model is constructed by collecting the states of all gesture models in the system. To overcome this problem, the states with similar probability distributions are merged, utilizing the relative entropy measure. Experimental results show that the proposed method can successfully extract trained gestures from continuous hand motion with 93.14% reliability.

704 citations


Journal ArticleDOI
TL;DR: This work proposes a family of linear methods that yield a unique solution to 4- and 5-point pose determination for generic reference points and shows that they do not degenerate for coplanar configurations and even outperform the special linear algorithm for copLANar configurations in practice.
Abstract: The determination of camera position and orientation from known correspondences of 3D reference points and their images is known as pose estimation in computer vision and space resection in photogrammetry. It is well-known that from three corresponding points there are at most four algebraic solutions. Less appears to be known about the cases of four and five corresponding points. We propose a family of linear methods that yield a unique solution to 4- and 5-point pose determination for generic reference points. We first review the 3-point algebraic method. Then we present our two-step, 4-point and one-step, 5-point linear algorithms. The 5-point method can also be extended to handle more than five points. Finally, we demonstrate our methods on both simulated and real images. We show that they do not degenerate for coplanar configurations and even outperform the special linear algorithm for coplanar configurations in practice.

671 citations


Journal ArticleDOI
TL;DR: The approach is to extend the standard hidden Markov model method of gesture recognition by including a global parametric variation in the output probabilities of the HMM states by forming an expectation-maximization (EM) method for training the parametric HMM.
Abstract: A method for the representation, recognition, and interpretation of parameterized gesture is presented. By parameterized gesture we mean gestures that exhibit a systematic spatial variation; one example is a point gesture where the relevant parameter is the two-dimensional direction. Our approach is to extend the standard hidden Markov model method of gesture recognition by including a global parametric variation in the output probabilities of the HMM states. Using a linear model of dependence, we formulate an expectation-maximization (EM) method for training the parametric HMM. During testing, a similar EM algorithm simultaneously maximizes the output likelihood of the PHMM for the given sequence and estimates the quantifying parameters. Using visually derived and directly measured three-dimensional hand position measurements as input, we present results that demonstrate the recognition superiority of the PHMM over standard HMM techniques, as well as greater robustness in parameter estimation with respect to noise in the input features. Finally, we extend the PHMM to handle arbitrary smooth (nonlinear) dependencies. The nonlinear formulation requires the use of a generalized expectation-maximization (GEM) algorithm for both training and the simultaneous recognition of the gesture and estimation of the value of the parameter. We present results on a pointing gesture, where the nonlinear approach permits the natural spherical coordinate parameterization of pointing direction.

646 citations


Journal ArticleDOI
TL;DR: This work presents a fingerprint classification algorithm which is able to achieve an accuracy better than previously reported in the literature and is based on a two-stage classifier to make a classification.
Abstract: Fingerprint classification provides an important indexing mechanism in a fingerprint database. An accurate and consistent classification can greatly reduce fingerprint matching time for a large database. We present a fingerprint classification algorithm which is able to achieve an accuracy better than previously reported in the literature. We classify fingerprints into five categories: whorl, right loop, left loop, arch, and tented arch. The algorithm uses a novel representation (FingerCode) and is based on a two-stage classifier to make a classification. It has been tested on 4000 images in the NIST-4 database. For the five-class problem, a classification accuracy of 90 percent is achieved (with a 1.8 percent rejection during the feature extraction phase). For the four-class problem (arch and tented arch combined into one class), we are able to achieve a classification accuracy of 94.8 percent (with 1.8 percent rejection). By incorporating a reject option at the classifier, the classification accuracy can be increased to 96 percent for the five-class classification task, and to 97.8 percent for the four-class classification task after a total of 32.5 percent of the images are rejected.

639 citations


Journal ArticleDOI
TL;DR: This paper addresses three major issues associated with conventional partitional clustering, namely, sensitivity to initialization, difficulty in determining the number of clusters, and sensitivity to noise and outliers with the proposed robust competitive agglomeration (RCA).
Abstract: This paper addresses three major issues associated with conventional partitional clustering, namely, sensitivity to initialization, difficulty in determining the number of clusters, and sensitivity to noise and outliers. The proposed robust competitive agglomeration (RCA) algorithm starts with a large number of clusters to reduce the sensitivity to initialization, and determines the actual number of clusters by a process of competitive agglomeration. Noise immunity is achieved by incorporating concepts from robust statistics into the algorithm. RCA assigns two different sets of weights for each data point: the first set of constrained weights represents degrees of sharing, and is used to create a competitive environment and to generate a fuzzy partition of the data set. The second set corresponds to robust weights, and is used to obtain robust estimates of the cluster prototypes. By choosing an appropriate distance measure in the objective function, RCA can be used to find an unknown number of clusters of various shapes in noisy data sets, as well as to fit an unknown number of parametric models simultaneously. Several examples, such as clustering/mixture decomposition, line/plane fitting, segmentation of range images, and estimation of motion parameters of multiple objects, are shown.

Journal ArticleDOI
TL;DR: A robust system is proposed to automatically detect and extract text in images from different sources, including video, newspapers, advertisements, stock certificates, photographs, and checks.
Abstract: A robust system is proposed to automatically detect and extract text in images from different sources, including video, newspapers, advertisements, stock certificates, photographs, and checks. Text is first detected using multiscale texture segmentation and spatial cohesion constraints, then cleaned up and extracted using a histogram-based binarization algorithm. An automatic performance evaluation scheme is also proposed.

Journal ArticleDOI
TL;DR: It is proved that, in the new formulation, there is a one-to-one correspondence between maximal cliques and maximal subtree isomorphisms, which allows the tree matching problem to be cast as an indefinite quadratic program using the Motzkin-Straus theorem.
Abstract: It is well-known that the problem of matching two relational structures can be posed as an equivalent problem of finding a maximal clique in a (derived) "association graph." However, it is not clear how to apply this approach to computer vision problems where the graphs are hierarchically organized, i.e., are trees, since maximal cliques are not constrained to preserve the partial order. We provide a solution to the problem of matching two trees by constructing the association graph using the graph-theoretic concept of connectivity. We prove that, in the new formulation, there is a one-to-one correspondence between maximal cliques and maximal subtree isomorphisms. This allows us to cast the tree matching problem as an indefinite quadratic program using the Motzkin-Straus theorem, and we use "replicator" dynamical systems developed in theoretical biology to solve it. Such continuous solutions to discrete problems are attractive because they can motivate analog and biological implementations. The framework is also extended to the matching of attributed trees by using weighted association graphs. We illustrate the power of the approach by matching articulated and deformed shapes described by shock trees.

Journal ArticleDOI
TL;DR: Based upon estimates of the short length scale spatial covariance of the image, a method utilizing indicator kriging to complete the image segmentation is developed.
Abstract: We consider the problem of segmenting a digitized image consisting of two univariate populations. Assume a priori knowledge allows incomplete assignment of voxels in the image, in the sense that a fraction of the voxels can be identified as belonging to population II/sub 0/, a second fraction to II/sub 1/, and the remaining fraction have no a priori identification. Based upon estimates of the short length scale spatial covariance of the image, we develop a method utilizing indicator kriging to complete the image segmentation.

Journal ArticleDOI
TL;DR: In this article, the RANSAC-based DARCES method is proposed to solve the partially overlapping 3D registration problem without any initial estimation, which can be used even for the case that there are no local features in the 3D data sets.
Abstract: In this paper, we propose a new method, the RANSAC-based DARCES method (data-aligned rigidity-constrained exhaustive search based on random sample consensus), which can solve the partially overlapping 3D registration problem without any initial estimation. For the noiseless case, the basic algorithm of our method can guarantee that the solution it finds is the true one, and its time complexity can be shown to be relatively low. An extra characteristic is that our method can be used even for the case that there are no local features in the 3D data sets.

Journal ArticleDOI
TL;DR: This work introduces a new approach to automatic fingerprint classification in which the directional image is partitioned into "homogeneous" connected regions according to the fingerprint topology, thus giving a synthetic representation which can be exploited as a basis for the classification.
Abstract: In this work, we introduce a new approach to automatic fingerprint classification. The directional image is partitioned into "homogeneous" connected regions according to the fingerprint topology, thus giving a synthetic representation which can be exploited as a basis for the classification. A set of dynamic masks, together with an optimization criterion, are used to guide the partitioning. The adaptation of the masks produces a numerical vector representing each fingerprint as a multidimensional point, which can be conceived as a continuous classification. Different search strategies are discussed to efficiently retrieve fingerprints both with continuous and exclusive classification. Experimental results have been given for the most commonly used fingerprint databases and the new method has been compared with other approaches known in the literature: As to fingerprint retrieval based on continuous classification, our method gives the best performance and exhibits a very high robustness.

Journal ArticleDOI
TL;DR: This paper demonstrates the feasibility of an end-to-end person tracking system using a unique combination of motion analysis on 3D geometry in different camera coordinates and other existing techniques in motion detection, segmentation, and pattern recognition.
Abstract: This paper presents a comprehensive framework for tracking coarse human models from sequences of synchronized monocular grayscale images in multiple camera coordinates. It demonstrates the feasibility of an end-to-end person tracking system using a unique combination of motion analysis on 3D geometry in different camera coordinates and other existing techniques in motion detection, segmentation, and pattern recognition. The system starts with tracking from a single camera view. When the system predicts that the active camera will no longer have a good view of the subject of interest, tracking will be switched to another camera which provides a better view and requires the least switching to continue tracking. The nonrigidity of the human body is addressed by matching points of the middle line of the human image, spatially and temporally, using Bayesian classification schemes. Multivariate normal distributions are employed to model class-conditional densities of the features for tracking, such as location, intensity, and geometric features. Limited degrees of occlusion are tolerated within the system. Experimental results using a prototype system are presented and the performance of the algorithm is evaluated to demonstrate its feasibility for real time applications.

Journal ArticleDOI
TL;DR: A new Kalman-filter based active contour model is proposed for tracking of nonrigid objects in combined spatio-velocity space and an optical-flow based detection mechanism is proposed to improve robustness to image clutter and to occlusions.
Abstract: A new Kalman-filter based active contour model is proposed for tracking of nonrigid objects in combined spatio-velocity space. The model employs measurements of gradient-based image potential and of optical-flow along the contour as system measurements. In order to improve robustness to image clutter and to occlusions an optical-flow based detection mechanism is proposed. The method detects and rejects spurious measurements which are not consistent with previous estimation of image motion.

Journal ArticleDOI
TL;DR: An efficient feature extraction method based on the fast wavelet transform is presented that has been verified on a flank wear estimation problem in turning processes and on a problem of recognizing different kinds of lung sounds for diagnosis of pulmonary diseases.
Abstract: An efficient feature extraction method based on the fast wavelet transform is presented. The paper especially deals with the assessment of process parameters or states in a given application using the features extracted from the wavelet coefficients of measured process signals. Since the parameter assessment using all wavelet coefficients will often turn out to be tedious or leads to inaccurate results, a preprocessing routine that computes robust features correlated to the process parameters of interest is highly desirable. The method presented divides the matrix of computed wavelet coefficients into clusters equal to row vectors. The rows that represent important frequency ranges (for signal interpretation) have a larger number of clusters than the rows that represent less important frequency ranges. The features of a process signal are eventually calculated by the Euclidean norms of the clusters. The effectiveness of this new method has been verified on a flank wear estimation problem in turning processes and on a problem of recognizing different kinds of lung sounds for diagnosis of pulmonary diseases.

Journal ArticleDOI
TL;DR: A region snake approach is proposed and fast algorithms for the segmentation of an object in an image are determined by transforming the summations over a region, for the calculation of the statistics, into summations along the boundary of the region.
Abstract: Algorithms for object segmentation are crucial in many image processing applications. During past years, active contour models (snakes) have been widely used for finding the contours of objects. This segmentation strategy is classically edge-based in the sense that the snake is driven to fit the maximum of an edge map of the scene. We propose a region snake approach and we determine fast algorithms for the segmentation of an object in an image. The algorithms developed in a maximum likelihood approach are based on the calculation of the statistics of the inner and the outer regions (defined by the snake). It has thus been possible to develop optimal algorithms adapted to the random fields which describe the gray levels in the input image if we assume that their probability density function family are known. We demonstrate that this approach is still efficient when no boundary's edge exists in the image. We also show that one can obtain fast algorithms by transforming the summations over a region, for the calculation of the statistics, into summations along the boundary of the region. Finally, we will provide numerical simulation results for different physical situations in order to illustrate the efficiency of this approach.

Journal ArticleDOI
TL;DR: The number of costly computations needed to determine if an area of the viewing volume would be occluded from some scanning position is decoupled from the number of positions considered for the NBV, thus reducing the computational cost of choosing one.
Abstract: A solution to the "next best view" (NBV) problem for automated surface acquisition is presented. The NBV problem is to determine which areas of a scanner's viewing volume need to be scanned to sample all of the visible surfaces of an a priori unknown object and where to position/control the scanner to sample them. A method for determining the unscanned areas of the viewing volume is presented. In addition, a novel representation, positional space, is presented which facilitates a solution to the NBV problem by representing what must be and what can be scanned in a single data structure. The number of costly computations needed to determine if an area of the viewing volume would be occluded from some scanning position is decoupled from the number of positions considered for the NBV, thus reducing the computational cost of choosing one. An automated surface acquisition systems designed to scan all visible surfaces of an a priori unknown object is demonstrated on real objects.

Journal ArticleDOI
TL;DR: An algorithm is developed to specifically align and mosaic images using parametric transformations in the presence of lens distortion to provide true multi-image alignment that does not rely on the measurements of a reference image being distortion free.
Abstract: Multiple images of a scene are related through 2D/3D view transformations and linear and nonlinear camera transformations. We present an algorithm for true multi-image alignment that does not rely on the measurements of a reference image being distortion free. The algorithm is developed to specifically align and mosaic images using parametric transformations in the presence of lens distortion. When lens distortion is present, none of the images can be assumed to be ideal. In our formulation, all the images are modeled as intensity measurements represented in their respective coordinate systems, each of which is related to an ideal coordinate system through an interior camera transformation and an exterior view transformation. The goal of the accompanying algorithm is to compute an image in the ideal coordinate system while solving for the transformations that relate the ideal system with each of the data images. Key advantages of the technique presented in this paper are: (i) no reliance on one distortion free image, (ii) ability to register images and compute coordinate transformations even when the multiple images are of an extended scene with no overlap between the first and last frame of the sequence, and (iii) ability to handle linear and nonlinear transformations within the same framework. Results of applying the algorithm are presented for the correction of lens distortion, and creation of video mosaics.

Journal ArticleDOI
TL;DR: Two fuzzy models are made to describe the skin color and hair color and then compared with the prebuilt head-shape models by using a fuzzy theory based pattern-matching method to detect face candidates.
Abstract: This paper describes a new method to detect faces in color images based on the fuzzy theory. We make two fuzzy models to describe the skin color and hair color, respectively. In these models, we use a perceptually uniform color space to describe the color information to increase the accuracy and stableness. We use the two models to extract the skin color regions and the hair color regions, and then comparing them with the prebuilt head-shape models by using a fuzzy theory based pattern-matching method to detect face candidates.

Journal ArticleDOI
TL;DR: This paper rederive these algorithms as approximations of the Kalman filter and then carry out a thorough analysis of their performance, which shows the computational feasibility of these algorithms.
Abstract: In an earlier work (1999), we introduced the problem of reconstructing a super-resolution image sequence from a given low resolution sequence. We proposed two iterative algorithms, the R-SD and the R-LMS, to generate the desired image sequence. These algorithms assume the knowledge of the blur, the down-sampling, the sequences motion, and the measurements noise characteristics, and apply a sequential reconstruction process. It has been shown that the computational complexity of these two algorithms makes both of them practically applicable. In this paper, we rederive these algorithms as approximations of the Kalman filter and then carry out a thorough analysis of their performance. For each algorithm, we calculate a bound on its deviation from the Kalman filter performance. We also show that the propagated information matrix within the R-SD algorithm remains sparse in time, thus ensuring the applicability of this algorithm. To support these analytical results we present some computer simulations on synthetic sequences, which also show the computational feasibility of these algorithms.

Journal ArticleDOI
Horst Bunke1
TL;DR: It is shown that, for a given cost function, there are an infinite number of other cost functions that lead, for any given pair of graphs, to the same optimal error correcting matching.
Abstract: Investigates the influence of the cost function on the optimal match between two graphs. It is shown that, for a given cost function, there are an infinite number of other cost functions that lead, for any given pair of graphs, to the same optimal error correcting matching. Furthermore, it is shown that well-known concepts from graph theory, such as graph isomorphism, subgraph isomorphism, and maximum common subgraph, are special cases of optimal error correcting graph matching under particular cost functions.

Journal ArticleDOI
TL;DR: This paper presents extensive experiments where the flexible algorithm to match curves under substantial deformations and arbitrary large scaling and rigid transformations, and defines a dissimilarity measure which is used in order to organize the image database into shape categories.
Abstract: Curve matching is one instance of the fundamental correspondence problem. Our flexible algorithm is designed to match curves under substantial deformations and arbitrary large scaling and rigid transformations. A syntactic representation is constructed for both curves and an edit transformation which maps one curve to the other is found using dynamic programming. We present extensive experiments where we apply the algorithm to silhouette matching. In these experiments, we examine partial occlusion, viewpoint variation, articulation, and class matching (where silhouettes of similar objects are matched). Based on the qualitative syntactic matching, we define a dissimilarity measure and we compute it for every pair of images in a database of 121 images. We use this experiment to objectively evaluate our algorithm. First, we compare our results to those reported by others. Second, we use the dissimilarity values in order to organize the image database into shape categories. The veridical hierarchical organization stands as evidence to the quality of our matching and similarity estimation.

Journal ArticleDOI
TL;DR: An omnifont, unlimited-vocabulary OCR system for English and Arabic based on hidden Markov models (HMM), an approach that has proven to be very successful in the area of automatic speech recognition, is presented.
Abstract: We present an omnifont, unlimited-vocabulary OCR system for English and Arabic. The system is based on hidden Markov models (HMM), an approach that has proven to be very successful in the area of automatic speech recognition. We focus on two aspects of the OCR system. First, we address the issue of how to perform OCR on omnifont and multi-style data, such as plain and italic, without the need to have a separate model for each style. The amount of training data from each style, which is used to train a single model, becomes an important issue in the face of the conditional independence assumption inherent in the use of HMMs. We demonstrate mathematically and empirically how to allocate training data among the different styles to alleviate this problem. Second, we show how to use a word-based HMM system to perform character recognition with unlimited vocabulary. The method includes the use of a trigram language model on character sequences. Using all these techniques, we have achieved character error rates of 1.1 percent on data from the University of Washington English Document Image Database and 3.3 percent on data from the DARPA Arabic OCR Corpus.

Journal ArticleDOI
TL;DR: A hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies and can be successfully used for handwritten word recognition.
Abstract: Describes a hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies. After preprocessing, a word image is segmented into letters or pseudoletters and represented by two feature sequences of equal length, each consisting of an alternating sequence of shape-symbols and segmentation-symbols, which are both explicitly modeled. The word model is made up of the concatenation of appropriate letter models consisting of elementary HMMs and an HMM-based interpolation technique is used to optimally combine the two feature sets. Two rejection mechanisms are considered depending on whether or not the word image is guaranteed to belong to the lexicon. Experiments carried out on real-life data show that the proposed approach can be successfully used for handwritten word recognition.

Journal ArticleDOI
TL;DR: The notion of the histogram of forces is introduced and it generalizes and may supersede the histograms of angles, which is considered to provide a good representation of the relative position of an object with regard to another.
Abstract: The fuzzy qualitative evaluation of directional spatial relationships (such as "to the right of", "to the south of...") between areal objects often relies on the computation of a histogram of angles, which is considered to provide a good representation of the relative position of an object with regard to another. In this paper, the notion of the histogram of forces is introduced. It generalizes and may supersede the histogram of angles. The objects (2D entities) are handled as longitudinal sections (1D entities), not as points (OD entities). It is thus possible to fully benefit from the power of integral calculus and, so, ensure rapid processing of raster data, as well as of vector data, explicitly considering both angular and metric information.

Journal ArticleDOI
TL;DR: The main conclusion drawn from the analysis is that the data-closeness constraint improves the efficiency of shape-from-shading and that both the topographic and gradient consistency constraints improve the fidelity of the recovered needle-map.
Abstract: This paper makes two contributions to the problem of needle-map recovery using shape-from-shading. First, we provide a geometric update procedure which allows the image irradiance equation to be satisfied as a hard constraint. This not only improves the data closeness of the recovered needle-map, but also removes the necessity for extensive parameter tuning. Second, we exploit the improved ease of control of the new shape-from-shading process to investigate various types of needle-map consistency constraint. The first set of constraints are based on needle-map smoothness. The second avenue of investigation is to use curvature information to impose topographic constraints. Third, we explore ways in which the needle-map is recovered so as to be consistent with the image gradient field. In each case we explore a variety of robust error measures and consistency weighting schemes that can be used to impose the desired constraints on the recovered needle-map. We provide an experimental assessment of the new shape-from-shading framework on both real world images and synthetic images with known ground truth surface normals. The main conclusion drawn from our analysis is that the data-closeness constraint improves the efficiency of shape-from-shading and that both the topographic and gradient consistency constraints improve the fidelity of the recovered needle-map.