scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 1988"


Journal ArticleDOI
TL;DR: Various types of moments have been used to recognize image patterns in a number of applications and some fundamental questions are addressed, such as image-representation ability, noise sensitivity, and information redundancy.
Abstract: Various types of moments have been used to recognize image patterns in a number of applications. A number of moments are evaluated and some fundamental questions are addressed, such as image-representation ability, noise sensitivity, and information redundancy. Moments considered include regular moments, Legendre moments, Zernike moments, pseudo-Zernike moments, rotational moments, and complex moments. Properties of these moments are examined in detail and the interrelationships among them are discussed. Both theoretical and experimental results are presented. >

1,522 citations


Journal ArticleDOI
TL;DR: The hierarchical chamfer matching algorithm matches edges by minimizing a generalized distance between them in a hierarchical structure, i.e. in a resolution pyramid, which reduces the computational load significantly.
Abstract: The algorithm matches edges by minimizing a generalized distance between them. The matching is performed in a series of images depicting the same scene with different resolutions, i.e. in a resolution pyramid. Using this hierarchical structure reduces the computational load significantly. The algorithm is reasonably simple to implement and is insensitive to noise and other disturbances. The algorithm has been tested in several applications. Two of them are briefly presented. In the first application the outlines of common tools are matched to gray-level images of the same tools, with overlapping. In the second application lake edges from aerial photographs are matched to lake edges from a map, with translation, rotation, scale, and perspective changes. The hierarchical chamfer matching algorithm gives correct results using a reasonable amount of computational resources in all tested applications. >

1,206 citations


Journal ArticleDOI
TL;DR: A piecewise-smooth surface model for image data that possesses surface coherence properties is used to develop an algorithm that simultaneously segments a large class of images into regions of arbitrary shape and approximates image data with bivariate functions so that it is possible to compute a complete, noiseless image reconstruction based on the extracted functions and regions.
Abstract: The solution of the segmentation problem requires a mechanism for partitioning the image array into low-level entities based on a model of the underlying image structure. A piecewise-smooth surface model for image data that possesses surface coherence properties is used to develop an algorithm that simultaneously segments a large class of images into regions of arbitrary shape and approximates image data with bivariate functions so that it is possible to compute a complete, noiseless image reconstruction based on the extracted functions and regions. Surface curvature sign labeling provides an initial coarse image segmentation, which is refined by an iterative region-growing method based on variable-order surface fitting. Experimental results show the algorithm's performance on six range images and three intensity images. >

1,151 citations


Journal ArticleDOI
TL;DR: An approach for enforcing integrability, a particular implementation of the approach, an example of its application to extending an existing shape-from-shading algorithm, and experimental results showing the improvement that results from enforcingIntegrability are presented.
Abstract: An approach for enforcing integrability, a particular implementation of the approach, an example of its application to extending an existing shape-from-shading algorithm, and experimental results showing the improvement that results from enforcing integrability are presented. A possibly nonintegrable estimate of surface slopes is represented by a finite set of basis functions, and integrability is enforced by calculating the orthogonal projection onto a vector subspace spanning the set of integrable slopes. The integrability projection constraint was applied to extending an iterative shape-from-shading algorithm of M.J. Brooks and B.K.P. Horn (1985). Experimental results show that the extended algorithm converges faster and with less error than the original version. Good surface reconstructions were obtained with and without known boundary conditions and for fairly complicated surfaces. >

1,090 citations


Journal ArticleDOI
TL;DR: A distributed architecture articulated around the CODGER (communication database with geometric reasoning) knowledge database is described for a mobile robot system that includes both perception and navigation tools.
Abstract: A distributed architecture articulated around the CODGER (communication database with geometric reasoning) knowledge database is described for a mobile robot system that includes both perception and navigation tools. Results are described for vision and navigation tests using a mobile testbed that integrates perception and navigation capabilities that are based on two types of vision algorithms: color vision for road following, and 3-D vision for obstacle detection and avoidance. The perception modules are integrated into a system that allows the vehicle to drive continuously in an actual outdoor environment. The resulting system is able to navigate continuously on roads while avoiding obstacles. >

780 citations


Journal ArticleDOI
TL;DR: An approximate solution to the weighted-graph-matching problem is discussed for both undirected and directed graphs and an analytic approach is used instead of a combinatorial or iterative approach to the optimum matching problem.
Abstract: An approximate solution to the weighted-graph-matching problem is discussed for both undirected and directed graphs. The weighted-graph-matching problem is that of finding the optimum matching between two weighted graphs, which are graphs with weights at each arc. The proposed method uses an analytic instead of a combinatorial or iterative approach to the optimum matching problem. Using the eigendecompositions of the adjacency matrices (in the case of the undirected-graph-matching problem) or Hermitian matrices derived from the adjacency matrices (in the case of the directed-graph-matching problem), a matching close to the optimum can be found efficiently when the graphs are sufficiently close to each other. Simulation results are given to evaluate the performance of the proposed method. >

777 citations


Journal ArticleDOI
TL;DR: The development and implementation of an algorithm for automated text string separation that is relatively independent of changes in text font style and size and of string orientation are described and showed superior performance compared to other techniques.
Abstract: The development and implementation of an algorithm for automated text string separation that is relatively independent of changes in text font style and size and of string orientation are described. It is intended for use in an automated system for document analysis. The principal parts of the algorithm are the generation of connected components and the application of the Hough transform in order to group components into logical character strings that can then be separated from the graphics. The algorithm outputs two images, one containing text strings and the other graphics. These images can then be processed by suitable character recognition and graphics recognition systems. The performance of the algorithm, both in terms of its effectiveness and computational efficiency, was evaluated using several test images and showed superior performance compared to other techniques. >

664 citations


Journal ArticleDOI
R.K. Lenz1, Roger Y. Tsai1
TL;DR: Three groups of techniques for center calibration are presented: Group I requires using a laser and a four-degree-of-freedom adjustment of its orientation, but is simplest in concept and is accurate and reproducible; Group II is simple to perform,but is less accurate than the other two; and the most general, Group II, is accurate, but requires a good calibration plate and accurate image feature extraction of calibration points.
Abstract: Techniques are described for calibrating certain intrinsic camera parameters for machine vision. The parameters to be calibrated are the horizontal scale factor, and the image center. The scale factor calibration uses a one-dimensional fast Fourier transform and is accurate and efficient. It also permits the use of only one coplanar set of calibration points for general camera calibration. Three groups of techniques for center calibration are presented: Group I requires using a laser and a four-degree-of-freedom adjustment of its orientation, but is simplest in concept and is accurate and reproducible; Group II is simple to perform, but is less accurate than the other two; and the most general, Group II, is accurate and efficient, but requires a good calibration plate and accurate image feature extraction of calibration points. Group II is recommended most highly for machine vision applications. Results of experiments are presented and compared with theoretical predictions. Accuracy and reproducibility of the calibrated parameters are reported, as well as the improvement in actual 3-D measurement due to center calibration. >

650 citations


Journal ArticleDOI
TL;DR: A well-posed variational formulation results from the use of a controlled-continuity surface model, and Finite-element shape primitives yield a local discretization of the variational principle, which is an efficient algorithm for visible-surface reconstruction.
Abstract: A computational theory of visible-surface representations is developed. The visible-surface reconstruction process that computes these quantitative representations unifies formal solutions to the key problems of: (1) integrating multiscale constraints on surface depth and orientation from multiple-visual sources; (2) interpolating dense, piecewise-smooth surfaces from these constraints; (3) detecting surface depth and orientation discontinuities to apply boundary conditions on interpolation; and (4) structuring large-scale, distributed-surface representations to achieve computational efficiency. Visible-surface reconstruction is an inverse problem. A well-posed variational formulation results from the use of a controlled-continuity surface model. Discontinuity detection amounts to the identification of this generic model's distributed parameters from the data. Finite-element shape primitives yield a local discretization of the variational principle. The result is an efficient algorithm for visible-surface reconstruction. >

520 citations


Journal ArticleDOI
TL;DR: It is shown that there exists a tradeoff between the number of frequency components used per position and thenumber of such clusters (sampling rate) utilized along the spatial coordinate.
Abstract: A scheme suitable for visual information representation in a combined frequency-position space is investigated through image decomposition into a finite set of two-dimensional Gabor elementary functions (GEF) The scheme is generalized to account for the position-dependent Gabor-sampling rate, oversampling, logarithmic frequency scaling and phase-quantization characteristic of the visual system Comparison of reconstructed signal highlights the advantages of the generalized Gabor scheme in coding typical bandlimited images It is shown that there exists a tradeoff between the number of frequency components used per position and the number of such clusters (sampling rate) utilized along the spatial coordinate >

486 citations


Journal ArticleDOI
TL;DR: The authors discuss various road segmentation methods for video-based road-following, along with approaches to boundary extraction and transformation of boundaries in the image plane into a vehicle-centered three-dimensional scene model.
Abstract: A description is given of VITS (for vision task sequencer), the vision system for the autonomous land vehicle (ALV) Alvin, addressing in particular the task of road-following. The ALV vision system builds symbolic descriptions of road and obstacle boundaries using both video and range sensors. The authors discuss various road segmentation methods for video-based road-following, along with approaches to boundary extraction and transformation of boundaries in the image plane into a vehicle-centered three-dimensional scene model. >

Journal ArticleDOI
TL;DR: The problem of automatically generating the possible camera locations for observing an object and an approach to its solution is presented, which uses models of the object and the camera based on meeting the requirements that the spatial resolution be above a minimum value and all surface points be in focus.
Abstract: The problem of automatically generating the possible camera locations for observing an object is defined, and an approach to its solution is presented. The approach, which uses models of the object and the camera, is based on meeting the requirements that: the spatial resolution be above a minimum value, all surface points be in focus, all surfaces lie within the sensor field of view and no surface points be occluded. The approach converts each sensing requirement into a geometric constraint on the sensor location, from which the three-dimensional region of viewpoints that satisfies that constraint is computed. The intersection of these regions is the space where a sensor may be located. The extension of this approach to laser-scanner range sensors is also described. Examples illustrate the resolution, focus, and field-of-view constraints for two vision tasks. >

Journal ArticleDOI
TL;DR: In this article, a linear bintree is presented to perform connected component labeling of images of arbitrary dimension that are represented by a linear Bintree, a generalization of the quadtree data structure that enables dealing with images of any dimension.
Abstract: An algorithm is presented to perform connected-component labeling of images of arbitrary dimension that are represented by a linear bintree. The bintree is a generalization of the quadtree data structure that enables dealing with images of arbitrary dimension. The linear bintree is a pointerless representation. The algorithm uses an active border which is represented by linked lists instead of arrays. This results in a significant reduction in the space requirements, thereby making it feasible to process three- and higher-dimensional images. Analysis of the execution time of the algorithm shows almost linear behavior with respect to the number of leaf nodes in the image, and empirical tests are in agreement. The algorithm can be modified easily to compute a (d-1)-dimensional boundary measure (e.g. perimeter in two dimensions and surface area in three dimensions) with linear performance. >

Journal ArticleDOI
TL;DR: Two standard approaches for texture analysis make use of numerical features of the second-order gray-level statistics, and on first-order statistics ofgray-level differences, respectively, which allow discrimination of degrees of wear in wool carpet.
Abstract: Two standard approaches for texture analysis make use of numerical features of the second-order gray-level statistics, and on first-order statistics of gray-level differences, respectively. Feature sets of these types, all designed analogously, were used to analyze four sets of carpet samples exposed to different degrees of wear. It was found that some of the features extracted from the spatial gray-level-dependence matrix, neighboring gray-level-dependence matrix, gray-level difference method, and the gray-level run-length method allowed discrimination of degrees of wear in wool carpet. The methods developed could be of use in quality control. >

Journal ArticleDOI
TL;DR: Experiments indicate that the procedure to estimate the unconstrained three-dimensional location and orientation of an object with a known shape when it is visible in a single image is very reliably accurate, although optimization can further improve estimates of the parameters.
Abstract: A procedure is presented to estimate the unconstrained three-dimensional location and orientation of an object with a known shape when it is visible in a single image. Using a generalized Hough transform, all six parameters of the object position are estimated from the distribution of values determined by matching triples of points on the object to possibly corresponding triples in the image. Most likely candidates for location are found, and then the remaining rotational parameters are evaluated. Two solutions are generally admitted to the distribution by every match of triangles. The number of possible matches is reduced by checking a simple geometric relation among triples. Even with partial occlusion, experiments indicate that the procedure is very reliably accurate, although optimization can further improve estimates of the parameters. >

Journal ArticleDOI
TL;DR: A procedure for using moment-based feature vectors to identify a three-dimensional object from a two-dimensional image recorded at an arbitrary viewing angle and range is presented and a moment form called standard moments is considered, rather than the usual moment invariants.
Abstract: A procedure for using moment-based feature vectors to identify a three-dimensional object from a two-dimensional image recorded at an arbitrary viewing angle and range is presented. A moment form called standard moments, rather than the usual moment invariants, is considered. A standard six-airplane experiment was used to compare different techniques. Fourier descriptors and moment invariants were both compared to the present scheme for normalized moments. Various experiments were conducted using mixtures of silhouette and boundary moments and different normalization techniques. Standard moments gave slightly better results than Fourier descriptors for this experiment; both of these techniques were much better than moment invariants. >

Journal ArticleDOI
TL;DR: It is shown that the method proposed is better than the classical method for L classes and is a generalization of the optimal set of discriminant vectors proposed for two-class problems.
Abstract: A general method is proposed to describe multivariate data sets using discriminant analysis and principal-component analysis. First, the problem of finding K discriminant vectors in an L-class data set is solved and compared to the solution proposed in the literature for two-class problems and the classical solution for L-class data sets. It is shown that the method proposed is better than the classical method for L classes and is a generalization of the optimal set of discriminant vectors proposed for two-class problems. Then the method is combined with a generalized principal-component analysis to permit the user to define the properties of each successive computed vector. All the methods were tested using measurements made on various kinds of flowers (IRIS data). >

Journal ArticleDOI
TL;DR: An approach to illumination and imaging of specular surfaces that yields three-dimensional shape information is described and the proposed structured highlight techniques are promising for many industrial tasks.
Abstract: An approach to illumination and imaging of specular surfaces that yields three-dimensional shape information is described. The structured highlight approach uses a scanned array of point sources and images of the resulting reflected highlights to compute local surface height and orientation. A prototype structured highlight inspection system, called SHINY, has been implemented. SHINY demonstrates the determination of surface shape for several test objects including solder joints. The current SHINY system makes the distant-source assumption and requires only one camera. A stereo structured highlight system using two cameras is proposed to determine surface-element orientation for objects in a much larger field of view. Analysis and description of the algorithms are included. The proposed structured highlight techniques are promising for many industrial tasks. >

Journal ArticleDOI
TL;DR: It is found that an algorithm using alternating mean thresholding and median filtering provides an acceptable method when the image is relatively highly contaminated, and seems to depend less on initial values than other procedures.
Abstract: Several model-based algorithms for threshold selection are presented, concentrating on the two-population univariate case in which an image contains an object and background. It is shown how the main ideas behind two important nonspatial thresholding algorithms follow from classical discriminant analysis. Novel thresholding algorithms that make use of available local/spatial information are then given. It is found that an algorithm using alternating mean thresholding and median filtering provides an acceptable method when the image is relatively highly contaminated, and seems to depend less on initial values than other procedures. The methods are also applicable to multispectral k-population images. >

Journal ArticleDOI
TL;DR: A novel approach to solving the stereo correspondence problem in computer vision and an entropy-based figure of merit for attribute selection and ordering are defined.
Abstract: A novel approach to solving the stereo correspondence problem in computer vision is described. Structural descriptions of two two-dimensional views of a scene are extracted by one of possibly several available low-level processes, and a new theory of inexact matching for such structures is derived. An entropy-based figure of merit for attribute selection and ordering is defined. Experimental results applying these techniques to real image pairs are presented. Some manipulation experiments are briefly presented. >

Journal ArticleDOI
TL;DR: A technique for automatically deriving the evidence rule base from training views of objects is shown to generate evidence conditions that successfully identify new views of those objects.
Abstract: An evidence-based recognition technique is defined that identifies 3-D objects by looking for their notable features. This technique makes use of an evidence rule base, which is a set of salient or evidence conditions with corresponding evidence weights for various objects in the database. A measure of similarity between the set of observed features and the set of evidence conditions for a given object in the database is used to determine the identity of an object in the scene or reject the object(s) in the scene as unknown. This procedure has polynomial time complexity and correctly identifies a variety of objects in both synthetic and real range images. A technique for automatically deriving the evidence rule base from training views of objects is shown to generate evidence conditions that successfully identify new views of those objects. >

Journal ArticleDOI
TL;DR: It is shown that W(P) can be computed in O(n log n+I) time and O( n) space, where I is the number of antipodal pairs of edges of the convex hull of P, and n is thenumber of vertices.
Abstract: For a set of points P in three-dimensional space, the width of P, W (P), is defined as the minimum distance between parallel planes of support of P. It is shown that W(P) can be computed in O(n log n+I) time and O(n) space, where I is the number of antipodal pairs of edges of the convex hull of P, and n is the number of vertices; in the worst case, I=O(n/sup 2/). For a convex polyhedra the time complexity becomes O(n+I). If P is a set of points in the plane, the complexity can be reduced to O(nlog n). For simple polygons, linear time suffices. >

Journal ArticleDOI
TL;DR: An analysis is presented of the imaging of surfaces modeled by fractal Brownian elevation functions of the sort used in computer graphics, and it is shown that, if Lambertian reflectance modest surface slopes and the absence of occlusions and self shadowing are assumed, a fractal surface with Fourier power spectrum proportional to f/sup beta / produces an image.
Abstract: An analysis is presented of the imaging of surfaces modeled by fractal Brownian elevation functions of the sort used in computer graphics. It is shown that, if Lambertian reflectance modest surface slopes and the absence of occlusions and self shadowing are assumed, a fractal surface with Fourier power spectrum proportional to f/sup beta / produces an image with power spectrum proportional to f/sup 2- beta /; here, f is the spatial frequency and beta is related to the fractional dimension value. This allows one to use the spectral falloff of the images to predict the fractal dimension of the surface. >

Journal ArticleDOI
TL;DR: The contrast and orientation estimation accuracy of several edge operators that have been proposed in the literature is examined both for the noiseless case and in the presence of additive Gaussian noise.
Abstract: The contrast and orientation estimation accuracy of several edge operators that have been proposed in the literature is examined both for the noiseless case and in the presence of additive Gaussian noise. The test image is an ideal step edge that has been sampled with a square-aperture grid. The effects of subpixel translations and rotations of the edge on the performance of the operators are studied. It is shown that the effect of subpixel translations of an edge can generate more error than moderate noise levels. Methods with improved results are presented for Sobel angle estimates and the Nevatia-Babu operator, and theoretical noise performance evaluations are also provided. An edge operator based on two-dimensional spatial moments is presented. All methods are compared according to worst-case and RMS error in an ideal noiseless situation and RMS error under various noise levels. >

Journal ArticleDOI
TL;DR: The authors investigate the use of a priori knowledge about a scene to coordinate and control bilevel image segmentation, interpretation, and shape inspection of different objects in the scene.
Abstract: The authors investigate the use of a priori knowledge about a scene to coordinate and control bilevel image segmentation, interpretation, and shape inspection of different objects in the scene. The approach is composed of two main steps. The first step consists of proper segmentation and labeling of individual regions in the image for subsequent ease in interpretation. General as well as scene-specific knowledge is used to improve the segmentation and interpretation processes. Once every region in the image has been identified, the second step proceeds by testing different regions to ensure they meet the design requirements, which are formalized by a set of rules. Morphological techniques are used to extract certain features from the previously processed image for rule verification purposes. As a specific example, results for detecting defects in printed circuit boards are presented. >

Journal ArticleDOI
Luc Devroye1
TL;DR: The Vapnik-Chervonenkis method can be used to choose the smoothing parameter in kernel-based rules, to choose k in the k-nearest neighbor rule, and to choose between parametric and nonparametric rules.
Abstract: A test sequence is used to select the best rule from a class of discrimination rules defined in terms of the training sequence. The Vapnik-Chervonenkis and related inequalities are used to obtain distribution-free bounds on the difference between the probability of error of the selected rule and the probability of error of the best rule in the given class. The bounds are used to prove the consistency and asymptotic optimality for several popular classes, including linear discriminators, nearest-neighbor rules, kernel-based rules, histogram rules, binary tree classifiers, and Fourier series classifiers. In particular, the method can be used to choose the smoothing parameter in kernel-based rules, to choose k in the k-nearest neighbor rule, and to choose between parametric and nonparametric rules. >

Journal ArticleDOI
TL;DR: Two models based on multivariate Gaussian random fields are proposed to model this fuzzy membership process of mixed-pixel data, which involves predicting the group membership and estimating the parameters.
Abstract: In the usual statistical approach to spatial classification, it is assumed that each pixel belongs to precisely one of a small number of known groups. This framework is extended to include mixed-pixel data; then, only a proportion of each pixel belongs to each group. Two models based on multivariate Gaussian random fields are proposed to model this fuzzy membership process. The problems of predicting the group membership and estimating the parameters are discussed. Some simulations are presented to study the properties of this approach, and an example is given using Landsat remote-sensing data. >

Journal ArticleDOI
A.F. Korn1
TL;DR: The symbolic representation of gray-value variations is studied, with emphasis on the gradient of the image function, and a procedure is proposed to select automatically a suitable scale, and with that, the size of the right convolution kernel.
Abstract: The symbolic representation of gray-value variations is studied, with emphasis on the gradient of the image function. The goal is to relate the results of this analysis to the structure of the picture, which is determined by the physics of the image generation process. Candidates for contour points are the maximal magnitudes of the gray-value gradient for different scales in the direction of the gradient. Based on the output of such a bank of gradient filters, a procedure is proposed to select automatically a suitable scale, and with that, the size of the right convolution kernel. The application of poorly adapted filters, which make the exact localization of gray-value corners or T-, X-, and Y-junctions more difficult, is thus avoided. Possible gaps at such junctions are discussed for images of real scenes, and possibilities for the closure of some of these gaps are demonstrated when the extrema of the magnitudes of the gray-value gradients are used. >

Journal ArticleDOI
TL;DR: A partial-shape-recognition technique utilizing local features described by Fourier descriptors is introduced, and experimental results are discussed that indicate that partial contours can be recognized with reasonable accuracy.
Abstract: A partial-shape-recognition technique utilizing local features described by Fourier descriptors is introduced. A dynamic programming formulation for shape matching is developed, and a method for comparison of match quality is discussed. This technique is shown to recognize unknown contours that may be occluded or that may overlap other objects. Precise scale information is not required, and the unknown objects may appear at any orientation with respect to the camera. The segment-matching dynamic programming method is contrasted with other sequence-comparison techniques that utilize dynamic programming. Experimental results are discussed that indicate that partial contours can be recognized with reasonable accuracy. >

Journal ArticleDOI
TL;DR: A description is given of the system architecture of an autonomous vehicle and its real-time adaptive vision system for road-following, which is a 10-ton armored personnel carrier modified for robotic control.
Abstract: A description is given of the system architecture of an autonomous vehicle and its real-time adaptive vision system for road-following. The vehicle is a 10-ton armored personnel carrier modified for robotic control. A color transformation that best discriminates road and nonroad regions is derived from labeled data samples. A maximum-likelihood pixel classification technique is then used to classify pixels in the transformed color image. The vision system adapts itself to road changes in two ways; color transformation parameters are updated infrequently to accommodate significant road color changes, and classifier parameters are updated every processing cycle to deal with gradual color and intensity changes. To reduce unnecessary computation, only the most likely road region in the segmented image is selected, and a polygonal representation of the detected road region boundary is transformed from the image coordinate system to the local vehicle coordinate system based on a flat-earth assumption. >