scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 1984"


Journal ArticleDOI
TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Abstract: We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, nonlinear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low energy states (``annealing''), or what is the same thing, the most probable states under the Gibbs distribution. The analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations. The result is a highly parallel ``relaxation'' algorithm for MAP estimation. We establish convergence properties of the algorithm and we experiment with some simple pictures, for which good restorations are obtained at low signal-to-noise ratios.

18,761 citations


Journal ArticleDOI
TL;DR: The3-D fractal model provides a characterization of 3-D surfaces and their images for which the appropriateness of the model is verifiable and this characterization is stable over transformations of scale and linear transforms of intensity.
Abstract: This paper addresses the problems of 1) representing natural shapes such as mountains, trees, and clouds, and 2) computing their description from image data. To solve these problems, we must be able to relate natural surfaces to their images; this requires a good model of natural surface shapes. Fractal functions are a good choice for modeling 3-D natural surfaces because 1) many physical processes produce a fractal surface shape, 2) fractals are widely used as a graphics tool for generating natural-looking shapes, and 3) a survey of natural imagery has shown that the 3-D fractal surface model, transformed by the image formation process, furnishes an accurate description of both textured and shaded image regions. The 3-D fractal model provides a characterization of 3-D surfaces and their images for which the appropriateness of the model is verifiable. Furthermore, this characterization is stable over transformations of scale and linear transforms of intensity. The 3-D fractal model has been successfully applied to the problems of 1) texture segmentation and classification, 2) estimation of 3-D shape information, and 3) distinguishing between perceptually ``smooth'' and perceptually ``textured'' surfaces in the scene.

1,919 citations


Journal ArticleDOI
TL;DR: It is shown that under certain conditions the K-means algorithm may fail to converge to a local minimum, and that it converges under differentiability conditions to a Kuhn-Tucker point.
Abstract: The K-means algorithm is a commonly used technique in cluster analysis. In this paper, several questions about the algorithm are addressed. The clustering problem is first cast as a nonconvex mathematical program. Then, a rigorous proof of the finite convergence of the K-means-type algorithm is given for any metric. It is shown that under certain conditions the algorithm may fail to converge to a local minimum, and that it converges under differentiability conditions to a Kuhn-Tucker point. Finally, a method for obtaining a local-minimum solution is given.

1,180 citations


Journal ArticleDOI
Robert M. Haralick1
TL;DR: The facet model is used to accomplish step edge detection and the Marr-Hildreth zero crossing of the Laplacian operator is found that it is the best performer; next is the Prewitt gradient operator.
Abstract: We use the facet model to accomplish step edge detection. The essence of the facet model is that any analysis made on the basis of the pixel values in some neighborhood has its final authoritative interpretation relative to the underlying gray tone intensity surface of which the neighborhood pixel values are observed noisy samples. With regard to edge detection, we define an edge to occur in a pixel if and only if there is some point in the pixel's area having a negatively sloped zero crossing of the second directional derivative taken in the direction of a nonzero gradient at the pixel's center. Thus, to determine whether or not a pixel should be marked as a step edge pixel, its underlying gray tone intensity surface must be estimated on the basis of the pixels in its neighborhood. For this, we use a functional form consisting of a linear combination of the tensor products of discrete orthogonal polynomials of up to degree three. The appropriate directional derivatives are easily computed from this kind of a function. Upon comparing the performance of this zero crossing of second directional derivative operator with the Prewitt gradient operator and the Marr-Hildreth zero crossing of the Laplacian operator, we find that it is the best performer; next is the Prewitt gradient operator. The Marr-Hildreth zero crossing of the Laplacian operator performs the worst.

1,130 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that seven point correspondences are sufficient to uniquely determine from two perspective views the three-dimensional motion parameters (within a scale factor for the translations) of a rigid object with curved surfaces.
Abstract: Two main results are established in this paper. First, we show that seven point correspondences are sufficient to uniquely determine from two perspective views the three-dimensional motion parameters (within a scale factor for the translations) of a rigid object with curved surfaces. The seven points should not be traversed by two planes with one plane containing the origin, nor by a cone containing the origin. Second, a set of ``essential parameters'' are introduced which uniquely determine the motion parameters up to a scale factor for the translations, and can be estimated by solving a set of linear equations which are derived from the correspondences of eight image points. The actual motion parameters can subsequently be determined by computing the singular value decomposition (SVD) of a 3×3 matrix containing the essential parameters.

915 citations


Journal ArticleDOI
TL;DR: Textures are classified based on the change in their properties with changing resolution, and the relation of a texture picture to its negative, and directional properties are discussed.
Abstract: Textures are classified based on the change in their properties with changing resolution The area of the gray level surface is measured at serveral resolutions This area decreases at coarser resolutions since fine details that contribute to the area disappear Fractal properties of the picture are computed from the rate of this decrease in area, and are used for texture comparison and classification The relation of a texture picture to its negative, and directional properties, are also discussed

833 citations


Journal ArticleDOI
TL;DR: Moment invariants are evaluated as a feature space for pattern recognition in terms of discrimination power and noise tolerance and properties of complex moments are used to characterize moment invariants.
Abstract: Moment invariants are evaluated as a feature space for pattern recognition in terms of discrimination power and noise tolerance The notion of complex moments is introduced as a simple and straightforward way to derive moment invariants Through this relation, properties of complex moments are used to characterize moment invariants Aspects of information loss, suppression, and redundancy encountered in moment invariants are investigated and significant results are derived The behavior of moment invariants in the presence of additive noise is also described

414 citations


Journal ArticleDOI
TL;DR: Algorithms to recover illuminant direction and estimate surface orientation have been evaluated on both natural and synthesized images, and have been found to produce useful information about the scene.
Abstract: Local analysis of image shading, in the absence of prior knowledge about the viewed scene, may be used to provide information about the scene. The following has been proved. Every image point has the same image intensity and first and second derivatives as the image of some point on a Lambertian surface with principal curvatures of equal magnitude. Further, if the principal curvatures are assumed to be equal there is a unique combination of image formation parameters (up to a mirror reversal) that will produce a particular set of image intensity and first and second derivatives. A solution for the unique combination of surface orientation, etc., is presented. This solution has been extended to natural imagery by using general position and regional constraints to obtain estimates of the following: ? surface orientation at each image point; ? the qualitative type of the surface, i.e., whether the surface is planar, cylindrical, convex, concave, or saddle; ? the illuminant direction within a region. Algorithms to recover illuminant direction and estimate surface orientation have been evaluated on both natural and synthesized images, and have been found to produce useful information about the scene.

412 citations


Journal ArticleDOI
TL;DR: A new solution to the image segmentation problem that is based on the design of a rule-based expert system that dynamically alters the processing strategy is presented.
Abstract: A major problem in robotic vision is the segmentation of images of natural scenes in order to understand their content. This paper presents a new solution to the image segmentation problem that is based on the design of a rule-based expert system. General knowledge about low level properties of processes employ the rules to segment the image into uniform regions and connected lines. In addition to the knowledge rules, a set of control rules are also employed. These include metarules that embody inferences about the order in which the knowledge rules are matched. They also incorporate focus of attention rules that determine the path of processing within the image. Furthermore, an additional set of higher level rules dynamically alters the processing strategy. This paper discusses the structure and content of the knowledge and control rules for image segmentation.

380 citations


Journal ArticleDOI
TL;DR: It is shown that the edge location is related to the so-called ``Christoffel numbers'' and is compared with Sobel and Hueckel edge detectors in presence and absence of noise.
Abstract: A new method for locating edges in digital data to subpixel values and which is invariant to additive and multiplicative changes in the data is presented For one-dimensional edge patterns an ideal edge is fit to the data by matching moments It is shown that the edge location is related to the so-called ``Christoffel numbers'' Also presented is the study of the effect of additive noise on edge location The method is extended to include two-dimensional edge patterns where a line equation is derived to locate an edge This in turn is compared with the standard Hueckel edge operator An application of the new edge operator as an edge detector is also provided and is compared with Sobel and Hueckel edge detectors in presence and absence of noise

359 citations


Journal ArticleDOI
TL;DR: A multiple resolution representation for the two-dimensional gray-scale shapes in an image is defined by detecting peaks and ridges in the difference of lowpass (DOLP) transform and the principles for determining the correspondence between symbols in pairs of such descriptions are described.
Abstract: This paper defines a multiple resolution representation for the two-dimensional gray-scale shapes in an image. This representation is constructed by detecting peaks and ridges in the difference of lowpass (DOLP) transform. Descriptions of shapes which are encoded in this representation may be matched efficiently despite changes in size, orientation, or position. Motivations for a multiple resolution representation are presented first, followed by the definition of the DOLP transform. Techniques are then presented for encoding a symbolic structural description of forms from the DOLP transform. This process involves detecting local peaks and ridges in each bandpass image and in the entire three-dimensional space defined by the DOLP transform. Linking adjacent peaks in different bandpass images gives a multiple resolution tree which describes shape. Peaks which are local maxima in this tree provide landmarks for aligning, manipulating, and matching shapes. Detecting and linking the ridges in each DOLP bandpass image provides a graph which links peaks within a shape in a bandpass image and describes the positions of the boundaries of the shape at multiple resolutions. Detecting and linking the ridges in the DOLP three-space describes elongated forms and links the largest peaks in the tree. The principles for determining the correspondence between symbols in pairs of such descriptions are then described. Such correspondence matching is shown to be simplified by using the correspondence at lower resolutions to constrain the possible correspondence at higher resolutions.

Journal ArticleDOI
TL;DR: This system is able to recognize a large class of 2-D mathematic formulas written on a graphic tablet and starts the parsing by localization of the ``principal'' operator in the formula and attempts to partition it into subexpressions which are similarly analyzed by looking for a starting character.
Abstract: Mathematical formulas are good examples of two-dimensional patterns as well as pictures or graphics. The use of syntactic methods is useful for interpreting such complex patterns. In this paper we propose a system for the interpretation of 2-D mathematic formulas based on a syntactic parser. This system is able to recognize a large class of 2-D mathematic formulas written on a graphic tablet. It starts the parsing by localization of the ``principal'' operator in the formula and attempts to partition it into subexpressions which are similarly analyzed by looking for a starting character. The generalized parser used in the system has been developed in our group for continuous speech recognition and picture interpretation.

Journal ArticleDOI
TL;DR: The work is described systematically and analyzed in terms of so-called feature matching, which is likely to be the mainstream of the research and development of machine recognition of handprinted Chinese characters.
Abstract: Machine recognition of handprinted Chinese characters has recently become very active in Japan. Both from the practical and the academic point of view, very encouraging results are reported. The work is described systematically and analyzed in terms of so-called feature matching, which is likely to be the mainstream of the research and development of machine recognition of handprinted Chinese characters. A database, ETL8 (881 Kanji, 71 hirakana, and 160 variations for each category), is explained, on which many experiments were performed. Recognition rates reported using this database can be compared, and so somewhat qualitative evaluation of these methods is described. Based on the comparative study, the merits and demerits of both feature and structural matching are discussed and some future directions are mentioned.

Journal ArticleDOI
TL;DR: The technique is hierarchical and uses results obtained at low levels to speed up and improve the accuracy of results at higher levels, and has been applied to two-dimensional simple closed curves represented by polygons.
Abstract: In this paper we present results in the areas of shape matching of nonoccluded and occluded two-dimensional objects. Shape matching is viewed as a ``segment matching'' problem. Unlike the previous work, the technique is based on a stochastic labeling procedure which explicitly maximizes a criterion function based on the ambiguity and inconsistency of classification. To reduce the computation time, the technique is hierarchical and uses results obtained at low levels to speed up and improve the accuracy of results at higher levels. This basic technique has been extended to the situation where various objects partially occlude each other to form an apparent object and our interest is to find all the objects participating in the occlusion. In such a case several hierarchical processes are executed in parallel for every object participating in the occlusion and are coordinated in such a way that the same segment of the apparent object is not matched to the segments of different actual objects. These techniques have been applied to two-dimensional simple closed curves represented by polygons and the power of the techniques is demonstrated by the examples taken from synthetic, aerial, industrial and biological images where the matching is done after using the actual segmentation methods.

Journal ArticleDOI
TL;DR: This correspondence shows the development of two-stage template matching with cross correlation as the similarity measure with significant speed-up over the one-stage process.
Abstract: Two-stage template matching with sum of absolute differences as the similarity measure has been developed by Vanderburg and Rosenfeld [1], [2]. This correspondence shows the development of two-stage template matching with cross correlation as the similarity measure. The threshold value of the first-stage is derived analytically and its validity is verified experimentally. Considerable speed-up over the one-stage process can be obtained by introducing only a small false dismissal probability.

Journal ArticleDOI
TL;DR: The Bayes smoothing algorithm presented here is valid for scene random fields consisting of multilevel (discrete) or continuous random variables, and it gives the optimum Bayes estimate for the scene value at each pixel.
Abstract: A new image segmentation algorithm is presented, based on recursive Bayes smoothing of images modeled by Markov random fields and corrupted by independent additive noise. The Bayes smoothing algorithm yields the a posteriori distribution of the scene value at each pixel, given the total noisy image, in a recursive way. The a posteriori distribution together with a criterion of optimality then determine a Bayes estimate of the scene. The algorithm presented is an extension of a 1-D Bayes smoothing algorithm to 2-D and it gives the optimum Bayes estimate for the scene value at each pixel. Computational concerns in 2-D, however, necessitate certain simplifying assumptions on the model and approximations on the implementation of the algorithm. In particular, the scene (noiseless image) is modeled as a Markov mesh random field, a special class of Markov random fields, and the Bayes smoothing algorithm is applied on overlapping strips (horizontal/vertical) of the image consisting of several rows (columns). It is assumed that the signal (scene values) vector sequence along the strip is a vector Markov chain. Since signal correlation in one of the dimensions is not fully used along the edges of the strip, estimates are generated only along the middle sections of the strips. The overlapping strips are chosen such that the union of the middle sections of the strips gives the whole image. The Bayes smoothing algorithm presented here is valid for scene random fields consisting of multilevel (discrete) or continuous random variables.

Journal ArticleDOI
TL;DR: It is proved that every ``straight'' chaincode string can be represented by a set of four unique integer parameters, and a mathematical expression is derived for the set of all continuous line segments which could have generated a given chain code string.
Abstract: If a continuous straight line segment is digitized on a regular grid, obviously a loss of information occurs. As a result, the discrete representation obtained (e.g., a chaincode string) can be coded more conveniently than the continuous line segment, but measurements of properties (such as line length) performed on the representation have an intrinsic inaccuracy due to the digitization process. In this paper, two fundamental properties of the quantization of straight line segments are treated. 1) It is proved that every ``straight'' chaincode string can be represented by a set of four unique integer parameters. Definitions of these parameters are given. 2) A mathematical expression is derived for the set of all continuous line segments which could have generated a given chaincode string. The relation with the chord property is briefly discussed.

Journal ArticleDOI
Bir Bhanu1
TL;DR: A three-dimensional scene analysis system for the shape matching of real world 3-D objects is presented and the results are shown on several unknown views of a complicated automobile casting.
Abstract: A three-dimensional scene analysis system for the shape matching of real world 3-D objects is presented. Various issues related to representation and modeling of 3-D objects are addressed. A new method for the approximation of 3-D objects by a set of planar faces is discussed. The major advantage of this method is that it is applicable to a complete object and not restricted to single range view which was the limitation of the previous work in 3-D scene analysis. The method is a sequential region growing algorithm. It is not applied to range images, but rather to a set of 3-D points. The 3-D model of an object is obtained by combining the object points from a sequence of range data images corresponding to various views of the object, applying the necessary transformations and then approximating the surface by polygons. A stochastic labeling technique is used to do the shape matching of 3-D objects. The technique matches the faces of an unknown view against the faces of the model. It explicitly maximizes a criterion function based on the ambiguity and inconsistency of classification. It is hierarchical and uses results obtained at low levels to speed up and improve the accuracy of results at higher levels. The objective here is to match the individual views of the object taken from any vantage point. Details of the algorithm are presented and the results are shown on several unknown views of a complicated automobile casting.

Journal ArticleDOI
TL;DR: An extremum principle is developed that determines three-dimensional surface orientation from a two-dimensional contour that interprets regular figures correctly and it interprets skew symmetries as oriented real symmetry.
Abstract: An extremum principle is developed that determines three-dimensional surface orientation from a two-dimensional contour. The principle maximizes the ratio of the area to the square of the perimeter, a measure of the compactness or symmetry of the three-dimensional surface. The principle interprets regular figures correctly and it interprets skew symmetries as oriented real symmetries. The maximum likelihood method approximates the principle on irregular figures, but we show that it consistently overestimates the slant of an ellipse.

Journal ArticleDOI
TL;DR: The results of some experiments on estimating the 3-D motion parameters of a rigid body from two consecutive TV images are described, and several factors which affect the accuracy of the results are discussed.
Abstract: This paper describes the results of some experiments on estimating the 3-D motion parameters of a rigid body from two consecutive TV images, and discusses several factors which affect the accuracy of the results. These factors include the sizes of the motion parameters themselves, the accuracy of the raw data, and the number of point correspondences. In addition, we address two related topics: determining corner positions to subpixel accuracy and matching point patterns with different scales.

Journal ArticleDOI
TL;DR: This paper discusses how data from multiple tactile sensors may be used to identify and locate one object, from among a set of known objects, with three degrees of positioning freedom relative to the sensors.
Abstract: This paper discusses how data from multiple tactile sensors may be used to identify and locate one object, from among a set of known objects. We use only local information from sensors: 1) the position of contact points and 2) ranges of surface normals at the contact points. The recognition and localization process is structured as the development and pruning of a tree of consistent hypotheses about pairings between contact points and object surfaces. In this paper, we deal with polyhedral objects constrained to lie on a known plane, i.e., having three degrees of positioning freedom relative to the sensors. We illustrate the performance of the algorithm by simulation.

Journal ArticleDOI
TL;DR: This work gives results on complex aerial images which contain many image differences, caused by varying sun position, different seasons, and imaging environments, and also structural changes caused by man-made alterations such as new construction.
Abstract: We describe techniques for matching two images or an image and a map. This operation is basic for machine vision and is needed for the tasks of object recognition, change detection, map up-dating, passive navigation, and other tasks. Our system uses line-based descriptions, and matching is accomplished by a relaxation operation which computes most similar geometrical structures. A more efficient variation, called the ``kernel'' method, is also described. We give results on complex aerial images which contain many image differences, caused by varying sun position, different seasons, and imaging environments, and also structural changes caused by man-made alterations such as new construction.

Journal ArticleDOI
TL;DR: This correspondence describes a method of image segmentation based on a ``pyramid'' of reduced-resolution versions of the image that defines link strengths between pixels at adjacent levels of the pyramid, based on proximity and similarity, and iteratively recomputes the pixel values and adjusts the link strengths.
Abstract: This correspondence describes a method of image segmentation based on a ``pyramid'' of reduced-resolution versions of the image. It defines link strengths between pixels at adjacent levels of the pyramid, based on proximity and similarity, and iteratively recomputes the pixel values and adjusts the link strengths. After a few iterations, the strengths stabilize, and the links that remain strong define subtrees of the pyramid; the leaves of each tree are the pixels belonging to a compact (piece of a) homogeneous region in the image.

Journal ArticleDOI
TL;DR: This paper defines the difference of low-pass (DOLP) transform and describes a fast algorithm for its computation and a fast computation technique based on ``resampling'' is described and shown to reduce the DOLP transform complexity to O(N log(N)) multiplies and O( N) storage locations.
Abstract: This paper defines the difference of low-pass (DOLP) transform and describes a fast algorithm for its computation. The DOLP is a reversible transform which converts an image into a set of bandpass images. A DOLP transform is shown to require O(N2) multiplies and produce O(N log(N)) samples from an N sample image. When Gaussian low-pass filters are used, the result is a set of images which have been convolved with difference of Gaussian (DOG) filters from an exponential set of sizes. A fast computation technique based on ``resampling'' is described and shown to reduce the DOLP transform complexity to O(N log(N)) multiplies and O(N) storage locations. A second technique, ``cascaded convolution with expansion,'' is then defined and also shown to reduce the computational cost to O(N log(N)) multiplies. Combining these two techniques yields an algorithm for a DOLP transform that requires O(N) storage cells and requires O(N) multiplies.

Journal ArticleDOI
TL;DR: Several theorems related to the bounds on the search time, error rate, memory requirement and overlap factor in the design of a decision tree have been proposed and some principles have been established to analyze the behaviors of the decision tree.
Abstract: Based on a recursive process of reducing the entropy, the general decision tree classifier with overlap has been analyzed. Several theorems have been proposed and proved. When the number of pattern classes is very large, the theorems can reveal both the advantages of a tree classifier and the main difficulties in its implementation. Suppose H is Shannon's entropy measure of the given problem. The theoretical results indicate that the tree searching time can be minimized to the order O(H), but the error rate is also in the same order O(H) due to error accumulation. However, the memory requirement is in the order 0(H exp(H)) which poses serious problems in the implementation of a tree classifier for a large number of classes. To solve these problems, several theorems related to the bounds on the search time, error rate, memory requirement and overlap factor in the design of a decision tree have been proposed and some principles have been established to analyze the behaviors of the decision tree. When applied to classify sets of 64, 450, and 3200 Chinese characters, respectively, the experimental results support the theoretical predictions. For 3200 classes, a very high recognition rate of 99.88 percent was achieved at a high speed of 873 samples/s when the experiment was conducted on a Cyber 172 computer using a high-level language.

Journal ArticleDOI
TL;DR: A quadratic metric dAO (X, Y) =[( X - Y)T AO(X - Y)]¿ is proposed which minimizes the mean-squared error between the nearest neighbor asymptotic risk and the finite sample risk.
Abstract: A quadratic metric dAO (X, Y) =[(X - Y)T AO(X - Y)]? is proposed which minimizes the mean-squared error between the nearest neighbor asymptotic risk and the finite sample risk. Under linearity assumptions, a heuristic argument is given which indicates that this metric produces lower mean-squared error than the Euclidean metric. A nonparametric estimate of Ao is developed. If samples appear to come from a Gaussian mixture, an alternative, parametrically directed distance measure is suggested for nearness decisions within a limited region of space. Examples of some two-class Gaussian mixture distributions are included.

Journal ArticleDOI
TL;DR: This work presents an approach for picture indexing and abstraction, and illustrates by examples how to apply abstraction operations to obtain various picture indexes, and how to construct icons to facilitate accessing of pictorial data.
Abstract: We present an approach for picture indexing and abstraction. Picture indexing facilitates information retrieval from a pictorial database consisting of picture objects and picture relations. To construct picture indexes, abstraction operations to perform picture object clustering and classification are formulated. To substantiate the abstraction operations, we also formalize syntactic abstraction rules and semantic abstraction rules. We then illustrate by examples how to apply these abstraction operations to obtain various picture indexes, and how to construct icons to facilitate accessing of pictorial data.

Journal ArticleDOI
TL;DR: A transform coding scheme for closed image boundaries on a plane using a Gaussian circular autoregressive model to represent the boundary data and the variances of the Fourier coefficients and the MAX quantizer is implemented.
Abstract: A transform coding scheme for closed image boundaries on a plane is described The given boundary is approximated by a series of straight line segments Depending on the shape, the boundary is represented by the (x-y) coordinates of the endpoints of the line segments or by the magnitude of the successive radii vectors that are equispaced in angle around the given boundary Due to the circularity present in the data, the discrete Fourier transform is used to exactly decorrelate the finite boundary data By fitting a Gaussian circular autoregressive model to represent the boundary data, estimates of the variances of the Fourier coefficients are obtained Using the variances of the Fourier coefficients and the MAX quantizer, the coding scheme is implemented The scheme is illustrated by an example

Journal ArticleDOI
TL;DR: It is proved that the nonzero entry of the commutator of a piar of scatter matrices constructed from discrete arcs is related to the angle between their eigenspaces, and this entry is proportional to the analytical curvature of the plane curve from which the discrete data are drawn.
Abstract: This paper introduces a new theory for the tangential deflection and curvature of plane discrete curves. Our theory applies to discrete data in either rectangular boundary coordinate or chain coded formats: its rationale is drawn from the statistical and geometric properties associated with the eigenvalue-eigenvector structure of sample covariance matrices. Specifically, we prove that the nonzero entry of the commutator of a piar of scatter matrices constructed from discrete arcs is related to the angle between their eigenspaces. And further, we show that this entry is-in certain limiting cases-also proportional to the analytical curvature of the plane curve from which the discrete data are drawn. These results lend a sound theoretical basis to the notions of discrete curvature and tangential deflection; and moreover, they provide a means for computationally efficient implementation of algorithms which use these ideas in various image processing contexts. As a concrete example, we develop the commutator vertex detection (CVD) algorithm, which identifies the location of vertices in shape data based on excessive cummulative tangential deflection; and we compare its performance to several well established corner detectors that utilize the alternative strategy of finding (approximate) curvature extrema.

Journal ArticleDOI
TL;DR: It is shown that uncertainty can be derived from the fundamental constraints on the process of vision-the requirements for class-defining operations which are both shift-invariant and insensitive to changes in illumination.
Abstract: The uncertainty principle is recognized as one of the fundamental results in signal processing. Its role in inference is, however, less well known outside of quantum mechanics. It is the aim of this paper to provide a unified approach to the problem of uncertainty in image processing. It is shown that uncertainty can be derived from the fundamental constraints on the process of vision-the requirements for class-defining operations which are both shift-invariant and insensitive to changes in illumination. It is thus shown that uncertainty plays a key role in the language of vision, since it affects the choice of both the alphabet, the elementary signals, and the syntax, the inferential structure, of vision. The report is concluded with a number of practical illustrations of these ideas, taken from such image processing tasks as enhancement, data compression, and segmentation.