scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Computer Vision in 1994"


Journal ArticleDOI
TL;DR: These comparisons are primarily empirical, and concentrate on the accuracy, reliability, and density of the velocity measurements; they show that performance can differ significantly among the techniques the authors implemented.
Abstract: While different optical flow techniques continue to appear, there has been a lack of quantitative evaluation of existing methods. For a common set of real and synthetic image sequences, we report the results of a number of regularly cited optical flow techniques, including instances of differential, matching, energy-based, and phase-based methods. Our comparisons are primarily empirical, and concentrate on the accuracy, reliability, and density of the velocity measurements; they show that performance can differ significantly among the techniques we implemented.

4,771 citations


Journal ArticleDOI
TL;DR: A heuristic method has been developed for registering two sets of 3-D curves obtained by using an edge-based stereo system, or two dense3-D maps obtained by use a correlation-based stereoscopic system, and it is efficient and robust, and yields an accurate motion estimate.
Abstract: A heuristic method has been developed for registering two sets of 3-D curves obtained by using an edge-based stereo system, or two dense 3-D maps obtained by using a correlation-based stereo system. Geometric matching in general is a difficult unsolved problem in computer vision. Fortunately, in many practical applications, some a priori knowledge exists which considerably simplifies the problem. In visual navigation, for example, the motion between successive positions is usually approximately known. From this initial estimate, our algorithm computes observer motion with very good precision, which is required for environment modeling (e.g., building a Digital Elevation Map). Objects are represented by a set of 3-D points, which are considered as the samples of a surface. No constraint is imposed on the form of the objects. The proposed algorithm is based on iteratively matching points in one set to the closest points in the other. A statistical method based on the distance distribution is used to deal with outliers, occlusion, appearance and disappearance, which allows us to do subset-subset matching. A least-squares technique is used to estimate 3-D motion from the point correspondences, which reduces the average distance between points in the two sets. Both synthetic and real data have been used to test the algorithm, and the results show that it is efficient and robust, and yields an accurate motion estimate.

2,177 citations


Journal ArticleDOI
TL;DR: A modified variational scheme for contour modeling is proposed, which uses no edge detection step, but local computations instead—only around contour neighborhoods—as well as an “anticipating” strategy that enhances the modeling activity of deformable contour curves.
Abstract: The variational method has been introduced by Kass et al. (1987) in the field of object contour modeling, as an alternative to the more traditional edge detection-edge thinning-edge sorting sequence. since the method is based on a pre-processing of the image to yield an edge map, it shares the limitations of the edge detectors it uses. in this paper, we propose a modified variational scheme for contour modeling, which uses no edge detection step, but local computations instead—only around contour neighborhoods—as well as an “anticipating” strategy that enhances the modeling activity of deformable contour curves. many of the concepts used were originally introduced to study the local structure of discontinuity, in a theoretical and formal statement by leclerc & zucker (1987), but never in a practical situation such as this one. the first part of the paper introduces a region-based energy criterion for active contours, and gives an examination of its implications, as compared to the gradient edge map energy of snakes. then, a simplified optimization scheme is presented, accounting for internal and external energy in separate steps. this leads to a complete treatment, which is described in the last sections of the paper (4 and 5). the optimization technique used here is mostly heuristic, and is thus presented without a formal proof, but is believed to fill a gap between snakes and other useful image representations, such as split-and-merge regions or mixed line-labels image fields.

694 citations


Journal ArticleDOI
TL;DR: The major direct solutions to the three point perspective pose estimation problems are reviewed from a unified perspective beginning with the first solution published in 1841 by a German mathematician and continuing through the solutions published in the German and then American photogrammetry literature, and most recently in the current computer vision literature.
Abstract: In this paper, the major direct solutions to the three point perspective pose estimation problems are reviewed from a unified perspective beginning with the first solution which was published in 1841 by a German mathematician, continuing through the solutions published in the German and then American photogrammetry literature, and most recently in the current computer vision literature. The numerical stability of these three point perspective solutions are also discussed. We show that even in case where the solution is not near the geometric unstable region, considerable care must be exercised in the calculation. Depending on the order of the substitutions utilized, the relative error can change over a thousand to one. This difference is due entirely to the way the calculations are performed and not due to any geometric structural instability of any problem instance. We present an analysis method which produces a numerically stable calculation.

574 citations


Journal ArticleDOI
TL;DR: A new method named STM is described for determining distance of objects and rapid autofocusing of camera systems based on a new Spatial-Domain Convolution/Deconvolution Transform that requires only two images taken with different camera parameters such as lens position, focal length, and aperture diameter.
Abstract: A new method named STM is described for determining distance of objects and rapid autofocusing of camera systems. STM uses image defocus information and is based on a new Spatial-Domain Convolution/Deconvolution Transform. The method requires only two images taken with different camera parameters such as lens position, focal length, and aperture diameter. Both images can be arbitrarily blurred and neither of them needs to be a focused image. Therefore STM is very fast in comparison with Depth-from-Focus methods which search for the lens position or focal length of best focus. The method involves simple local operations and can be easily implemented in parallel to obtain the depth-map of a scene. STM has been implemented on an actual camera system named SPARCS. Experiments on the performance of STM and their results on real-world planar objects are presented. The results indicate that the accuracy of STM compares well with Depth-from-Focus methods and is useful in practical applications. The utility of the method is demonstrated for rapid autofocusing of electronic cameras.

514 citations


Journal Article
TL;DR: A multiscale representation of grey-level shape called the scale-space primal sketch is presented, which gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way.
Abstract: This article presents: (i) a multiscale representation of grey-level shape called the scale-space primal sketch, which makes explicit both features in scale-space and the relations between structures at different scales, (ii) a methodology for extracting significant blob-like image structures from this representation, and (iii) applications to edge detection, histogram analysis, and junction classification demonstrating how the proposed method can be used for guiding later-stage visual processes.The representation gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues, and can hence be seen as preceding further processing, which can then be properly tuned. It is argued that once such information is available, many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives intuitive results.

449 citations


Journal ArticleDOI
TL;DR: A method is presented for detecting and tracking occluding and transparent moving objects, which uses temporal integration without assuming motion constancy, to improve the segmentation of motion analysis and segmentation.
Abstract: Computing the motions of several moving objects in image sequences involves simultaneous motion analysis and segmentation. This task can become complicated when image motion changes significantly between frames, as with camera vibrations. Such vibrations make tracking in longer sequences harder, as temporal motion constancy cannot be assumed. The problem becomes even more difficult in the case of transparent motions.

435 citations


Journal ArticleDOI
TL;DR: A new method for determining the minimal non-rigid deformation between two 3-D surfaces, such as those which describe anatomical structures in3-D medical images, which performs a least squares minimization of the distance between the two surfaces of interest.
Abstract: Presents a new method for determining the minimal nonrigid deformation between two 3D surfaces, such as those which describe anatomical structures in 3D medical images. Although the authors match surfaces, they represent the deformation as a volumetric transformation. Their method performs a least squares minimization of the distance between the two surfaces of interest. To quickly and accurately compute distances between points on the two surfaces, the authors use a precomputed distance map represented using an octree spline whose resolution increases near the surface. To quickly and robustly compute the deformation, the authors use a second octree-spline to model the deformation function. The coarsest level of the deformation encodes the global (e.g., affine) transformation between the two surfaces, while finer levels encode smooth local displacements which bring the two surfaces into closer registration. The authors present experimental results on both synthetic and real 3D surfaces. >

346 citations


Journal ArticleDOI
TL;DR: The TEA-1 selective vision system uses Bayes nets for representation, benefit-cost analysis for control of visual and nonvisual actions; and its data structures and decision-making algorithms provide a general, reusable framework that solves the T-world problem.
Abstract: A selective vision system sequentially collects evidence to support a specified hypothesis about a scene, as long as the additional evidence is worth the effort of obtaining it. Efficiency comes from processing the scene only where necessary, to the level of detail necessary, and with only the necessary operators. Knowledge representation and sequential decision-making are central issues for selective vision, which takes advantage of prior knowledge of a domain''s abstract and geometrical structure and models for the expected performance and cost of visual operators. .pp The TEA-1 selective vision system uses Bayes nets for representation and benefit-cost analysis for control of visual and non-visual actions. It is the high-level control for an active vision system, enabling purposive behavior, the use of qualitative vision modules and a pointable multiresolution sensor. TEA-1 demonstrates that Bayes nets and decision theoretic techniques provide a general, re-usable framework for constructing computer vision systems that are selective perception systems, and that Bayes nets provide a general framework for representing visual tasks. Control, or decision making, is the most important issue in a selective vision system. TEA-1''s decisions about what to do next are based on general hand-crafted ``goodness functions'''' constructed around core decision theoretic elements. Several goodness functions for different decisions are presented and evaluated. .pp The TEA-1 system solves a version of the T-world problem, an abstraction of a large set of domains and tasks. Some key factors that affect the success of selective perception are analyzed by examining how each factor affects the overall performance of TEA-1 when solving ensembles of randomly produced, simulated T-world domains and tasks. TEA-1''s decision making algorithms are also evaluated in this manner. Experiments in the lab for one specific T-world domain, table settings, are also presented.

203 citations


Journal ArticleDOI
TL;DR: This work builds upon the seminal work of Kishon et al. (1990), where curves are first smoothed using B-splines, with matching based on hashing using curvature and torsion measures, but introduces two enhancements that allow a more accurate estimation of position, curvature, torsions, and Frénet frames along the curve.
Abstract: We present a new approach to the problem of matching 3-D curves. The approach has a low algorithmic complexity in the number of models, and can operate in the presence of noise and partial occlusions. Our method builds upon the seminal work of Kishon et al. (1990), where curves are first smoothed using B-splines, with matching based on hashing using curvature and torsion measures. However, we introduce two enhancements: We present experimental results using synthetic data and also using characteristic curves extracted from 3-D medical images. An earlier version of this article was presented at the 2nd European Conference on Computer Vision in Italy.

145 citations


Journal ArticleDOI
TL;DR: A general method for space variant image processing, based on a connectivity graph which represents the neighbor-relations in an arbitrarily structured sensor, which is suitable for real-time implementation, and provides a generic solution to a wide range of image processing applications with space variant sensors.
Abstract: This paper describes a graph-based approach to image processing, intended for use with images obtained from sensors having space variant sampling grids. The connectivity graph (CG) is presented as a fundamental framework for posing image operations in any kind of space variant sensor. Partially motivated by the observation that human vision is strongly space variant, a number of research groups have been experimenting with space variant sensors. Such systems cover wide solid angles yet maintain high acuity in their central regions. Implementation of space variant systems pose at least two outstanding problems. First, such a system must be active, in order to utilize its high acuity region; second, there are significant image processing problems introduced by the non-uniform pixel size, shape and connectivity. Familiar image processing operations such as connected components, convolution, template matching, and even image translation, take on new and different forms when defined on space variant images. The present paper provides a general method for space variant image processing, based on a connectivity graph which represents the neighbor-relations in an arbitrarily structured sensor. We illustrate this approach with the following applications: (1) Connected components is reduced to its graph theoretic counterpart. We illustrate this on a logmap sensor, which possesses a difficult topology due to the branch cut associated with the complex logarithm function. (2) We show how to write local image operators in the connectivity graph that are independent of the sensor geometry. (3) We relate the connectivity graph to pyramids over irregular tessalations, and implement a local binarization operator in a 2-level pyramid. (4) Finally, we expand the connectivity graph into a structure we call a transformation graph, which represents the effects of geometric transformations in space variant image sensors. Using the transformation graph, we define an efficient algorithm for matching in the logmap images and solve the template matching problem for space variant images. Because of the very small number of pixels typical of logarithmic structured space variant arrays, the connectivity graph approach to image processing is suitable for real-time implementation, and provides a generic solution to a wide range of image processing applications with space variant sensors.

Journal ArticleDOI
TL;DR: It is argued that working in HSI space offers an effective method for segmenting scenes in the presence of confounding cues due to shading, transparency, highlights, and shadows and designed and fabricated an analog CMOS VLSI circuit with on-board phototransistor input that computes normalized color and hue.
Abstract: Standard techniques for segmenting color images are based on finding normalized RGB discontinuities, color histogramming, or clustering techniques in RGB or CIE color spaces. The use of the psychophysical variable hue in HSI space has not been popular due to its numerical instability at low saturations. In this article, we propose the use of a simplified hue description suitable for implementation in analog VLSI. We demonstrate that if theintegrated white condition holds, hue is invariant to certain types of highlights, shading, and shadows. This is due to theadditive/shift invariance property, a property that other color variables lack. The more restrictive uniformly varying lighting model associated with themultiplicative/scale invariance property shared by both hue and normalized RGB allows invariance to transparencies, and to simple models of shading and shadows. Using binary hue discontinuities in conjunction with first-order type of surface interpolation, we demonstrate these invariant properties and compare them against the performance of RGB, normalized RGB, and CIE color spaces. We argue that working in HSI space offers an effective method for segmenting scenes in the presence of confounding cues due to shading, transparency, highlights, and shadows. Based on this work, we designed and fabricated for the first time an analog CMOS VLSI circuit with on-board phototransistor input that computes normalized color and hue.

Journal ArticleDOI
TL;DR: A mathematical model of search efficiency is presented that identifies the factors affecting efficiency and can be used to predict their effects and predicts that, in typical situations, indirect search provides up to an 8-fold increase in efficiency.
Abstract: When using a mobile camera to search for a target object, it is often important to maximize the efficiency of the search. We consider a method for increasing efficiency by searching only those subregions that are especially likely to contain the object. These subregions are identified via spatial relationships. Searches that use this method repeatedly find an “intermediate” object that commonly participates in a spatial relationship with the target object, and then look for the target in the restricted region specified by this relationship. Intuitively, such searches, calledindirect searches, seem likely to provide efficiency increases when the intermediate objects can be recognized at low resolutions and hence can be found with little extra overhead, and when they significantly restrict the area that must be searched for the target. But what is the magnitude of this increase, and upon what other factors does efficiency depend? Although the idea of exploiting spatial relationships has been used in vision systems before, few have quantitatively examined these questions.

Journal ArticleDOI
TL;DR: In this article, an approach for recovering surface shape from the occluding contour using an active (i.e., moving) observer is presented. But this approach does not require knowledge of the velocities or accelerations of the observer.
Abstract: We present an approach for recovering surface shape from the occluding contour using an active (i.e., moving) observer. It is based on a relation between the geometries of a surface in a scene and its occluding contour: If the viewing direction of the observer is along a principal direction for a surface point whose projection is on the contour, surface shape (i.e., curvature) at the surface point can be recovered from the contour. Unlike previous approaches for recovering shape from the occluding contour, we use an observer thatpurposefully changes viewpoint in order to achieve a well-defined geometric relationship with respect to a 3-D shape prior to its recognition. We show that there is a simple and efficient viewing strategy that allows the observer to align the viewing direction with one of the two principal directions for a point on the surface. This strategy depends on only curvature measurements on the occluding contour and therefore demonstrates that recovering quantitative shape information from the contour does not require knowledge of the velocities or accelerations of the observer. Experimental results demonstrate that our method can be easily implemented and can provide reliable shape information from the occluding contour.

Journal ArticleDOI
TL;DR: This work demonstrates that the algorithm developed in Horn and Weldon 1987, if appropriately modified, results in a robust algorithm that works in the case of general rigid motion with bounded rotation, which has the potential to replace expensive accelerometers, inertial systems and inaccurate odometers in practical navigational systems for the problem of kinetic stabilization.
Abstract: If an observer is moving rigidly with bounded rotation then normal flow measurements (i.e., the spatiotemporal derivatives of the image intensity function) give rise to a constraint on the oberver's translation. This novel constraint gives rise to a robust, qualitative solution to the problem of recovering the observer's heading direction, by providing an area where the Focus of Expansion lies. If the rotation of the observer is large then the solution area is large too, while small rotation causes the solution area to be small, thus giving rise to a robust solution. In the paper the relationship between the solution area and the rotation and translation vectors is studied and experimental results using synthetic and real calibrated image sequences are presented. This work demonstrates that the algorithm developed in (Horn and Weldon 1987) for the case of pure translation, if appropriately modified, results in a robust algorithm that works in the case of general rigid motion with bounded rotation. Subsequently, it has the potential to replace expensive accelerometers, inertial systems and inaccurate odometers in practical navigational systems for the problem of kinetic stabilization, which is a prerequisite for any other navigational ability.

Journal ArticleDOI
TL;DR: A procedure for color photometric stereo is described, which recovers the shape of a colored object from two or more color images of the object under white illumination, and the result is less sensitive to noise.
Abstract: Computer vision systems can be used to determine the shapes of real three-dimensional objects for purposes of object recognition and pose estimation or for CAD applications. One method that has been developed is photometric stereo. This method uses several images taken from the same viewpoint, but with different lightings, to determine the three-dimensional shape of an object. Most previous work in photometric stereo has been with gray-tone images; color images have only been used for dielectric materials. In this paper we describe a procedure for color photometric stereo, which recovers the shape of a colored object from two or more color images of the object under white illumination. This method can handle different types of materials, such as composites and metals, and can employ various reflection models such as the Lambertian, dichromatic, and Torrance-Sparrow models. For composite materials, colored metals, and dielectrics, there are two advantages of utilizing color information: at each pixel, there are more constraints on the orientation, and the result is less sensitive to noise. Consequently, the shape can be found more accurately. The method has been tested on both artificial and real images of objects of various materials, and on real images of a multi-colored object.

Journal ArticleDOI
TL;DR: This article presents initial results toward performing object recognition by using multiple observations to resolve ambiguities using a representation for planning that combines geometric information with viewpoint uncertainty and a sensor planner utilizing the representation was implemented.
Abstract: Most computer vision systems perform object recognition on the basis of the features extracted from a single image of the object. The problem with this approach is that it implicitly assumes that the available features are sufficient to determine the identity and pose of the object uniquely. If this assumption is not met, then the feature set is insufficient, and ambiguity results. Consequently, much research in computer vision has gone toward finding sets of features that are sufficient for specific tasks, with the result that each system has its own associated set of features. A single, general feature set would be desirable. However, research in automatic generation of object recognition programs has demonstrated that predetermined, fixed feature sets are often incapable of providing enough information to unambiguously determine either object identity or pose. One approach to overcoming the inadequacy of any feature set is to utilize multiple sensor observations obtained from different viewpoints, and combine them with knowledge of the 3-D structure of the object to perform unambiguous object recognition. This article presents initial results toward performing object recognition by using multiple observations to resolve ambiguities. Starting from the premise that sensor motions should be planned in advance, the difficulties involved in planning with ambiguous information are discussed. A representation for planning that combines geometric information with viewpoint uncertainty is presented. A sensor planner utilizing the representation was implemented, and the results of pose-determination experiments performed with the planner are discussed.

Journal ArticleDOI
TL;DR: A linear time algorithm to find the axes of skew symmetry inO(n) time, where n is the number of contour points, which is especially suited to industrial applications where the degree of symmetry is often knowna priori.
Abstract: Symmetry is pervasive in both man-made objects and nature. Since symmetries project to skew symmetries, finding axes of skew symmetry is an important vision task. This paper presents a linear time algorithm for finding the axes of skew symmetry, where the degree of symmetry is known. First, we present a review and critique of current methods for finding the axes of skew symmetry. Next, we decompose the problem of finding skew symmetry into the subproblems of solving for the rotational parameter of a “shear symmetry” and recovering the shear parameter of a reflexive symmetry. Using this approach, the authors derive a direct, non-heuristic moment-based technique for finding the axes of skew symmetry. For skew symmetric figures with degree of symmetry less than five we obtain a closed-form solution. The method does not rely on continuous contours but assumes there is no occlusion and requires knowing the contour's degree of symmetry. It is the first algorithm to find the axes of skew symmetry inO(n) time, where n is the number of contour points. The method is especially suited to industrial applications where the degree of symmetry is often knowna priori. Examples of the method are presented for both real and synthetic images, and an error analysis of the method is given.

Journal ArticleDOI
TL;DR: This paper investigates the robustness of an algorithm due to Horn and Weldon (1988) for recovery of egomotion from optic-flow and shows that the algorithm can indeed be extended to cope with such uncertainty with graceful degradation in accuracy of estimated position of the focus of expansion.
Abstract: This paper investigates the robustness of an algorithm due to Horn and Weldon (1988) for recovery of egomotion from optic-flow. Assuming only normal components of flow vectors are availableand that 3D angular velocity is known, tight constraints can be constructed on the direction of translational motion or, equivalently, on the focus of expansion. In practice however this is unacceptably restrictive. Some allowance must be made for uncertainty in angular velocity. We show that the algorithm can indeed be extended to cope with such uncertainty with graceful degradation in accuracy of estimated position of the focus of expansion. The shape of the error distribution depends on whether the focus of expansion is inside or outside the field of view. If it is inside, the error distribution is isotropic. As it moves outside the distribution becomes increasingly anisotropic. Results from an implementation of the algorithm confirm the validity of the error bounds.

Journal ArticleDOI
TL;DR: The design, performance, and application of The Real-time, Intelligently ControLled, Optical Positioning System (TRICLOPS), a multiresolution trinocular camera-pointing system which provides a center wide-angle view camera and two higher-resolution vergence cameras, are described.
Abstract: The design, performance, and application of The Real-time, Intelligently ControLled, Optical Positioning System (TRICLOPS) are described in this article. TRICLOPS is a multiresolution trinocular camera-pointing system which provides a center wide-angle view camera and two higher-resolution vergence cameras. It is a direct-drive system that exhibits dynamic performance comparable to the human visual system. The mechanical design and performance of various active vision systems are discussed and compared to those of TRICLOPS. The multiprocessor control system for TRICLOPS is described. The kinematics of the device are also discussed and calibration methods are given. Finally, as an example of real-time visual control, a problem in visual tracking with TRICLOPS is examined. In this example, TRICLOPS is shown to be capable of tracking a ball moving at 3 m/s, which results in rotational velocities of the vergence cameras in excess of 6 rad/s (344 deg/s).

Journal ArticleDOI
TL;DR: This paper synthesizes a new approach to shape recovery for 3-D object recognition that decouples recognition from localization by combining basic elements from these two approaches, and uses qualitative shape recovery and recognition techniques to provide strong fitting constraints on physics-based deformable model recovery techniques.
Abstract: Recent work in qualitative shape recovery and object recognition has focused on solving the “what is it” problem, while avoiding the “where is it” problem. In contrast, typical CAD-based recognition systems have focused on the “where is it” problem, while assuming they know what the object is. Although each approach addresses an important aspect of the 3-D object recognition problem, each falls short in addressing the complete problem of recognizing and localizing 3-D objects from a large database. In this paper, we first synthesize a new approach to shape recovery for 3-D object recognition that decouples recognition from localization by combining basic elements from these two approaches. Specifically, we use qualitative shape recovery and recognition techniques to provide strong fitting constraints on physics-based deformable model recovery techniques. Secondly, we extend our previously developed technique of fitting deformable models to occluding image contours to the case of image data captured under general orthographic, perspective, and stereo projections. On one hand, integrating qualitative knowledge of the object being fitted to the data, along with knowledge of occlusion supports a much more robust and accurate quantitative fitting. On the other hand, recovering object pose and quantitative surface shape not only provides a richer description for indexing, but supports interaction with the world when object manipulation is required. This paper presents the approach in detail and applies it to real imagery.

Journal ArticleDOI
TL;DR: An expression is obtained for the range of affine-invariant values that are consistent with a given set of four points, where each image point lies in an ∈-disc of uncertainty.
Abstract: Affine transformations of the plane have been used in a number of model-based recognition systems. Because the underlying mathematics are based on exact data, in practice various heuristics are used to adapt the methods to real data where there is positional uncertainty. This paper provides a precise analysis of affine point matching under uncertainty. We obtain an expression for the range of affine-invariant values that are consistent with a given set of four points, where each image point lies in an ∈-disc of uncertainty. This range is shown to depend on the actualx-y-positions of the data points. In other words, given uncertainty in the data there are no representations that are invariant with respect to the Cartesian coordinate system of the data. This is problematic for methods, such as geometric hashing, that are based on affine-invariant representations. We also analyze the effect that uncertainty has on the probability that recognition methods using affine transformations will find false positive matches. We find that there is a significant probability of false positives with even moderate levels of sensor error, suggesting the importance of good verification techniques and good grouping techniques.

Journal ArticleDOI
TL;DR: It is shown that all uncalibrated paraperspective images of an object can be constructed from a 3-D model of the object by applying an affine transformation to the model, and every affine image of theobject represents some uncalibated paraperpective image ofThe object.
Abstract: It is shown that the set of all paraperspective images with arbitrary reference point and the set of affine images of a 3-D object are identical. Consequently, all uncalibrated paraperspective images of an object can be constructed from a 3-D model of the object by applying an affine transformation to the model and every affine image of the object represents some uncalibrated paraperspective image of the object. It follows that the paraperspective images of an object can be expressed as linear combinations of any two non-degenerate images of the object. When the image position of the reference point is given the parameters of the affine transformation (and, likewise, the coefficients of the linear combinations) satisfy two quadratic constraints. Conversely, when the values of parameters are given the image position of the reference point is determined by solving a bi-quadratic equation.

Journal ArticleDOI
TL;DR: A non-parametric analytical model of the intensity for a curved edge is considered, and the relations between the image data and some local characteristics of the edge are derived, in the discrete case, and this discrete approach corresponds to the notion of spatio-temporal surfaces in the continuous case.
Abstract: In this paper we consider a non-parametric analytical model of the intensity for a curved edge, and derive the relations between the image data and some local characteristics of the edge, in the discrete case. In order to identify this model we also study how to develop high order non-biased spatial derivative operators, with subpixel accuracy. In fact, this discrete approach corresponds to the notion of spatio-temporal surfaces in the continuous case, and provides a way to obtain some of the spatio-temporal parameters from an image sequence. An implementation is proposed, and experimental data are provided. Computed characteristics are subpixel localization, normal displacement between two frames, orientation and curvature, but the method is easy to extend to other geometrical or dynamical parameters of the edge. Results derived in this paper are always valid for step-like edges, but computation of orientation and curvature are also valid for edges with more general profiles.

Journal ArticleDOI
TL;DR: This paper proves that all high order non-biased spatial intensity derivative operators in images can be computed using linear combinations of separable filters and discusses the performance of an edge detector using these derivatives for unsmoothed and smoothed step edges.
Abstract: In this paper we prove that all high order non-biased spatial intensity derivative operators in images can be computed using linear combinations of separable filters. The separable filters are the same as those used by Haralick (1984), but different linear combinations are taken. A comparison of the number of operations necessary to compute the derivatives using separable and non-separable filters is made. The conclusion of our analysis is that the optimal way to compute the needed derivatives depends on which derivatives we have to compute, on the size of the window and on the order of expansion. Finally, we discuss the performance of an edge detector using these derivatives for unsmoothed and smoothed step edges.

Journal ArticleDOI
TL;DR: It is demonstrated in this paper that this constrained pose refinement formulation is no more difficult than the original problem based on numerical analysis techniques, including active set methods and lagrange multiplier analysis.
Abstract: Pose refinement is an essential task for computer vision systems that require the calibration and verification of model and camera parameters. Typical domains include the real-time tracking of objects and verification in model-based recognition systems. A technique is presented for recovering model and camera parameters of 3D objects from a single two-dimensional image. This basic problem is further complicated by the incorporation of simple bounds on the model and camera parameters and linear constraints restricting some subset of object parameters to a specific relationship. It is demonstrated in this paper that this constrained pose refinement formulation is no more difficult than the original problem based on numerical analysis techniques, including active set methods and lagrange multiplier analysis. A number of bounded and linearly constrained parametric models are tested and convergence to proper values occurs from a wide range of initial error, utilizing minimal matching information (relative to the number of parameters and components). The ability to recover model parameters in a constrained search space will thus simplify associated object recognition problems.

Journal ArticleDOI
TL;DR: It is proved that the mirror uncertainty for the three view problem also exists for a long sequence: if shapeS is a solution, so is its mirror imageS′ which is symmetric toS about the image plane.
Abstract: This paper presents new forms of necessary and sufficient conditions for determining shape and motion to within a mirror uncertainty from monocular orthographic projections of any number of point trajectories over any number of views. The new forms of conditions use image data only and can therefore be employed in any practical algorithms for shape and motion estimation. We prove that the mirror uncertainty for the three view problem also exists for a long sequence: if shapeS is a solution, so is its mirror imageS′ which is symmetric toS about the image plane. The necessary and sufficient conditions for determining the two sets of solutions are associated with the rank of the measurement matrixW.