scispace - formally typeset
Search or ask a question

Showing papers on "Feature (computer vision) published in 1995"


Journal ArticleDOI
TL;DR: Compared to classic approaches making use of Newton's method, POSIT does not require starting from an initial guess, and computes the pose using an order of magnitude fewer floating point operations; it may therefore be a useful alternative for real-time operation.
Abstract: In this paper, we describe a method for finding the pose of an object from a single image. We assume that we can detect and match in the image four or more noncoplanar feature points of the object, and that we know their relative geometry on the object. The method combines two algorithms; the first algorithm,POS (Pose from Orthography and Scaling) approximates the perspective projection with a scaled orthographic projection and finds the rotation matrix and the translation vector of the object by solving a linear system; the second algorithm,POSIT (POS with ITerations), uses in its iteration loop the approximate pose found by POS in order to compute better scaled orthographic projections of the feature points, then applies POS to these projections instead of the original image projections. POSIT converges to accurate pose measurements in a few iterations. POSIT can be used with many feature points at once for added insensitivity to measurement errors and image noise. Compared to classic approaches making use of Newton's method, POSIT does not require starting from an initial guess, and computes the pose using an order of magnitude fewer floating point operations; it may therefore be a useful alternative for real-time operation. When speed is not an issue, POSIT can be written in 25 lines or less in Mathematica; the code is provided in an Appendix.

1,195 citations


Proceedings ArticleDOI
20 Jun 1995
TL;DR: This paper analyzes geometric active contour models discussed previously from a curve evolution point of view and proposes some modifications based on gradient flows relative to certain new feature-based Riemannian metrics, leading to a novel snake paradigm in which the feature of interest may be considered to lie at the bottom of a potential well.
Abstract: In this paper, we analyze the geometric active contour models discussed previously from a curve evolution point of view and propose some modifications based on gradient flows relative to certain new feature-based Riemannian metrics. This leads to a novel snake paradigm in which the feature of interest may be considered to lie at the bottom of a potential well. Thus the snake is attracted very naturally and efficiently to the desired feature. Moreover, we consider some 3-D active surface models based on these ideas. >

754 citations


Journal ArticleDOI
TL;DR: A modified box-counting approach is proposed to estimate the FD, in combination with feature smoothing in order to reduce spurious regions and to segment a scene into the desired number of classes, an unsupervised K-means like clustering approach is used.
Abstract: This paper deals with the problem of recognizing and segmenting textures in images. For this purpose the authors employ a technique based on the fractal dimension (FD) and the multi-fractal concept. Six FD features are based on the original image, the above average/high gray level image, the below average/low gray level image, the horizontally smoothed image, the vertically smoothed image, and the multi-fractal dimension of order two. A modified box-counting approach is proposed to estimate the FD, in combination with feature smoothing in order to reduce spurious regions. To segment a scene into the desired number of classes, an unsupervised K-means like clustering approach is used. Mosaics of various natural textures from the Brodatz album as well as microphotographs of thin sections of natural rocks are considered, and the segmentation results to show the efficiency of the technique. Supervised techniques such as minimum-distance and k-nearest neighbor classification are also considered. The results are compared with other techniques. >

650 citations


Patent
Michael J. Black1, Yaser Yacoob1
15 Dec 1995
TL;DR: In this article, a planar model is used to recover motion parameters that estimate motion between the segmented face region in the first image and a second image in the sequence of images.
Abstract: A system tracks human head and facial features over time by analyzing a sequence of images. The system provides descriptions of motion of both head and facial features between two image frames. These descriptions of motion are further analyzed by the system to recognize facial movement and expression. The system analyzes motion between two images using parameterized models of image motion. Initially, a first image in a sequence of images is segmented into a face region and a plurality of facial feature regions. A planar model is used to recover motion parameters that estimate motion between the segmented face region in the first image and a second image in the sequence of images. The second image is warped or shifted back towards the first image using the estimated motion parameters of the planar model, in order to model the facial features relative to the first image. An affine model and an affine model with curvature are used to recover motion parameters that estimate the image motion between the segmented facial feature regions and the warped second image. The recovered motion parameters of the facial feature regions represent the relative motions of the facial features between the first image and the warped image. The face region in the second image is tracked using the recovered motion parameters of the face region. The facial feature regions in the second image are tracked using both the recovered motion parameters for the face region and the motion parameters for the facial feature regions. The parameters describing the motion of the face and facial features are filtered to derive mid-level predicates that define facial gestures occurring between the two images. These mid-level predicates are evaluated over time to determine facial expression and gestures occurring in the image sequence.

373 citations


Patent
21 Feb 1995
TL;DR: The preferred embodiment of the present invention operates on a system (100) of interconnected gaming machines (101-108) and selectively provides one or more active features to all of the gaming machines linked to the system as mentioned in this paper.
Abstract: The preferred embodiment of the present invention operates on a system (100) of interconnected gaming machines (101-108) and selectively provides one or more active features to all of the gaming machines (101-108) linked to the system (100). The enablement of the feature may be based on the combined results from previous plays on the individual gaming machines (101-108). The feature may be a bonus award for a specific displayed combination, an increased possibility for a winning combination, or any other feature which alters the normal operation of the gaming machines (101-108). The first machine (101-108) to generate a winning game result to which the feature applies will be given an award based upon the feature. The feature is then disabled and the system returns to normal play mode until the feature is again enabled. Various other features are described.

360 citations


Proceedings Article
20 Aug 1995
TL;DR: This work introduces compound operators that dynamically change the topology of the search space to better utilize the information available from the evaluation of feature subsets and shows that compound operators unify previous approaches that deal with relevant and irrelevant features.
Abstract: In the wrapper approach to feature subset selection, a search for an optimal set of features is made using the induction algorithm as a black box. The estimated future performance of the algorithm is the heuristic guiding the search. Statistical methods for feature subset selection including forward selection, backward elimination, and their stepwise variants can be viewed as simple hill-climbing techniques in the space of feature subsets. We utilize best-first search to find a good feature subset and discuss overfitting problems that may be associated with searching too many feature subsets. We introduce compound operators that dynamically change the topology of the search space to better utilize the information available from the evaluation of feature subsets. We show that compound operators unify previous approaches that deal with relevant and irrelevant features. The improved feature subset selection yields significant improvements for real-world datasets when using the ID3 and the Naive-Bayes induction algorithms.

358 citations


Patent
Michael J. Black1, Yaser Yacoob1
15 Dec 1995
TL;DR: In this paper, a planar model is used to recover motion parameters that estimate motion between the segmented face region in the first image and a second image in the sequence of images.
Abstract: A system tracks human head and facial features over time by analyzing a sequence of images. The system provides descriptions of motion of both head and facial features between two image frames. These descriptions of motion are further analyzed by the system to recognize facial movement and expression. The system analyzes motion between two images using parameterized models of image motion. Initially, a first image in a sequence of images is segmented into a face region and a plurality of facial feature regions. A planar model is used to recover motion parameters that estimate motion between the segmented face region in the first image and a second image in the sequence of images. The second image is warped or shifted back towards the first image using the estimated motion parameters of the planar model, in order to model the facial features relative to the first image. An affine model and an affine model with curvature are used to recover motion parameters that estimate the image motion between the segmented facial feature regions and the warped second image. The recovered motion parameters of the facial feature regions represent the relative motions of the facial features between the first image and the warped image. The face region in the second image is tracked using the recovered motion parameters of the face region. The facial feature regions in the second image are tracked using both the recovered motion parameters for the face region and the motion parameters for the facial feature regions. The parameters describing the motion of the face and facial features are filtered to derive mid-level predicates that define facial gestures occurring between the two images. These mid-level predicates are evaluated over time to determine facial expression and gestures occurring in the image sequence.

285 citations


Patent
08 Jun 1995
TL;DR: In this article, an imaging device includes an array of plural imaging elements each of which is responsive to incident light flux to provide an output signal, and each of the imaging elements includes provision for conducting a variable time integration of incident light, and alternatively, also for selecting a time interval during which each imaging elements simultaneously conducts such a time integration.
Abstract: An imaging device includes an array of plural imaging elements each of which is responsive to incident light flux to provide an output signal. Each of the imaging elements includes provision for conducting a variable time integration of incident light flux, and alternatively, also for selecting a time interval during which each of the imaging elements simultaneously conducts such a time integration of incident light flux (i.e., takes a snap shot of an image scene). The imaging device includes provision for random access of each image element or group of image elements in the array so that output signals indicative of all or of only selected parts of an imaged scene can be processed for their image information, if desired. The other parts of an imaged scene may not be considered or may be considered for their image information at a lower sampling rate than the selected parts of the scene so that image information about the selected parts of the image scene can be accessed at a much higher rate than is conventionally possible. A variable gain feature allows selective canceling of fixed-pattern noise, interference, or unwanted image information. An anti-blooming feature prevents charge from an excessively bright image source from cascading across the array. Also, a control cache memory allows control commands to be fed to the device at a high rate and to be implemented at a slower rate on a first-in, first-out basis.

260 citations


Journal ArticleDOI
TL;DR: A system to read automatically the Italian license number of a car passing through a tollgate using a CCTV camera and a frame grabber card to acquire a rear-view image of the vehicle is presented.
Abstract: A system for the recognition of car license plates is presented The aim of the system is to read automatically the Italian license number of a car passing through a tollgate A CCTV camera and a frame grabber card are used to acquire a rear-view image of the vehicle The recognition process consists of three main phases First, a segmentation phase locates the license plate within the image Then, a procedure based upon feature projection estimates some image parameters needed to normalize the license plate characters Finally, the character recognizer extracts some feature points and uses template matching operators to get a robust solution under multiple acquisition conditions A test has been done on more than three thousand real images acquired under different weather and illumination conditions, thus obtaining a recognition rate close to 91% >

258 citations


Proceedings ArticleDOI
15 Sep 1995
TL;DR: This paper first adopts a computer vision technique called snakes to reduce the burden of feature specification, and proposes the use of multilevel free-form deformations (MFFD) to achieve -continuous and one-to-one warps among feature point pairs.
Abstract: This paper presents new solutions to the following three problems in image morphing: feature specification, warp generation, and transition control. To reduce the burden of feature specification, we first adopt a computer vision technique called snakes. We next propose the use of multilevel free-form deformations (MFFD) to achieve -continuous and one-to-one warps among feature point pairs. The resulting technique, based on B-spline approximation, is simpler and faster than previous warp generation methods. Finally, we simplify the MFFD method to construct -continuous surfaces for deriving transition functions to control geometry and color blending.

256 citations


Patent
10 Oct 1995
TL;DR: In this paper, a process for identifying a single item from a family of items presents a user with a feature screen having a series of groupings, each grouping represents a feature having a set of alternatives from which to select.
Abstract: A process for identifying a single item from a family of items presents a user with a feature screen having a series of groupings. Each grouping represents a feature having a set of alternatives from which to select. Selected alternatives are used as a selection criteria in a search operation. Results of the search operation is a revised feature screen indicating alternatives that remain available to the user for further selection and searching. The feature screen and search process, therefore, presents the user with a guided nonhierarchical parametric search to identify matching items based upon user specified criteria and priorities. Also disclosed is an adaptation of the claimed method and system appropriate in an Internet environment.


Patent
30 Aug 1995
TL;DR: In this paper, a system for automatically detecting and recognizing the identity of a deformable object such as a human face within an arbitrary image scene is presented, which consists of an object detector implemented as a probabilistic DBNN, for determining whether the object is within the arbitrary image scenes and a feature localizer also implemented as an entity detector for determining the position of an identifying feature on the object.
Abstract: A system for automatically detecting and recognizing the identity of a deformable object such as a human face, within an arbitrary image scene. The system comprises an object detector implemented as a probabilistic DBNN, for determining whether the object is within the arbitrary image scene and a feature localizer also implemented as a probabilistic DBNN, for determining the position of an identifying feature on the object such as the eyes. A feature extractor is coupled to the feature localizer and receives coordinates sent from the feature localizer which are indicative of the position of the identifying feature and also extracts from the coordinates information relating to other features of the object such as the eyebrows and nose, which are used to create a low resolution image of the object. A probabilistic DBNN based object recognizer for determining the identity of the object receives the low resolution image of the object inputted from the feature extractor to identify the object.

Journal ArticleDOI
TL;DR: A novel method for efficient image analysis that uses tuned matched Gabor filters that requires no a priori knowledge of the analyzed image so that the analysis is unsupervised.
Abstract: Recent studies have confirmed that the multichannel Gabor decomposition represents an excellent tool for image segmentation and boundary detection. Unfortunately, this approach when used for unsupervised image analysis tasks imposes excessive storage requirements due to the nonorthogonality of the basis functions and is computationally highly demanding. In this correspondence, we propose a novel method for efficient image analysis that uses tuned matched Gabor filters. The algorithmic determination of the parameters of the Gabor filters is based on the analysis of spectral feature contrasts obtained from iterative computation of pyramidal Gabor transforms with progressive dyadic decrease of elementary cell sizes. The method requires no a priori knowledge of the analyzed image so that the analysis is unsupervised. Computer simulations applied to different classes of textures illustrate the matching property of the tuned Gabor filters derived using our determination algorithm. Also, their capability to extract significant image information and thus enable an easy and efficient low-level image analysis will be demonstrated. >

Patent
25 May 1995
TL;DR: In this article, a method for recognizing handwritten characters in response to an input signal from a handwriting transducer is described, which relies on static or shape information, wherein the temporal order in which points are captured by an electronic tablet may be disregarded.
Abstract: Methods and apparatus are disclosed for recognizing handwritten characters in response to an input signal from a handwriting transducer. A feature extraction and reduction procedure is disclosed that relies on static or shape information, wherein the temporal order in which points are captured by an electronic tablet may be disregarded. A method of the invention generates and processes the tablet data with three independent sets of feature vectors which encode the shape information of the input character information. These feature vectors include horizontal (x-axis) and vertical (y-axis) slices of a bit-mapped image of the input character data, and an additional feature vector to encode an absolute y-axis displacement from a baseline of the bit-mapped image. It is shown that the recognition errors that result from the spatial or static processing are quite different from those resulting from temporal or dynamic processing. Furthermore, it is shown that these differences complement one another. As a result, a combination of these two sources of feature vector information provides a substantial reduction in an overall recognition error rate. Methods to combine probability scores from dynamic and the static character models are also disclosed.

Journal ArticleDOI
TL;DR: An automated approach to register CT and MR brain images that achieves an accurate match even if the scanned volumes to be matched do not completely overlap, or if some of the features in the images are not similar.
Abstract: Describes an automated approach to register CT and MR brain images. Differential operators in scale space are applied to each type of image data, so as to produce feature images depicting "ridgeness". The resulting CT and MR feature images show similarities which can be used for matching. No segmentation is needed and the method is devoid of human interaction. The matching is accomplished by hierarchical correlation techniques. Results of 2-D and 3-D matching experiments are presented. The correlation function ensures an accurate match even if the scanned volumes to be matched do not completely overlap, or if some of the features in the images are not similar. >

Proceedings ArticleDOI
20 Jun 1995
TL;DR: An algorithm is developed that uniquely recovers 3D surface profiles using a single virtual feature tracked from the occluding boundary of the object and a closed-form relation is derived between the image trajectory of a virtual feature and the geometry of the specular surface it travels on.
Abstract: A theoretical framework is introduced for the perception of specular surface geometry. When an observer moves in three-dimensional space, real scene features, such as surface markings, remain stationary with respect to the surfaces they belong to. In contrast, a virtual feature, which is the specular reflection of a real feature, travels on the surface. Based on the notion of caustics, a novel feature classification algorithm is developed that distinguishes real and virtual features from their image trajectories that result from observer motion. Next, using support functions of curves, a closed-form relation is derived between the image trajectory of a virtual feature and the geometry of the specular surface it travels on. It is shown that in the 2D case where camera motion and the surface profile are coplanar, the profile is uniquely recovered by tracking just two unknown virtual features. Finally, these results are generalized to the case of arbitrary 3D surface profiles that are travelled by virtual features when camera motion is not confined to a plane. An algorithm is developed that uniquely recovers 3D surface profiles using a single virtual feature tracked from the occluding boundary of the object. All theoretical derivations and proposed algorithms are substantiated by experiments. >

Journal ArticleDOI
TL;DR: Spatial attention was measured in visual search tasks using a spatial probe as discussed by the authors, and it was shown that spatial attention was allocated to locations according to the presence of target features in conjunction search.
Abstract: Spatial attention was measured in visual search tasks using a spatial probe Both speed and accuracy measures showed that in a conjunction task, spatial attention was allocated to locations according to the presence of target features Also, contrary to some predictions, spatial attention was used when a clearly distinguishable feature defined the target The results raise questions about any account that assumes separate mechanisms for feature and conjunction search The probe method demonstrated here allows a very direct measurement of attentional allocation, and may uncover aspects of selection not revealed by visual search

Journal ArticleDOI
TL;DR: An algorithm for pose estimation based on the volume measurement of tetrahedra composed of feature-point triplets extracted from an arbitrary quadrangular target and the lens center of the vision system is proposed.
Abstract: Pose estimation is an important operation for many vision tasks. In this paper, the authors propose an algorithm for pose estimation based on the volume measurement of tetrahedra composed of feature-point triplets extracted from an arbitrary quadrangular target and the lens center of the vision system. The inputs to this algorithm are the six distances joining all feature pairs and the image coordinates of the quadrangular target. The outputs of this algorithm are the effective focal length of the vision system, the interior orientation parameters of the target, the exterior orientation parameters of the camera with respect to an arbitrary coordinate system if the target coordinates are known in this frame, and the final pose of the camera. The authors have also developed a shape restoration technique which is applied prior to pose recovery in order to reduce the effects of inaccuracies caused by image projection. An evaluation of the method has shown that this pose estimation technique is accurate and robust. Because it is based on a unique and closed form solution, its speed makes it a potential candidate for solving a variety of landmark-based tracking problems. >

Book ChapterDOI
03 Apr 1995
TL;DR: A unifying definition and a classification scheme for existing VB matching criteria and a new matching criterion: the entropy of the grey-level scatter-plot, which requires no segmentation or feature extraction and no a priori knowledge of photometric model parameters.
Abstract: In this paper, 3D voxel-similarity-based (VB) registration algorithms that optimize a feature-space clustering measure are proposed to combine the segmentation and registration process. We present a unifying definition and a classification scheme for existing VB matching criteria and propose a new matching criterion: the entropy of the grey-level scatter-plot. This criterion requires no segmentation or feature extraction and no a priori knowledge of photometric model parameters. The effects of practical implementation issues concerning grey-level resampling, scatter-plot binning, parzen-windowing and resampling frequencies are discussed in detail and evaluated using real world data (CT and MRI).

Journal ArticleDOI
01 Feb 1995
TL;DR: This paper presents a technique that poses the vision sensor planning problem in an optimization setting and determines viewpoints that satisfy all previous requirements simultaneously and with a margin and presents experimental results of this technique when applied to a robotic vision system that consists of a camera mounted on a robot manipulator in a hand-eye configuration.
Abstract: The MVP (machine vision planner) model-based sensor planning system for robotic vision is presented. MVP automatically synthesizes desirable camera views of a scene based on geometric models of the environment, optical models of the vision sensors, and models of the task to be achieved. The generic task of feature detectability has been chosen since it is applicable to many robot-controlled vision systems. For such a task, features of interest in the environment are required to simultaneously be visible, inside the field of view, in focus, and magnified as required. In this paper, we present a technique that poses the vision sensor planning problem in an optimization setting and determines viewpoints that satisfy all previous requirements simultaneously and with a margin. In addition, we present experimental results of this technique when applied to a robotic vision system that consists of a camera mounted on a robot manipulator in a hand-eye configuration. >

Patent
26 Jan 1995
TL;DR: In this article, an instrument is placed in relation to the designated object and which is capable of sending information about the object to a computer, which can be used as input to robotic devices or can be rendered, in various ways, to a human user.
Abstract: The invention comprises an instrument which is placed in relation to the designated object and which is capable of sending information about the object to a computer. Image processing methods are used to generated images of the object and determine positional information about it. This information can be used as input to robotic devices or can be rendered, in various ways (video graphics, speech synthesis), to a human user. Various input apparatus are attached to the transmitting or other used instruments to provide control inputs to the computer.

Patent
31 Oct 1995
TL;DR: In this article, an optical system in which optical radiation containing a wavelength λ is directed onto a patterned mask, in order to form an image feature on a photoresist layer located on the image plane of the system is described.
Abstract: This invention involves an optical system in which optical radiation containing a wavelength λ is directed onto a patterned mask, in order to form an image feature on a photoresist layer located on the image plane of the system. The patterned mask has a main object feature, which has the form of the image feature. The object feature has a portion whose width is everywhere less than (1.5)λ/NA, where NA is the numerical aperture of the image side of the system. An assist feature whose width is everywhere less than (0.5)λ/NA is located on the mask in a neighborhood of the portion of the main object feature. Advantageously, the optical radiation is directed through an annular aperture ("off-axis illumination") in an opaque screen and through a collimating lens onto the mask. In one exemplary situation, the assist feature is located outside the main object feature and has a distance of closest approach to the main object feature that is everywhere equal to less than λ/NA. In another exemplary situation, the assist feature is located inside the main object feature. In addition, advantageously the object and the assist features in the mask are defined by localized clear regions (optical transmission coefficient is equal to approximately unity) located in a relatively opaque field (optical transmission coefficient T=approximately 0.10), or are defined by localized relatively opaque regions located in a clear field, as the case may be.

Book
02 Jan 1995
TL;DR: In this paper, a robotic system for object recognition that uses passive stereo vision and active exploratory tactile sensing is described, where the complementary nature of these sensing modalities allows the system to discover the underlying 3D structure of the objects to be recognized.
Abstract: A robotic system for object recognition is described that uses passive stereo vision and active exploratory tactile sensing. The complementary nature of these sensing modalities allows the system to discover the underlying 3-D structure of the objects to be recognized. This structure is embodied in rich, hierarchical, viewpoint-independent 3-D models of the objects which include curved surfaces, concavities and holes. The vision processing provides sparse 3-D data about regions of interest that are then actively explored by the tactile sensor mounted on the end of a six-degree-of-freedom manipulator. A robust, hierarchical procedure has been developed to inte grate the visual and tactile data into accurate 3-D surface and feature primitives. This integration of vision and touch provides geometric measures of the surfaces and features that are used in a matching phase to find model objects that are consistent with the sensory data. Methods for verification of the hypothesis are presented, including the sen...

Journal ArticleDOI
03 Apr 1995
TL;DR: With the techniques developed in this paper, interactive video, which transmits images of a patient to the expert and sends them back with some image overlay, can be realized.
Abstract: This paper presents computer vision based techniques for object registration, real-time tracking, and image overlay. The capability can be used to superimpose registered images such as those from CT or MRI onto a video image of a patient’s body. Real-time object registration enables an image to be overlaid consistently onto objects even while the object or the viewer is moving. The video image of a patient’s body is used as input for object registration. Reliable real-time object registration at frame rate (30 Hz) is realized by a combination of techniques, including template matching based feature detection, feature correspondence by geometric constraints, and pose calculation of objects from feature positions in the image. Two types of image overlay systems are presented. The first one registers objects in the image and projects preoperative model data onto a raw camera image. The other computes the position of image overlay directly from 2D feature positions without any prior models. With the techniques developed in this paper, interactive video, which transmits images of a patient to the expert and sends them back with some image overlay, can be realized.

Proceedings ArticleDOI
29 Oct 1995
TL;DR: This work presents a conceptual framework and a process model for feature extraction and iconic visualization, and describes some generic techniques to generate attribute sets, such as volume integrals and medial axis transforms.
Abstract: This paper presents a conceptual framework and a process model for feature extraction and iconic visualization. Feature extraction is viewed as a process of data abstraction, which can proceed in multiple stages and corresponding data abstraction levels. The features are represented by attribute sets, which play a key role in the visualization process. Icons are symbolic parametric objects, designed as visual representations of features. The attributes are mapped to the parameters (or degrees of freedom) of an icon. We describe some generic techniques to generate attribute sets, such as volume integrals and medial axis transforms. A simple but powerful modeling language was developed to create icons, and to link the attributes to the icon parameters. We present illustrative examples of iconic visualization created with the techniques described, showing the effectiveness of this approach.

Journal ArticleDOI
TL;DR: In this paper, an original language for the symbolic representation of the contents of image sequences is presented, referred to as spatio-temporal logic, which allows for treatment and operation of content structures at a higher level than pixels or image features.
Abstract: The emergence of advanced multimedia applications is emphasizing the relevance of retrieval by contents within databases of images and image sequences. Matching the inherent visuality of the information stored in such databases, visual specification by example provides an effective and natural way to express content-oriented queries. To support this querying approach, the system must be able to interpret example scenes reproducing the contents of images and sequences to be retrieved, and to match them against the actual contents of the database. In the accomplishment of this task, to avoid a direct access to raw image data, the system must be provided with an appropriate description language supporting the representation of the contents of pictorial data. An original language for the symbolic representation of the contents of image sequences is presented. This language, referred to as spatio-temporal logic, comprises a framework for the qualitative representation of the contents of image sequences, which allows for treatment and operation of content structures at a higher level than pixels or image features. Organization and operation principles of a prototype system exploiting spatio-temporal logic to support querying by example through visual iconic interaction are expounded. >

Journal ArticleDOI
TL;DR: A new method for the detection of vanishing points based on sub-pixel line descriptions which recognizes the existence of errors in feature detection and which does not rely on supervision or the arbitrary specification of thresholds is presented.
Abstract: This paper presents a new method for the detection of vanishing points based on sub-pixel line descriptions which recognizes the existence of errors in feature detection and which does not rely on supervision or the arbitrary specification of thresholds. Image processing and image analysis are integrated into a coherent scheme which extracts straight line structure from images, develops a measure of line quality for each line, estimates the number of vanishing points and their approximate orientations, and then computes optimal vanishing point estimates through combined clustering and numerical optimization. Both qualitative and quantitative evaluation of the algorithms performance is included in the presentation. >

Patent
19 Dec 1995
TL;DR: In this paper, a distortion-correction map is generated for each acquired calibration target image, and a combined map can be generated that both corrects image distortion, and transforms local camera coordinates into global coordinates.
Abstract: A method is provided for use in a multi-camera machine vision system wherein each of a plurality of cameras simultaneously acquires an image of a different portion of an object of interest. The invention makes it possible to precisely coordinate the fields of view of the plurality of cameras so that accurate measurements can be precisely performed across multiple fields of view, even in the presence of image distortion within each field of view. The method includes the steps of, at calibration-time, fixing the plurality of cameras with respect to a substantially rigid dimensionally-stable substrate including a plurality of calibration targets each having a reference feature. For each camera, an image of a calibration target is acquired to provide a plurality of acquired calibration target images. Then a distortion-correction map is generated for each acquired calibration target image. At run-time, for each camera, an image is acquired, at least two of the images including a portion of the object to provide a plurality of partial object images. These partial object images are then transformed by a distortion-correction map to provide a plurality of corrected partial object images. Next, relative displacement information is used to determine the relative displacement of a first point in a first corrected partial object image with respect to a second point in a second corrected partial object image. A combined map can be generated that both corrects image distortion, and transforms local camera coordinates into global coordinates.

Proceedings ArticleDOI
20 Jun 1995
TL;DR: The paper describes how image sequences taken by a moving video camera may be processed to detect and track moving objects against a moving background in real-time.
Abstract: The paper describes how image sequences taken by a moving video camera may be processed to detect and track moving objects against a moving background in real-time. The motion segmentation and shape tracking system as known as ASSET-2-A Scene Segmenter Establishing Tracking, Version 2. Motion is found by tracking image features, and segmentation is based on first-order (i.e., six parameter) flow fields. Shape tracking is performed using two dimensional radial map representation. The system runs in real-time, and is accurate and reliable. It requires no camera calibration and no knowledge of the camera's motion. >