scispace - formally typeset
Search or ask a question

Showing papers on "Orientation (computer vision) published in 1997"


Journal ArticleDOI
TL;DR: A feature-based algorithm for detecting faces that is sufficiently generic and is also easily extensible to cope with more demanding variations of the imaging conditions is proposed.

422 citations


Journal ArticleDOI
TL;DR: This paper describes techniques to perform efficient and accurate target recognition in difficult domains using a version of the Hausdorff measure that incorporates both location and orientation information to determine which positions of each object model are reported as possible target locations.
Abstract: This paper describes techniques to perform efficient and accurate target recognition in difficult domains. In order to accurately model small, irregularly shaped targets, the target objects and images are represented by their edge maps, with a local orientation associated with each edge pixel. Three dimensional objects are modeled by a set of two-dimensional (2-D) views of the object. Translation, rotation, and scaling of the views are allowed to approximate full three-dimensional (3-D) motion of the object. A version of the Hausdorff measure that incorporates both location and orientation information is used to determine which positions of each object model are reported as possible target locations. These positions are determined efficiently through the examination of a hierarchical cell decomposition of the transformation space. This allows large volumes of the space to be pruned quickly. Additional techniques are used to decrease the computation time required by the method when matching is performed against a catalog of object models. The probability that this measure will yield a false alarm and efficient methods for estimating this probability at run time are considered in detail. This information can be used to maintain a low false alarm rate or to rank competing hypotheses based on their likelihood of being a false alarm. Finally, results of the system recognizing objects in infrared and intensity images are given.

408 citations


Journal ArticleDOI
TL;DR: It is concluded that in the light of the vast hardware resources available in the ventral stream of the primate visual system relative to those exercised here, the appealingly simple feature-space conjecture remains worthy of serious consideration as a neurobiological model.
Abstract: Severe architectural and timing constraints within the primate visual system support the conjecture that the early phase of object recognition in the brain is based on a feedforward feature-extraction hierarchy. To assess the plausibility of this conjecture in an engineering context, a difficult three-dimensional object recognition domain was developed to challenge a pure feedforward, receptive-field-based recognition model called SEEMORE. SEEMORE is based on 102 viewpoint-invariant nonlinear filters that as a group are sensitive to contour, texture, and color cues. The visual domains consists of 100 real objects of many different types, including rigid (shovel), nonrigid (telephone cord), and statistical (maple leaf cluster) objects and photographs of complex scenes. Objects were individually presented in color video images under normal room lighting conditions. Based on 12 to 36 training views, SEEMORE was required to recognize unnormalized test views of objects that could vary in position, orientation in the image plane and in depth, and scale (factor of 2); for nonrigid objects, recognition was also tested under gross shape deformations. Correct classification performance on a test set consisting of 600 novel object views was 97 percent (chance was 1 percent) and was comparable for the subset of 15 nonrigid objects. Performance was also measured under a variety of image degradation conditions, including partial occlusion, limited clutter, color shift, and additive noise. Generalization behavior and classification errors illustrated the emergence of several striking natural shape categories that are not explicitly encoded in the dimensions of the feature space. It is concluded that in the light of the vast hardware resources available in the ventral stream of the primate visual system relative to those exercised here, the appealingly simple feature-space conjecture remains worthy of serious consideration as a neurobiological model.

371 citations


Journal ArticleDOI
TL;DR: A state-based technique for the representation and recognition of gesture is presented, using techniques for computing a prototype trajectory of an ensemble of trajectories and for defining configuration states along the prototype and for recognizing gestures from an unsegmented, continuous stream of sensor data.
Abstract: A state-based technique for the representation and recognition of gesture is presented. We define a gesture to be a sequence of states in a measurement or configuration space. For a given gesture, these states are used to capture both the repeatability and variability evidenced in a training set of example trajectories. Using techniques for computing a prototype trajectory of an ensemble of trajectories, we develop methods for defining configuration states along the prototype and for recognizing gestures from an unsegmented, continuous stream of sensor data. The approach is illustrated by application to a range of gesture-related sensory data: the two-dimensional movements of a mouse input device, the movement of the hand measured by a magnetic spatial position and orientation sensor, and, lastly, the changing eigenvector projection coefficients computed from an image sequence.

339 citations


Patent
28 Nov 1997
TL;DR: In this article, the position and orientation of an ultrasound transducer are tracked in a frame of reference by a spatial determinator, which is used to generate processed images from the images acquired by the transducers.
Abstract: The present invention provides a system and method for visualizing internal images of an anatomical body. Internal images of the body are acquired by an ultrasound imaging transducer. The position and orientation of the ultrasound imaging transducer is tracked in a frame of reference by a spatial determinator. The position of the images in the frame of reference is determined by calibrating the ultrasound imaging transducer to produce a vector position of the images with respect to a fixed point on the transducer. This vector position can then be added to the position and orientation of the fixed point of the transducer in the frame of reference determined by the spatial determinator. The position and orientation of a medical instrument used on the patient are also tracked in the frame of reference by spatial determinators. The position and orientation of the instrument is mapped onto the position and orientation of the images. This information is used to generate processed images from the images acquired by the transducer. The processed images are generated from a view spatially related to the position of the instrument. The system is expandable so that more than one instrument and more than one transducer can be used.

313 citations


Journal ArticleDOI
TL;DR: This work addresses the problem of obtaining dense surface information from a sparse set of 3D data in the presence of spurious noise samples, and proposes to impose additional perceptual constraints such as good continuity and "cosurfacity" to not only infer surfaces, but also to detect surface orientation discontinuities, as well as junctions, all at the same time.
Abstract: We address the problem of obtaining dense surface information from a sparse set of 3D data in the presence of spurious noise samples. The input can be in the form of points, or points with an associated tangent or normal, allowing both position and direction to be corrupted by noise. Most approaches treat the problem as an interpolation problem, which is solved by fitting a surface such as a membrane or thin plate to minimize some function. We argue that these physical constraints are not sufficient, and propose to impose additional perceptual constraints such as good continuity and "cosurfacity". These constraints allow us to not only infer surfaces, but also to detect surface orientation discontinuities, as well as junctions, all at the same time. The approach imposes no restriction on genus, number of discontinuities, number of objects, and is noniterative. The result is in the form of three dense saliency maps for surfaces, intersections between surfaces (i.e., 3D curves), and 3D junctions, respectively. These saliency maps are then used to guide a "marching" process to generate a description (e.g., a triangulated mesh) making information about surfaces, space curves, and 3D junctions explicit. The traditional marching process needs to be refined as the polarity of the surface orientation is not necessarily locally consistent. These three maps are currently not integrated, and this is the topic of our ongoing research. We present results on a variety of computer-generated and real data, having varying curvature, of different genus, and multiple objects.

187 citations


Patent
02 Dec 1997
TL;DR: A display with orientation dependent rotatably image presents a properly oriented image in a first mounted fold-down position and a second up-right table-top position as mentioned in this paper, where the display folds up into a base unit when not being used for compact storage of the system.
Abstract: A display with orientation dependent rotatably image presents a properly oriented image in a first mounted fold-down position and a second up-right table-top position. The display folds up into a base unit when not being used for compact storage of the system. An orientation determining device is included for determining the current orientation of the display and properly orienting the image based on that current orientation. The orientation determining device is either a mechanically flipped switch, an automatic switch or an acceleration sensor. The display screen is preferably an LCD screen. Alternatively, the display screen is a light valve type display including a grating light valve system. The display is for use in a television system, computer system, video phone or browser. Infrared input devices are used to control the display and provide data to the computer system. In an alternate embodiment, a touch sensitive screen is also used as an input device.

183 citations


Patent
31 Jul 1997
TL;DR: In this article, a method and apparatus for automatically rotating a graphical user interface for managing portrait and landscape captures in an image capture unit is presented. But it is not shown in this paper.
Abstract: The present invention provides a method and apparatus for automatically rotating a graphical user interface for managing portrait and landscape captures in an image capture unit. A method and apparatus for viewing an image in an image capture unit including a display comprises the steps of providing a first orientation associated with the image and providing a second orientation associated with the image capture unit. It is then determined whether the first orientation is different from the second orientation, and the image is displayed in the second orientation if the first and second orientations are different from each other.

170 citations


Journal ArticleDOI
TL;DR: In this paper, an image analysis technique using the Fourier transform of the image to evaluate orientation in a fibrous assembly is presented. And the results are compared with those for the tracking method presented in Part II.
Abstract: This paper addresses the development of an image analysis technique using the Fourier transform of the image to evaluate orientation in a fibrous assembly. The algorithms are evaluated using simulated images presented in Part I of the series. The results are compared with those for the tracking method presented in Part II.

151 citations


Journal ArticleDOI
TL;DR: In this article, a new edge-based approach for efficient image registration is proposed, which applies wavelet transform to extract a number of feature points as the basis for registration, each selected feature point is an edge point whose edge response is the maximum within a neighborhood.

144 citations


Journal ArticleDOI
TL;DR: A systematic procedure is proposed in this paper for generating good color stripe patterns, in which a pattern of color stripes is projected onto the objects when taking images, after which edge segments are extracted from the acquired stereo image pair and then used for finding the correct stereo correspondence.

Patent
30 Sep 1997
TL;DR: In this article, a flat panel display that can change three-dimensional orientation in a continuous way, such as when the display is horizontally rotated on a turntable, is presented.
Abstract: A system that includes a flat panel display that can change three-dimensional orientation in a continuous way, such as when the display is horizontally rotated on a turntable. Position or orientation of the display relative to a reference orientation is sensed by orientation sensors coupled to the display. A computer compares the orientation of the display to a fixed reference orientation. When the orientation of the display has changed from the reference, the computer maps the orientation of a user interface onto the display in such a way as to maintain the same orientation of the interface with respect to the reference. User input through an overlaid input device correlates to the oriented user interface. In this way if the display is rotated into, for example, a sideways or upside-down orientation, the user interface elements will still be displayed in a normal upright orientation for the user and the user inputs in a normal manner. When multiple users interact with the display, the orientation of each user with respect to the reference is tracked by the computer and the particular user's interface orientation is maintained constant with respect to the particular user's reference.

Journal ArticleDOI
TL;DR: In this paper recent developments and the state of the art in automatic image orientation are presented.
Abstract: Considerable progress has been achieved in the automation of image orientation for photogrammetry and remote sensing over the last few years. Today, autonomous software modules for interior and relative orientation are commercially available in digital photogrammetric workstations (DPWS), and so is automatic aerial triangulation. The absolute orientation has been successfully automated for a number of applications. In this paper recent developments and the state of the art in automatic image orientation are presented.

Journal ArticleDOI
TL;DR: This contribution addresses the problem of pose estimation and tracking of vehicles in image sequences from traffic scenes recorded by a stationary camera by directly matching polyhedral vehicle models to image gradients without an edge segment extraction process.
Abstract: This contribution addresses the problem of pose estimation and tracking of vehicles in image sequences from traffic scenes recorded by a stationary camera. In a new algorithm, the vehicle pose is estimated by directly matching polyhedral vehicle models to image gradients without an edge segment extraction process. The new approach is significantly more robust than approaches that rely on feature extraction since the new approach exploits more information from the image data. We successfully tracked vehicles that were partially occluded by textured objects, e.g., foliage, where a previous approach based on edge segment extraction failed. Moreover, the new pose estimation approach is also used to determine the orientation and position of the road relative to the camera by matching an intersection model directly to image gradients. Results from various experiments with real world traffic scenes are presented.

Journal ArticleDOI
TL;DR: The correctness of the method is demonstrated theoretically and in practice by applying them to a number of "degenerate" images which have failed other previously reported techniques for image normalization.
Abstract: We provide a generalized image normalization technique which basically solves all problems in image normalization. The orientation of any image can be uniquely defined by at most three non-zero generalized complex moments. The correctness of our method is demonstrated theoretically as well as in practice by applying them to a number of "degenerate" images which have failed other previously reported techniques for image normalization.

Journal ArticleDOI
TL;DR: An algorithm based on a set of linear filters sensitive to vessels of different orientation and thickness, which can be integrated to obtain images in which vessels are highly enhanced independently of their direction and thickness is presented.

Journal ArticleDOI
Armin Gruen1
TL;DR: Besides being a highly automated measurement technique, videogrammetry provides for high accuracy and truly real-time data processing capabilities, which is of great interest for applications in biomechanics, sport, animation, and virtual reality generation and control.

Proceedings ArticleDOI
17 Jun 1997
TL;DR: A new real-time method to estimate the posture of a human from thermal images acquired by an infrared camera regardless of the back-ground and lighting conditions is introduced and the robustness of the proposed algorithm and real- time performance is demonstrated.
Abstract: This paper introduces a new real-time method to estimate the posture of a human from thermal images acquired by an infrared camera regardless of the back-ground and lighting conditions. Distance transformation is performed for the human body area extracted from the thresholded thermal image for the. Calculation of the center of gravity. After the orientation of the upper half of the body is obtained by calculating the moment of inertia, significant points such as the top of the head, the tips of the hands and foot are heuristically located. In addition, the elbow and foot positions are estimated from the detected (significant) points using a genetic algorithm based learning procedure. The experimental results demonstrate the robustness of the proposed algorithm and real-time (faster than 20 frames per second) performance.

Patent
12 Sep 1997
TL;DR: In this article, a computer system for producing computer generated displays, including window-format displays permitting visualization of changes to a building or structure in an actual environment, is presented, with a background display of digital images originating from either an image capture device or from other sources to which changes are to be made for visualization purposes.
Abstract: A computer system (252) for producing computer generated displays (248), including window-format displays permitting visualization of changes to a building or structure in an actual environment (250). The system provides a background display (250) of digital images originating from either an image capture device or from other sources to which changes are to be made for visualization purposes; a product catalog in the form of a database of objects (260), together with features in the computer system operable to record and store digital images of the objects as well as detailed information (262) related to the objects within the database (260); and a mechanism to copy and move and removably place the object (260) over the background (250). The realistic visualization is facilitated by means of a number of tools associated with the system (252) that permit resizing of objects (260), fitting objects (260) into user designated areas (250), perspective orientation, and other tools useful by the system user to obtain this objective.

Patent
07 Oct 1997
TL;DR: In this paper, a machine vision method analyzes a calibration target of the type having two or more regions, each having a "imageable characteristic," eg, a different color, contrast, or brightness, from its neighboring region(s) Each region has at least two edges that are linear and that are directed toward and, optionally meet at, a reference point (e.g., the center of the target or some other location of interest).
Abstract: A machine vision method analyzes a calibration target of the type having two or more regions, each having a "imageable characteristic," eg, a different color, contrast, or brightness, from its neighboring region(s) Each region has at least two edges--referred to as "adjoining edges"--that are linear and that are directed toward and, optionally meet at, a reference point (eg, the center of the target or some other location of interest) The method includes generating an image of the target, identifying in the image features corresponding to the adjoining edges, fitting lines to those edges, and determining the orientation and/or position of the target from those lines

Journal ArticleDOI
TL;DR: A technique has been developed for accurate estimation of three-dimensional (3D) biplane imaging geometry and reconstruction of 3D objects based on two perspective projections acquired at arbitrary orientations, without the need of calibration.
Abstract: A technique has been developed for accurate estimation of three-dimensional (3D) biplane imaging geometry and reconstruction of 3D objects based on two perspective projections acquired at arbitrary orientations, without the need of calibration. The required prior information (i.e., the intrinsic parameters of each single-plane imaging system) for determination of biplane imaging geometry includes (a) the distance between each focal spot and its image plane, SID (the focal-spot to imaging-plane distance); (b) the pixel size, p size (e.g., 0.3 mm/pixel); (c) the distance between the two focal spots ff ¯ or the known 3D distance between two points in the projection images; and (d) for each view, an approximation of the magnification factor, MF (e.g., 1.2), which is the ratio of the SID and the approximate distance of the object to the focal spot. Item (d) is optional but may provide a more accurate estimation if it is available. Given five or more corresponding object points in both views, a constrained nonlinear optimization algorithm is applied to obtain an optimal estimate of the biplane imaging geometry in the form of a rotation matrix R and a translation vector t that characterize the position and orientation of one imaging system relative to the other. With the calculated biplane imaging geometry, 3D spatial information concerning the object can then be reconstructed. The accuracy of this method was evaluated by using a computer-simulated coronary arterial tree and a cube phantom object. Our simulation study showed that a computer-simulated coronary tree can be reconstructed from two views with less than 2 and 8.4 mm root-mean-square (rms) configuration (or relative-position) error and absolute-position error, respectively, even if the input errors in the corresponding 2D points are fairly large (more than two pixels=0.6 mm ). In contrast, input image error of more than one pixel ( =0.3 mm ) can yield 3D position errors of 10 cm or more when other existing methods based on linear approaches are employed. For the cube phantom images acquired from a routine biplane system, rms errors in the 3D configuration of the cube and the 3D absolute position were 0.6–0.9 mm and 3.9–5.0 mm, respectively.

Patent
21 Aug 1997
TL;DR: In this article, a method and system for providing data transfers which support image rotation is disclosed, which includes determining the orientation of the image capture device and defining an image area of the sensor based on its orientation.
Abstract: A method and system for providing data transfers which support image rotation is disclosed. In one aspect, the method and system include determining the orientation of the image capture device and transferring the data from a memory in an order. The order depends on the orientation of the image capture device. In a second aspect, the method and system include determining the orientation of the image capture device and defining an image area of the image sensor based on the orientation of the image capture device. In a third aspect, the method and system include transferring data in a plurality of computational units and processing each computational unit of the plurality of computational units of data. At least a portion of a computational unit is processed while at least a portion of a subsequent computational unit is transferred.

Patent
21 Feb 1997
TL;DR: In this article, a shear-warp factorization process is used to derive a 2D projection for a rotated volume, which is then implemented to produce the projection of the rotated volume.
Abstract: A 3D image is generated in real-time on an ultrasound medical imaging system which performs acquisition, volume reconstruction, and image visualization tasks using multiple processors. The acquisition task includes deriving position and orientation indicators for each gathered image frame. Volume reconstruction includes defining a reference coordinate system within which each image frame in a sequence of image frames is registered. The reference coordinate system is the coordinate system for a 3D volume encompassing the image planes to be used in the 3D image. The first image frame is used to define the reference coordinate system. As each image plane is registered, a 2D projection of the incremental volume is displayed. A shear-warp factorization process is used to derive a 2D projection for a rotated volume. A viewing transformation matrix is factorized into a 3D shear which is parallel to slices of the reference volume. A 2D warp then is implemented to produce the projection of the rotated volume.

Journal ArticleDOI
TL;DR: A robust yet fast algorithm for skew detection in binary document images based on interline cross-correlation in the scanned image, which does not require prior segmentation of the document into text and graphics regions.
Abstract: We describe a robust yet fast algorithm for skew detection in binary document images. The method is based on interline cross-correlation in the scanned image. Instead of finding the correlation for the entire image, it is calculated over small regions selected randomly. The proposed method does not require prior segmentation of the document into text and graphics regions. The maximum median of cross-correlation is used as the criterion to obtain the skew, and a Monte Carlo sampling technique is chosen to determine the number of regions over which the correlations have to be calculated. Experimental results on detecting skews in various types of documents containing different linguistic scripts are presented here.

Patent
Dennis L. Venable1
21 Jan 1997
TL;DR: In this paper, an intelligent scanning system for processing a digital input image to automatically characterize a plurality of objects therein is presented, and the system then employs the characterizations as the basis for rudimentary image editing operations so as to produce a digital document.
Abstract: The present invention is an intelligent scanning system for processing a digital input image to automatically characterize a plurality of objects therein. The system then employs the characterizations as the basis for rudimentary image editing operations so as to produce a digital document. In the digital document, the objects may be derotated, shifted, cropped or otherwise aligned in a predetermined fashion in accordance with a template. The scanning apparatus of the present invention not only enables the scanning of a plurality of objects, but does so in an intelligent manner so as to enable further processing and manipulation of the images associated with the objects to create an output document.

Proceedings ArticleDOI
18 Aug 1997
TL;DR: A fast algorithm is presented for skew and slant correction in printed document images by searching for a peak in the histogram of the gradient orientation of the input grey-level image.
Abstract: A fast algorithm is presented for skew and slant correction in printed document images. The algorithm employs only the gradient information. The skew angle is obtained by searching for a peak in the histogram of the gradient orientation of the input grey-level image. The skewness of the document is corrected by a rotation at such an angle. The slant of characters can also be detected using the same technique, and can be corrected by a shear operation. A second method for character slant correction by fitting parallelograms to the connected components is also described. Document images with different contents (tables, figures, and photos) have been tested for skew correction and the algorithm gives accurate results on all the test images, and the algorithm is very easy to implement.

Patent
19 Nov 1997
TL;DR: An interactive photo kiosk as discussed by the authors allows the user to pose and freeze a selected image made visible to the user on a display screen, and the selected frozen image is processed electronically to form a single digital multiple image of the same image in a selected area format, which, when delivered to a printing apparatus for hard copy print-out, produces a multiple image on a single sheet wherein each of the multiple images can be peeled off the single sheet and used separately from each other.
Abstract: An interactive photo kiosk in which, in one embodiment thereof, presents an upright open face which enables a user to stand directly in front of the kiosk, which optically defocuses the background image and substitutes a selected computer generated image, and which enables the user to pose and freeze a selected image made visible to the user on a display screen. Users may choose from among a menu of different computer generated background images. In a preferred embodiment, the selected frozen image is in digital form and is processed electronically to form a single digital multiple image of the same image in a selected area format which, when delivered to a printing apparatus for hard copy print-out, produces a multiple image of the same frozen image on a single sheet wherein each of the multiple images can be peeled off the single sheet and used separately from each other. An interactive image adjusting feature is provided to allow a user to modify the size and orientation of the user's image within the composite image. The composite image is formed by providing a uniform colored backdrop behind the user, imaging the user with the backdrop, and electronically replacing the backdrop color with the selected background image.

Journal ArticleDOI
TL;DR: Psychophysical data are compared with predictions from four schemes for extracting features from Glass patterns: token matching, isotropic filtering, oriented filtering, and "adaptive" filtering (selection of local peak output from multiply oriented filters).

Journal ArticleDOI
TL;DR: The subpixel interpolation technique is demonstrated to be far better than traditional methods when multiple or entire curves are present in a very small neighborhood, and to avoid the propagation of information across singularities.

Proceedings ArticleDOI
16 Jun 1997
TL;DR: In this paper, an approach to extract quantitative geometric descriptions of the movements of persons by fitting the projection of a 3D human model to consecutive frames of an image sequence is presented.
Abstract: We present an approach to extract quantitative geometric descriptions of the movements of persons by fitting the projection of a 3-dimensional human model to consecutive frames of an image sequence. The kinematics of the human model is given by a homogeneous transformation tree and its body parts are modeled by right-elliptical cones. The proposed approach can determine the values of a varying number of degrees of freedom (DOFs; body joints, position and orientation of the person relative to the camera) according to the application and the kind of image sequence. The determination of the DOFs is understood as an estimation problem which is solved by an iterated extended Kalman filter (IEKF). For this purpose, the human model is augmented by a simple motion model of constant velocity for all DOFs which is used in the prediction step of the IEKF. In the update step both region and edge information is used.