scispace - formally typeset
Search or ask a question

Showing papers by "Takeo Kanade published in 1994"


Journal ArticleDOI
TL;DR: In this paper, the authors proposed an adaptive window selection method to select an appropriate window by evaluating the local variation of the intensity and the disparity within the window, which is based on a statistical model of the disparity distribution within a window.
Abstract: A central problem in stereo matching by computing correlation or sum of squared differences (SSD) lies in selecting an appropriate window size. The window size must be large enough to include enough intensity variation for reliable matching, but small enough to avoid the effects of projective distortion. If the window is too small and does not cover enough intensity variation, it gives a poor disparity estimate, because the signal (intensity variation) to noise ratio is low. If, on the other hand, the window is too large and covers a region in which the depth of scene points (i.e., disparity) varies, then the position of maximum correlation or minimum SSD may not represent correct matching due to different projective distortions in the left and right images. For this reason, a window size must be selected adaptively depending on local variations of intensity and disparity. The authors present a method to select an appropriate window by evaluating the local variation of the intensity and the disparity. The authors employ a statistical model of the disparity distribution within the window. This modeling enables the authors to assess how disparity variation, as well as intensity variation, within a window affects the uncertainty of disparity estimate at the center point of the window. As a result, the authors devise a method which searches for a window that produces the estimate of disparity with the least uncertainty for each pixel of an image: the method controls not only the size but also the shape (rectangle) of the window. The authors have embedded this adaptive-window method in an iterative stereo matching algorithm: starting with an initial estimate of the disparity map, the algorithm iteratively updates the disparity estimate for each point by choosing the size and shape of a window till it converges. The stereo matching algorithm has been tested on both synthetic and real images, and the quality of the disparity maps obtained demonstrates the effectiveness of the adaptive window method. >

1,081 citations


Book ChapterDOI
07 May 1994
TL;DR: A model-based hand tracking system, called DigitEyes, that can recover the state of a 27 DOF hand model from ordinary gray scale images at speeds of up to 10 Hz is described.
Abstract: Passive sensing of human hand and limb motion is important for a wide range of applications from human-computer interaction to athletic performance measurement High degree of freedom articulated mechanisms like the human hand are difficult to track because of their large state space and complex image appearance This article describes a model-based hand tracking system, called DigitEyes, that can recover the state of a 27 DOF hand model from ordinary gray scale images at speeds of up to 10 Hz

516 citations


Book ChapterDOI
02 May 1994
TL;DR: A paraperspective factorization method that can be applied to a much wider range of motion scenarios, such as image sequences containing significant translational motion toward the camera or across the image, is developed.
Abstract: The factorization method, first developed by Tomasi and Kanade, recovers both the shape of an object and its motion from a sequence of images, using many images and tracking many feature points to obtain highly redundant feature position information. The method robustly processes the feature trajectory information using singular value decomposition (SVD), taking advantage of the linear algebraic properties of orthographic projection. However, an orthographic formulation limits the range of motions the method can accommodate. Paraperspective projection, first introduced by Ohta, is a projection model that closely approximates perspective projection by modelling several effects not modelled under orthographic projection, while retaining linear algebraic properties. We have developed a paraperspective factorization method that can be applied to a much wider range of motion scenarios, such as image sequences containing significant translational motion toward the camera or across the image. We present the results of several experiments which illustrate the method's performance in a wide range of situations, including an aerial image sequence of terrain taken from a low-altitude airplane.

289 citations


Proceedings ArticleDOI
11 Nov 1994
TL;DR: The DigitEyes system has demonstrated tracking performance at speeds of up to 10 Hz, using line and point features extracted from gray scale images of unadorned, unmarked hands, and an application of the sensor to a 3D mouse user-interface problem is described.
Abstract: Computer sensing of hand and limb motion is an important problem for applications in human-computer interaction (HCI), virtual reality, and athletic performance measurement. Commercially available sensors are invasive, and require the user to wear gloves or targets. We have developed a noninvasive vision-based hand tracking system, called DigitEyes. Employing a kinematic hand model, the DigitEyes system has demonstrated tracking performance at speeds of up to 10 Hz, using line and point features extracted from gray scale images of unadorned, unmarked hands. We describe an application of our sensor to a 3D mouse user-interface problem. >

215 citations


Journal ArticleDOI
TL;DR: This paper presents new algorithms for extracting topographic maps consisting of topographic features (peaks, pits, ravines, and ridges) and contour maps and develops new definitions for thoseTopographic features based on the contour map.
Abstract: Some applications such as the autonomous navigation in natural terrain and the automation of map making process require highlevel scene descriptions as well as geometrical representation of the natural terrain environments. In this paper, we present methods for building high level terrain descriptions, referred to as topographic maps, by extracting terrain features like “peaks,” “pits,” “ridges,” and “ravines” from the contour map. The resulting topographic map contains the location and type of terrain features as well as the ground topography. We present new algorithms for extracting topographic maps consisting of topographic features (peaks, pits, ravines, and ridges) and contour maps. We develop new definitions for those topographic features based on the contour map. We build a contour map from an elevation map and generate the connectivity tree of all regions separated by the contours. We use this connectivity tree, called a topographic change tree, to extract the topographic features. Experimental results on a digital elevation model (DEM) supports our definitions for topographic features and the approach. D 1994 Academic prss, Inc.

172 citations


Proceedings ArticleDOI
08 May 1994
TL;DR: A system which can perform full 3-D pose estimation of a single arbitrarily shaped, rigid object at rates up to 10 Hz using an enhanced implementation of the Iterative Closest Point Algorithm introduced by Besl and McKay (1992).
Abstract: This paper describes a system which can perform full 3-D pose estimation of a single arbitrarily shaped, rigid object at rates up to 10 Hz. A triangular mesh model of the object to be tracked is generated offline using conventional range sensors. Real-time range data of the object is sensed by the CMU high speed VLSI range sensor. Pose estimation is performed by registering the real-time range data to the triangular mesh model using an enhanced implementation of the Iterative Closest Point (ICP) Algorithm introduced by Besl and McKay (1992). The method does not require explicit feature extraction or specification of correspondence. Pose estimation accuracies of the order of 1% of the object size in translation, and 1 degree in rotation have been measured. >

161 citations


18 May 1994
TL;DR: A new approach to telepresence is presented in which a multitude of stationary cameras are used to acquire both photometric and depth information and systems based on this approach may exhibit more natural and intuitive interaction among participants than current D teleconferencing systems.
Abstract: A new approach to telepresence is presented in which a multitude of stationary cameras are used to acquire both photometric and depth information A virtual environment is constructed by displaying the acquired data from the remote site in accordance with the head position and orientation of a local participant Shown are preliminary results of a depth image of a human subject calculated from closely spaced video camera positions A user wearing a head mounted display walks around this D data that has been inserted into a D model of a simple room Future systems based on this approach may exhibit more natural and intuitive interaction among participants than current D teleconferencing systems

107 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed an adaptive control scheme in joint space, and presented a simulation study to demonstrate its effectiveness and computational procedure, and identified two potential problem: unavailability of joint trajectory because the mapping from inertial space trajectory is dynamic-dependent and subject to uncertainty.
Abstract: In space application, robot system are subject to unknown or unmodeled dynamics, for example, in the tasks of transporting an unknown payload or catching an unmodeled moving object. We discuss the parameterization problem in dynamic structure and adaptive control of a space robot system with an attitude-controlled base to which the robot is attached. We first derive the system kinematic and dynamic equations based on Lagrangian dynamics and the linear momentum conservation law. Based on the dynamic model developed, we discuss the problem of linear parameterization in term of dynamic parameters, and find that in joint space, the dynamics can be linearized by a set of combined dynamic parameters; however, in inertial space linear parameterization is impossible in general. Then we propose an adaptive control scheme in joint space, and present a simulation study to demonstrate its effectiveness and computational procedure. Because most takes are specified in inertial space instead of joint space, we discuss the issues associated to adaptive control in inertial space and identify two potential problem: unavailability of joint trajectory because the mapping from inertial space trajectory is dynamic-dependent and subject to uncertainty; and nonlinear parameterization in inertial space. We approach the problem by making use of the proposed joint space adaptive controller and updating the joint trajectory by the estimated dynamic parameters and given trajectory in inertial space. >

78 citations


Journal ArticleDOI
TL;DR: The self-mobile space manipulator, (SM)/sup 2/, is a simple, 5-DOF, 1/3 scale, laboratory version of a robot designed to walk on the trusswork and other exterior surfaces of Space Station Freedom.
Abstract: The self-mobile space manipulator, (SM)/sup 2/, is a simple, 5-DOF, 1/3 scale, laboratory version of a robot designed to walk on the trusswork and other exterior surfaces of Space Station Freedom. It will be capable of routine tasks such as inspection, parts transportation and simple maintenance procedures. The authors have designed and built the robot and gravity compensation system to permit simulated zero-gravity experiments. The authors have developed the control system for the (SM)/sup 2/ including control hardware architecture and operating system, control station with various interfaces, hierarchical control structure, multi-phase control strategy for step motion, and various low-level controllers. The system provides operator friendly, real-time monitoring, robust control for 3-D locomotion movements of the flexible robot. A hierarchical structure allows the control to be executed in various levels autonomously or by teleoperation, and a multiphase control strategy facilitates the control in different tasks. Based on the dynamic model developed, a linear-structured joint-level controller and model-based control scheme with acceleration feedback is being implemented to provide a stable and fast motion. The configuration-independent control scheme allows the control parameters to adapt to changes in system dynamics due to the robot configuration variation. With a variety of low-level controllers which the authors developed, the system has demonstrated robustness to the uncertainties in modeling and in payload. >

74 citations


Proceedings ArticleDOI
12 Sep 1994
TL;DR: A fast computation method of the normalized correlation for multiple rotated templates by using multiresolution eigenimages that allows the authors to accurately detect both location and orientation of the object in a scene at faster rate than applying conventional template matching to the rotated object.
Abstract: Presents a fast computation method of the normalized correlation for multiple rotated templates by using multiresolution eigenimages. This method allows the authors to accurately detect both location and orientation of the object in a scene at faster rate than applying conventional template matching to the rotated object. Since the correlation among slightly rotated templates is high, the authors first apply the Karhunen-Loeve expansion to a set of rotated templates and extract "eigenimages" from them. Each template in this set can be approximated by a linear combination of these eigenimages and it substitute for the template in computing the normalized correlation. The number of eigenimages is smaller than that of original templates and computation cost becomes small. Second, the authors employ a multiresolution image structure to reduce the number of rotated templates and location search area. For the lower resolution image, the position and angle are coarsely obtained in a wide region. Then not only searching area for the position but also the range of rotation angle of templates at the next layer can be limited to the neighbor of the prior results. The authors implemented the proposed algorithm on a vision system and realized computation time around 600 msec and achieved sub pixel resolution for translation and 0.3 degree maximum error for 360 degree rotation on the 512 by 480 gray scale image. Experimental results are shown to demonstrate the accuracy, efficiency and feasibility of the proposed method. >

70 citations


01 Sep 1994
TL;DR: In this article, a four-camera multibaseline stereo system in a convergent configuration is described and a parallel depth recovery scheme for this system is implemented, which is capable of image capture at video rate.
Abstract: We describe our four-camera multibaseline stereo system in a convergent configuration and our implementation of a parallel depth recovery scheme for this system. Our system is capable of image capture at video rate. This is critical in applications that require three-dimensional tracking. We obtain dense stereo depth data by projecting a light pattern of frequency modulated sinusoidally varying intensity onto the scene, thus increaing the local discriminability at each pixel and facilitating matches. In addition, we make most of the camera view areas by converging them at a volume of interest. Results indicate that we are able to extract stereo depth data that are, on the average, less than 1 mm in error at distances between 1.5 to 3.5 m away from the cameras.

01 Mar 1994
TL;DR: Integration of computer vision with on-board sensors to autonomously fly helicopters was researched and custom designed vision processing hardware and an indoor testbed provided convenient calibrated experimentation in constructing real autonomous systems.
Abstract: Integration of computer vision with on-board sensors to autonomously fly helicopters was researched. The key components developed were custom designed vision processing hardware and an indoor testbed. The custom designed hardware provided flexible integration of on-board sensors with real-time image processing resulting in a significant improvement in vision-based state estimation. The indoor testbed provided convenient calibrated experimentation in constructing real autonomous systems.

01 Dec 1994
TL;DR: A framework for local tracking of self-occluding motion, in which parts of the mechanism obstruct each others visibility to the camera, is described, which uses a kinematic model to predict occlusion and windowed templates to track partially occluded objects.
Abstract: Computer sensing of hand and limb motion is an important problem for applications in human-computer interaction, virtual reality, and athletic performance measurement. We describe a framework for local tracking of self-occluding motion, in which parts of the mechanism obstruct each others visibility to the camera. Our approach uses a kinematic model to predict occlusion and windowed templates to track partially occluded objects. We analyze our model of self-occlusion, discuss the implementation of our algorithm, and give experimental results for 3D hand tracking under significant amounts of self-occlusion. These results extend the DigitEyes system for articulated tracking, which we have previously developed, to handle self-occluding motions.

Proceedings ArticleDOI
K. Kemmotsu1, Takeo Kanade
08 May 1994
TL;DR: This paper presents a method for finding an optimal sensor placement off-line to accurately determine the pose of an object when using three light-stripe range finders using a Monte Carlo method.
Abstract: The pose (position and orientation) of a polyhedral object can be determined with range data obtained from simple light-stripe range finders. However, localization results are sensitive to where those range finders are placed in the workspace, that is, sensor placement. It is advantageous for vision tasks in a factory environment to plan optimal sensing positions off-line all at once rather than online sequentially. This paper presents a method for finding an optimal sensor placement off-line to accurately determine the pose of an object when using three light-stripe range finders. We evaluate a sensor placement on the basis of average performance measures such as an error rate of object recognition, recognition speed and pose uncertainty over the state space of object pose by a Monte Carlo method. An optimal sensor placement which is given a maximal score by a scalar function of the performance measures is selected by another Monte Carlo method. We emphasize that the expected performance of our system under an optimal sensor placement can be characterized completely via simulation. >

Journal ArticleDOI
TL;DR: The robot hardware development, gravity compensation system, control structure and teleoperation functions of SM2 system, and its capabilities of locomotion and manipulation in space applications are discussed.

01 Jan 1994
TL;DR: The material transfer robots first appeared in the mid-1960s for use in traditional industrial applications and found use in more demanding industrial applications such as welding, assembly, and inspection, with the help of vision and other sensors as discussed by the authors.
Abstract: Material transfer robots first appeared in the mid-1960s for use in traditional industrial applications. By the 1980s, robots found use in more demanding industrial applications such as welding, assembly, and inspection, with the help of vision and other sensors. It is estimated that 46,000 industrial robots have been installed in the U.S. japan has six to eight times as many robots

Journal ArticleDOI
TL;DR: Material transfer robots first appeared in the mid-1960s for use in traditional industrial applications but by the 1980s, robots found use in more demanding industrial applications such as welding, assembly, and inspection, with the help of vision and other sensors.
Abstract: Material transfer robots first appeared in the mid-1960s for use in traditional industrial applications. By the 1980s, robots found use in more demanding industrial applications such as welding, assembly, and inspection, with the help of vision and other sensors. It is estimated that 46,000 industrial robots have been installed in the U.S. japan has six to eight times as many robots

01 Mar 1994
TL;DR: In this paper, the authors found that Japan has in place a broad base of robotics research and development, ranging from components to working systems for manufacturing, construction, and human service industries, from this base, Japan looks to the use of robotics in space applications and has funded work in space robotics since the mid 1980's.
Abstract: Japan has been one of the most successful countries in the world in the realm of terrestrial robot applications. The panel found that Japan has in place a broad base of robotics research and development, ranging from components to working systems for manufacturing, construction, and human service industries. From this base, Japan looks to the use of robotics in space applications and has funded work in space robotics since the mid-1980's. The Japanese are focusing on a clear image of what they hope to achieve through three objectives for the 1990's: developing long-reach manipulation for tending experiments on Space Station Freedom, capturing satellites using a free-flying manipulator, and surveying part of the moon with a mobile robot. This focus and a sound robotics infrastructure is enabling the young Japanese space program to develop relevant systems for extraterrestrial robotics applications.

01 Jan 1994
TL;DR: This paper presents an algorithm which performs the three-dimensional reconstruction task and examines one of the main steps of the algorithm: detecting vessels in single images, finding the positions of the vessels in three dimensions, and finally performing diameter measurements.
Abstract: Angiography is a method used by radiologists to examine the structure and health of blood vessels. The method consists of injecting an x-ray opaque contrast agent into the bloodstream, and taking one or more x-ray images of the vessel. Although these images are clear and have high resolution, they individually show structures in only two-dimensions. The goal of our work is to automatically reconstruct the threedimensional structure of a blood vessel using a small number of angiogram images taken from different angles. Among the types of information about blood vessels that are useful to physicians, the most important for diagnostic purposes is a measure of the cross-sectional area of the vessel at every location along its length. This information reveals constrictions caused by blood clots, cholesterol build-ups, or injuries. Treatment of these conditions may involve inserting a catheter, with a small video camera and scraping tool mounted at its tip, into the vessel to remove the obstruction. A computer can aid in planning this surgery by simulating what the tip of the catheter would encounter in the vessel, and by computing how far the catheter must be inserted to begin repair. This paper presents an algorithm which performs the three-dimensional reconstruction task. Each section of the paper will examine one of the main steps of the algorithm: detecting vessels in single images, finding the positions of the vessels in three dimensions, and finally performing diameter measurements. The paper concludes with some reconstruction results.

Book
07 Mar 1994
TL;DR: In the fall of 1977, I was invited by Raj Reddy to spend a year at the Computer Science Department of Carnegie Mellon University as a visiting scientist from Kyoto University in Japan, and my plan for research during the stay was to develop a model-based object recognition program.
Abstract: In the fall of 1977, I was invited by Raj Reddy to spend a year at the Computer Science Department of Carnegie Mellon University as a visiting scientist from Kyoto University in Japan. My plan for research during the stay was to develop a model-based object recognition program. Upon arrival, I chose an image of an office scene (Fig. 1 ) as an example image; the image was one of a set that Ron Ohlander had used in his research on color image segmentation. The task I set for my program was to recognize the chair in this image. I began to write a "knowledge-based" program for chair recognition by creating a set of heuristic rules for the task. It seemed that in addition to geometric relationships, a good source of constraints was color information, such as "the back and the seat of a chair have the same color". The effort of creating heuristic rules one after another, however, was not a satisfying game, since every time I came up with a reasonably functioning program, I could also find a chair that was an exception to the rules.