scispace - formally typeset
Search or ask a question

Showing papers by "Takeo Kanade published in 1992"


Journal Article•DOI•
TL;DR: In this paper, the singular value decomposition (SVDC) technique is used to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively, and two of the three translation components are computed in a preprocessing stage.
Abstract: Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step. An image stream can be represented by the 2FxP measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3. Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures. The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.

2,696 citations


01 Mar 1992
TL;DR: In this article, the singular value decomposition (SVDC) is used to factor the measurement matrix into two matrices, which represent object shape and camera motion, respectively.
Abstract: Inferring scene geometry and camera motion from a stream of images is possible in principle, but it is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion without computing depth as an intermediate step. An image stream can be represented by the 2F x P measurement matrix of the image coordinates of P points tracked through F frames. Under orthographic projection this matrix is of rank 3. Using this observation, the factorization method uses the singular value decomposition technique to factor the measurement matrix into two matrices, which represent object shape and camera motion, respectively. The method can also handle and obtain a full solution from a partially filled-in measurement matrix, which occurs when features appear and disappear in the image sequence due to occlusions or tracking failures. The method gives accurate results and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions.

871 citations


Journal Article•DOI•
TL;DR: In this paper, the authors present an approach to color image understanding that accounts for color variations due to highlights and shading and demonstrate that reflected light from every point on a dielectric object, such as plastic, can be described as a linear combination of the object color and the highlight color.
Abstract: In this paper, we present an approach to color image understanding that accounts for color variations due to highlights and shading. We demonstrate that the reflected light from every point on a dielectric object, such as plastic, can be described as a linear combination of the object color and the highlight color. The colors of all light rays reflected from one object then form a planar cluster in the color space. The shape of this cluster is determined by the object and highlight colors and by the object shape and illumination geometry. We present a method that exploits the difference between object color and highlight color to separate the color of every pixel into a matte component and a highlight component. This generates two intrinsic images, one showing the scene without highlights, and the other one showing only the highlights. The intrinsic images may be a useful tool for a variety of algorithms in computer vision, such as stereo vision, motion analysis, shape from shading, and shape from highlights. Our method combines the analysis of matte and highlight reflection with a sensor model that accounts for camera limitations. This enables us to successfully run our algorithm on real images taken in a laboratory setting. We show and discuss the results.

350 citations


Book•DOI•
03 Jan 1992
TL;DR: An approach to color image understanding that can be used to segment and analyze surfaces with color variations due to highlights and shading and is capable of generating physical descriptions of the reflection processes occurring in the scene.
Abstract: This thesis uses a physical, intrinsic reflection model to analyze the effects of shading and highlights on dielectric, non-uniform materials, such as plastics and paints. Since black-and-white images do not provide sufficient information to interpret these effects appropriately, this work uses color images. The colors seen on a dielectric object are an additive mixture of the object color and the illumination color, i.e.: the colors are a linear combination of those two color vectors. They therefore lie in a plane in the color space. Within the plane, they form a cluster that looks like a skewed T, due to the geometric properties of shading and highlight reflection. When combined with a camera model, this description of physically possible color changes on one object can be used to guide an algorithm in interpreting color images. It is the basis of an algorithm that analyzes color variation in real color images. The algorithm looks for characteristic color clusters in the image and relates their shape to hypotheses on the object color and the amount of shading, as well as the position, strength and color of highlights on the object. It verifies the hypotheses by reapplying them back to the image, testing whether and how they apply to the pixels in the local image area from which the respective hypothesis was derived. In this way, it adapts the image interpretation process to local scene characteristics and reacts differently to color and intensity changes at different places in the image. The result is an image segmentation that ignores color changes due to shading and highlights, as well as a set of intrinsic images, one describing the amount of shading at every pixel in the image, and the other one showing the amount highlight reflection at every pixel. Such intrinsic images can be a useful tool for other areas of computer vision that are currently disturbed by highlights in images, such as stereo vision and motion analysis. The approach can also be used to determine the illumination color from the highlights and as a preprocessor for methods to determine object shape from shading or highlights. This line of research may lead to physics-based image understanding methods that are both more reliable and more useful than traditional methods. (Abstract shortened with permission of author.)

332 citations


Journal Article•DOI•
TL;DR: 3-D vision techniques for incrementally building an accurate 3-D representation of rugged terrain using multiple sensors and the locus method, which is used to estimate the vehicle position in the digital elevation map (DEM), are presented.
Abstract: The authors present 3-D vision techniques for incrementally building an accurate 3-D representation of rugged terrain using multiple sensors. They have developed the locus method to model the rugged terrain. The locus method exploits sensor geometry to efficiently build a terrain representation from multiple sensor data. The locus method is used to estimate the vehicle position in the digital elevation map (DEM) by matching a sequence of range images with the DEM. Experimental results from large-scale real and synthetic terrains demonstrate the feasibility and power of the 3-D mapping techniques for rugged terrain. In real world experiments, a composite terrain map was built by merging 125 real range images. Using synthetic range images, a composite map of 150 m was produced from 159 images. With the proposed system, mobile robots operating in rugged environments can build accurate terrain models from multiple sensor data. >

185 citations


Journal Article•DOI•
TL;DR: A signal matching algorithm that can select an appropriate window size adaptively so as to obtain both precise and stable estimation of correspondences is presented, together with analytical and experimental results that demonstrate their effectiveness.
Abstract: This article presents a signal matching algorithm that can select an appropriate window size adaptively so as to obtain both precise and stable estimation of correspondences. Matching two signals by calculating the sum of squared differences (SSD) over a certain window is a basic technique in computer vision. Given the signals and a window, there are two factors that determine the difficulty of obtaining precise matching. The first is the variation of the signal within the window, which must be large enough, relative to noise, that the SSD values exhibit a clear and sharp minimum at the correct disparity. The second factor is the variation of disparity within the window, which must be small enough that signals of corresponding positions are duly compared. These two factors present conflicting requirements to the size of the matching window, since a larger window tends to increase the signal variation, but at the same time tends to include points of different disparity. A window size must be adaptively selected depending on local variations of signal and disparity in order to compute a most-certain estimate of disparity at each point. There has been little work on a systematic method for automatic window-size selection. The major difficulty is that, while the signal variation is measurable from the input, the disparity variation is not, since disparities are what we wish to calculate. We introduce here a statistical model of disparity variation within a window, and employ it to establish a link between the window size and the uncertainty of the computed disparity. This allows us to choose the window size that minimizes uncertainty in the disparity computed at each point. This article presents a theory for the model and the resultant algorithm, together with analytical and experimental results that demonstrate their effectiveness.

145 citations


Book•
03 Jan 1992
TL;DR: In this article, the basic shadow problem is extended to shadows cast by polyhedra and curved surfaces, and the analysis of the constraints provided by shadows can be analyzed in a manner analogous to the Basic Shadow Problem.
Abstract: : Given a line drawing from an image with shadow regions identified, the shapes of the shadows can be used to generate constraints on the orientations of the surfaces involved. This paper describes the theory which governs those constraints under orthography. A 'Basic Shadow Problem' is first posed, in which there is a single light source, and a single surface casts a shadow on another (background) surface. There are six parameters to determine: the orientation (2 parameters) for each surface, and the direction of the vector (2 parameters) pointing at the light source. If some set of 3 of these are given in advance, the remaining 3 can then be determined geometrically. The solution method consists of identifying 'illumination surfaces' consisting of illumination vectors, assigning Huffman-Clowes line labels to their edges, and applying the corresponding constraints in gradient space. The analysis is extended to shadows cast by polyhedra and curved surfaces. In both cases, the constraints provided by shadows can be analyzed in a manner analogous to the Basic Shadow Problem. When the shadow falls upon a polyhedron or curved surface, similar techniques apply. The consequences of varying the position and number of light sources are also discussed. Finally, some methods are presented for combining shadow geometry with other gradien space techniques for 3D shape inference. (Author)

91 citations


Book Chapter•DOI•
12 May 1992
TL;DR: In this article, an adaptive control of a space robot system with an attitude-controlled base on which the robot is attached is proposed, where most tasks are specified in inertia space, instead of joint space, and two potential problems, unavailability of the joint trajectory (since mapping from inertia space trajectory is dynamics-dependent and subject to uncertainty), and nonlinear parameterization in inertia spaces are identified.
Abstract: The authors discuss adaptive control of a space robot system with an attitude-controlled base on which the robot is attached. An adaptive control scheme in joint space is proposed. Since most tasks are specified in inertia space, instead of joint space, the authors discuss the issues associated to adaptive control in inertia space and identify two potential problems, unavailability of the joint trajectory (since mapping from inertia space trajectory is dynamics-dependent and subject to uncertainty), and nonlinear parameterization in inertia space. For a planar system, the linear parameterization problem is investigated, the design procedure of the controller is illustrated, and the validity and effectiveness of the proposed control scheme are demonstrated. >

51 citations


Proceedings Article•DOI•
01 Jan 1992
TL;DR: This thesis demonstrates that intelligent data acquisition makes possible new approaches to sensing that can significantly improve sensor performance and convincingly demonstrates the power of the technique.
Abstract: VLSI technology makes possible a powerful new sensing methodology--the smart sensor. In a smart sensor, transducers are integrated with processing circuitry so that desired information can be intelligently extracted at the point of sensing. Physical limitations force traditional systems to artificially partition sensing and processing functions. By eliminating such partitioning, VLSI smart sensing adds a new dimension to the design of both sensors and sensing algorithms. In this research, a high-performance VLSI range-image sensor has been built using the smart sensing methodology. This sensor measures range via light-stripe triangulation, a mature technology widely used in robotic systems. VLSI-based smart sensing made practical a cell-parallel implementation of the light-stripe method. Experiments with the cell-parallel sensor show that its performance is substantially better than that of traditional light-stripe systems. Range image acquisition time is decreased by two orders of magnitude. Furthermore, the range measurement process is qualitatively different, providing more robust and more accurate 3-D measurements. The success of the cell-parallel sensor can be attributed directly to the use of smart sensing and convincingly demonstrates the power of the technique. One of the most distinguishing features of this work is that it is not just a re-implementation of established algorithms using VLSI. Rather, this thesis demonstrates that intelligent data acquisition makes possible new approaches to sensing that can significantly improve sensor performance.

45 citations


Book•
03 Jan 1992
TL;DR: In this article, the authors focus on sensor modeling and its relationship to strategy generation, and propose a representation method for sensor detectability and reliability in the configuration space, and investigate how to use the proposed sensor model in automatic generation of object recognition programs.
Abstract: One of the most important and systematic methods of building model-based vision systems is that of generating object recognition programs automatically from given geometric models. Automatic generation of object recognition programs requires several key components to be developed: object models to describe the geometric and photometric properties of the object to be recognized, sensor models to predict object appearances from the object model under a given sensor, strategy generation using the predicted appearances to produce a recognition strategy, and program generation converting the recognition strategy into an executable code. This paper concentrates on sensor modeling and its relationship to strategy generation, because we regard it as the bottleneck to automatic generation of object recognition programs. We consider two aspects of sensor characteristics: sensor detectability and sensor reliability. Sensor detectability specifies what kinds of featuers can be detected and under what conditions the features are detected; sensor reliability is a confidence level for the detected features. We define a configuration space to represent sensor characteristics. Then, we propose a representation method for sensor detectability and reliability in the configuration space. Finally, we investigate how to use the proposed sensor model in automatic generation of object recognition programs.

41 citations


Proceedings Article•DOI•
12 May 1992
TL;DR: The authors have designed and built the robot and gravity compensation system to permit simulated zero-gravity experiments and developed the control system for the SM/sup 2/ that provides operator-friendly real-time monitoring, and robust control for 3D locomotion movements of the flexible robot.
Abstract: Self-Mobile Space Manipulator (SM/sup 2/) is a simple, 5-DOF (degree-of-freedom), 1/3-scale, laboratory version of a robot designed to walk on the trusswork and other exterior surfaces of Space Station Freedom. It will be capable of routine tasks such as inspection, parts transportation, and simple maintenance procedures. The authors have designed and built the robot and gravity compensation system to permit simulated zero-gravity experiments. They have developed the control system for the SM/sup 2/ including control hardware architecture and operating system, control station with various interfaces, hierarchical control structure, multiphase control strategy for step motion, and various low-level controllers. The system provides operator-friendly real-time monitoring, and robust control for 3D locomotion movements of the flexible robot. >

Book Chapter•DOI•
03 Jan 1992
TL;DR: The Carnegie—Mellon University Navigational Laboratory (the CMU Navlab) project integrates sensing, image understanding, planning, control, and software systems architectures onto a self-contained mobile robot.
Abstract: The Carnegie—Mellon University Navigational Laboratory (the CMU Navlab) project integrates sensing, image understanding, planning, control, and software systems architectures onto a self-contained mobile robot. The Navlab drives autonomously along a variety of roads (dirt, gravel unmarked bicycle paths, city streets, rural roads) and cross-country. This chapter describes the Navlab and its contributions in color vision, neural nets, 3-D perception, planning, and robot architectures.

Journal Article•DOI•
TL;DR: This paper will actively use physics-based simulator and deformable template matching techniques for specular object recognition, simulating object appearances by using a physics based specular reflection model on top of a geometric modeler to predict specular reflections quite accurately.
Abstract: Recognizing shiny objects with specular reflections is a hard problem for computer vision. Specular reflections appear, disappear, or change their shapes abruptly, due to tiny movements of the viewer. Traditionally, such specular reflections are discarded as annoying noise for recognition purposes. This paper will actively use such specular reflections for recognition. Specular reflections provide distinct clues for object recognition, if properly used. Some advanced sensors, such as underwater sonar or SAR sensors, provide images due to only specular reflections of emitted signals. It is important to establish a technique to recognize objects from such specular images. Although specular reflections are quite unstable, simulating object appearances by using a physics based specular reflection model on top of a geometric modeler allows us to predict specular reflections quite accurately. Recently, several robust matching techniques such as deformable template matchings have been developed. We will employ such physics-based simulator and deformable template matching techniques for specular object recognition. Our system follows the precompilation method. From the specular appearances of an object, the system will extract specular features, collections of pixels arising from specular reflections, and generate a set of (specular) aspects of the object. Specular features comprise stable distinct ones as well as unstable useless ones. The system analyzes the detectability and stability of each specular feature and determines a set of effective specular features to be used for specular aspect classification. For each specular aspect, the system prepares deformable matching templates. At runtime, an input image is first classified into a few candidate aspects using the predetermined effective features. Some of the specular features are still unstable and misclassification of aspects might occur if we used the binary decision for aspect classification. We will employ a continuous decision making for classification based on the Dempster-Shafer theory instead. Then the system will find the deformable template which provides the best match to verify the existence of the object. In order to demonstrate the usefulness of our system for specular object recognition, we present two experimental results using two different sensors: a TV image of a shiny object and a synthetic aperture radar (SAR) image of an airplane. The results show the flexibility of the proposed model-based approach for specular object recognition.

Proceedings Article•DOI•
07 Jul 1992
TL;DR: The robot hardware development, gravity compensation system, control structure and teleoperation functions of the SM2 system, and its capabilities of locomotion and manipulation in space applications are discussed.
Abstract: We have developed a light-weight space manipulator, Self-Mobile Space Manipulator (SM'), in the Robotics Institute at Carnegie Mellon University. SM' is a 7-degreeof-freedom (DOF), 1/3-scale, laboratory version of a robot designed to walk on the trusswork and other exterior surfaces of Space Station Freedom, and to perform manipulation tasks that are required for inspection, maintenance, and construction. Combining the mobility and manip ulation functions in one body as a mobile manipulator, SM2 is capable of routine tasks such as inspection, parts transportation, object lighting, and simple assembly procedures. The system will provide assistance to astronauts and greatly reduce the need for astronaut extra-vehicular activity (EVA). This paper discusses the robot hardware development, gravity compensation system, control structure and teleoperation functions of the SM2 system, and demonstrates its capabilities of locomotion and manipulation in space applications.

01 Jan 1992
TL;DR: The factorization method uses the singular value decomposition technique to factor the measurement matrix into two matrices, which represent object shape and camera motion, respectively, and gives accurate results and does not introduce smoothing in either shape or motion.
Abstract: Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size We have developed a factorization method that can overcome this difficulty by recovering shape and motion with computing depth as an intermediate step An image stream can be represented by the 2F * P measurement matrix of the image coordinates of P points tracked through F frames We show that under orthographic projection this matrix is of rank 3 Using this observation, the factorization method uses the singular value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera motion respectively The method which can also handle and obtain a full solution from a partially filled-in measurement matrix, which occurs when features appear and disappear in the image sequence due to occlusions or tracking failures The method gives accurate results, and does not introduce smoothing in either shape or motion We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions

01 Jan 1992
TL;DR: A relatively simple, modular, low mass, low cost robot is being developed for space EVA that is large enough to be independently mobile on a space station or platform exterior, yet versatile enough to accomplish many vital tasks as mentioned in this paper.
Abstract: A relatively simple, modular, low mass, low cost robot is being developed for space EVA that is large enough to be independently mobile on a space station or platform exterior, yet versatile enough to accomplish many vital tasks The robot comprises two long flexible links connected by a rotary joint, with 2-DOF 'wrist' joints and grippers at each end It walks by gripping pre-positioned attachment points, such as trusswork nodes, and alternately shifting its base of support from one foot (gripper) to the other The robot can perform useful tasks such as visual inspection, material transport, and light assembly by manipulating objects with one gripper, while stabilizing itself with the other At SOAR '90, we reported development of 1/3 scale robot hardware, modular trusswork to serve as a locomotion substrate, and a gravity compensation system to allow laboratory tests of locomotion strategies on the horizontal face of the trusswork In this paper, we report on project progress including the development of: (1) adaptive control for automatic adjustment to loads; (2) enhanced manipulation capabilities; (3) machine vision, including the use of neural nets, to guide autonomous locomotion; (4) locomotion between orthogonal trusswork faces; and (5) improved facilities for gravity compensation and telerobotic control

01 Aug 1992
TL;DR: In this paper, the authors presented the results of their experiments with a unique multiple-baseline stereo technique, which is further applied to complex outdoor scenes with variable lighting conditions and large depth ranges.
Abstract: : This paper presents the results of our experiments with a unique multiple-baseline stereo technique. This algorithm for producing precise, unambiguous depth maps from a set of multiple stereo pairs was developed by Okutomi and Kanade. Early versions of the algorithm were shown to perform well under controlled conditions in the Calibrated Imaging Laboratory (CIL). in this paper, the algorithm is further applied to complex outdoor scenes with variable lighting conditions and large depth ranges. While Okutomi and Kanade used stereo pairs acquired by moving a camera horizontally, we also investigated the use of stereo pairs taken by moving a camera in both horizontal and vertical directions. The use of stereo images with two orthogonal baseline orientations removes ambiguity and increases precision without the problems associated with the orientation of the features in a scene. We also show that the shapes of the sum of squared-difference (SSD) curve indicates the reliability of the match, and suggest a method to classify matches into various types and to improve estimates when a false match occurs. Finally, results are presented to show the effectiveness of this algorithm and the classification method.

Book•
03 Jan 1992
TL;DR: In this paper, the Dichromatic Reflection model is used to model the relationship between highlights and color in images of dielectrics such as plastic and painted surfaces, which gives rise to a mathematical relationship in color space to separate highlights from object color.
Abstract: Research in early (low-level) vision, tooth for machines and humans, has traditionally been based on the study of idealized images or image patches such as step edges, gratings, flat fields, and Mondrians. Real images, however, exhibit much richer and more complex structure, whose nature is determined by the physical and geometric properties of illumination, reflection, and imaging. By understanding these physical relationships, a new kind of early vision analysis is made possible. In this paper, we describe a progression of models of imaging physics that present a much more complex and realistic set of image relationships than are commonly assumed in early vision research. We begin with the Dichromatic Reflection Model, which describes how highlights and color are related in images of dielectrics such as plastic and painted surfaces. This gives rise to a mathematical relationship in color space to separate highlights from object color. Perceptions of shape, surface roughness/texture, and illumination color are readily derived from this analysis. We next show how this can be extended to images of several objects, by deriving local color variation relationships from the basic model. The resulting method for color image analysis has been successfully applied in machine vision experiments in our laboratory. Yet another extension is to account for inter-reflection among multiple objects. We have derived a simple model of color inter-reflection that accounts for the basic phenomena, and report on this model and how we are applying it. In general, the concept of illumination for vision should account for the entire "illumination environment", rather than being restricted to a single light source. This work shows that the basic physical relationships give rise to very structured image properties, which can be a more valid basis for early vision than the traditional idealized image patterns.

Book•
03 Jan 1992
TL;DR: In this paper, the effect of surface roughness on the three primary components of a reflectance model is analyzed in detail, and the conditions that determine the validity of the model are clearly stated.
Abstract: Reflectance models based on physical optics and geometrical optics are studied. Specifically, the authors consider the Beckmann-Spizzichino (physical optics) model and the Torrance-Sparrow (geometrical optics) model. These two models were chosen because they have been reported to fit experimental data well. Each model is described in detail, and the conditions that determine the validity of the model are clearly stated. By studying reflectance curves predicted by the two models, the authors propose a reflectance framework comprising three components: the diffuse lobe, the specular lobe, and the specular spike. The effects of surface roughness on the three primary components are analyzed in detail. >

Journal Article•DOI•
TL;DR: A statistical model is introduced for evaluating the impact of local intensity variation and scene disparity variation and a method of selecting an appropriate window size to minimize the uncertainty of the estimation is proposed.
Abstract: This paper describes a stereo matching algorithm capable of selecting an appropriate window size to achieve both objectives of precise localization and stable estimation in scene correspondence. Window size is an important parameter that depends on two local attributes: local intensity variation and scene disparity variation. A statistical model is introduced for evaluating the impact of these two types of variations on the uncertainty of disparity estimation and proposes a method of selecting an appropriate window size to minimize the uncertainty of the estimation. Experiments have been conducted for various window sizes. The experimental results demonstrate the effectiveness of the proposed model and the matching algorithm with an adaptive window.

01 Sep 1992
TL;DR: The Perception for Outdoor Navigation (PERO) project at Carnegie Mellon University was sponsored by DARPA, DoD, monitored by the U.S. Army Topographic Engineering Center under contract No. 76-89-C-0014 as discussed by the authors.
Abstract: : This report reviews progress at Carnegie Mellon from August 16, 1991 to August 15, 1992 on research sponsored by DARPA, DoD, monitored by the U.S. Army Topographic Engineering Center under contract DACA 76-89-C-0014, titled Perception for Outdoor Navigation. Research supported by this contract includes perception for road following, terrain mapping for off-road navigation, and systems software for building integrated mobile robots. We overview our efforts for the year, and list our publications and personnel, then provide further detail on several of our subprojects.