Showing papers by "Takeo Kanade published in 1999"

PDF

Open Access

Proceedings Article•DOI•

[...]

Sundar Vedula¹, Simon Baker¹, Peter Rander¹, Robert T. Collins, Takeo Kanade¹ - Show less +1 more•Institutions (1)

20 Sep 1999

TL;DR: This work presents a framework for the computation of dense, non-rigid scene flow from optical flow and shows that multiple estimates of the normal flow cannot be used to estimate dense scene flow directly without some form of smoothing or regularization.

...read moreread less

Abstract: Scene flow is the three-dimensional motion field of points in the world, just as optical flow is the two-dimensional motion field of points in an image. Any optical flow is simply the projection of the scene flow onto the image plane of a camera. We present a framework for the computation of dense, non-rigid scene flow from optical flow. Our approach leads to straightforward linear algorithms and a classification of the task into three major scenarios: complete instantaneous knowledge of the scene structure; knowledge only of correspondence information; and no knowledge of the scene structure. We also show that multiple estimates of the normal flow cannot be used to estimate dense scene flow directly without some form of smoothing or regularization.

...read moreread less

335 citations

Journal Article•DOI•

Name-It: naming and detecting faces in news videos

[...]

Shin'ichi Satoh, Yuichi Nakamura, Takeo Kanade

01 Jan 1999-IEEE MultiMedia

TL;DR: Name-It, a system that associates faces and names in news videos, takes a multimodal video analysis approach: face sequence extraction and similarity evaluation from videos, name extraction from transcripts, and video-caption recognition.

...read moreread less

Abstract: We developed Name-It, a system that associates faces and names in news videos. It processes information from the videos and can infer possible name candidates for a given face or locate a face in news videos by name. To accomplish this task, the system takes a multimodal video analysis approach: face sequence extraction and similarity evaluation from videos, name extraction from transcripts, and video-caption recognition.

...read moreread less

311 citations

Journal Article•DOI•

Automated face analysis by feature point tracking has high concurrent validity with manual FACS coding

[...]

Jeffrey F. Cohn¹, Adena J. Zlochower¹, James Jenn-Jier Lien¹, Takeo Kanade²•Institutions (2)

University of Pittsburgh¹, Carnegie Mellon University²

01 Jan 1999-Psychophysiology

TL;DR: An automated method of facial display analysis by feature point tracking demonstrated high concurrent validity with manual FACS coding.

...read moreread less

Abstract: The face is a rich source of information about human behavior. Available methods for coding facial displays, however, are human-observer dependent, labor intensive, and difficult to standardize. To enable rigorous and efficient quantitative measurement of facial displays, we have developed an automated method of facial display analysis. In this report, we compare the results with this automated system with those of manual FACS (Facial Action Coding System, Ekman & Friesen, 1978a) coding. One hundred university students were videotaped while performing a series of facial displays. The image sequences were coded from videotape by certified FACS coders. Fifteen action units and action unit combinations that occurred a minimum of 25 times were selected for automated analysis. Facial features were automatically tracked in digitized image sequences using a hierarchical algorithm for estimating optical flow. The measurements were normalized for variation in position, orientation, and scale. The image sequences were randomly divided into a training set and a cross-validation set, and discriminant function analyses were conducted on the feature point measurements. In the training set, average agreement with manual FACS coding was 92% or higher for action units in the brow, eye, and mouth regions. In the cross-validation set, average agreement was 91%, 88%, and 81% for action units in the brow, eye, and mouth regions, respectively. Automated face analysis by feature point tracking demonstrated high concurrent validity with manual FACS coding.

...read moreread less

287 citations

A System for Video Surveillance and Monitoring CMU VSAM Final Report

[...]

Takeo Kanade, Robert T. Collins, Alan J. Lipton, Hironobu Fujiyoshi, David Duggins - Show less +1 more

30 Nov 1999

TL;DR: An overview of theVSAM system, which uses multiple, cooperative video sensors to provide continuous coverage of people and vehicles in a cluttered environment, and of the technical accomplishments that have been achieved is presented.

...read moreread less

Abstract: : Under the three-year Video Surveillance and Monitoring (VSAM) project, the Robotics Institute at Carnegie Mellon University (CMU) and the Sarnoff Corporation have developed a system for autonomous Video Surveillance and Monitoring. The technical approach uses multiple, cooperative video sensors to provide continuous coverage of people and vehicles in a cluttered environment. This final report presents an overview of the system, and of the technical accomplishments that have been achieved. Details can be found in a set of previously published papers that together comprise Appendix A.

...read moreread less

279 citations

System identification of small-size unmanned helicopter dynamics

[...]

Bernard Mettler¹, Mark B. Tischler¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1999

TL;DR: In this paper, an accurate, high-bandwidth, linear state-space model was derived for the hover condition of a fully-instrumented model-scale unmanned helicopter (Yamaha R-SO with loft. diameter rotor) for dynamic model identification.

...read moreread less

Abstract: Abstmcf: Flight testing of a fully-instrumented model-scale unmanned helicopter (Yamaha R-SO with loft. diameter rotor) was conducted for the purpose of dynamic model identification. This paper describes the application of CIFER' system identification techniques, which have been developed for full size helicopters, to this aircraft. An accurate, high-bandwidth, linear state-space model was derived for the hover condition. The model structure includes the explicit representation of regressive rotor-flap dynamics, rigid-body fuselage dynamics, and the yaw damper. The R-50 codiguration and identified dynamics are compared with those of a dynamically scaled UH-1H. The identified model shows excellent predictive capability and is well suited for flight control design and simulation applications.

...read moreread less

218 citations

Journal Article•DOI•

Video OCR: indexing digital new libraries by recognition of superimposed captions

[...]

Toshio Sato¹, Takeo Kanade², Ellen K. Hughes², Michael A. Smith², Shin'ichi Satoh - Show less +1 more•Institutions (2)

Toshiba¹, Carnegie Mellon University²

01 Sep 1999-Multimedia Systems

TL;DR: To solve two problems of character recognition for videos, low-resolution characters and extremely complex backgrounds, an interpolation filter, multi-frame integration and character extraction filters are applied and the overall recognition results are satisfactory for use in news indexing.

...read moreread less

Abstract: The automatic extraction and recognition of news captions and annotations can be of great help locating topics of interest in digital news video libraries. To achieve this goal, we present a technique, called Video OCR (Optical Character Reader), which detects, extracts, and reads text areas in digital video data. In this paper, we address problems, describe the method by which Video OCR operates, and suggest applications for its use in digital news archives. To solve two problems of character recognition for videos, low-resolution characters and extremely complex backgrounds, we apply an interpolation filter, multiframe integration and character extraction filters. Character segmentation is performed by a recognition-based segmentation method, and intermediate character recognition results are used to improve the segmentation. We also include a method for locating text areas using text-like properties and the use of a language-based postprocessing technique to increase word recognition rates. The overall recognition results are satisfactory for use in news indexing. Performing Video OCR on news video and combining its results with other video understanding techniques will improve the overall understanding of the news video content.

...read moreread less

215 citations

Advances in Cooperative Multi-Sensor Video Surveillance

[...]

Takeo Kanade, Robert T. Collins, Alan J. Lipton, Peter Burt

01 Jan 1999

TL;DR: The objective is to develop a cooperative, multi-sensor video surveillance system that provides continuous coverage over battle eld areas and achievements have been demonstrated during VSAM Demo I.

...read moreread less

Abstract: Carnegie Mellon University (CMU) and the Sarno Corporation (Sarno ) are performing an integrated feasibility demonstration of Video Surveillance and Monitoring (VSAM). The objective is to develop a cooperative, multi-sensor video surveillance system that provides continuous coverage over battle eld areas. Signi cant achievements have been demonstrated during VSAM Demo I in November 1997, and in the intervening year leading up to Demo II in October 1998.

...read moreread less

206 citations

Journal Article•DOI•

A visual odometer for autonomous helicopter flight

[...]

Omead Amidi¹, Takeo Kanade¹, Keisuke Fujita¹•Institutions (1)

Carnegie Mellon University¹

31 Aug 1999

TL;DR: A visual odometer for autonomous helicopter flight that estimates helicopter position by visually locking on to and tracking ground objects and the philosophy behind the odometer as well as its tracking algorithm and implementation are described.

...read moreread less

Abstract: This paper presents a visual odometer for autonomous helicopter flight. The odometer estimates helicopter position by visually locking on to and tracking ground objects. The paper describes the philosophy behind the odometer as well as its tracking algorithm and implementation. The paper concludes by presenting test flight data of the odometer's performance on-board indoor and outdoor prototype autonomous helicopters.

...read moreread less

177 citations

Proceedings Article•DOI•

Shape reconstruction in projective grid space from large number of images

[...]

Hideo Saito¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1999

TL;DR: The quality of the virtual view images re-synthesized from the projective shape demonstrates the effectiveness of the proposed scheme for projective reconstruction from a large number of images.

...read moreread less

Abstract: This paper proposes a new scheme for multi-image projective reconstruction based on a projective grid space. The projective grid space is defined by two basis views and the fundamental matrix relating these views. Given fundamental matrices relating other views to each of the two basis views, this projective grid space can be related to any view. In the projective grid space as a general space that is related to all images, a projective shape can be reconstructed from all the images of weakly calibrated cameras. The projective reconstruction is one way to reduce the effort of the calibration because it does not need Euclid metric information, but rather only correspondences of several points between the images. For demonstrating the effectiveness of the proposed projective grid definition, we modify the voxel coloring algorithm for the projective voxel scheme. The quality of the virtual view images re-synthesized from the projective shape demonstrates the effectiveness of our proposed scheme for projective reconstruction from a large number of images.

...read moreread less

105 citations

Journal Article•DOI•

WYSIWYF Display: A Visual/Haptic Interface to Virtual Environment

[...]

Yasuyoshi Yokokohji¹, Ralph L. Hollis², Takeo Kanade²•Institutions (2)

Kyoto University¹, Carnegie Mellon University²

01 Aug 1999-Presence: Teleoperators & Virtual Environments

TL;DR: This paper proposes a method that can realize correct visual/haptic registration, namely WYSIWYF, by using a vision-based, object-tracking technique and a video-keying technique and provides realistic haptic sensations, such as free-to-touch and move-and-collide.

...read moreread less

Abstract: To build a VR training system for visuomotor skills, an image displayed by a visual interface should be correctly registered to a haptic interface so that the visual sensation and the haptic sensation are both spatially and temporally consistent. In other words, it is desirable that what you see is what you feel (WYSIWYF). In this paper, we propose a method that can realize correct visual/haptic registration, namely WYSIWYF, by using a vision-based, object-tracking technique and a video-keying technique. Combining an encountered-type haptic device with a motion-command-type haptic rendering algorithm makes it possible to deal with two extreme cases (free motion and rigid constraint). This approach provides realistic haptic sensations, such as free-to-touch and move-and-collide. We describe a first prototype and illustrate its use with several demonstrations. The user encounters the haptic device exactly when his or her hand reaches a virtual object in the display. Although this prototype has some remaining technical problems to be solved, it serves well to show the validity of the proposed approach.

...read moreread less

82 citations

Proceedings Article•DOI•

Appearance-based virtual view generation of temporally-varying events from multi-camera images in the 3D room

[...]

Hideo Saito¹, S. Baba¹, M. Kimura¹, Sundar Vedula¹, Takeo Kanade¹ - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

04 Oct 1999

TL;DR: An "appearance based" virtual view generation method for temporally-varying events taken by multiple cameras of the "3D Room", developed by the group and presented for demonstrating the performance of the virtual view image generation in the 3D Room.

...read moreread less

Abstract: We present an "appearance based" virtual view generation method for temporally-varying events taken by multiple cameras of the "3D Room", developed by our group. With this method we can generate images from any virtual view point between two selected real views. The virtual appearance view generation method is based on simple interpolation between two selected views. The correspondence between the views are automatically generated from the multiple images by use of the volumetric model shape reconstruction framework. Since the correspondences are obtained by the recovered volumetric model, even occluded regions in the views can be correctly interpolated in the virtual view images. The virtual view image sequences are presented for demonstrating the performance of the virtual view image generation in the 3D Room.

...read moreread less

Virtualized Reality : Digitizing a 3D Time-Varying Event As Is and in Real Time

[...]

Takeo Kanade

01 Jan 1999

Book Chapter•DOI•

Uncertainty Modeling for Optimal Structure from Motion

[...]

Daniel H. Morris¹, Kenichi Kanatani², Takeo Kanade¹•Institutions (2)

Carnegie Mellon University¹, Gunma University²

21 Sep 1999

TL;DR: A Geometric Equivalence Relationship is derived with which covariances under different parametrizations and gauges can be compared, based on their true geometric uncertainty, and it is shown that the uncertainty of gauge invariants exactly captures the geometric uncertainty of the solution, and hence provides useful measures for evaluating the uncertaintyof the solution.

...read moreread less

Abstract: The parameters estimated by Structure from Motion (SFM) contain inherent indeterminacies which we call gauge freedoms. Under a perspective camera, shape and motion parameters are only recovered up to an unknown similarity transformation. In this paper we investigate how covariance-based uncertainty can be represented under these gauge freedoms. Past work on uncertainty modeling has implicitly imposed gauge constraints on the solution before considering covariance estimation. Here we examine the effect of selecting a particular gauge on the uncertainty of parameters. We show potentially dramatic effects of gauge choice on parameter uncertainties. However the inherent geometric uncertainty remains the same irrespective of gauge choice. We derive a Geometric Equivalence Relationship with which covariances under different parametrizations and gauges can be compared, based on their true geometric uncertainty. We show that the uncertainty of gauge invariants exactly captures the geometric uncertainty of the solution, and hence provides useful measures for evaluating the uncertainty of the solution. Finally we propose a fast method for covariance estimation and show its correctness using the Geometric Equivalence Relationship.

...read moreread less

Journal Article•DOI•

A VLSI sorting image sensor: global massively parallel intensity-to-time processing for low-latency adaptive vision

[...]

Vladimir Brajovic¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Feb 1999

TL;DR: A sorting image computational sensor-a VLSI chip which senses an image and sorts all pixel by their intensities, and the global cumulative histogram is used internally on-chip in a top-down fashion to adapt the values in individual pixel so as to reflect the index of the incoming light, thus computing an "image of indices".

...read moreread less

Abstract: Presents a new intensity-to-time processing paradigm suitable for very large scale integration (VLSI) computational sensor implementation of global operations over sensed images. Global image quantities usually describe images with fewer data. When computed at the point of sensing, global quantities result in a low-latency performance due to the reduced data transfer requirements between an image sensor and a processor. The global quantities also help global top-down adaptation: the quantities are continuously computed on-chip, and are readily available to sensing for adaptation. As an example, we have developed a sorting image computational sensor-a VLSI chip which senses an image and sorts all pixel by their intensities. The first sorting sensor prototype is a 21/spl times/26 array of cells. It receives an image optically, senses it, and computes the image's cumulative histogram-a global quantity which can be quickly routed off chip via one pin. In addition, the global cumulative histogram is used internally on-chip in a top-down fashion to adapt the values in individual pixel so as to reflect the index of the incoming light, thus computing an "image of indices". The image of indices never saturates and has a uniform histogram.

...read moreread less

Book Chapter•DOI•

3-D Deformable Registration of Medical Images Using a Statistical Atlas

[...]

Mei Chen¹, Takeo Kanade¹, Dean A. Pomerleau¹, Jeff Schneider¹•Institutions (1)

Carnegie Mellon University¹

19 Sep 1999

TL;DR: This work characterize such anatomical variations to achieve accurate registration between 3-D images of human anatomies and shows how innate differences in the appearance and location of anatomical structures between individuals make accurate registration difficult.

...read moreread less

3-d deformable registration using a statistical atlas with applications in medicine

[...]

Mei Chen, Takeo Kanade, Dean A. Pomerleau

01 Jan 1999

TL;DR: This thesis focuses on characterizing non-pathological variations in human brain anatomy, and applying such knowledge to achieve accurate 3D deformable registration, to reduce the overall error on 40 test cases by 34%.

...read moreread less

Abstract: Registering medical images of different individuals is difficult due to inherent anatomical variabilities and possible pathologies. This thesis focuses on characterizing non-pathological variations in human brain anatomy, and applying such knowledge to achieve accurate 3D deformable registration. Inherent anatomical variations are automatically extracted by deformably registering training data with an expert-segmented 3-D image, a digital brain atlas. Statistical properties of the density and geometric variations in brain anatomy are measured and encoded into the atlas to build a statistical atlas. These statistics can function as prior knowledge to guide the automatic registration process. Compared to an algorithm with no knowledge guidance, registration using the statistical atlas reduces the overall error on 40 test cases by 34%. Automatic registration between the atlas and a subject’s data adapts the expert segmentation for the subject, thus reduces the months-long manual segmentation process to minutes. Accurate and efficient segmentation of medical images enable quantitative study of anatomical differences between populations, as well as detection of abnormal variations indicative of pathologies.

...read moreread less

Journal Article•DOI•

Anomaly detection through registration

[...]

Mei Chen¹, Takeo Kanade¹, Dean A. Pomerleau¹, Henry Allan Rowley¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1999-Pattern Recognition

TL;DR: A system that automatically segments and classifies features in brain MRI volumes using an atlas, a hand-segmented and classified MRI of a normal brain, which is warped in 3-D using a hierarchical deformable matching algorithm until it closely matches the subject.

...read moreread less

Cooperative Multi-Sensor Video Surveillance

[...]

Takeo Kanade, Robert T. Collins, Alan J. Lipton, Padmanabhan Anandan, Peter J. Burt, Lambert Ernest Wixson, David Sarno - Show less +3 more

01 Jan 1999

TL;DR: In this paper, a cooperative, multi-sensor video surveillance system that provides continuous coverage over large battle field areas is presented. And the authors have begun a joint, integrated feasibility demonstration in the area of Video Surveillance and Monitoring (VSAM).

...read moreread less

Abstract: Carnegie Mellon University (CMU) and the David Sarno Research Center (Sarno ) have begun a joint, integrated feasibility demonstration in the area of Video Surveillance and Monitoring (VSAM). The objective is to develop a cooperative, multi-sensor video surveillance system that provides continuous coverage over large battle eld areas. Image Understanding (IU) technologies will be developed to: 1) coordinate multiple sensors to seamlessly track moving targets over an extended area, 2) actively control sensor and platform parameters to track multiple moving targets, 3) integrate multisensor output with collateral data to maintain an evolving, scene-level representation of all targets and platforms, and 4) monitor the scene for unusual \trigger" events and activities. These technologies will be integrated into an experimental testbed to support evaluation, data collection, and demonstration of other VSAM technologies developed within the DARPA IU community.

...read moreread less

Patent•

License plate information reader device for motor vehicles

[...]

Takeo Kanade, Taizo Umezaki, Toshio Hamada

23 Apr 1999

TL;DR: In this paper, a CCD camera is used to produce video image data involving a license plate obtained by photographing a front and rear portion of a motor vehicle, and a literal region extracting device is provided to recognize a letter from a literal image (571) of the literal positional region obtained from the literal region extractor.

...read moreread less

Abstract: In a license plate information reader device (A) for motor vehicles, a CCD camera (1) is provided to produce video image data (11) involving a license plate obtained by photographing a front and rear portion of a motor vehicle. An A/D converter (3) produces a digital multivalue image data (31) by A/D converting the video image data (11). A license plate extracting device (4) is provided to produce a digital multivalue image data (41) corresponding to an area in which the license plate occupies. A literal region extracting device (5) extracts a literal positional region of a letter sequence of the license plate based on the image obtained from the license plate extracting device (4). A literal recognition device (6) is provided to recognize a letter from a literal image (571) of the literal positional region obtained from the literal region extracting device (5). An image emphasis device is provided to emphasize the literal image (571) of the literal positional region by replacing a part of the literal region extracting device (5) with a filter net which serves as a neural network.

...read moreread less

Proceedings Article•DOI•

3D voxel construction based on epipolar geometry

[...]

M. Kimura¹, Hideo Saito¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Dec 1999

TL;DR: The concept of projective 3D voxel, which makes it possible to handle 3D geometric data without complete 3D geometry information, is described.

...read moreread less

Abstract: In this paper, we propose an approach for construction of a projective 3D voxel space based on epipolar geometry obtained with weak calibration. This concept of voxel space defines the epipolar lines following to the coordinate axis. In the field of computer vision, it is common to reconstruct 3D geometry data based on camera calibration data and disparities of matching points in each image. In the case that the goal of the system is generating the images from another point of view, complete 3D geometry reconstruction is not presently required. However, detecting the consistent matching points in several pairs of images without complete 3D geometry information is difficult. This paper describes the concept of projective 3D voxel, which makes it possible to handle 3D geometric data without complete 3D geometry information.

...read moreread less

Journal Article•DOI•

A Tracker for Broken and Closely-Spaced Lines

[...]

Naoki Chiba, Takeo Kanade

30 Jun 1999-Systems and Computers in Japan

TL;DR: Experiments using real image sequences taken by a hand-held camcorder show that the proposed automatic line tracking method is robust against line extraction problems, closely-spaced lines, and large motion.

...read moreread less

Abstract: : We propose an automatic line tracking method which can deal with broken or closely-spaced line segments more accurately than previous methods over an image sequence. The method uses both grey scale information of the original images and geometric attributes of line segments. By using our hierarchical optical flow technique, we can get a good prediction of line segments in a consecutive frame even with large motion. The line attribute of direction, not the orientation, discriminates closely-spaced line segments because when lines are crowded or closely-spaced, their directions are opposite in many cases, even though their orientations are the same. A proposed new matching cost function enables us to deal with multiple collinear line segment matching easily instead of using one-to-one matching. Experiments using real image sequences taken by a hand-held camcorder show that our method is robust against line extraction problems, closely-spaced lines, and large motion.

...read moreread less

Automatically Recognizing Facial Expressions in the Spatio-Temporal Domain

[...]

James Jenn-Jier Lien¹, Takeo Kanade, Adena J. Zlochower, Jeffrey F. Cohn, Ching-Chung Li - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

01 Jan 1999

TL;DR: A computer vision system that automatically recognizes facial action units (AUs) or AU combinations using Hidden Markov Models (HMMs) and uses principal component analysis (PCA) to compress the data.

...read moreread less

Abstract: We developed a computer vision system that automatically recognizes facial action units (AUs) or AU combinations using Hidden Markov Models (HMMs). AUs are defined as visually discriminable muscle movements. The facial expressions are recognized in digitized image sequences of arbitrary length. In this paper, we use two approaches to extract the expression information: (1) facial feature point tracking, which is sensitive to subtle feature motion, in the mouth region, and (2) pixel-wise flow tracking, which includes more motion information, in the forehead and brow regions. In the latter approach, we use principal component analysis (PCA) to compress the data. We accurately recognize 93% of the lower face expressions and 91% of the upper face expressions.

...read moreread less

Multi-State Based Facial Feature Tracking and Detection

[...]

Yingli Tian, Takeo Kanade, Jeffrey F. Cohn

01 Jan 1999

TL;DR: This work presents a work toward a robust system to detect and track facial features including both permanent and transient facial features in a nearly frontal image sequence, by combining color, shape, edge and motion information.

...read moreread less

Abstract: Accurately and robustly tracking facial features must cope with the large variation in appearance across subjects and the combination of rigid and non-rigid motion. We present a work toward a robust system to detect and track facial features including both permanent (e.g. mouth, eye, and brow) and transient (e.g. furrows and wrinkles) facial features in a nearly frontal image sequence. Multi-state facial component models are proposed for tracking and modeling different facial features. Based on these multi-state models, and without any artificial enhancement, we detect and track the facial features, including mouth, eyes, brows, cheeks, and their related wrinkles and facial furrows by combining color, shape, edge and motion information. Given the initial location of the facial features in the first frame, the facial features can be detected or tracked in remainder images automatically. Our system is tested on 500 image sequences from the Pittsburgh-Carnegie Mellon University (Pitt-CMU) Facial Expression Action Unit (AU) Coded Database, which includes image sequences from children and adults of European, African, and Asian ancestry. Accurate tracking results are obtained in 98% of image sequences.

...read moreread less

Homography-Based 3D Scene Analysis of Video Sequences *

[...]

Mei Han¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1999

TL;DR: A robust homography algorithm is described which incorporates contrast/brightness adjustment and robust estimation into image registration and which applies the Levenburg-Marquardt method to generate a dense projectivedepth map.

...read moreread less

Abstract: We propose a framework to recover projectivedepth based on image homography and discuss itsapplication to scene analysis of video sequences.We describe a robust homography algorithmwhich incorporates contrast/brightness adjustmentand robust estimation into image registration. Wepresent a camera motion solver to obtain the ego-motion and the real/virtual plane position fromhomography. We then apply the Levenburg-Marquardt method to generate a dense projectivedepth map. We also discuss temporal integrationover video sequences. Finally we present theresults of applying the homography-based videoanalysis to motion detection. 1 Introduction Temporal information redundancy of videosequences allows us to use efficient, incrementalmethods which perform temporal integration ofinformation for gradual refinement.Approaches handling 3D scene analysis of videosequences with camera motion can be classifiedinto two categories: algorithms which use 2Dtransformation or model fitting, and algorithmswhich use 3D geometry analysis. Video sequencesof our interest are taken from a moving airborneplatform where the ego-motion is complex and thescene is relatively distant but not necessarily flat;

...read moreread less

Probabilistic Registration of 3-D Medical Images

[...]

Mei Chen, Takeo Kanade, Dean A. Pomerleau, Jeff Schneider

01 Jan 1999

TL;DR: This work represents anatomical variations in the form of statistical models, and embed these statistics into a 3-D digital brain atlas which is built by registering a training set of brain MRI volumes with the atlas.

...read moreread less

Abstract: Registration between 3-D images of human anatomies enables cross-subject diagnosis. However, innate differences in the appearance and location of anatomical structures between individuals make accurate registration difficult. We characterize such anatomical variations to achieve accurate registration. We represent anatomical variations in the form of statistical models, and embed these statistics into a 3-D digital brain atlas which we use as a reference. These models are built by registering a training set of brain MRI volumes with the atlas. This associates each voxel in the atlas with multi-dimensional distributions of variations in intensity and geometry of the training set. We evaluate statistical properties of these distributions to build a statistical atlas. When we register the statistical atlas with a particular subject, the embedded statistics function as prior knowledge to guide the deformation process. This allows the deformation to tolerate variations between individuals while retaining discrimination between different structures. This method gives an overall voxel mis-classification rate of 2.9% on 40 test cases; this is a 34% error reduction over the performance of our previous algorithm without using anatomical knowledge. Besides achieving accurate registration, statistical models of anatomical variations also enable quantitative study of anatomical differences between populations.

...read moreread less

Proceedings Article•DOI•

Mixed reality: where real and virtual worlds meet

[...]

Steven Feiner, Henry Fuchs, Takeo Kanade, Gudrun Klinker, Paul Milgram, Hideyuki Tamura - Show less +2 more

01 Jul 1999

Proceedings Article•DOI•

PALM: portable sensor-augmented vision system for large-scene modeling

[...]

Teck Khim Ng¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

04 Oct 1999

TL;DR: PALM-a portable sensor-augmented vision system for large-scene modeling solves the problem of recovering large structures in arbitrary scenes from video streams taken by a sensor-AUgmented camera through the use of multiple constraints derived from GPS measurements, camera orientation sensor readings, and image features.

...read moreread less

Abstract: We propose PALM-a portable sensor-augmented vision system for large-scene modeling. The system solves the problem of recovering large structures in arbitrary scenes from video streams taken by a sensor-augmented camera. Central to the solution method is the use of multiple constraints derived from GPS measurements, camera orientation sensor readings, and image features. The knowledge of camera orientation enhances computational efficiency by making a linear formulation of perspective ray constraints possible. The overall shape is constructed by merging smaller shape segments. Shape merging errors are minimized using the concept of shape hierarchy, which is realized through a "landmarking" technique. The features of the system include its use of a small number of images and feature points, its portability, and its low cost interface for synchronizing sensor measurements with the video stream. Example reconstructions of a football stadium and two large buildings are presented and these results are compared with the ground truth.

...read moreread less

Proceedings Article•DOI•

Computational model of DIC microscopy for reconstructing 3-D specimens: from observations to measurements

[...]

F. Kagalwala¹, Takeo Kanade, F. Lanni•Institutions (1)

Carnegie Mellon University¹

28 May 1999

TL;DR: This work describes a method to extract quantitative information from optically-sectioned DIC microscope images and attempts to reconstruct the three-dimensional structure and refractive index distribution throughout the specimen.

...read moreread less

Abstract: Summary form only given. Differential interference contrast (DIC) microscopy, a method pioneered by Georges Nomarski, is widely used to study live biological specimens. However, to date, biologists only qualitatively interpret DIC microscope images. In this work, we describe a method to extract quantitative information from optically-sectioned DIC microscope images. Specifically, given a set of images of a specimen, we attempt to reconstruct the three-dimensional structure and refractive index distribution throughout the specimen.

...read moreread less

Proceedings Article•DOI•

Computational model of DIC microscopy for reconstructing specimens

[...]

F. Kagalwala¹, F. Lanni, Takeo Kanade•Institutions (1)

Carnegie Mellon University¹

13 Oct 1999

TL;DR: A computational model is developed and verified for the image formation process of differential interference contrast microscopy and it is planned to use this model to reconstruct the properties of unknown specimens.

...read moreread less

Abstract: Biologists often use differential interference contrast (DIC) microscopy to study live cells. However, they are limited to qualitative observations due to the inherent nonlinear relation between the object properties and image intensity. As a first step towards quantitatively measuring optical properties of objects from DIC images, we develop and verify a computational model for the image formation process. Next, we plan to use this model to reconstruct the properties of unknown specimens.

...read moreread less

Detecting Faces in News Videos

[...]

Yuichi Nakamura, Takeo Kanade

01 Jan 1999