scispace - formally typeset
Search or ask a question

Showing papers by "Takeo Kanade published in 2010"


Proceedings Article•DOI•
13 Jun 2010
TL;DR: The Cohn-Kanade (CK+) database is presented, with baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data.
Abstract: In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22% and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010.

3,439 citations


Journal Article•DOI•
TL;DR: This paper introduces the database, describes the recording procedure, and presents results from baseline experiments using PCA and LDA classifiers to highlight similarities and differences between PIE and Multi-PIE.

1,333 citations


01 Jan 2010
TL;DR: A new voting-based object pose extraction algorithm that does not rely on 2D/3D feature correspondences and thus reduces the early-commitment problem plaguing the generality of traditional vision-based pose extraction algorithms is shown.
Abstract: Society is becoming more automated with robots beginning to perform most tasks in factories and starting to help out in home and office environments. One of the most important functions of robots is the ability to manipulate objects in their environment. Because the space of possible robot designs, sensor modalities, and target tasks is huge, researchers end up having to manually create many models, databases, and programs for their specific task, an effort that is repeated whenever the task changes. Given a specification for a robot and a task, the presented framework automatically constructs the necessary databases and programs required for the robot to reliably execute manipulation tasks. It includes contributions in three major components that are critical for manipulation tasks. The first is a geometric-based planning system that analyzes all necessary modalities of manipulation planning and offers efficient algorithms to formulate and solve them. This allows identification of the necessary information needed from the task and robot specifications. Using this set of analyses, we build a planning knowledge-base that allows informative geometric reasoning about the structure of the scene and the robot's goals. We show how to efficiently generate and query the information for planners. The second is a set of efficient algorithms considering the visibility of objects in cameras when choosing manipulation goals. We show results with several robot platforms using grippers cameras to boost accuracy of the detected objects and to reliably complete the tasks. Furthermore, we use the presented planning and visibility infrastructure to develop a completely automated extrinsic camera calibration method and a method for detecting insufficient calibration data. The third is a vision-centric database that can analyze a rigid object's surface for stable and discriminable features to be used in pose extraction programs. Furthermore, we show work towards a new voting-based object pose extraction algorithm that does not rely on 2D/3D feature correspondences and thus reduces the early-commitment problem plaguing the generality of traditional vision-based pose extraction algorithms. In order to reinforce our theoric contributions with a solid implementation basis, we discuss the open-source planning environment OpenRAVE, which began and evolved as a result of the work done in this thesis. We present an analysis of its architecture and provide insight for successful robotics software environments.

540 citations


Journal Article•DOI•
TL;DR: Frequency analysis allows for greater accuracy in the removal of dynamic weather and in the performance of feature extraction than previous pixel-based or patch-based methods and is effective for videos with both scene and camera motions.
Abstract: Dynamic weather such as rain and snow causes complex spatio-temporal intensity fluctuations in videos. Such fluctuations can adversely impact vision systems that rely on small image features for tracking, object detection and recognition. While these effects appear to be chaotic in space and time, we show that dynamic weather has a predictable global effect in frequency space. For this, we first develop a model of the shape and appearance of a single rain or snow streak in image space. Detecting individual streaks is difficult even with an accurate appearance model, so we combine the streak model with the statistical characteristics of rain and snow to create a model of the overall effect of dynamic weather in frequency space. Our model is then fit to a video and is used to detect rain or snow streaks first in frequency space, and the detection result is then transferred to image space. Once detected, the amount of rain or snow can be reduced or increased. We demonstrate that our frequency analysis allows for greater accuracy in the removal of dynamic weather and in the performance of feature extraction than previous pixel-based or patch-based methods. We also show that unlike previous techniques, our approach is effective for videos with both scene and camera motions.

357 citations


Proceedings Article•DOI•
06 Dec 2010
TL;DR: This paper argues for a parametric representation of objects in 3D, which allows us to incorporate volumetric constraints of the physical world, and shows that augmenting current structured prediction techniques withvolumetric reasoning significantly improves the performance of the state-of-the-art.
Abstract: There has been a recent push in extraction of 3D spatial layout of scenes. However, none of these approaches model the 3D interaction between objects and the spatial layout. In this paper, we argue for a parametric representation of objects in 3D, which allows us to incorporate volumetric constraints of the physical world. We show that augmenting current structured prediction techniques with volumetric reasoning significantly improves the performance of the state-of-the-art.

319 citations


Journal Article•DOI•
26 Jul 2010
TL;DR: In this paper, a multi-layered display that uses water drops as voxels is presented, where a single projector-camera system and a set of linear drop generator manifolds are tightly synchronized and controlled using a computer.
Abstract: We present a multi-layered display that uses water drops as voxels. Water drops refract most incident light, making them excellent wide-angle lenses. Each 2D layer of our display can exhibit arbitrary visual content, creating a layered-depth (2.5D) display. Our system consists of a single projector-camera system and a set of linear drop generator manifolds that are tightly synchronized and controlled using a computer. Following the principles of fluid mechanics, we are able to accurately generate and control drops so that, at any time instant, no two drops occupy the same projector pixel's line-of-sight. This drop control is combined with an algorithm for space-time division of projector light rays. Our prototype system has up to four layers, with each layer consisting of an row of 50 drops that can be generated at up to 60 Hz. The effective resolution of the display is 50x projector vertical-resolution x number of layers. We show how this water drop display can be used for text, videos, and interactive games.

85 citations


Proceedings Article•DOI•
14 Apr 2010
TL;DR: This work presents a pixel classification approach that is independent of cell type or imaging modality, and demonstrates the effectiveness of this approach on four cell types with diverse morphologies under different microscopy imaging modalities.
Abstract: Cell segmentation in microscopy imagery is essential for many bioimage applications such as cell tracking. To segment cells from the background accurately, we present a pixel classification approach that is independent of cell type or imaging modality. We train a set of Bayesian classifiers from clustered local training image patches. Each Bayesian classifier is an expert to make decision in its specific domain. The decision from the mixture of experts determines how likely a new pixel is a cell pixel. We demonstrate the effectiveness of this approach on four cell types with diverse morphologies under different microscopy imaging modalities.

82 citations


Proceedings Article•DOI•
11 Nov 2010
TL;DR: Methods for assessment of exercise quality using body-worn tri-axial accelerometers will form the basis for an at-home rehabilitation device that will recognize errors in patient exercise performance, provide appropriate feedback on the performance, and motivate the patient to continue the prescribed regimen.
Abstract: In this paper, we describe methods for assessment of exercise quality using body-worn tri-axial accelerometers. We assess exercise quality by building a classifier that labels incorrect exercises. The incorrect performances are divided into a number of classes of errors as defined by a physical therapist. We focus on exercises commonly prescribed for knee osteoarthritis: standing hamstring curl, reverse hip abduction, and lying straight leg raise. The methods presented here will form the basis for an at-home rehabilitation device that will recognize errors in patient exercise performance, provide appropriate feedback on the performance, and motivate the patient to continue the prescribed regimen.

68 citations


Proceedings Article•DOI•
14 Apr 2010
TL;DR: A fully-automated mitosis event detector using hidden conditional random fields for cell populations imaged with time-lapse phase contrast microscopy that achieved 95% precision and 85% recall in very challenging image sequences of multipolar-shaped C3H10T1/2 mesenchymal stem cells is proposed.
Abstract: We propose a fully-automated mitosis event detector using hidden conditional random fields for cell populations imaged with time-lapse phase contrast microscopy. The method consists of two stages that jointly optimize recall and precision. First, we apply model-based microscopy image preconditioning and volumetric segmentation to identify candidate spatiotemporal sub-regions in the input image sequence where mitosis potentially occurred. Then, we apply a learned hidden conditional random field classifier to classify each candidate sequence as mitosis or not. The proposed detection method achieved 95% precision and 85% recall in very challenging image sequences of multipolar-shaped C3H10T1/2 mesenchymal stem cells. The superiority of the method was further demonstrated by comparisons with conditional random field and support vector machine classifiers. Moreover, the proposed method does not depend on empirical parameters, ad hoc image processing, or cell tracking; and can be straightforwardly adapted to different cell types.

67 citations


Book Chapter•DOI•
20 Sep 2010
TL;DR: It turns out that the phase contrast imaging system can be relatively well explained by a linear imaging model and a quadratic optimization function with sparseness and smoothness regularizations to restore the "authentic" phase contrast images that directly correspond to specimen's optical path length without phase contrast artifacts such as halo and shade-off.
Abstract: Image segmentation is essential for many automated microscopy image analysis systems. Rather than treating microscopy images as general natural images and rushing into the image processing warehouse for solutions, we propose to study a microscope's optical properties to model its image formation process first using phase contrast microscopy as an exemplar. It turns out that the phase contrast imaging system can be relatively well explained by a linear imaging model. Using this model, we formulate a quadratic optimization function with sparseness and smoothness regularizations to restore the "authentic" phase contrast images that directly correspond to specimen's optical path length without phase contrast artifacts such as halo and shade-off. With artifacts removed, high quality segmentation can be achieved by simply thresholding the restored images. The imaging model and restoration method are quantitatively evaluated on two sequences with thousands of cells captured over several days.

61 citations


Proceedings Article•DOI•
13 Jun 2010
TL;DR: The proposed Heterogeneous Conditional Random Field model successfully realizes joint detection and segmentation of the cell regions into individual cells whether the cells are separate or touch one another.
Abstract: Detecting and segmenting cell regions in microscopic images is a challenging task, because cells typically do not have rich features, and their shapes and appearances are highly irregular and flexible. Furthermore, cells often form clusters, rendering the existing joint detection and segmentation algorithms unable to segment out individual cells. We address these difficulties by proposing a Heterogeneous Conditional Random Field (HCRF), in which different nodes have different state sets. The state sets are designed in such a way that the resulting HCRF model could encode all possible detection/segmentation cases while keeping its identifiability and compactness. Attributed to the provably optimal design of the state sets, the proposed model successfully realizes joint detection and segmentation of the cell regions into individual cells whether the cells are separate or touch one another. Experiments on two different types of cell images show that the HCRF outperforms several recently proposed methods.

Proceedings Article•DOI•
25 Oct 2010
TL;DR: This work addresses the problem of interactive search for a target of interest in surveillance imagery by iteratively learning a distance metric for retrieval, based on user feedback, and employs rank based constraints and convex optimization to efficiently learn the distance metric.
Abstract: We address the problem of interactive search for a target of interest in surveillance imagery. Our solution consists of iteratively learning a distance metric for retrieval, based on user feedback. The approach employs (retrieval) rank based constraints and convex optimization to efficiently learn the distance metric. The algorithm uses both user labeled and unlabeled examples in the learning process. The method is fast enough for a new metric to be learned interactively for each target query. In order to reduce the burden on the user, a model-independent active learning method is used to select key examples, for response solicitation. This leads to a significant reduction in the number of user-interactions required for retrieving the target of interest. The proposed method is evaluated on challenging pedestrian and vehicle data sets, and compares favorably to the state of the art in target re-acquisition algorithms.

Journal Article•DOI•
TL;DR: This paper analytically and empirically derive the conditions under which a multi-camera system can be modeled as a single spherical camera and shows that spherical approximation is applicable to a surprisingly larger extent than currently expected.

Journal Article•DOI•
TL;DR: This work proposes a method to find degenerate cases of the linear seventeen-point algorithm by decomposing a measurement matrix that is used in the algorithm into two matrices about ray directions and centers of projections.
Abstract: In estimating motions of multi-centered optical systems using the generalized camera model, one can use the linear seventeen-point algorithm for obtaining a generalized essential matrix, the counterpart of the eight-point algorithm for the essential matrix of a pair of cameras. Like the eight-point algorithm, the seventeen-point algorithm has degenerate cases. However, mechanisms of the degeneracy of this algorithm have not been investigated. We propose a method to find degenerate cases of the algorithm by decomposing a measurement matrix that is used in the algorithm into two matrices about ray directions and centers of projections. This decomposition method allows us not only to prove degeneracy of the previously known degenerate cases, but also to find a new degenerate configuration.

Proceedings Article•DOI•
19 Jul 2010
TL;DR: A fully-automated detection method for cells imaged with phase contrast microscopy that does not depend on empirical parameters, ad hoc image processing, or explicit cell tracking; and can be straightforwardly adapted to different cell types is proposed.
Abstract: High-throughput automated analysis of cell population behaviors in vitro is of great importance to biological research. In particular, automated quantification of cellular mitosis in time-lapse microscopy video is useful for multiple applications such as tissue engineering, cancer research, and developmental biology. Accurate localization and counting of mitosis are challenging since cells undergo drastic morphological and appearance changes during mitosis. To tackle this challenge, we propose a fully-automated detection method for cells imaged with phase contrast microscopy. The method consists of three stages: image preconditioning, spatiotemporal volume extraction and SVM — based mitosis event detection. First, the input images are transformed based on physics of phase contrast image formation such that potential mitosis regions are assigned high values. Second, volumetric region grow was performed on the transformed images to extract candidate mitosis regions. Third, mitosis events are detected in the candidates using a Support Vector Machine (SVM) classifier. The proposed method does not depend on empirical parameters, ad hoc image processing, or explicit cell tracking; and can be straightforwardly adapted to different cell types. It was validated with 10 image sequences consisting of 8000 images, and achieved excellent performance with 90.6% average precision and 95.6% average recall.


Proceedings Article•DOI•
14 Apr 2010
TL;DR: A method for robustly detecting hematopoietic stem cells in phase contrast microscopy images by modeling the profile of each filter response as a quadratic surface and exploring the variations of peak curvatures and peak values of the filter responses when the ring radius varies is presented.
Abstract: We present a method for robustly detecting hematopoietic stem cells (HSCs) in phase contrast microscopy images. HSCs appear to be easy to detect since they typically appear as round objects. However, when HSCs are touching and overlapping, showing the variations in shape and appearance, standard pattern detection methods, such as Hough transform and correlation, do not perform well. The proposed method exploits the output pattern of a ring filter bank applied to the input image, which consists of a series of matched filters with multiple-radius ring-shaped templates. By modeling the profile of each filter response as a quadratic surface, we explore the variations of peak curvatures and peak values of the filter responses when the ring radius varies. The method is validated on thousands of phase contrast microscopy images with different acquisition settings, achieving 96.5% precision and 94.4% recall.

Proceedings Article•DOI•
03 May 2010
TL;DR: A new method for detecting object boundaries using planar laser scanners (LIDARs) and, optionally, co-registered imagery and features derived from the LIDAR and imagery are used to train a support vector machine (SVM) classifier to label pairs of range measurements as boundary or non-boundary.
Abstract: Detecting the boundaries of objects is a key step in separating foreground objects from the background, which is useful for robotics and computer vision applications, such as object detection, recognition, and tracking. We propose a new method for detecting object boundaries using planar laser scanners (LIDARs) and, optionally, co-registered imagery. We formulate boundary detection as a classification problem, in which we estimate whether a boundary exists in the gap between two consecutive range measurements. Features derived from the LIDAR and imagery are used to train a support vector machine (SVM) classifier to label pairs of range measurements as boundary or non-boundary. We compare this approach to an existing boundary detection algorithm that uses dynamically adjusted thresholds. Experiments show that the new method performs better even when only LIDAR features are used, and additional improvement occurs when image-based features are included, too. The new algorithm performs better on difficult boundary cases, such as obliquely viewed objects.

Proceedings Article•DOI•
03 Dec 2010
TL;DR: It is shown analytically that joint torques can supply energy to the system composed of the robot arm and the object efficiently near singular configurations of the arm.
Abstract: This paper discusses the advantages of singular configurations of a two-link robot arm in achieving tasks of pulling or lifting a heavy object. Optimal base location and arm motion for minimizing the joint torques are examined by numerical simulations, and the base location where the robot arm is near a singular configuration at the start time of task is optimal. It is shown analytically that joint torques can supply energy to the system composed of the robot arm and the object efficiently near singular configurations of the arm. The energy supply rates at two singular configurations are derived based on the equations of motion of the system.

Journal Article•DOI•
TL;DR: In this article, a method for removing glare caused by water droplets, or other foreign objects, adhering to the protective glass of an imaging lens is presented. But this method is limited to images obtained with a camera under adverse weather conditions.
Abstract: View disturbing noise must be removed from images obtained with a camera under adverse weather conditions. In this paper, we present a method for removing glare caused by water droplets, or other foreign objects, adhering to protective glass of an imaging lens. We have designed and implemented an electronically controlled optical shutter array that detects and removes glare. We also present the possibility of applying this technique to remove general glare caused by the imaging lens itself.

Proceedings Article•DOI•
03 Dec 2010
TL;DR: This paper evaluates the dynamic and kinematic properties of a prismatic mechanism and shows its capabilities in performing home manipulation tasks when integrated into a robotic arm and verifies that translational motion is more energy efficient with PRISM.
Abstract: This paper evaluates the dynamic and kinematic properties of a prismatic mechanism and shows its capabilities in performing home manipulation tasks when integrated into a robotic arm. Our design is motivated from the observation that human hand motions often follow a linear trajectory when manipulating everyday objects. We present the mechanical design for a light-weight, energy-efficient robot named PRISM that emphasizes translational motion. By simulating the dynamics equations and comparing the structure of commonly used anthropomorphic arms and our proposed arm, we verify that translational motion is more energy efficient with PRISM, and the robot can maneuver itself in narrower places. Through simulation experiments using state of the art manipulation planning algorithms, we analyze the success rates of PRISM and an anthropomorphic robot arm in performing basic tasks. The simulation experiments center on pick-and-place tasks in cluttered kitchen scenes. We show a real-world prototype of PRISM and perform several manipulation experiments with it.

Proceedings Article•DOI•
03 Dec 2010
TL;DR: The responsiveness of the singularity-based mechanism (SBM) is clarified using the dynamics analysis and the characteristic of the SBM that generates a large acceleration at start is similar to the human arm moved by a muscle.
Abstract: We propose a singularity-based mechanism (SBM) to exploit the singular configuration that improves the angular acceleration instead of constraining the movement. The tradeoff between the responsiveness and the range of motion is achieved by varying a length of linkage in the SBM. In this paper, we clarify the responsiveness of the SBM using the dynamics analysis. For the demonstration, we build an experimental SBM system with the high responsiveness, a practical range of motion, and a size comparable to a human arm. In the experiment, the effectiveness of the SBM is shown in a vertical lifting task. The characteristic of the SBM that generates a large acceleration at start is similar to the human arm moved by a muscle. The similarity between the SBM and the human arm is analyzed in terms of the static torque.