scispace - formally typeset
Search or ask a question

Showing papers by "Takeo Kanade published in 2014"


Book ChapterDOI
06 Sep 2014
TL;DR: An ultra-low latency reactive visual system that can sense, react, and adapt quickly to any environment while moving at highway speeds is introduced and can be programmed to perform a variety of tasks.
Abstract: The primary goal of an automotive headlight is to improve safety in low light and poor weather conditions. But, despite decades of innovation on light sources, more than half of accidents occur at night even with less traffic on the road. Recent developments in adaptive lighting have addressed some limitations of standard headlights, however, they have limited flexibility - switching between high and low beams, turning off beams toward the opposing lane, or rotating the beam as the vehicle turns - and are not designed for all driving environments. This paper introduces an ultra-low latency reactive visual system that can sense, react, and adapt quickly to any environment while moving at highway speeds. Our single hardware design can be programmed to perform a variety of tasks. Anti-glare high beams, improved driver visibility during snowstorms, increased contrast of lanes, markings, and sidewalks, and early visual warning of obstacles are demonstrated.

42 citations


Book ChapterDOI
06 Sep 2014
TL;DR: This work proposes a Kernel Structured Sparsity method that can handle both the temporal alignment problem and the structured sparse reconstruction within a common framework, and it can rely on simple features.
Abstract: In many behavioral domains, such as facial expression and gesture, sparse structure is prevalent. This sparsity would be well suited for event detection but for one problem. Features typically are confounded by alignment error in space and time. As a consequence, high-dimensional representations such as SIFT and Gabor features have been favored despite their much greater computational cost and potential loss of information. We propose a Kernel Structured Sparsity (KSS) method that can handle both the temporal alignment problem and the structured sparse reconstruction within a common framework, and it can rely on simple features. We characterize spatio-temporal events as time-series of motion patterns and by utilizing time-series kernels we apply standard structured-sparse coding techniques to tackle this important problem. We evaluated the KSS method using both gesture and facial expression datasets that include spontaneous behavior and differ in degree of difficulty and type of ground truth coding. KSS outperformed both sparse and non-sparse methods that utilize complex image features and their temporal extensions. In the case of early facial event classification KSS had 10% higher accuracy as measured by F1 score over kernel SVM methods.

36 citations


Proceedings ArticleDOI
01 Apr 2014
TL;DR: Experimental results performed on three types of cell populations validate that the interactive cell segmentation method proposed quickly reaches high quality results with minimal human interventions, and thus is significantly more efficient than alternative methods.
Abstract: Automatic cell segmentation can hardly be flawless due to the complexity of the image data particularly when time-lapse experiments last for a long time without biomarkers. To address this issue, we propose an interactive cell segmentation method that actively selects uncertain regions and requests human validation on them. Once erroneous segmentation is detected and subsequently corrected, the information is propagated over affinity graphs in order to fix analogous errors. We present a systematical method for correction propagation based on active and semi-supervised learning. Experimental results performed on three types of cell populations validate that our interactive cell segmentation quickly reaches high quality results with minimal human interventions, and thus is significantly efficient than alternative methods.

9 citations


Proceedings ArticleDOI
07 Mar 2014
TL;DR: This talk will describe a new DMD-based design for a headlight that can be programmed to perform several tasks simultaneously and that can sense, react and adapt quickly to any environment with the goal of increasing safety for all drivers on the road.
Abstract: The primary goal of a vehicular headlight is to improve safety in low-light and poor weather conditions. The typical headlight however has very limited flexibility - switching between high and low beams, turning off beams toward the opposing lane or rotating the beam as the vehicle turns - and is not designed for all driving environments. Thus, despite decades of innovation in light source technology, more than half of the vehicular accidents still happen at night even with much less traffic on the road. We will describe a new DMD-based design for a headlight that can be programmed to perform several tasks simultaneously and that can sense, react and adapt quickly to any environment with the goal of increasing safety for all drivers on the road. For example, we will be able to drive with high-beams without glaring any other driver and we will be able to see better during rain and snowstorms when the road is most treacherous to drive. The headlight can also increase contrast of lanes, markings and sidewalks and can alert drivers to sudden obstacles. In this talk, we will lay out the engineering challenges in building this headlight and share our experiences with the prototypes developed over the past two years.

5 citations


Proceedings ArticleDOI
29 Sep 2014
TL;DR: The proposed representation creates a viewpoint-invariant and scale-normalized model approximately describing an unknown object with multimodal sensors that facilitates 3D tracking of the object using 2D-to-2D image matching.
Abstract: Object representation is useful for many computer vision tasks, such as object detection, recognition, and tracking. Computer vision tasks must handle situations where unknown objects appear and must detect and track some object which is not in the trained database. In such cases, the system must learn or, otherwise derive, descriptions of new objects. In this paper, we investigate creating a representation of previously unknown objects that newly appear in the scene. The representation creates a viewpoint-invariant and scale-normalized model approximately describing an unknown object with multimodal sensors. Those properties of the representation facilitate 3D tracking of the object using 2D-to-2D image matching. The representation has both benefits of an implicit model (referred to as a view-based model) and an explicit model (referred to as a shape-based model). Experimental results demonstrate the viability of the proposed representation and outperform the existing approaches for 3D-pose estimation.

1 citations


Proceedings ArticleDOI
27 Oct 2014
TL;DR: This talk will present the idea, approach, and current status of the Smart Headlight Project, a combination of computer vision and projector-based illumination that opens the possibility for a new type of augmented reality.
Abstract: Summary form only given. A combination of computer vision and projector-based illumination opens the possibility for a new type of augmented reality: selectively illuminating the scene to improve or manipulate how the reality itself, rather than its display, appears to a human. One such example is the Smart Headlight being developed at Carnegie Mellon University's Robotics Institute. The project team has been working on a new set of capabilities for the headlight, such as making rain drops and snowflakes disappear, allowing for the high beams to always be on without glare, and enhancing the appearance of objects of interest. This talk will present the idea, approach, and current status of the Smart Headlight Project.

1 citations


Proceedings ArticleDOI
24 Mar 2014
TL;DR: Marvin, a system that can search physical objects using a mobile or wearable device that integrates HOG-based object recognition, SURF-based localization information, automatic speech recognition, and user feedback information with a probabilistic model to recognize the “object of interest” at high accuracy and at interactive speeds is presented.
Abstract: We present Marvin, a system that can search physical objects using a mobile or wearable device. It integrates HOG-based object recognition, SURF-based localization information, automatic speech recognition, and user feedback information with a probabilistic model to recognize the “object of interest” at high accuracy and at interactive speeds. Once the object of interest is recognized, the information that the user is querying, e.g. reviews, options, etc., is displayed on the user's mobile or wearable device. We tested this prototype in a real-world retail store during business hours, with varied degree of background noise and clutter. We show that this multi-modal approach achieves superior recognition accuracy compared to using a vision system alone, especially in cluttered scenes where a vision system would be unable to distinguish which object is of interest to the user without additional input. It is computationally able to scale to large numbers of objects by focusing compute-intensive resources on the objects most likely to be of interest, inferred from user speech and implicit localization information. We present the system architecture, the probabilistic model that integrates the multi-modal information, and empirical results showing the benefits of multi-modal integration.

01 Jan 2014
TL;DR: This document provides implementation details of the Kernel Structured Sparsity (KSS) method for FISTA (fast iterative shrinkage-thresholding algorithm) and shows scaling results on the 6D Motion Gesture Database.
Abstract: This document accompanies the paper: "Spatio-temporal Event Classication using Time-series Kernel based Structured Sparsity". We provide implementation details of the Kernel Structured Sparsity (KSS) method for FISTA (fast iterative shrinkage-thresholding algorithm) (2). We also show scaling results on the 6D Motion Gesture Database (3). Note that information given in this document is not necessary to under- stand the content of the main paper.