scispace - formally typeset
Search or ask a question
Author

Kenji Oka

Other affiliations: Panasonic
Bio: Kenji Oka is an academic researcher from University of Tokyo. The author has contributed to research in topics: Gesture recognition & Desk. The author has an hindex of 11, co-authored 20 publications receiving 894 citations. Previous affiliations of Kenji Oka include Panasonic.

Papers
More filters
Journal ArticleDOI
TL;DR: A method for discerning fingertip locations in image frames and measuring fingertip trajectories across image frames is introduced and a mechanism for combining direct manipulation and symbolic gestures based on multiple fingertip motions is proposed.
Abstract: Augmented desk interfaces and other virtual reality systems depend on accurate, real-time hand and fingertip tracking for seamless integration between real objects and associated digital information. We introduce a method for discerning fingertip locations in image frames and measuring fingertip trajectories across image frames. We also propose a mechanism for combining direct manipulation and symbolic gestures based on multiple fingertip motions. Our method uses a filtering technique, in addition to detecting fingertips in each image frame, to predict fingertip locations in successive image frames and to examine the correspondences between the predicted locations and detected fingertips. This lets us obtain multiple complex fingertip trajectories in real time and improves fingertip tracking. This method can track multiple fingertips reliably even on a complex background under changing lighting conditions without invasive devices or color markers.

308 citations

Journal Article
TL;DR: A fast and robust method for tracking a user's hand and multiple fingertips and gesture recognition based on measured fingertip trajectories for augmented desk interface systems, which is particularly advantageous for human-computer interaction (HCI).

170 citations

Proceedings ArticleDOI
20 May 2002
TL;DR: In this paper, the location of each fingertip is located in each input infrared image frame and correspondences of detected fingertips between successive image frames are determined based on a prediction technique, which is particularly advantageous for human-computer interaction.
Abstract: We propose a fast and robust method for tracking a user's hand and multiple fingertips; we then demonstrate gesture recognition based on measured fingertip trajectories for augmented desk interface systems. Our tracking method is capable of tracking multiple fingertips in a reliable manner even in a complex background under a dynamically changing lighting condition without any markers. First, based on its geometrical features, the location of each fingertip is located in each input infrared image frame. Then, correspondences of detected fingertips between successive image frames are determined based on a prediction technique. Our gesture recognition system is particularly advantageous for human-computer interaction (HCI) in that users can achieve interactions based on symbolic gestures at the same time that they perform direct manipulation with their own hands and fingers. The effectiveness of our proposed method has been successfully demonstrated via a number of experiments.

167 citations

Proceedings ArticleDOI
02 Apr 2005
TL;DR: In this experiment head tracking is used to switch the mouse pointer between monitors and use the mouse to move within each monitor, and users required significantly less mouse movement with the tracking system, and preferred using it, although task time actually increased.
Abstract: The use of multiple LCD monitors is becoming popular as prices are reduced, but this creates problems for window management and switching between applications. For a single monitor, eye tracking can be combined with the mouse to reduce the amount of mouse movement, but with several monitors the head is moved through a large range of positions and angles which makes eye tracking difficult. We thus use head tracking to switch the mouse pointer between monitors and use the mouse to move within each monitor. In our experiment users required significantly less mouse movement with the tracking system, and preferred using it, although task time actually increased. A graphical prompt (flashing star) prevented the user losing the pointer when switching monitors. We present discussions on our results and ideas for further developments.

86 citations

Proceedings Article
01 Dec 2005
TL;DR: A new tracking system based on a stochastic filtering framework for reliably estimating the 3D pose of a user’s head in real-time and designed to control the difusion factor of a motion model adaptively is proposed.
Abstract: In this paper, we propose a new tracking system based on a stochastic filtering framework for reliably estimating the 3D pose of a user’s head in real-time. Our system estimates the pose of a user’s head in each image frame whose 3D model is automatically obtained at an initialization step. In particular, our estimation method is designed to control the difusion factor of a motion model adaptively. This technique contributes significantly to improving the following performance simultaneously: the robust tracking against abrupt head motion and the accurate pose estimation when the user is staring at a point in a scene. The performance of our proposed method has been successfully demonstrated via experiments.

49 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper discusses the inherent difficulties in head pose estimation and presents an organized survey describing the evolution of the field, comparing systems by focusing on their ability to estimate coarse and fine head pose and highlighting approaches well suited for unconstrained environments.
Abstract: The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has fewer rigorously evaluated systems or generic solutions. In this paper, we discuss the inherent difficulties in head pose estimation and present an organized survey describing the evolution of the field. Our discussion focuses on the advantages and disadvantages of each approach and spans 90 of the most innovative and characteristic papers that have been published on this topic. We compare these systems by focusing on their ability to estimate coarse and fine head pose, highlighting approaches that are well suited for unconstrained environments.

1,402 citations

Journal ArticleDOI
TL;DR: A literature review on the second research direction, which aims to capture the real 3D motion of the hand, which is a very challenging problem in the context of HCI.

901 citations

Patent
13 May 2005
TL;DR: Sign-understanding technology can be used for remote control of home devices, mouse-less operation of computer consoles, gaming, and man-robot communication to give instructions among others.
Abstract: Communication is an important issue in man-to-robot interaction. Signs can be used to interact with machines by providing user instructions or commands. Embodiment of the present invention include human detection, human body parts detection, hand shape analysis, trajectory analysis, orientation determination, gesture matching, and the like. Many types of shapes and gestures are recognized in a non-intrusive manner based on computer vision. A number of applications become feasible by this sign-understanding technology, including remote control of home devices, mouse-less (and touch-less) operation of computer consoles, gaming, and man-robot communication to give instructions among others. Active sensing hardware is used to capture a stream of depth images at a video rate, which is consequently analyzed for information extraction.

587 citations

Proceedings ArticleDOI
27 Jun 2004
TL;DR: A method is presented for robust tracking in highly cluttered environments that makes effective use of 3D depth sensing technology, resulting in illumination-invariant tracking.
Abstract: A method is presented for robust tracking in highly cluttered environments. The method makes effective use of 3D depth sensing technology, resulting in illumination-invariant tracking. A few applications using tracking are presented including face tracking and hand tracking.

507 citations

Proceedings ArticleDOI
15 Jun 2000
TL;DR: A multi-PC/camera system that can perform 3D reconstruction and ellipsoid fitting of moving humans in real time and using a simple and user-friendly interface, the user can display and observe, in realTime and from any view-point, the 3D models of the moving human body.
Abstract: We present a multi-PC/camera system that can perform 3D reconstruction and ellipsoid fitting of moving humans in real time. The system consists of five cameras. Each camera is connected to a PC which locally extracts the silhouettes of the moving person in the image captured by the camera. The five silhouette images are then sent, via local network, to a host computer to perform 3D voxel-based reconstruction by an algorithm called SPOT. Ellipsoids are then used to fit the reconstructed data. By using a simple and user-friendly interface, the user can display and observe, in real time and from any view-point, the 3D models of the moving human body. With a rate of higher than 15 frames per second, the system is able to capture non-intrusively, a sequence of human motions.

447 citations