scispace - formally typeset
Search or ask a question

Showing papers by "Matthew Turk published in 2012"


Proceedings ArticleDOI
21 Sep 2012
TL;DR: A framework and prototype implementation for unobtrusive mobile remote collaboration on tasks that involve the physical environment using the Augmented Reality paradigm and model-free, markerless visual tracking to facilitate decoupled, live updated views of the environment and world-stabilized annotations while supporting a moving camera and unknown, unprepared environments is described.
Abstract: We describe a framework and prototype implementation for unobtrusive mobile remote collaboration on tasks that involve the physical environment. Our system uses the Augmented Reality paradigm and model-free, markerless visual tracking to facilitate decoupled, live updated views of the environment and world-stabilized annotations while supporting a moving camera and unknown, unprepared environments. In order to evaluate our concept and prototype, we conducted a user study with 48 participants in which a remote expert instructed a local user to operate a mock-up airplane cockpit. Users performed significantly better with our prototype (40.8 tasks completed on average) as well as with static annotations (37.3) than without annotations (28.9). 79% of the users preferred our prototype despite noticeably imperfect tracking.

139 citations


Proceedings ArticleDOI
05 Nov 2012
TL;DR: A user study evaluating the benefits of geometrically correct user-perspective rendering using an Augmented Reality (AR) magic lens finds that a tablet-sized display allows for significantly faster performance of a selection task and that a user-Perspective lens has benefits over a device-persistive lens for a selectiontask.
Abstract: In this paper we present a user study evaluating the benefits of geometrically correct user-perspective rendering using an Augmented Reality (AR) magic lens. In simulation we compared a user-perspective magic lens against the common device-perspective magic lens on both phone-sized and tablet-sized displays. Our results indicate that a tablet-sized display allows for significantly faster performance of a selection task and that a user-perspective lens has benefits over a device-perspective lens for a selection task. Based on these promising results, we created a proof-of-concept prototype, engineered with current off-the-shelf devices and software. To our knowledge, this is the first geometrically correct user-perspective magic lens.

71 citations


Proceedings ArticleDOI
05 Nov 2012
TL;DR: This work presents an approach to real-time tracking and mapping that supports any type of camera motion in 3D environments, that is, general as well as rotation-only camera movements, and effectively generalizes both a panorama mapping and tracking system and a keyframe-based Simultaneous Localization and Mapping system.
Abstract: We present an approach to real-time tracking and mapping that supports any type of camera motion in 3D environments, that is, general (parallax-inducing) as well as rotation-only (degenerate) motions. Our approach effectively generalizes both a panorama mapping and tracking system and a keyframe-based Simultaneous Localization and Mapping (SLAM) system, behaving like one or the other depending on the camera movement. It seamlessly switches between the two and is thus able to track and map through arbitrary sequences of general and rotation-only camera movements.

53 citations


Journal ArticleDOI
TL;DR: The ubiquity of high quality mobile cameras with ever increasing computational capacity, packaged along with other rich sensors, is providing many new opportunities for applications of computer vision technologies, especially those that support human activities such as location-based and usercentered services.
Abstract: Years ago, it was difficult to imagine that one day the digital camera would become a standard component of mobile phones. Nowadays, smart phones have not only been equipped with camera sensors, but also various other sensors such as accelerometers, gyroscopes, and GPS receivers. The ubiquity of high quality mobile cameras with ever increasing computational capacity, packaged along with other rich sensors, is providing many new opportunities for applications of computer vision technologies, especially those that support human activities such as location-based and usercentered services.

23 citations


Journal ArticleDOI
01 Aug 2012
TL;DR: The recognition results for the real-world data set show that the proposed detector gives similar performance to the method using manually located eye coordinates, showing that the accuracy of the proposed eye detector is comparable with that of the ground-truth data.
Abstract: We propose a new biased discriminant analysis (BDA) using composite vectors for eye detection. A composite vector consists of several pixels inside a window on an image. The covariance of composite vectors is obtained from their inner product and can be considered as a generalization of the covariance of pixels. The proposed composite BDA (C-BDA) method is a BDA using the covariance of composite vectors. We construct a hybrid cascade detector for eye detection, using Haar-like features in the earlier stages and composite features obtained from C-BDA in the later stages. The proposed detector runs in real time; its execution time is 5.5 ms on a typical PC. The experimental results for the CMU PIE database and our own real-world data set show that the proposed detector provides robust performance to several kinds of variations such as facial pose, illumination, eyeglasses, and partial occlusion. On the whole, the detection rate per pair of eyes is 98.0% for the 3604 face images of the CMU PIE database and 95.1% for the 2331 face images of the real-world data set. In particular, it provides a 99.7% detection rate for the 2120 CMU PIE images without glasses. Face recognition performance is also investigated using the eye coordinates from the proposed detector. The recognition results for the real-world data set show that the proposed detector gives similar performance to the method using manually located eye coordinates, showing that the accuracy of the proposed eye detector is comparable with that of the ground-truth data.

15 citations


Proceedings ArticleDOI
01 Sep 2012
TL;DR: This work presents an approach based on noncooperative game theory for computing the locations of every binary feature in a pattern, improving the performance of binary-feature-based matchers and shows an improvement in matching keypoints, in particular those with similar texture.
Abstract: Many applications in computer vision rely on determining the correspondence between two images that share an overlapping region. One way to establish this correspondence is by matching binary keypoint descriptors from both images. Although, these descriptors are efficiently computed with bits produced by an arrangement of binary features (pattern), their matching performance falls short in comparison with other more elaborated descriptors such as SIFT. We present an approach based on noncooperative game theory for computing the locations of every binary feature in a pattern, improving the performance of binary-feature-based matchers. We propose a simultaneous two-player zero-sum game in which a maximizer wants to increase a payoff by selecting the possible locations for the features; a minimizer wants to decrease the payoff by selecting a pair of keypoints to confuse the maximizer; and the payoff matrix is computed from the pixel intensities across the pixel neighborhood of the keypoints. We use the best locations from the obtained maximizer's optimal policy for locating every binary feature in the pattern. Our evaluation of this approach coupled with Ferns shows an improvement in matching keypoints, in particular those with similar texture. Moreover, our approach improves the matching performance when fewer bits are required.

Proceedings ArticleDOI
21 Sep 2012
TL;DR: A framework and prototype implementation for unobtrusive mobile remote collaboration on tasks that involve the physical environment using model-free, markerless visual tracking to facilitate decoupled, live updated views of the environment and world-stabilized annotations while supporting a moving camera and unknown, unprepared environments is described.
Abstract: In the accompanying paper [1], we describe a framework and prototype implementation for unobtrusive mobile remote collaboration on tasks that involve the physical environment. Our system uses model-free, markerless visual tracking to facilitate decoupled, live updated views of the environment and world-stabilized annotations while supporting a moving camera and unknown, unprepared environments. We conducted a user study with 48 participants to evaluate our concept. In this demo, we will present our system prototype and the setup used in the user study: a remote expert instructs a local user to operate a mock-up airplane. Users will be able to try out our interface as well as the two interfaces used as baseline in the study.