scispace - formally typeset
Search or ask a question

Showing papers by "Sebastian Thrun published in 2010"


Journal ArticleDOI
27 Aug 2010-Science
TL;DR: Using a bioengineered substrate to recapitulate key biophysical and biochemical niche features in conjunction with a highly automated single-cell tracking algorithm, it is shown that substrate elasticity is a potent regulator of MuSC fate in culture.
Abstract: Stem cells that naturally reside in adult tissues, such as muscle stem cells (MuSCs), exhibit robust regenerative capacity in vivo that is rapidly lost in culture. Using a bioengineered substrate to recapitulate key biophysical and biochemical niche features in conjunction with a highly automated single-cell tracking algorithm, we show that substrate elasticity is a potent regulator of MuSC fate in culture. Unlike MuSCs on rigid plastic dishes (approximately 10(6) kilopascals), MuSCs cultured on soft hydrogel substrates that mimic the elasticity of muscle (12 kilopascals) self-renew in vitro and contribute extensively to muscle regeneration when subsequently transplanted into mice and assayed histologically and quantitatively by noninvasive bioluminescence imaging. Our studies provide novel evidence that by recapitulating physiological tissue rigidity, propagation of adult muscle stem cells is possible, enabling future cell-based therapies for muscle-wasting diseases.

1,428 citations


Proceedings ArticleDOI
03 May 2010
TL;DR: This work proposes an extension to this approach to vehicle localization that yields substantial improvements over previous work in vehicle localization, including higher precision, the ability to learn and improve maps over time, and increased robustness to environment changes and dynamic obstacles.
Abstract: Autonomous vehicle navigation in dynamic urban environments requires localization accuracy exceeding that available from GPS-based inertial guidance systems. We have shown previously that GPS, IMU, and LIDAR data can be used to generate a high-resolution infrared remittance ground map that can be subsequently used for localization [4]. We now propose an extension to this approach that yields substantial improvements over previous work in vehicle localization, including higher precision, the ability to learn and improve maps over time, and increased robustness to environment changes and dynamic obstacles. Specifically, we model the environment, instead of as a spatial grid of fixed infrared remittance values, as a probabilistic grid whereby every cell is represented as its own gaussian distribution over remittance values. Subsequently, Bayesian inference is able to preferentially weight parts of the map most likely to be stationary and of consistent angular reflectivity, thereby reducing uncertainty and catastrophic errors. Furthermore, by using offline SLAM to align multiple passes of the same environment, possibly separated in time by days or even months, it is possible to build an increasingly robust understanding of the world that can be then exploited for localization. We validate the effectiveness of our approach by using these algorithms to localize our vehicle against probabilistic maps in various dynamic environments, achieving RMS accuracy in the 10cm-range and thus outperforming previous work. Importantly, this approach has enabled us to autonomously drive our vehicle for hundreds of miles in dense traffic on narrow urban roads which were formerly unnavigable with previous localization methods.

615 citations


Journal ArticleDOI
TL;DR: A practical path-planning algorithm for an autonomous vehicle operating in an unknown semi-structured (or unstructured) environment, where obstacles are detected online by the robot’s sensors is described, leading to faster search and final trajectories better suited to the structure of the environment.
Abstract: We describe a practical path-planning algorithm for an autonomous vehicle operating in an unknown semi-structured (or unstructured) environment, where obstacles are detected online by the robot’s sensors. This work was motivated by and experimentally validated in the 2007 DARPA Urban Challenge, where robotic vehicles had to autonomously navigate parking lots. The core of our approach to path planning consists of two phases. The first phase uses a variant of A* search (applied to the 3D kinematic state space of the vehicle) to obtain a kinematically feasible trajectory. The second phase then improves the quality of the solution via numeric non-linear optimization, leading to a local (and frequently global) optimum. Further, we extend our algorithm to use prior topological knowledge of the environment to guide path planning, leading to faster search and final trajectories better suited to the structure of the environment. We present experimental results from the DARPA Urban Challenge, where our robot demonstrated near-flawless performance in complex general path-planning tasks such as navigating parking lots and executing U-turns on blocked roads. We also present results on autonomous navigation of real parking lots. In those latter tasks, which are significantly more complex than the ones in the DARPA Urban Challenge, the time of a full replanning cycle of our planner is in the range of 50—300 ms.

594 citations


Proceedings ArticleDOI
03 May 2010
TL;DR: A semi-reactive trajectory generation method, which can be tightly integrated into the behavioral layer of the holistic autonomous system, that realizes long-term objectives such as velocity keeping, merging, following, stopping, in combination with a reactive collision avoidance by means of optimal-control strategies within the Frenét-Frame of the street.
Abstract: Safe handling of dynamic highway and inner city scenarios with autonomous vehicles involves the problem of generating traffic-adapted trajectories. In order to account for the practical requirements of the holistic autonomous system, we propose a semi-reactive trajectory generation method, which can be tightly integrated into the behavioral layer. The method realizes long-term objectives such as velocity keeping, merging, following, stopping, in combination with a reactive collision avoidance by means of optimal-control strategies within the Frenet-Frame [12] of the street. The capabilities of this approach are demonstrated in the simulation of a typical high-speed highway scenario.

567 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper derives an efficient filtering algorithm for tracking human pose using a stream of monocular depth images and describes a novel algorithm for propagating noisy evidence about body part locations up the kinematic chain using the un-scented transform.
Abstract: Markerless tracking of human pose is a hard yet relevant problem. In this paper, we derive an efficient filtering algorithm for tracking human pose using a stream of monocular depth images. The key idea is to combine an accurate generative model — which is achievable in this setting using programmable graphics hardware — with a discriminative model that provides data-driven evidence about body part locations. In each filter iteration, we apply a form of local model-based search that exploits the nature of the kinematic chain. As fast movements and occlusion can disrupt the local search, we utilize a set of discriminatively trained patch classifiers to detect body parts. We describe a novel algorithm for propagating this noisy evidence about body part locations up the kinematic chain using the un-scented transform. The resulting distribution of body configurations allows us to reinitialize the model-based search. We provide extensive experimental results on 28 real-world sequences using automatic ground-truth annotations from a commercial motion capture system.

406 citations


Proceedings ArticleDOI
03 May 2010
TL;DR: Experiments show that the interest points in conjunction with a boosted patch classifier are significantly better in detecting body parts in depth images than state-of-the-art sliding-window based detectors.
Abstract: We deal with the problem of detecting and identifying body parts in depth images at video frame rates. Our solution involves a novel interest point detector for mesh and range data that is particularly well suited for analyzing human shape. The interest points, which are based on identifying geodesic extrema on the surface mesh, coincide with salient points of the body, which can be classified as, e.g., hand, foot or head using local shape descriptors. Our approach also provides a natural way of estimating a 3D orientation vector for a given interest point. This can be used to normalize the local shape descriptors to simplify the classification problem as well as to directly estimate the orientation of body parts in space. Experiments involving ground truth labels acquired via an active motion capture system show that our interest points in conjunction with a boosted patch classifier are significantly better in detecting body parts in depth images than state-of-the-art sliding-window based detectors.

335 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: It is shown the surprising result that 3D scans of reasonable quality can also be obtained with a sensor of such low data quality, and a new combination of a 3D superresolution method with a probabilistic scan alignment approach that explicitly takes into account the sensor's noise characteristics.
Abstract: We describe a method for 3D object scanning by aligning depth scans that were taken from around an object with a time-of-flight camera. These ToF cameras can measure depth scans at video rate. Due to comparably simple technology they bear potential for low cost production in big volumes. Our easy-to-use, cost-effective scanning solution based on such a sensor could make 3D scanning technology more accessible to everyday users. The algorithmic challenge we face is that the sensor's level of random noise is substantial and there is a non-trivial systematic bias. In this paper we show the surprising result that 3D scans of reasonable quality can also be obtained with a sensor of such low data quality. Established filtering and scan alignment techniques from the literature fail to achieve this goal. In contrast, our algorithm is based on a new combination of a 3D superresolution method with a probabilistic scan alignment approach that explicitly takes into account the sensor's noise characteristics.

308 citations


Journal ArticleDOI
TL;DR: It is argued that enormous societal benefits can be reaped by deploying this emerging technology for autopilots for cars in the marketplace, and some of the key remaining technology obstacles are discussed.
Abstract: This article advocates self-driving, robotic technology for cars. Recent challenges organized by DARPA have induced a significant advance in technology for autopilots for cars; similar to those already used in aircraft and marine vessels. This article reviews this technology, and argues that enormous societal benefits can be reaped by deploying this emerging technology in the marketplace. It lays out a vision for deployment, and discusses some of the key remaining technology obstacles.

226 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work presents a flexible method for fusing information from optical and range sensors based on an accelerated high-dimensional filtering approach, and describes how to integrate priors on object motion and appearance and how to achieve an efficient implementation using parallel processing hardware such as GPUs.
Abstract: We present a flexible method for fusing information from optical and range sensors based on an accelerated high-dimensional filtering approach. Our system takes as input a sequence of monocular camera images as well as a stream of sparse range measurements as obtained from a laser or other sensor system. In contrast with existing approaches, we do not assume that the depth and color data streams have the same data rates or that the observed scene is fully static. Our method produces a dense, high-resolution depth map of the scene, automatically generating confidence values for every interpolated depth point. We describe how to integrate priors on object motion and appearance and how to achieve an efficient implementation using parallel processing hardware such as GPUs.

175 citations


Journal ArticleDOI
26 Jul 2010
TL;DR: The modularity of the proposed method allows customization of a character's gesture repertoire, animation of non-human characters, and the use of additional inputs such as speech recognition or direct user control.
Abstract: We introduce gesture controllers, a method for animating the body language of avatars engaged in live spoken conversation. A gesture controller is an optimal-policy controller that schedules gesture animations in real time based on acoustic features in the user's speech. The controller consists of an inference layer, which infers a distribution over a set of hidden states from the speech signal, and a control layer, which selects the optimal motion based on the inferred state distribution. The inference layer, consisting of a specialized conditional random field, learns the hidden structure in body language style and associates it with acoustic features in speech. The control layer uses reinforcement learning to construct an optimal policy for selecting motion clips from a distribution over the learned hidden states. The modularity of the proposed method allows customization of a character's gesture repertoire, animation of non-human characters, and the use of additional inputs such as speech recognition or direct user control.

143 citations


Patent
21 Jun 2010
TL;DR: In this article, a system and method provides maps identifying the 3D location of traffic lights, which can then be used to assist robotic vehicles or human drivers to identify the location and status of a traffic signal.
Abstract: A system and method provides maps identifying the 3D location of traffic lights. The position, location, and orientation of a traffic light may be automatically extrapolated from two or more images. The maps may then be used to assist robotic vehicles or human drivers to identify the location and status of a traffic signal.

Proceedings ArticleDOI
15 Dec 2010
TL;DR: A new performance capture approach that incorporates a physically-based cloth model to reconstruct a rigged fully-animatable virtual double of a real person in loose apparel from multi-view video recordings and can now also create new real-time animations of actors captured in general apparel.
Abstract: We present a new performance capture approach that incorporates a physically-based cloth model to reconstruct a rigged fully-animatable virtual double of a real person in loose apparel from multi-view video recordings. Our algorithm only requires a minimum of manual interaction. Without the use of optical markers in the scene, our algorithm first reconstructs skeleton motion and detailed time-varying surface geometry of a real person from a reference video sequence. These captured reference performance data are then analyzed to automatically identify non-rigidly deforming pieces of apparel on the animated geometry. For each piece of apparel, parameters of a physically-based real-time cloth simulation model are estimated, and surface geometry of occluded body regions is approximated. The reconstructed character model comprises a skeleton-based representation for the actual body parts and a physically-based simulation model for the apparel. In contrast to previous performance capture methods, we can now also create new real-time animations of actors captured in general apparel.

Proceedings ArticleDOI
03 May 2010
TL;DR: This work applies its approach to the task of autonomous sideways sliding into a parking spot, and shows that it can repeatedly and accurately control the system, placing the car within about 2 feet of the desired location; this represents the state of the art in terms of accurately controlling a vehicle in such a maneuver.
Abstract: We consider the task of accurately controlling a complex system, such as autonomously sliding a car sideways into a parking spot. Although certain regions of this domain are extremely hard to model (i.e., the dynamics of the car while skidding), we observe that in practice such systems are often remarkably deterministic over short periods of time, even in difficult-to-model regions. Motivated by this intuition, we develop a probabilistic method for combining closed-loop control in the well-modeled regions and open-loop control in the difficult-to-model regions. In particular, we show that by combining 1) an inaccurate model of the system and 2) a demonstration of the desired behavior, our approach can accurately and robustly control highly challenging systems, without the need to explicitly model the dynamics in the most complex regions and without the need to hand-tune the switching control law. We apply our approach to the task of autonomous sideways sliding into a parking spot, and show that we can repeatedly and accurately control the system, placing the car within about 2 feet of the desired location; to the best of our knowledge, this represents the state of the art in terms of accurately controlling a vehicle in such a maneuver.

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This paper finds experimentally that fusion of multiple sensor modalities is necessary for optimal performance and demonstrates sub-meter localization accuracy.
Abstract: The interpretation of uncertain sensor streams for localization is usually considered in the context of a robot Increasingly, however, portable consumer electronic devices, such as smartphones, are equipped with sensors including WiFi radios, cameras, and inertial measurement units (IMUs) Many tasks typically associated with robots, such as localization, would be valuable to perform on such devices In this paper, we present an approach for indoor localization exclusively using the low-cost sensors typically found on smartphones Environment modification is not needed We rigorously evaluate our method using ground truth acquired using a laser range scanner Our evaluation includes overall accuracy and a comparison of the contribution of individual sensors We find experimentally that fusion of multiple sensor modalities is necessary for optimal performance and demonstrate sub-meter localization accuracy

Proceedings ArticleDOI
13 Jun 2010
TL;DR: An algorithm that learns invariant features from real data in an entirely unsupervised fashion that can be applied without human intervention to a particular application or data set, learning the specific invariances necessary for excellent feature performance on that data.
Abstract: We present an algorithm that learns invariant features from real data in an entirely unsupervised fashion. The principal benefit of our method is that it can be applied without human intervention to a particular application or data set, learning the specific invariances necessary for excellent feature performance on that data. Our algorithm relies on the ability to track image patches over time using optical flow. With the wide availability of high frame rate video (eg: on the web, from a robot), good tracking is straightforward to achieve. The algorithm then optimizes feature parameters such that patches corresponding to the same physical location have feature descriptors that are as similar as possible while simultaneously maximizing the distinctness of descriptors for different locations. Thus, our method captures data or application specific invariances yet does not require any manual supervision. We apply our algorithm to learn domain-optimized versions of SIFT and HOG. SIFT and HOG features are excellent and widely used. However, they are general and by definition not tailored to a specific domain. Our domain-optimized versions offer a substantial performance increase for classification and correspondence tasks we consider. Furthermore, we show that the features our method learns are near the optimal that would be achieved by directly optimizing the test set performance of a classifier. Finally, we demonstrate that the learning often allows fewer features to be used for some tasks, which has the potential to dramatically improve computational concerns for very large data sets.

Book ChapterDOI
01 Jan 2010
TL;DR: This chapter describes a new mesh-based performance capture algorithm that uses a combination of deformable surface and volume models for high-quality reconstruction of people in general apparel, i.e. also wide dresses and skirts.
Abstract: Nowadays, increasing performance of computing hardware makes it feasible to simulate ever more realistic humans even in real-time applications for the end-user. To fully capitalize on these computational resources, all aspects of the human, including textural appearance and lighting, and, most importantly, dynamic shape and motion have to be simulated at high fidelity in order to convey the impression of a realistic human being. In consequence, the increase in computing power is flanked by increasing requirements to the skills of the animators. In this chapter, we describe several recently developed performance capture techniques that enable animators to measure detailed animations from real world subjects recorded on multi-view video. In contrast to classical motion capture, performance capture approaches don’t only measure motion parameters without the use of optical markers, but also measure detailed spatio-temporally coherent dynamic geometry and surface texture of a performing subject. This chapter gives an overview of recent state-of-the-art performance capture approaches from the literature. The core of the chapter describes a new mesh-based performance capture algorithm that uses a combination of deformable surface and volume models for high-quality reconstruction of people in general apparel, i.e. also wide dresses and skirts. The chapter concludes with a discussion of the different approaches, pointers to additional literature and a brief outline of open research questions for the future.

Patent
25 Feb 2010
TL;DR: In this paper, the length of possible paths from a plurality of points on the three-dimensional surface mesh to a common reference point is categorized and used to identify a subset of the points as salient points.
Abstract: A variety of methods, systems, devices and arrangements are implemented for use with motion capture. One such method is implemented for identifying salient points from three-dimensional image data. The method involves the execution of instructions on a computer system to generate a three-dimensional surface mesh from the three-dimensional image data. Lengths of possible paths from a plurality of points on the three-dimensional surface mesh to a common reference point are categorized. The categorized lengths of possible paths are used to identify a subset of the plurality of points as salient points.

Patent
25 Feb 2010
TL;DR: In this paper, a depth-based image data is used in the system, which includes a processing circuit configured and arranged to render a plurality of orientations for at least one object.
Abstract: Systems, devices, methods and arrangements are implemented in a variety of embodiments to facilitate motion capture of objects. Consistent with one such system, three-dimensional representations are determined for at least one object. Depth-based image data is used in the system, which includes a processing circuit configured and arranged to render a plurality of orientations for at least one object. Orientations from the plurality of orientations are assessed against the depth-based image data. An orientation is selected from the plurality of orientations as a function of the assessment of orientations from the plurality of orientations.

Patent
01 Jun 2010
TL;DR: In this paper, a Markov Random Field (MRF) model is defined for estimating a number of mobile devices being used within a geographic area, and the estimated population density can then be used to provide location-based services.
Abstract: The population density for a geographic area is predicted using a Markov Random Field (MRF) model. A MRF model is defined for estimating a number of mobile devices being used within a geographic area. The MRF model includes a set of rules describing how to use current data describing mobile devices currently observed in the area, and historical data describing mobile devices historically observed in the area to produce the estimate. Values of weight parameters in the MRF model are learned using the historical data. The current and historical data are applied to the MRF model having the learned weight parameters, and cost minimization is used to estimate of the number of mobile devices currently being used within the area. This estimate is used to predict the population density for the area. The predicted population density can then be used to provide location-based services.

Patent
25 Feb 2010
TL;DR: In this paper, a system for tracking at least one object articulated in three-dimensional space is implemented using data obtained from a depth sensor, which includes at least a processing circuit configured and arranged to determine location probabilities for a plurality of object parts by identifying, from image data, features of the object parts.
Abstract: Methods, systems, devices and arrangements are implemented for motion tracking. One such system for tracking at least one object articulated in three-dimensional space is implemented using data obtained from a depth sensor. The system includes at least one processing circuit configured and arranged to determine location probabilities for a plurality of object parts by identifying, from image data obtained from the depth sensor, features of the object parts. The processing circuit selects a set of poses for the at least one object based upon the determined location probabilities and generates modeled depth sensor data by applying the selected set of poses to a model of the at least one object. The processing circuit selects a pose for the at least one object model-based based upon a probabilistic comparison between the data obtained from the depth sensor and the modeled depth sensor data.

Proceedings Article
23 Mar 2010
TL;DR: RASCL combines state-of-the-art sensing and localization techniques with an accurate map describing road structure to detect and track other cars, determine whether or not a lane change to either side is safe, and communicate these safety statuses to the user using a variety of audio and visual interfaces.
Abstract: Lane changing on highways is stressful. In this paper, we present RASCL, the Robotic Assistance System for Changing Lanes. RASCL combines state-of-the-art sensing and localization techniques with an accurate map describing road structure to detect and track other cars, determine whether or not a lane change to either side is safe, and communicate these safety statuses to the user using a variety of audio and visual interfaces. The user can interact with the system through specifying the size of their “comfort zone”, engaging the turn signal, or by simply driving across lane dividers. Additionally, RASCL provides speed change recommendations that are predicted to turn an unsafe lane change situation into a safe situation and enables communication with other vehicles by automatically controlling the turn signal when the driver attempts to change lanes without using the turn signal.

Proceedings ArticleDOI
02 Nov 2010
TL;DR: A description of the Google Street View Project is provided, from the beginnings to its current state, and some of the implications of this new database, and key challenges in moving ahead are discussed.
Abstract: A description of the Google Street View Project is provided, from the beginnings to its current state. Google Street View is perhaps the largest image data base ever collected. The goal of this project is to take panoramic images at every public place in this world and to make these images accessible through the Internet, so that people can "tele-port" themselves to anywhere, anytime. While the general idea behind Street View can be traced back to the MIT Movie Maps project that photographed major roads in Aspen in the late 1960s, a project of this sale and proportion has never been tried before. Many of the issues that arose in building up a fleet of vehicles, and processing endless pedabytes of image data in endless Google computer racks are reviewed. In addition, some of the implications of this new database, and key challenges in moving ahead are discussed.