scispace - formally typeset
Search or ask a question

Showing papers by "Sebastian Thrun published in 2013"


Patent
28 Feb 2013
TL;DR: A passenger in an automated vehicle may relinquish control of the vehicle to a control computer when the control computer has determined that it may maneuver the vehicle safely to a destination as discussed by the authors.
Abstract: A passenger in an automated vehicle may relinquish control of the vehicle to a control computer when the control computer has determined that it may maneuver the vehicle safely to a destination. The passenger may relinquish or regain control of the vehicle by applying different degrees of pressure, for example, on a steering wheel of the vehicle. The control computer may convey status information to a passenger in a variety of ways including by illuminating elements of the vehicle. The color and location of the illumination may indicate the status of the control computer, for example, whether the control computer has been armed, is ready to take control of the vehicle, or is currently controlling the vehicle.

346 citations


Proceedings ArticleDOI
23 Jun 2013
TL;DR: Two new real-time techniques that enable camera-laser calibration online, automatically, and in arbitrary environments are introduced, including a probabilistic monitoring algorithm that can detect a sudden miscalibration in a fraction of a second, and a continuous calibration optimizer that adjusts transform offsets in real time, tracking gradual sensor drift as it occurs.
Abstract: The combined use of 3D scanning lasers with 2D cameras has become increasingly popular in mobile robotics, as the sparse depth measurements of the former augment the dense color information of the latter. Sensor fusion requires precise 6DOF transforms between the sensors, but hand-measuring these values is tedious and inaccurate. In addition, autonomous robots can be rendered inoperable if their sensors’ calibrations change over time. Yet previously published camera-laser calibration algorithms are offline only, requiring significant amounts of data and/or specific calibration targets; they are thus unable to correct calibration errors that occur during live operation. In this paper, we introduce two new real-time techniques that enable camera-laser calibration online, automatically, and in arbitrary environments. The first is a probabilistic monitoring algorithm that can detect a sudden miscalibration in a fraction of a second. The second is a continuous calibration optimizer that adjusts transform offsets in real time, tracking gradual sensor drift as it occurs. Although the calibration objective function is not globally convex and cannot be optimized in real time, in practice it is always locally convex around the global optimum, and almost everywhere else. Thus, the local shape of the objective function at the current parameters can be used to determine whether the sensors are calibrated, and allows the parameters to be adjusted gradually so as to maintain the global optimum. In several online experiments on thousands of frames in real markerless scenes, our method automatically detects miscalibrations within one second of the error exceeding .25 deg or 10cm, with an accuracy of 100%. In addition, rotational sensor drift can be tracked in real-time with a mean error of just .10 deg. Together, these techniques allow significantly greater flexibility and adaptability of robots in unknown and potentially harsh environments.

286 citations


Proceedings ArticleDOI
06 May 2013
TL;DR: A novel method to combine laser and camera data to achieve accurate velocity estimates of moving vehicles and a color-augmented search algorithm is used to align the dense color point clouds from successive time frames for a moving vehicle, thereby obtaining a precise estimate of the tracked vehicle's velocity.
Abstract: Precision tracking is important for predicting the behavior of other cars in autonomous driving We present a novel method to combine laser and camera data to achieve accurate velocity estimates of moving vehicles We combine sparse laser points with a high-resolution camera image to obtain a dense colored point cloud We use a color-augmented search algorithm to align the dense color point clouds from successive time frames for a moving vehicle, thereby obtaining a precise estimate of the tracked vehicle's velocity Using this alignment method, we obtain velocity estimates at a much higher accuracy than previous methods Through pre-filtering, we are able to achieve near real time results We also present an online method for real-time use with accuracies close to that of the full method We present a novel approach to quantitatively evaluate our velocity estimates by tracking a parked car in a local reference frame in which it appears to be moving relative to the ego vehicle We use this evaluation method to automatically quantitatively evaluate our tracking performance on 466 separate tracked vehicles Our method obtains a mean absolute velocity error of 027 m/s and an RMS error of 047 m/s on this test set We can also qualitatively evaluate our method by building color 3D car models from moving vehicles We have thus demonstrated that our method can be used for precision car tracking with applications to autonomous driving and behavior modeling

95 citations


Proceedings ArticleDOI
23 Jun 2013
TL;DR: A new, generic approach to the calibration of depth sensor intrinsics that requires only the ability to run SLAM is presented, advantageous both for individuals who wish to calibrate their own sensors as well as for a robot that needs to calibration automatically while in the field.
Abstract: We present a new, generic approach to the calibration of depth sensor intrinsics that requires only the ability to run SLAM. In particular, no specialized hardware, calibration target, or hand measurement is required. Essential to this approach is the idea that certain intrinsic parameters, identified here as myopic, govern distortions that increase with range. We demonstrate these ideas on the calibration of the popular Kinect and Xtion Pro Live RGBD sensors, which typically exhibit significant depth distortion at ranges greater than three meters. Making use of the myopic property, we show how to efficiently learn a discrete grid of 32,000 depth multipliers that resolve this distortion. Compared to the most similar unsupervised calibration work in the literature, this is a 100-fold increase in the maximum number of calibration parameters previously learned. Compared to the supervised calibration approach, the work of this paper means the difference between A) printing a poster of a checkerboard, mounting it to a rigid plane, and recording data of it from many different angles and ranges a process that often requires two people or repeated use of a special easel versus B) recording a few minutes of data from unmodified, natural environments. This is advantageous both for individuals who wish to calibrate their own sensors as well as for a robot that needs to calibrate automatically while in the field.

92 citations


Journal ArticleDOI
TL;DR: It is shown the surprising result that 3D scans of reasonable quality can also be obtained with a sensor of such low data quality, which could make 3D scanning technology more accessible to everyday users.
Abstract: We describe a method for 3D object scanning by aligning depth scans that were taken from around an object with a Time-of-Flight (ToF) camera. These ToF cameras can measure depth scans at video rate. Due to comparably simple technology, they bear potential for economical production in big volumes. Our easy-to-use, cost-effective scanning solution, which is based on such a sensor, could make 3D scanning technology more accessible to everyday users. The algorithmic challenge we face is that the sensor's level of random noise is substantial and there is a nontrivial systematic bias. In this paper, we show the surprising result that 3D scans of reasonable quality can also be obtained with a sensor of such low data quality. Established filtering and scan alignment techniques from the literature fail to achieve this goal. In contrast, our algorithm is based on a new combination of a 3D superresolution method with a probabilistic scan alignment approach that explicitly takes into account the sensor's noise characteristics.

85 citations


Posted Content
TL;DR: The test is suitable for Bayesian network learning algorithms that use independence tests to infer the network structure, in domains that contain any mix of continuous, ordinal discrete, and categorical variables.
Abstract: In this paper we present a method ofcomputing the posterior probability ofconditional independence of two or morecontinuous variables from data,examined at several resolutions. Ourapproach is motivated by theobservation that the appearance ofcontinuous data varies widely atvarious resolutions, producing verydifferent independence estimatesbetween the variablesinvolved. Therefore, it is difficultto ascertain independence withoutexamining data at several carefullyselected resolutions. In our paper, weaccomplish this using the exactcomputation of the posteriorprobability of independence, calculatedanalytically given a resolution. Ateach examined resolution, we assume amultinomial distribution with Dirichletpriors for the discretized tableparameters, and compute the posteriorusing Bayesian integration. Acrossresolutions, we use a search procedureto approximate the Bayesian integral ofprobability over an exponential numberof possible histograms. Our methodgeneralizes to an arbitrary numbervariables in a straightforward manner.The test is suitable for Bayesiannetwork learning algorithms that useindependence tests to infer the networkstructure, in domains that contain anymix of continuous, ordinal andcategorical variables.

40 citations


Book ChapterDOI
TL;DR: It is shown that it is possible to achieve an order of magnitude speedup and thus real-time performance on a laptop computer by applying simple algorithmic optimizations to the original work, which makes this approach applicable to a broader range of tasks.
Abstract: We consider the problem of segmenting and tracking deformable objects in color video with depth (RGBD) data available from commodity sensors such as the Asus Xtion Pro Live or Microsoft Kinect. We frame this problem with very few assumptions-no prior object model, no stationary sensor, and no prior 3-D map-thus making a solution potentially useful for a large number of applications, including semi-supervised learning, 3-D model capture, and object recognition. Our approach makes use of a rich feature set, including local image appearance, depth discontinuities, optical flow, and surface normals to inform the segmentation decision in a conditional random field model. In contrast to previous work in this field, the proposed method learns how to best make use of these features from ground-truth segmented sequences. We provide qualitative and quantitative analyses which demonstrate substantial improvement over the state of the art. This paper is an extended version of our previous work. Building on our previous work, we show that it is possible to achieve an order of magnitude speedup and thus real-time performance ( ~ 20 FPS) on a laptop computer by applying simple algorithmic optimizations to the original work. This speedup comes at only a minor cost in overall accuracy and thus makes this approach applicable to a broader range of tasks. We demonstrate one such task: real-time, online, interactive segmentation to efficiently collect training data for an off-the-shelf object detector.

37 citations


Proceedings ArticleDOI
01 Nov 2013
TL;DR: This work proposes an entirely unsupervised procedure for calibrating the relative pose and time offsets of a pair of depth sensors, using the unstructured motion of objects in the scene to find potential correspondences between the sensor pair.
Abstract: While inexpensive depth sensors are becoming increasingly ubiquitous, field of view and self-occlusion constraints limit the information a single sensor can provide. For many applications one may instead require a network of depth sensors, registered to a common world frame and synchronized in time. Historically such a setup has required a tedious manual calibration procedure, making it infeasible to deploy these networks in the wild, where spatial and temporal drift are common. In this work, we propose an entirely unsupervised procedure for calibrating the relative pose and time offsets of a pair of depth sensors. So doing, we make no use of an explicit calibration target, or any intentional activity on the part of a user. Rather, we use the unstructured motion of objects in the scene to find potential correspondences between the sensor pair. This yields a rough transform which is then refined with an occlusion-aware energy minimization. We compare our results against the standard checkerboard technique, and provide qualitative examples for scenes in which such a technique would be impossible.

26 citations


Proceedings Article
01 Nov 2013
TL;DR: A new mathematical framework that rigorously encodes the intuition of [14] in an alternating optimization problem similar to expectation maximization (EM), but with the assumption that the unlabeled data comes in groups of instances that share the same hidden label is presented.
Abstract: Machine perception often requires a large amount of user-annotated data which is time-consuming, difficult, or expensive to collect. Perception systems should be easy to train by regular users, and this is currently far from the case. Our previous work, tracking-based semi-supervised learning [14], helped reduce the labeling burden by using tracking information to harvest new and useful training examples. However, [14] was designed for offline use; it assumed a fixed amount of unlabeled data and did not allow for corrections from users. In many practical robot perception scenarios we A) desire continuous learning over a long period of time, B) have a stream of unlabeled sensor data available rather than a fixed dataset, and C) are willing to periodically provide a small number of new training examples. In light of this, we present group induction, a new mathematical framework that rigorously encodes the intuition of [14] in an alternating optimization problem similar to expectation maximization (EM), but with the assumption that the unlabeled data comes in groups of instances that share the same hidden label. The mathematics suggest several improvements to the original heuristic algorithm, and make clear how to handle user interaction and streams of unlabeled data. We evaluate group induction on a track classification task from natural street scenes, demonstrating its ability to learn continuously, adapt to user feedback, and accurately recognize objects of interest.

9 citations


Posted Content
TL;DR: This presentation will introduce the audience to a new, emerging body of research on sequential Monte Carlo techniques in robotics, and discuss specific tricks necessary to make these techniques work in real - world domains.
Abstract: This presentation will introduce the audience to a new, emerging body of research on sequential Monte Carlo techniques in robotics. In recent years, particle filters have solved several hard perceptual robotic problems. Early successes were limited to low-dimensional problems, such as the problem of robot localization in environments with known maps. More recently, researchers have begun exploiting structural properties of robotic domains that have led to successful particle filter applications in spaces with as many as 100,000 dimensions. The presentation will discuss specific tricks necessary to make these techniques work in real - world domains,and also discuss open challenges for researchers IN the UAI community.

3 citations