scispace - formally typeset
Search or ask a question

Showing papers by "Sebastian Thrun published in 2011"


Proceedings ArticleDOI
05 Jun 2011
TL;DR: In order to achieve autonomous operation of a vehicle in urban situations with unpredictable traffic, several realtime systems must interoperate, including environment perception, localization, planning, and control.
Abstract: In order to achieve autonomous operation of a vehicle in urban situations with unpredictable traffic, several realtime systems must interoperate, including environment perception, localization, planning, and control. In addition, a robust vehicle platform with appropriate sensors, computational hardware, networking, and software infrastructure is essential.

1,199 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: This work presents a GraphSLAM-like algorithm for signal strength SLAM, which shares many of the benefits of Gaussian processes, yet is viable for a broader range of environments since it makes no signature uniqueness assumptions.
Abstract: The widespread deployment of wireless networks presents an opportunity for localization and mapping using only signal-strength measurements. The current state of the art is to use Gaussian process latent variable models (GP-LVM). This method works well, but relies on a signature uniqueness assumption which limits its applicability to only signal-rich environments. Moreover, it does not scale computationally to large sets of data, requiring O (N3) operations per iteration. We present a GraphSLAM-like algorithm for signal strength SLAM. Our algorithm shares many of the benefits of Gaussian processes, yet is viable for a broader range of environments since it makes no signature uniqueness assumptions. It is also more tractable to larger map sizes, requiring O (N2) operations per iteration. We compare our algorithm to a laser-SLAM ground truth, showing it produces excellent results in practice.

238 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: This paper presents a new track classification method, based on a mathematically principled method of combining log odds estimators, that is fast enough for real time use, is non-specific to object class, and performs well on the task of classifying correctly-tracked, well-segmented objects into car, pedestrian, bicyclist, and background classes.
Abstract: Object recognition is a critical next step for autonomous robots, but a solution to the problem has remained elusive. Prior 3D-sensor-based work largely classifies individual point cloud segments or uses class-specific trackers. In this paper, we take the approach of classifying the tracks of all visible objects. Our new track classification method, based on a mathematically principled method of combining log odds estimators, is fast enough for real time use, is non-specific to object class, and performs well (98.5% accuracy) on the task of classifying correctly-tracked, well-segmented objects into car, pedestrian, bicyclist, and background classes. We evaluate the classifier's performance using the Stanford Track Collection, a new dataset of about 1.3 million labeled point clouds in about 14,000 tracks recorded from an autonomous vehicle research platform. This dataset, which we make publicly available, contains tracks extracted from about one hour of 360-degree, 10Hz depth information recorded both while driving on busy campus streets and parked at busy intersections.

221 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: This work introduces a convenient technique for mapping traffic light locations from recorded video data using tracking, back-projection, and triangulation, and is the first to account for multiple lights per intersection, which yields superior results by probabilistically combining evidence from all available lights.
Abstract: Detection of traffic light state is essential for autonomous driving in cities. Currently, the only reliable systems for determining traffic light state information are non-passive proofs of concept, requiring explicit communication between a traffic signal and vehicle. Here, we present a passive camera-based pipeline for traffic light state detection, using (imperfect) vehicle localization and assuming prior knowledge of traffic light location. First, we introduce a convenient technique for mapping traffic light locations from recorded video data using tracking, back-projection, and triangulation. In order to achieve robust real-time detection results in a variety of lighting conditions, we combine several probabilistic stages that explicitly account for the corresponding sources of sensor and data uncertainty. In addition, our approach is the first to account for multiple lights per intersection, which yields superior results by probabilistically combining evidence from all available lights. To evaluate the performance of our method, we present several results across a variety of lighting conditions in a real-world environment. The techniques described here have for the first time enabled our autonomous research vehicle to successfully navigate through traffic-light-controlled intersections in real traffic.

177 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a version of Markov localization which provides accurate position estimates and which is tailored towards dynamic environments, where a probability density over the space of all locations of a robot in its environment is maintained.
Abstract: Localization, that is the estimation of a robot's location from sensor data, is a fundamental problem in mobile robotics. This papers presents a version of Markov localization which provides accurate position estimates and which is tailored towards dynamic environments. The key idea of Markov localization is to maintain a probability density over the space of all locations of a robot in its environment. Our approach represents this space metrically, using a fine-grained grid to approximate densities. It is able to globally localize the robot from scratch and to recover from localization failures. It is robust to approximate models of the environment (such as occupancy grid maps) and noisy sensors (such as ultrasound sensors). Our approach also includes a filtering technique which allows a mobile robot to reliably estimate its position even in densely populated environments in which crowds of people block the robot's sensors for extended periods of time. The method described here has been implemented and tested in several real-world applications of mobile robots, including the deployments of two mobile robots as interactive museum tour-guides.

128 citations


Proceedings ArticleDOI
01 Oct 2011
TL;DR: This paper is meant as an overview of the recent object recognition work done on Stanford's autonomous vehicle and the primary challenges along this particular path.
Abstract: This paper is meant as an overview of the recent object recognition work done on Stanford’s autonomous vehicle and the primary challenges along this particular path. The eventual goal is to provide practical object recognition systems that will enable new robotic applications such as autonomous taxis that recognize hailing pedestrians, personal robots that can learn about specific objects in your home, and automated farming equipment that is trained on-site to recognize the plants and materials that it must interact with. Recent work has made some progress towards object recognition that could fulfill these goals, but advances in modelfree segmentation and tracking algorithms are required for applicability beyond scenarios like driving in which model-free segmentation is often available. Additionally, online learning may be required to make use of the large amounts of labeled data made available by tracking-based semi-supervised learning.

89 citations


Journal ArticleDOI
TL;DR: It is shown that image mosaicing can be a powerful tool for widening the field of view and creating image maps of microanatomical structures and a global alignment algorithm that draws upon techniques commonly used in probabilistic robotics is presented.
Abstract: Recent advances in optical imaging have led to the development of miniature microscopes that can be brought to the patient for visualizing tissue structures in vivo. These devices have the potential to revolutionize health care by replacing tissue biopsy with in vivo pathology. One of the primary limitations of these microscopes, however, is that the constrained field of view can make image interpretation and navigation difficult. In this paper, we show that image mosaicing can be a powerful tool for widening the field of view and creating image maps of microanatomical structures. First, we present an efficient algorithm for pairwise image mosaicing that can be implemented in real time. Then, we address two of the main challenges associated with image mosaicing in medical applications: cumulative image registration errors and scene deformation. To deal with cumulative errors, we present a global alignment algorithm that draws upon techniques commonly used in probabilistic robotics. To accommodate scene deformation, we present a local alignment algorithm that incorporates deformable surface models into the mosaicing framework. These algorithms are demonstrated on image sequences acquired in vivo with various imaging devices including a hand-held dual-axes confocal microscope, a miniature two-photon microscope, and a commercially available confocal microendoscope.

83 citations


Patent
11 Jan 2011
TL;DR: In this article, the authors present an approach to arrange for free or discounted transportation to an advertiser's business location by automatically comparing the cost of transportation and the potential profit from a completed transaction using a number of real-time calculations.
Abstract: The present invention relates generally to arranging for free or discounted transportation to an advertiser's business location. More specifically, the invention involves automatically comparing the cost of transportation and the potential profit from a completed transaction using a number of real-time calculations. For example, the calculation may consider various factors including a consumer's current location, the consumer's most likely route and form of transportation (such as train, personal car, taxi, rental car, or shared vehicle), the consumer's daily agenda, the price competing advertisers are willing to pay for the customer to be delivered to alternate locations, and other costs. In this regard, the customer's obstacles to entering a business location are reduced while routing and cost calculations are automatically handled based on the demand for the advertiser's goods and potential profit margins.

49 citations


Proceedings ArticleDOI
27 Jun 2011
TL;DR: In this paper, a semi-supervised approach is proposed for track classification in dense 3D range data, which is based on the expectation-maximization algorithm, iteratively training a classifier and extracting useful training examples from unlabeled data by exploiting tracking information.
Abstract: We consider a semi-supervised approach to the problem of track classification in dense three-dimensional range data. This problem involves the classification of objects that have been segmented and tracked without the use of a class-specific tracker. This paper is an extended version of our previous work. We propose a method based on the expectation-maximization algorithm: iteratively (1) train a classifier, and (2) extract useful training examples from unlabeled data by exploiting tracking information. We evaluate our method on a large multiclass problem in dense range data collected from natural street scenes. When given only three hand-labeled training tracks of each object class, the final accuracy of the semi-supervised algorithm is comparable to that of the fully supervised equivalent which uses two orders of magnitude more. Further, we show experimentally that the accuracy of a classifier considered as a function of human labeling effort can be substantially improved using this method. Finally, we show that a simple algorithmic speedup based on incrementally updating a boosting classifier can reduce learning time by a factor of three.

41 citations


Patent
22 Nov 2011
TL;DR: In this article, a head-mounted display (HMD) system includes a processor data storage comprising user-interface logic executable by the at least one processor to receive data corresponding to first position of a HMD and responsively cause the HMD to display a user interface comprising a view region, content region, and history region located below the view region.
Abstract: Methods and devices for providing a user-interface are disclosed. In one aspect, a head-mounted-device system includes a processor data storage comprising user-interface logic executable by the at least one processor to receive data corresponding to first position of a head-mounted display (HMD) and responsively cause the HMD to display a user-interface comprising a view region, at least one content region located above the view region, and a history region located below the view region. The user-interface logic is further executable to receive data corresponding to an left or right movement of the HMD and responsively cause the HMD to move the field of view such that the at least one content region becomes more visible, for example, scrolling an item in a user interface. The scrolling may have a non-linear relationship with the head movement speed.

37 citations


Patent
08 Dec 2011
TL;DR: In this paper, a chord-based authentication on a touch-based interface is presented. But chord authentication requires the user interface to be composed of a plurality of input regions, and each chord is defined by touch interaction with a certain combination of one or more of the input regions.
Abstract: Exemplary methods and systems involve chord-based authentication on a touch-based interface. An exemplary method may involve: (a) providing a user-interface on a touch-based interface of a computing device, wherein the user-interface comprises a plurality of input regions; (b) receiving input data corresponding to a plurality of touch interactions on the touch-based interface; (c) determining a sequence of chords from the input data, wherein each chord is defined by touch interaction with a certain combination of one or more of the input regions; (d) determining that the sequence of chords substantially matches a predetermined chord authentication sequence; and (e) responsive to the match, causing a computing device to make at least one function accessible.

Patent
30 Nov 2011
TL;DR: In this article, the authors describe a system that allows a head-mounted display (HMD) to provide a graphical interface, the graphical interface comprising a view port having a view-port orientation and at least one navigable area having at least 1 border, the at least border having a first border orientation.
Abstract: Methods and systems involving navigation of a graphical interface are disclosed herein. An example system may be configured to: (a) cause a head-mounted display (HMD) to provide a graphical interface, the graphical interface comprising (i) a view port having a view-port orientation and (ii) at least one navigable area having at least one border, the at least one border having a first border orientation; (b) receive input data that indicates movement of the view port towards the at least one border; (c) determine that the view-port orientation is within a predetermined threshold distance from the first border orientation; and (d) based on at least the determination that the view-port orientation is within a predetermined threshold distance from the first border orientation, adjust the first border orientation from the first border orientation to a second border orientation.

Patent
08 Jul 2011
TL;DR: In this paper, the authors used a particle filter in conjunction with one or more orientation devices to identify a location of a client device with respect to a map of an indoor space.
Abstract: Aspects of the present disclosure relate generally to indoor localization, for example, where GPS or other localization signals are unavailable. More specifically, aspects relate to using a particle filter in conjunction with one or more orientation devices to identify a location of a client device with respect to a map of an indoor space. This location may then be used to identify the path of the client device through the indoor space. The paths of a plurality of different client devices through the same indoor space may be used to update the map based on common patterns or inconsistencies between the map and the paths of the plurality of client devices.

Patent
19 Mar 2011
TL;DR: In this article, a system for localizing human body parts such as hands, arms, shoulders, or even the fully body, with a processing device such as a computer along with a computer display to provide visual feedback on the display that encourages a user to maintain an ergonomic preferred position with ergonomically preferred motions.
Abstract: With the advent of touch-free interfaces such as described in the present disclosure, it is no longer necessary for computer interfaces to be in predefined locations (e.g., desktops) or configuration (e.g., rectangular keyboard). The present invention makes use of touch-free interfaces to encourage users to interface with a computer in an ergonomically sound manner. Among other things, the present invention implements a system for localizing human body parts such as hands, arms, shoulders, or even the fully body, with a processing device such as a computer along with a computer display to provide visual feedback on the display that encourages a user to maintain an ergonomically preferred position with ergonomically preferred motions. For example, the present invention encourages a user to maintain his motions within an ergonomically preferred range without have to reach out excessively or repetitively.

Patent
08 Jul 2011
TL;DR: In this article, a particle filter is used in conjunction with one or more orientation devices to identify a location of a client device with respect to a map of an indoor space. This location may then be used to identify the path of the client device through the indoor space, where GPS or other localization signals are unavailable.
Abstract: Aspects of the present disclosure relate generally to indoor localization, for example, where GPS or other localization signals are unavailable. More specifically, aspects relate to using a particle filter in conjunction with one or more orientation devices to identify a location of a client device with respect to a map of an indoor space. This location may then be used to identify the path of the client device through the indoor space.

Patent
17 Feb 2011
TL;DR: In this paper, the authors presented a system and computerized method for receiving image information and translating it to computer inputs, where image information is received for a predetermined action space to identify an active body part.
Abstract: The present invention provides a system and computerized method for receiving image information and translating it to computer inputs. In an embodiment of the invention, image information is received for a predetermined action space to identify an active body part. From such image information, depth information is extracted to interpret the actions of the active body part. Predetermined gestures can then be identified to provide input to a computer. For example, gestures that can be interpreted to mimic computerized touchscreen operation. Also, touchpad or mouse operations can be mimicked.

Journal ArticleDOI
TL;DR: In this article, the authors use exponential family principal components analysis (ECA) to represent sparse, high-dimensional belief spaces using small sets of learned features of the belief state.
Abstract: Standard value function approaches to finding policies for Partially Observable Markov Decision Processes (POMDPs) are generally considered to be intractable for large models. The intractability of these algorithms is to a large extent a consequence of computing an exact, optimal policy over the entire belief space. However, in real-world POMDP problems, computing the optimal policy for the full belief space is often unnecessary for good control even for problems with complicated policy classes. The beliefs experienced by the controller often lie near a structured, low-dimensional subspace embedded in the high-dimensional belief space. Finding a good approximation to the optimal value function for only this subspace can be much easier than computing the full value function. We introduce a new method for solving large-scale POMDPs by reducing the dimensionality of the belief space. We use Exponential family Principal Components Analysis (Collins, Dasgupta and Schapire, 2002) to represent sparse, high-dimensional belief spaces using small sets of learned features of the belief state. We then plan only in terms of the low-dimensional belief features. By planning in this low-dimensional space, we can find policies for POMDP models that are orders of magnitude larger than models that can be handled by conventional techniques. We demonstrate the use of this algorithm on a synthetic problem and on mobile robot navigation tasks.

Journal ArticleDOI
TL;DR: In this article, the authors describe state-of-the-art solutions to challenging tasks from the area of mobile robotics, autonomous cars, and activity recognition, which are all based on the paradigm of probabilistic state estimation.
Abstract: One of the ultimate goals of the field of artificial intelligence and robotics is to develop systems that assist us in our everyday lives by autonomously carrying out a variety of different tasks To achieve this and to generate appropriate actions, such systems need to be able to accurately interpret their sensory input and estimate their state or the state of the environment to be successful In recent years, probabilistic approaches have emerged as a key technology for these problems In this article, we will describe state-of-the-art solutions to challenging tasks from the area of mobile robotics, autonomous cars, and activity recognition, which are all based on the paradigm of probabilistic state estimation