Home
/
Authors
/
Jeannette Bohg

Author

Jeannette Bohg

Other affiliations: Royal Institute of Technology, Max Planck Society, Carnegie Mellon University

Bio: Jeannette Bohg is an academic researcher from Stanford University. The author has contributed to research in topics: Computer science & GRASP. The author has an hindex of 32, co-authored 132 publications receiving 3727 citations. Previous affiliations of Jeannette Bohg include Royal Institute of Technology & Max Planck Society.

Topics: Computer science, GRASP, Reinforcement learning, Robot, Video tracking ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Data-Driven Grasp Synthesis—A Survey

[...]

Jeannette Bohg, Antonio Morales, Tamim Asfour¹, Danica Kragic•Institutions (1)

Karlsruhe Institute of Technology¹

01 Apr 2014-IEEE Transactions on Robotics

TL;DR: A review of the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps and an overview of the different methodologies are provided, which draw a parallel to the classical approaches that rely on analytic formulations.

...read moreread less

Abstract: We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar, or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally, for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.

...read moreread less

859 citations

Proceedings Article•DOI•

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

[...]

Michelle A. Lee¹, Yuke Zhu¹, Krishnan Srinivasan¹, Parth Shah¹, Silvio Savarese¹, Li Fei-Fei¹, Animesh Garg¹, Jeannette Bohg¹ - Show less +4 more•Institutions (1)

Stanford University¹

20 May 2019

TL;DR: This work uses self-supervision to learn a compact and multimodal representation of sensory inputs, which can then be used to improve the sample efficiency of the policy learning of deep reinforcement learning algorithms.

...read moreread less

Abstract: Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. We evaluate our method on a peg insertion task, generalizing over different geometry, configurations, and clearances, while being robust to external perturbations. We present results in simulation and on a real robot.

...read moreread less

322 citations

Proceedings Article•DOI•

Leveraging big data for grasp planning

[...]

Daniel Kappler¹, Jeannette Bohg¹, Stefan Schaal¹•Institutions (1)

Max Planck Society¹

26 May 2015

TL;DR: A deep learning method is applied and it is shown that it can better leverage the large-scale database for prediction of grasp success compared to logistic regression and suggest that labels based on the physics-metric are less noisy than those from the υ-metrics and therefore lead to a better classification performance.

...read moreread less

Abstract: We propose a new large-scale database containing grasps that are applied to a large set of objects from numerous categories. These grasps are generated in simulation and are annotated with different grasp stability metrics. We use a descriptive and efficient representation of the local object shape at which each grasp is applied. Given this data, we present a two-fold analysis: (i) We use crowdsourcing to analyze the correlation of the metrics with grasp success as predicted by humans. The results show that the metric based on physics simulation is a more consistent predictor for grasp success than the standard υ-metric. The results also support the hypothesis that human labels are not required for good ground truth grasp data. Instead the physics-metric can be used to generate datasets in simulation that may then be used to bootstrap learning in the real world. (ii) We apply a deep learning method and show that it can better leverage the large-scale database for prediction of grasp success compared to logistic regression. Furthermore, the results suggest that labels based on the physics-metric are less noisy than those from the υ-metric and therefore lead to a better classification performance.

...read moreread less

284 citations

Journal Article•DOI•

Interactive Perception: Leveraging Action in Perception and Perception in Action

[...]

Jeannette Bohg¹, Karol Hausman², Bharath Sankaran², Oliver Brock³, Danica Kragic⁴, Stefan Schaal², Gaurav S. Sukhatme² - Show less +3 more•Institutions (4)

Max Planck Society¹, University of Southern California², Technical University of Berlin³, Royal Institute of Technology⁴

11 Aug 2017-IEEE Transactions on Robotics

TL;DR: This survey postulates this as a principle for robot perception and collects evidence in its support by analyzing and categorizing existing work in this area, and provides an overview of the most important applications of IP.

...read moreread less

Abstract: Recent approaches in robot perception follow the insight that perception is facilitated by interaction with the environment. These approaches are subsumed under the term Interactive Perception (IP). This view of perception provides the following benefits. First, interaction with the environment creates a rich sensory signal that would otherwise not be present. Second, knowledge of the regularity in the combined space of sensory data and action parameters facilitates the prediction and interpretation of the sensory signal. In this survey, we postulate this as a principle for robot perception and collect evidence in its support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of IP. We close this survey by discussing remaining open questions. With this survey, we hope to help define the field of Interactive Perception and to provide a valuable resource for future research.

...read moreread less

258 citations

Proceedings Article•DOI•

MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences

[...]

Xingyu Liu¹, Mengyuan Yan¹, Jeannette Bohg¹•Institutions (1)

Stanford University¹

01 Oct 2019

TL;DR: This work proposes a novel neural network architecture called MeteorNet for learning representations for dynamic 3D point cloud sequences that shows stronger performance than previous grid-based methods while achieving state-of-the-art performance on Synthia.

...read moreread less

Abstract: Understanding dynamic 3D environment is crucial for robotic agents and many other applications. We propose a novel neural network architecture called MeteorNet for learning representations for dynamic 3D point cloud sequences. Different from previous work that adopts a grid-based representation and applies 3D or 4D convolutions, our network directly processes point clouds. We propose two ways to construct spatiotemporal neighborhoods for each point in the point cloud sequence. Information from these neighborhoods is aggregated to learn features per point. We benchmark our network on a variety of 3D recognition tasks including action recognition, semantic segmentation and scene flow estimation. MeteorNet shows stronger performance than previous grid-based methods while achieving state-of-the-art performance on Synthia. MeteorNet also outperforms previous baseline methods that are able to process at most two consecutive point clouds. To the best of our knowledge, this is the first work on deep learning for dynamic raw point cloud sequences.

...read moreread less

169 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

On robust estimation of the location parameter

[...]

Frederick R. Forst

01 Jan 1980

3,652 citations

The theory of affordances

[...]

博之三嶋

01 Nov 2008

2,686 citations

Journal Article•

A sensorimotor account of vision and visual consciousness-Authors' Response-Acting out our sensory experience

[...]

J. Kevin O'Regan, Alva Noë

01 Jan 2001-Behavioral and Brain Sciences

TL;DR: In this article, the authors propose that the brain produces an internal representation of the world, and the activation of this internal representation is assumed to give rise to the experience of seeing, but it leaves unexplained how the existence of such a detailed internal representation might produce visual consciousness.

...read moreread less

Abstract: Many current neurophysiological, psychophysical, and psychological approaches to vision rest on the idea that when we see, the brain produces an internal representation of the world. The activation of this internal representation is assumed to give rise to the experience of seeing. The problem with this kind of approach is that it leaves unexplained how the existence of such a detailed internal representation might produce visual consciousness. An alternative proposal is made here. We propose that seeing is a way of acting. It is a particular way of exploring the environment. Activity in internal representations does not generate the experience of seeing. The outside world serves as its own, external, representation. The experience of seeing occurs when the organism masters what we call the governing laws of sensorimotor contingency. The advantage of this approach is that it provides a natural and principled way of accounting for visual consciousness, and for the differences in the perceived quality of sensory experience in the different sensory modalities. Several lines of empirical evidence are brought forward in support of the theory, in particular: evidence from experiments in sensorimotor adaptation, visual \"filling in,\" visual stability despite eye movements, change blindness, sensory substitution, and color perception.

...read moreread less

2,271 citations

Posted Content•

nuScenes: A multimodal dataset for autonomous driving

[...]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom - Show less +6 more

26 Mar 2019-arXiv: Learning

TL;DR: nuScenes as mentioned in this paper is the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view.

...read moreread less

Abstract: Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We define novel 3D detection and tracking metrics. We also provide careful dataset analysis as well as baselines for lidar and image based detection and tracking. Data, development kit and more information are available online.

...read moreread less

1,939 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse