Home
/
Authors
/
Alvaro Collet

Author

Alvaro Collet

Other affiliations: Microsoft

Bio: Alvaro Collet is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Pose & Cognitive neuroscience of visual object recognition. The author has an hindex of 15, co-authored 17 publications receiving 2016 citations. Previous affiliations of Alvaro Collet include Microsoft.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

High-quality streamable free-viewpoint video

[...]

Alvaro Collet¹, Ming Chuang¹, Patrick Sweeney¹, Don Gillett¹, Dennis Evseev¹, David J. Calabrese¹, Hugues Hoppe¹, Adam G. Kirk¹, Steve Sullivan¹ - Show less +5 more•Institutions (1)

Microsoft¹

27 Jul 2015

TL;DR: This work presents the first end-to-end solution to create high-quality free-viewpoint video encoded as a compact data stream using a dense set of RGB and IR video cameras, generates dynamic textured surfaces, and compresses these to a streamable 3D video format.

...read moreread less

Abstract: We present the first end-to-end solution to create high-quality free-viewpoint video encoded as a compact data stream. Our system records performances using a dense set of RGB and IR video cameras, generates dynamic textured surfaces, and compresses these to a streamable 3D video format. Four technical advances contribute to high fidelity and robustness: multimodal multi-view stereo fusing RGB, IR, and silhouette information; adaptive meshing guided by automatic detection of perceptually salient areas; mesh tracking to create temporally coherent subsequences; and encoding of tracked textured meshes as an MPEG video stream. Quantitative experiments demonstrate geometric accuracy, texture fidelity, and encoding efficiency. We release several datasets with calibrated inputs and processed results to foster future research.

...read moreread less

520 citations

Journal Article•DOI•

The MOPED framework: Object recognition and pose estimation for manipulation

[...]

Alvaro Collet¹, Manuel Martinez¹, Siddhartha S. Srinivasa²•Institutions (2)

Carnegie Mellon University¹, Intel²

01 Sep 2011-The International Journal of Robotics Research

TL;DR: MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework is presented.

...read moreread less

Abstract: We present MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework. We address two main challenges in computer vision for robotics: robust performance in complex scenes, and low latency for real-time operation. We achieve robust performance with Iterative Clustering Estimation (ICE), a novel algorithm that iteratively combines feature clustering with robust pose estimation. Feature clustering quickly partitions the scene and produces object hypotheses. The hypotheses are used to further refine the feature clusters, and the two steps iterate until convergence. ICE is easy to parallelize, and easily integrates single- and multi-camera object recognition and pose estimation. We also introduce a novel object hypothesis scoring function based on M-estimator theory, and a novel pose clustering algorithm that robustly handles recognition outliers. We achieve scalability and low latency with an improved feature matching algorithm for large databases, a GPU/CPU hybrid architecture that exploits parallelism at all levels, and an optimized resource scheduler. We provide extensive experimental results demonstrating state-of-the-art performance in terms of recognition, scalability, and latency in real-world robotic applications.

...read moreread less

455 citations

Journal Article•DOI•

HERB: a home exploring robotic butler

[...]

Siddhartha S. Srinivasa¹, Dave Ferguson¹, Casey J. Helfrich¹, Dmitry Berenson², Alvaro Collet², Rosen Diankov², Garratt Gallagher², Geoffrey A. Hollinger², James J. Kuffner², Michael Vande Weghe² - Show less +6 more•Institutions (2)

Intel¹, Carnegie Mellon University²

01 Jan 2010-Autonomous Robots

TL;DR: New algorithms for searching for objects, learning to navigate in cluttered dynamic indoor scenes, recognizing and registering objects accurately in high clutter using vision, manipulating doors and other constrained objects using caging grasps, grasp planning and execution in clutter, and manipulation on pose and torque constraint manifolds are presented.

...read moreread less

Abstract: We describe the architecture, algorithms, and experiments with HERB, an autonomous mobile manipulator that performs useful manipulation tasks in the home We present new algorithms for searching for objects, learning to navigate in cluttered dynamic indoor scenes, recognizing and registering objects accurately in high clutter using vision, manipulating doors and other constrained objects using caging grasps, grasp planning and execution in clutter, and manipulation on pose and torque constraint manifolds We also present numerous severe real-world test results from the integration of these algorithms into a single mobile manipulator

...read moreread less

337 citations

Proceedings Article•DOI•

Object recognition and full pose registration from a single image for robotic manipulation

[...]

Alvaro Collet¹, Dmitry Berenson¹, Siddhartha S. Srinivasa², Dave Ferguson²•Institutions (2)

Carnegie Mellon University¹, Intel²

12 May 2009

TL;DR: This paper presents an approach for building metric 3D models of objects using local descriptors from several images, optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object.

...read moreread less

Abstract: Robust perception is a vital capability for robotic manipulation in unstructured scenes. In this context, full pose estimation of relevant objects in a scene is a critical step towards the introduction of robots into household environments. In this paper, we present an approach for building metric 3D models of objects using local descriptors from several images. Each model is optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object. Given a new test image, we match the local descriptors to our stored models online, using a novel combination of the RANSAC and Mean Shift algorithms to register multiple instances of each object. A robust initialization step allows for arbitrary rotation, translation and scaling of objects in the test images. The resulting system provides markerless 6-DOF pose estimation for complex objects in cluttered scenes. We provide experimental results demonstrating orientation and translation accuracy, as well a physical implementation of the pose output being used by an autonomous robot to perform grasping in highly cluttered scenes.

...read moreread less

310 citations

Proceedings Article•DOI•

Manipulation planning with Workspace Goal Regions

[...]

Dmitry Berenson¹, Siddhartha S. Srinivasa², Dave Ferguson², Alvaro Collet¹, James J. Kuffner¹ - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, Intel²

12 May 2009

TL;DR: An approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses and shows that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties.

...read moreread less

Abstract: We present an approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses Instead of specifying a discrete set of goals in the manipulator's configuration space, we specify goals more intuitively as volumes in the manipulator's workspace We show that WGRs provide a common framework for describing goal regions that are useful for grasping and manipulation We also describe two randomized planning algorithms capable of planning with WGRs The first is an extension of RRT-JT that interleaves exploration using a Rapidly-exploring Random Tree (RRT) with exploitation using Jacobian-based gradient descent toward WGR samples The second is the IKBiRRT algorithm, which uses a forward-searching tree rooted at the start and a backward-searching tree that is seeded by WGR samples We demonstrate both simulation and experimental results for a 7DOF WAM arm with a mobile base performing reaching and pick-and-place tasks Our results show that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties

...read moreread less

148 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Proceedings Article•DOI•

ORB: An efficient alternative to SIFT or SURF

[...]

Ethan Rublee¹, Vincent Rabaud¹, Kurt Konolige¹, Gary Bradski¹•Institutions (1)

Willow Garage¹

06 Nov 2011

TL;DR: This paper proposes a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise, and demonstrates through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations.

...read moreread less

Abstract: Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.

...read moreread less

8,702 citations

Proceedings Article•DOI•

Domain randomization for transferring deep neural networks from simulation to the real world

[...]

Josh Tobin¹, Rachel Fong², Alex Ray², Jonas Schneider², Wojciech Zaremba², Pieter Abbeel¹ - Show less +2 more•Institutions (2)

University of California, Berkeley¹, OpenAI²

20 Mar 2017

TL;DR: This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator, and achieves the first successful transfer of a deep neural network trained only on simulated RGB images to the real world for the purpose of robotic control.

...read moreread less

Abstract: Bridging the ‘reality gap’ that separates simulated robotics from experiments on hardware could accelerate robotic research through improved data availability. This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator. With enough variability in the simulator, the real world may appear to the model as just another variation. We focus on the task of object localization, which is a stepping stone to general robotic manipulation skills. We find that it is possible to train a real-world object detector that is accurate to 1.5 cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures. To demonstrate the capabilities of our detectors, we show they can be used to perform grasping in a cluttered environment. To our knowledge, this is the first successful transfer of a deep neural network trained only on simulated RGB images (without pre-training on real images) to the real world for the purpose of robotic control.

...read moreread less

2,079 citations

Proceedings Article•

A morphable model for the synthesis of 3D faces

[...]

Matthew Turk

01 Jan 1999

2,010 citations

Journal Article•DOI•

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection

[...]

Sergey Levine¹, Peter Pastor, Alex Krizhevsky¹, Julian Ibarz¹, Deirdre Quillen¹ - Show less +1 more•Institutions (1)

Google¹

01 Apr 2018-The International Journal of Robotics Research

TL;DR: The approach achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing, and illustrates that data from different robots can be combined to learn more reliable and effective grasping.

...read moreread less

Abstract: We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural netwo...

...read moreread less

1,402 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse