Home
/
Authors
/
Manuel Martinez

Author

Manuel Martinez

Other affiliations: Carnegie Mellon University

Bio: Manuel Martinez is an academic researcher from Karlsruhe Institute of Technology. The author has contributed to research in topics: Codec & Pose. The author has an hindex of 12, co-authored 34 publications receiving 796 citations. Previous affiliations of Manuel Martinez include Carnegie Mellon University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The MOPED framework: Object recognition and pose estimation for manipulation

[...]

Alvaro Collet¹, Manuel Martinez¹, Siddhartha S. Srinivasa²•Institutions (2)

Carnegie Mellon University¹, Intel²

01 Sep 2011-The International Journal of Robotics Research

TL;DR: MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework is presented.

...read moreread less

Abstract: We present MOPED, a framework for Multiple Object Pose Estimation and Detection that seamlessly integrates single-image and multi-image object recognition and pose estimation in one optimized, robust, and scalable framework. We address two main challenges in computer vision for robotics: robust performance in complex scenes, and low latency for real-time operation. We achieve robust performance with Iterative Clustering Estimation (ICE), a novel algorithm that iteratively combines feature clustering with robust pose estimation. Feature clustering quickly partitions the scene and produces object hypotheses. The hypotheses are used to further refine the feature clusters, and the two steps iterate until convergence. ICE is easy to parallelize, and easily integrates single- and multi-camera object recognition and pose estimation. We also introduce a novel object hypothesis scoring function based on M-estimator theory, and a novel pose clustering algorithm that robustly handles recognition outliers. We achieve scalability and low latency with an improved feature matching algorithm for large databases, a GPU/CPU hybrid architecture that exploits parallelism at all levels, and an optimized resource scheduler. We provide extensive experimental results demonstrating state-of-the-art performance in terms of recognition, scalability, and latency in real-world robotic applications.

...read moreread less

455 citations

Proceedings Article•DOI•

MOPED: A scalable and low latency object recognition and pose estimation system

[...]

Manuel Martinez¹, Alvaro Collet¹, Siddhartha S. Srinivasa²•Institutions (2)

Carnegie Mellon University¹, Intel²

03 May 2010

TL;DR: MOPED builds on POSESEQ, a state of the art object recognition algorithm, demonstrating a massive improvement in scalability and latency without sacrificing robustness, with both algorithmic and architecture improvements.

...read moreread less

Abstract: The latency of a perception system is crucial for a robot performing interactive tasks in dynamic human environments. We present MOPED, a fast and scalable perception system for object recognition and pose estimation. MOPED builds on POSESEQ, a state of the art object recognition algorithm, demonstrating a massive improvement in scalability and latency without sacrificing robustness. We achieve this with both algorithmic and architecture improvements, with a novel feature matching algorithm, a hybrid GPU/CPU architecture that exploits parallelism at all levels, and an optimized resource scheduler. Using the same standard hardware, we achieve up to 30x improvement on real-world scenes.

...read moreread less

100 citations

Proceedings Article•DOI•

DriveAHead — A Large-Scale Driver Head Pose Dataset

[...]

Anke Schwarz¹, Monica Haurilet¹, Manuel Martinez¹, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

21 Jul 2017

TL;DR: This work introduces DriveAHead, a novel dataset designed to develop and evaluate head pose monitoring algorithms in real driving conditions, and presents the Head Pose Network, a deep learning model that achieves better performance than current state-of-the-art algorithms.

...read moreread less

Abstract: Head pose monitoring is an important task for driver assistance systems, since it is a key indicator for human attention and behavior. However, current head pose datasets either lack complexity or do not adequately represent the conditions that occur while driving. Therefore, we introduce DriveAHead, a novel dataset designed to develop and evaluate head pose monitoring algorithms in real driving conditions. We provide frame-by-frame head pose labels obtained from a motion-capture system, as well as annotations about occlusions of the driver's face. To the best of our knowledge, DriveAHead is the largest publicly available driver head pose dataset, and also the only one that provides 2D and 3D data aligned at the pixel level using the Kinect v2. Existing performance metrics are based on the mean error without any consideration of the bias towards one position or another. Here, we suggest a new performance metric, named Balanced Mean Angular Error, that addresses the bias towards the forward looking position existing in driving datasets. Finally, we present the Head Pose Network, a deep learning model that achieves better performance than current state-of-the-art algorithms, and we analyze its performance when using our dataset.

...read moreread less

58 citations

Proceedings Article•DOI•

Sleep position classification from a depth camera using Bed Aligned Maps

[...]

Timo Grimm¹, Manuel Martinez¹, Andreas Benz, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Dec 2016

TL;DR: This work suggests a non-intrusive and cost-efficient approach to detect the sleep position based on a single depth camera which outperforms current state-of-the-art algorithms and even the contact sensor from the sleep laboratory.

...read moreread less

Abstract: Sleep position is an important feature used to assess the quality and quantity of an individual's sleep. Furthermore, it is related to sleep disorders like sleep apnoea and snoring, and needs to be tracked in nursery homes to avoid pressure ulcers. Therefore, a gravity sensor attached to the chest is generally used to register body position during sleep studies. We suggest a non-intrusive and cost-efficient approach to detect the sleep position based on a single depth camera. Compared to alternative state-of-the-art approaches, ours require no calibration, and has been evaluated on a real setting comprising 78 patients from a sleep laboratory. We use the Bed Aligned Maps to extract a low resolution descriptor from a depth map which is aligned to the bed position, We perform classification using Convolutional Neural Networks, achieving an accuracy of 94.0%, thus outperforming current state-of-the-art algorithms and even the contact sensor from the sleep laboratory which achieves an accuracy of 91.9%.

...read moreread less

52 citations

Proceedings Article•

Breath rate monitoring during sleep using near-ir imagery and PCA

[...]

Manuel Martinez¹, Rainer Stiefelhagen¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Nov 2012

TL;DR: This work presents a vision based method to estimate the respiration rate of subjects from their chest movements that is fully automated, non-invasive, robust to occlusions, and only depends on off-the-shelf hardware.

...read moreread less

Abstract: We present a vision based method to estimate the respiration rate of subjects from their chest movements. In contrast to alternative approaches, our method is fully automated, non-invasive, robust to occlusions, and only depends on off-the-shelf hardware. We project a fixed infrared (IR) dot pattern. The dots are detected using a camera with a matching IR filter. We estimate the dots' barycenters with sub-pixel precision and we track them over a 30 seconds sliding window. We merge all trajectories using Principal Component Analysis(PCA) and use Autoregressive (AR) Spectral Analysis to estimate the respiratory rate. The system was evaluated on 9 subjects and on a range of simulated scenarios using an artificial chest.

...read moreread less

52 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

ORB: An efficient alternative to SIFT or SURF

[...]

Ethan Rublee¹, Vincent Rabaud¹, Kurt Konolige¹, Gary Bradski¹•Institutions (1)

Willow Garage¹

06 Nov 2011

TL;DR: This paper proposes a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise, and demonstrates through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations.

...read moreread less

Abstract: Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.

...read moreread less

8,702 citations

Proceedings Article•DOI•

Domain randomization for transferring deep neural networks from simulation to the real world

[...]

Josh Tobin¹, Rachel Fong², Alex Ray², Jonas Schneider², Wojciech Zaremba², Pieter Abbeel¹ - Show less +2 more•Institutions (2)

University of California, Berkeley¹, OpenAI²

20 Mar 2017

TL;DR: This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator, and achieves the first successful transfer of a deep neural network trained only on simulated RGB images to the real world for the purpose of robotic control.

...read moreread less

Abstract: Bridging the ‘reality gap’ that separates simulated robotics from experiments on hardware could accelerate robotic research through improved data availability. This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator. With enough variability in the simulator, the real world may appear to the model as just another variation. We focus on the task of object localization, which is a stepping stone to general robotic manipulation skills. We find that it is possible to train a real-world object detector that is accurate to 1.5 cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures. To demonstrate the capabilities of our detectors, we show they can be used to perform grasping in a cluttered environment. To our knowledge, this is the first successful transfer of a deep neural network trained only on simulated RGB images (without pre-training on real images) to the real world for the purpose of robotic control.

...read moreread less

2,079 citations

Journal Article•DOI•

Deep learning for detecting robotic grasps

[...]

Ian Lenz¹, Honglak Lee², Ashutosh Saxena¹•Institutions (2)

Cornell University¹, University of Michigan²

01 Apr 2015-The International Journal of Robotics Research

TL;DR: This work presents a two-step cascaded system with two deep networks, where the top detections from the first are re-evaluated by the second, and shows that this method improves performance on an RGBD robotic grasping dataset, and can be used to successfully execute grasps on two different robotic platforms.

...read moreread less

Abstract: We consider the problem of detecting robotic grasps in an RGB-D view of a scene containing objects. In this work, we apply a deep learning approach to solve this problem, which avoids time-consuming hand-design of features. This presents two main challenges. First, we need to evaluate a huge number of candidate grasps. In order to make detection fast and robust, we present a two-step cascaded system with two deep networks, where the top detections from the first are re-evaluated by the second. The first network has fewer features, is faster to run, and can effectively prune out unlikely candidate grasps. The second, with more features, is slower but has to run only on the top few detections. Second, we need to handle multimodal inputs effectively, for which we present a method that applies structured regularization on the weights based on multimodal group regularization. We show that our method improves performance on an RGBD robotic grasping dataset, and can be used to successfully execute grasps on two different robotic platforms.

...read moreread less

1,144 citations

Posted Content•

Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World

[...]

Josh Tobin¹, Rachel Fong², Alex Ray², Jonas Schneider², Wojciech Zaremba², Pieter Abbeel¹ - Show less +2 more•Institutions (2)

University of California, Berkeley¹, OpenAI²

20 Mar 2017-arXiv: Robotics

TL;DR: In this article, the authors use domain randomization to train a real-world object detector that is accurate to $1.5 cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures.

...read moreread less

Abstract: Bridging the 'reality gap' that separates simulated robotics from experiments on hardware could accelerate robotic research through improved data availability. This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator. With enough variability in the simulator, the real world may appear to the model as just another variation. We focus on the task of object localization, which is a stepping stone to general robotic manipulation skills. We find that it is possible to train a real-world object detector that is accurate to $1.5$cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures. To demonstrate the capabilities of our detectors, we show they can be used to perform grasping in a cluttered environment. To our knowledge, this is the first successful transfer of a deep neural network trained only on simulated RGB images (without pre-training on real images) to the real world for the purpose of robotic control.

...read moreread less

966 citations

Journal Article•DOI•

Data-Driven Grasp Synthesis—A Survey

[...]

Jeannette Bohg, Antonio Morales, Tamim Asfour¹, Danica Kragic•Institutions (1)

Karlsruhe Institute of Technology¹

01 Apr 2014-IEEE Transactions on Robotics

TL;DR: A review of the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps and an overview of the different methodologies are provided, which draw a parallel to the classical approaches that rely on analytic formulations.

...read moreread less

Abstract: We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar, or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally, for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.

...read moreread less

859 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse