Vision for robotic object manipulation in domestic settings

doi:10.1016/J.ROBOT.2005.03.011

Home
/
Papers
/
Vision for robotic object manipulation in domestic settings

Journal Article•DOI•

Vision for robotic object manipulation in domestic settings

Danica Kragic¹, Mårten Björkman¹, Henrik I. Christensen¹, Jan-Olof Eklundh¹•Institutions (1)

Royal Institute of Technology¹

31 Jul 2005-Robotics and Autonomous Systems (North-Holland)-Vol. 52, Iss: 1, pp 85-100

TL;DR: A vision system for robotic object manipulation tasks in natural, domestic environments and one important property is that the step from object recognition to pose estimation is completely automatic combining both appearance and geometric models.

read less

About: This article is published in Robotics and Autonomous Systems.The article was published on 2005-07-31. It has received 118 citations till now. The article focuses on the topics: Pose & 3D single-object recognition.

...read moreread less

Citations

PDF

Open Access

More filters

Towards vision-based deep reinforcement learning for robotic motion control

[...]

Fangyi Zhang¹, Jürgen Leitner², Michael Milford¹, Ben Upcroft¹, Peter Corke¹ - Show less +1 more•Institutions (2)

Queensland University of Technology¹, University of Lugano²

12 Nov 2015

TL;DR: In this article, a Deep Q Network (DQNets) was used to learn target reaching with a three-joint robot manipulator using external visual observation, which was demonstrated to perform target reaching after training in simulation.

...read moreread less

Abstract: This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and develop a system for learning target reaching with a three-joint robot manipulator using external visual observation. A Deep Q Network (DQN) was demonstrated to perform target reaching after training in simulation. Transferring the network to real hardware and real observation in a naive approach failed, but experiments show that the network works when replacing camera images with synthetic images.

...read moreread less

156 citations

Journal Article•DOI•

Learning grasping points with shape context

[...]

Jeannette Bohg¹, Danica Kragic¹•Institutions (1)

Royal Institute of Technology¹

01 Apr 2010-Robotics and Autonomous Systems

TL;DR: The results show that a combination of a descriptor based on shape context with a non-linear classification algorithm leads to a stable detection of grasping points for a variety of objects.

...read moreread less

152 citations

Cites background from "Vision for robotic object manipulat..."

...Although the problem is still unsolved for general scenes, we have demonstrated in our previous work how simple assumptions about the environment help in segmentation of table-top scenes, [46, 47, 48]....
[...]

Posted Content•

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control

[...]

Fangyi Zhang¹, Jürgen Leitner², Michael Milford¹, Ben Upcroft¹, Peter Corke¹ - Show less +1 more•Institutions (2)

Queensland University of Technology¹, University of Lugano²

12 Nov 2015-arXiv: Learning

TL;DR: In this paper, a Deep Q Network (DQNets) was used to learn target reaching with a three-joint robot manipulator using external visual observation, which was demonstrated to perform target reaching after training in simulation.

...read moreread less

151 citations

Journal Article•DOI•

Preparation of silver, gold and silver–gold bimetallic nanoparticles in w/o microemulsion containing TritonX-100

[...]

Angshuman Pal¹, Sunil Shah¹, Surekha Devi¹•Institutions (1)

Maharaja Sayajirao University of Baroda¹

20 Jul 2007-Colloids and Surfaces A: Physicochemical and Engineering Aspects

TL;DR: In this paper, the surface plasmon absorption maxima for bimetallic nanoparticles changes linearly with increasing Au mole ratio content in various alloy compositions, and the transmission electron microscopy (TEM) showed formation of particles of 5-50nm diameter.

...read moreread less

135 citations

Journal Article•DOI•

Data-driven grasping

[...]

Corey Goldfeder¹, Peter K. Allen¹•Institutions (1)

Columbia University¹

01 Jan 2010-Autonomous Robots

TL;DR: A novel framework for a data driven grasp planner that indexes partial sensor data into a database of 3D models with known grasps and transfersgrasps from those models to novel objects is proposed.

...read moreread less

Abstract: This thesis introduces a new framework for data-driven grasping. We assume, with neuropsychological justification, that human grasping is fundamentally example based rather than rule based. This means that planning grasps for a novel object will usually reduce to identifying it as similar to known objects with known grasp affordances. Our framework is intended to allow robots to mimic this approach. For robots to succeed in the real world, it is essential that they be able to grasp objects based on realistically available sensor data. However, most existing grasp planners assume that the robot has access to the full 3D geometry of all objects to be grasped, which is unscalable, or abandon 3D geometry entirely to plan grasps based on appearance, which is difficult to extend to dexterous hands. The core advantage of our data-driven framework is that it naturally allows grasps to be planned for partially sensed objects. We accomplish this by using the partial sensor data to find similar 3D models, which can be used as proxy geometries for grasp planning. Along with the framework, we present a new set of shape descriptors suitable for matching partial sensor data to similar - but not identical - 3D models. This is in contrast to most previous descriptors for partial matching, which tend to rely on local feature correspondences that will often not exist in our problem setting. In a similar vein we also present new algorithms for aligning the pose and scale of partial sensor data to the best matching models, where no local correspondences may be assumed to exist. Our grasp planner makes use of a grasp database, consisting of example grasps for a large number of 3D models. As no such database has previously existed, this thesis introduces the Columbia Grasp Database, a freely available collection of hundreds of thousands of grasps for thousands of 3D models using a variety of robotic hands. To construct this database we modified the Eigengrasp grasp planner, which uses a low dimensional control space to simplify the grasp search space. We also discuss some limitations of this planner and show how they can be addressed by model decomposition. Our use of a database of 3D models annotated with precomputed grasps suggests the possibility of annotating the models with other forms of information as well. With this in mind, we show how to leverage noisy user data downloaded from the internet to suggest likely text tags for previously unlabeled 3D models. Although this work has not yet been applied to the problem of grasp planning, we demonstrate a content-based 3D model search engine that implements our automatic labeling algorithm. The title of this thesis, “Data-Driven Grasping,” represents a vision of robotic grasping that is larger than our particular implementation. Instead, the major contribution of this thesis is bridging the worlds of robotics and 3D model similarity search. Content based search is an important area of research in its own right, but the combination of content based search and robotics is particularly exciting because it opens up the possibility of using the entire internet as a knowledge base for intelligent robots.

...read moreread less

120 citations

Cites background from "Vision for robotic object manipulat..."

...The authors of [Kragic et al., 2005] proposed reverting to simple grasping heuristics for unknown objects since it is “likely to be that the shape of an object has to be determined in order to successfully grasp it....
[...]
..., 2009] and [Kragic et al., 2005] which can find exactly known objects in the presence of strong occlusion....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

[...]

Martin A. Fischler¹, Robert C. Bolles¹•Institutions (1)

SRI International¹

01 Jun 1981-Communications of The ACM

TL;DR: New results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form that provide the basis for an automatic system that can solve the Location Determination Problem under difficult viewing.

...read moreread less

Abstract: A new paradigm, Random Sample Consensus (RANSAC), for fitting a model to experimental data is introduced. RANSAC is capable of interpreting/smoothing data containing a significant percentage of gross errors, and is thus ideally suited for applications in automated image analysis where interpretation is based on the data provided by error-prone feature detectors. A major portion of this paper describes the application of RANSAC to the Location Determination Problem (LDP): Given an image depicting a set of landmarks with known locations, determine that point in space from which the image was obtained. In response to a RANSAC requirement, new results are derived on the minimum number of landmarks needed to obtain a solution, and algorithms are presented for computing these minimum-landmark solutions in closed form. These results provide the basis for an automatic system that can solve the LDP under difficult viewing

...read moreread less

23,396 citations

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Vision for robotic object manipulat..." refers methods or result in this paper

...Since the observed matching scores did not significantly differ from those already published in Lowe [ 24 ] and Mikolajczyk and Schmid [30] we have chosen not to include any additional quantitative results....
[...]
...Two recognition modules are available for this purpose: (i) a feature based module based on Scale Invariant Feature Transform (SIFT) features Lowe [ 24 ], and (ii) an appearance based module using color histograms, Ekvall et al. [25]....
[...]
...For a more thorough analysis on the SIFT recognition performance we refer to Lowe [ 24 ]....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

A model of saliency-based visual attention for rapid scene analysis

[...]

Laurent Itti¹, Christof Koch¹, Ernst Niebur²•Institutions (2)

California Institute of Technology¹, Johns Hopkins University²

01 Nov 1998-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.

...read moreread less

Abstract: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing saliency. The system breaks down the complex problem of scene understanding by rapidly selecting, in a computationally efficient manner, conspicuous locations to be analyzed in detail.

...read moreread less

10,525 citations

Journal Article•DOI•

A tutorial on visual servo control

[...]

Seth Hutchinson¹, Gregory D. Hager², Peter Corke³•Institutions (3)

University of Illinois at Urbana–Champaign¹, Yale University², Commonwealth Scientific and Industrial Research Organisation³

01 Oct 1996

TL;DR: This article provides a tutorial introduction to visual servo control of robotic manipulators by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process.

...read moreread less

Abstract: This article provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to providing a basic conceptual framework. We begin by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process. We then present a taxonomy of visual servo control systems. The two major classes of systems, position-based and image-based systems, are then discussed in detail. Since any visual servo system must be capable of tracking image features in a sequence of images, we also include an overview of feature-based and correlation-based methods for tracking. We conclude the tutorial with a number of observations on the current directions of the research field of visual servo control.

...read moreread less

3,619 citations

"Vision for robotic object manipulat..." refers background in this paper

...Our current research considers the roblem of mobile manipulation in domestic settings here, in order for the robot to be able to detect and anipulate objects in the environment, robust visual eedback is of key importance....
[...]