scispace - formally typeset
Search or ask a question
Proceedings Article

Treasure hunting for humanoids robot

TL;DR: The current status of the group in trying to make a humanoid robot autonomously build an internal representation of an object, and later on to find it in an unknown environment named "treasure hunting" is described.
Abstract: This paper intends to describe the current status of our group in trying to make a humanoid robot autonomously build an internal representation of an object, and later on to find it in an unknown environment. This problem is named "treasure hunting". In both cases, the main difficulty is to be able to find the next best position of the vision sensor in order to realize the behavior while taking care of the robots limitation. We briefly describe the models and the processes we are currently investigating in building this overall behavior. Along the description we stress the current key problems faced while trying to solve this problem.
Citations
More filters
Proceedings ArticleDOI
03 May 2010
TL;DR: This paper presents a systems for autonomous acquisition of visual object representations, which endows a humanoid robot with the ability to enrich its internal object representation and allows the realization of complex visual tasks.
Abstract: The autonomous acquisition of object representations which allow recognition, localization and grasping of objects in the environment is a challenging task, which has shown to be difficult. In this paper, we present a systems for autonomous acquisition of visual object representations, which endows a humanoid robot with the ability to enrich its internal object representation and allows the realization of complex visual tasks. More precisely, we present techniques for segmentation and modeling of objects held in the five-fingered robot hand. Multiple object views are generated by rotating the held objects in the robot's field of view. The acquired object representations are evaluated in the context of visual search and object recognition tasks in cluttered environments. Experimental results show successful implementation of the complete cycle from object exploration to object recognition on a humanoid robot.

52 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper introduces a novel inverse reachability map representation that can be used for fast pose generation and combine it with a next-best-view algorithm and shows that this approach enables the humanoid to efficiently cover room-sized environments with its camera.
Abstract: Covering a known 3D environment with a robot's camera is a commonly required task, for example in inspection and surveillance, mapping, or object search applications. In addition to the problem of finding a complete and efficient set of view points for covering the whole environment, humanoid robots also need to observe balance, energy, and kinematic constraints for reaching the desired view poses. In this paper, we approach this high-dimensional planning problem by introducing a novel inverse reachability map representation that can be used for fast pose generation and combine it with a next-best-view algorithm. We implemented our approach in ROS and tested it with a Nao robot on both simulated and real-world scenes. The experiments show that our approach enables the humanoid to efficiently cover room-sized environments with its camera.

17 citations

Journal ArticleDOI
TL;DR: An original method to build a visual model for unknown objects by a humanoid robot by addressing the problem as an active coupling between computer vision and whole-body posture generation by using the use of different optimization algorithms to find an optimal viewpoint.
Abstract: An original method to build a visual model for unknown objects by a humanoid robot is proposed. The algorithm ensures successful autonomous realization of this goal by addressing the problem as an active coupling between computer vision and whole-body posture generation. The visual model is built through the repeated execution of two processes. The first one considers the current knowledge about the visual aspects and the shape of the object to deduce a preferred viewpoint with the aim of reducing the uncertainty of the shape and appearance of the object. This is done while considering the constraints related to the embodiment of the vision sensors in the humanoid head. The second process generates a whole robot posture using the desired head pose while solving additional constraints such as collision avoidance and joint limitations. The main contribution of our approach relies on the use of different optimization algorithms to find an optimal viewpoint by including the humanoid specificities in terms of constraints, an embedded vision sensor, and redundant motion capabilities. This approach differs significantly from those of traditional works addressing the problem of autonomously building an object model.

13 citations

Proceedings ArticleDOI
06 Jul 2014
TL;DR: In this article, a method for learning specific object representations that can be applied (and reused) in visual detection and identification tasks is proposed. But this method relies on the use of Cartesian Genetic Programming (CGP).
Abstract: We propose a method for learning specific object representations that can be applied (and reused) in visual detection and identification tasks. A machine learning technique called Cartesian Genetic Programming (CGP) is used to create these models based on a series of images. Our research investigates how manipulation actions might allow for the development of better visual models and therefore better robot vision. This paper describes how visual object representations can be learned and improved by performing object manipulation actions, such as, poke, push and pick-up with a humanoid robot. The improvement can be measured and allows for the robot to select and perform the `right' action, i.e. the action with the best possible improvement of the detector.

11 citations

Proceedings ArticleDOI
06 May 2013
TL;DR: A novel saliency measure is proposed that includes accuracy requirements from the manipulation task in the saliency calculation and an iterative procedure based on spherical graphs is developed in order to decide for the best gaze direction.
Abstract: A major strength of humanoid robotics platforms consists in their potential to perform a wide range of manipulation tasks in human-centered environments thanks to their anthropomorphic design. Further, they offer active head-eye systems which allow to extend the observable workspace by employing active gaze control. In this work, we address the question where to look during manipulation tasks while exploiting these two key capabilities of humanoid robots. We present a solution to the gaze selection problem, which takes into account constraints derived from manipulation tasks. Thereby, three different subproblems are addressed: the representation of the acquired visual input, the calculation of saliency based on this representation, and the selection of the most suitable gaze direction. As representation of the visual input, a probabilistic environmental model is discussed, which allows to take into account the dynamic nature of manipulation tasks. At the core of the gaze selection mechanism, a novel saliency measure is proposed that includes accuracy requirements from the manipulation task in the saliency calculation. Finally, an iterative procedure based on spherical graphs is developed in order to decide for the best gaze direction. The feasibility of the approach is experimentally evaluated in the context of bimanual manipulation tasks on the humanoid robot ARMAR-III.

10 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Journal ArticleDOI
TL;DR: In this article, Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model are presented, and two new classes of methods are presented. But neither of these methods is suitable for handling general models with non-conjugate priors.
Abstract: This article reviews Markov chain methods for sampling from the posterior distribution of a Dirichlet process mixture model and presents two new classes of methods. One new approach is to make Metropolis—Hastings updates of the indicators specifying which mixture component is associated with each observation, perhaps supplemented with a partial form of Gibbs sampling. The other new approach extends Gibbs sampling for these indicators by using a set of auxiliary parameters. These methods are simple to implement and are more efficient than previous ways of handling general Dirichlet process mixture models with non-conjugate priors.

2,320 citations


"Treasure hunting for humanoids robo..." refers methods in this paper

  • ...The model is estimated by a Gibbs sampling algorithm [23] (specific case of Markov Chain Monte Carlo (MCMC) method)....

    [...]

Proceedings ArticleDOI
01 Dec 2001
TL;DR: This paper presents a method for combining multiple images of a 3D object into a single model representation that provides for recognition of 3D objects from any viewpoint, the generalization of models to non-rigid changes, and improved robustness through the combination of features acquired under a range of imaging conditions.
Abstract: There have been important recent advances in object recognition through the matching of invariant local image features. However, the existing approaches are based on matching to individual training images. This paper presents a method for combining multiple images of a 3D object into a single model representation. This provides for recognition of 3D objects from any viewpoint, the generalization of models to non-rigid changes, and improved robustness through the combination of features acquired under a range of imaging conditions. The decision of whether to cluster a training image into an existing view representation or to treat it as a new view is based on the geometric accuracy of the match to previous model views. A new probabilistic model is developed to reduce the false positive matches that would otherwise arise due to loosened geometric constraints on matching 3D and non-rigid models. A system has been developed based on these approaches that is able to robustly recognize 3D objects in cluttered natural images in sub-second times.

582 citations


"Treasure hunting for humanoids robo..." refers background in this paper

  • ...Many existing works focus on the environment exploration [4] or object recognition problems [5]....

    [...]

Journal ArticleDOI
01 Feb 1995
TL;DR: A survey of research in the area of vision sensor planning is presented, and a brief description of representative sensing strategies for the tasks of object recognition and scene reconstruction are presented.
Abstract: A survey of research in the area of vision sensor planning is presented. The problem can be summarized as follows: given information about the environment as well as information about the task that the vision system is to accomplish, develop strategies to automatically determine sensor parameter values that achieve this task with a certain degree of satisfaction. With such strategies, sensor parameters values can be selected and can be purposefully changed in order to effectively perform the task at hand. The focus here is on vision sensor planning for the task of robustly detecting object features. For this task, camera and illumination parameters such as position, orientation, and optical settings are determined so that object features are, for example, visible, in focus, within the sensor field of view, magnified as required, and imaged with sufficient contrast. References to, and a brief description of, representative sensing strategies for the tasks of object recognition and scene reconstruction are also presented. For these tasks, sensor configurations are sought that will prove most useful when trying to identify an object or reconstruct a scene. >

493 citations


"Treasure hunting for humanoids robo..." refers background in this paper

  • ...Hypothesis and limits of such works are detailed in these two surveys: [9] and [10]....

    [...]

Book ChapterDOI
07 May 2006
TL;DR: The results show that color descriptors remain reliable under photometric and geometrical changes, and with decreasing image quality, and for all experiments a combination of color and shape outperforms a pure shape-based approach.
Abstract: Although color is commonly experienced as an indispensable quality in describing the world around us, state-of-the art local feature-based representations are mostly based on shape description, and ignore color information. The description of color is hampered by the large amount of variations which causes the measured color values to vary significantly. In this paper we aim to extend the description of local features with color information. To accomplish a wide applicability of the color descriptor, it should be robust to : 1. photometric changes commonly encountered in the real world, 2. varying image quality, from high quality images to snap-shot photo quality and compressed internet images. Based on these requirements we derive a set of color descriptors. The set of proposed descriptors are compared by extensive testing on multiple applications areas, namely, matching, retrieval and classification, and on a wide variety of image qualities. The results show that color descriptors remain reliable under photometric and geometrical changes, and with decreasing image quality. For all experiments a combination of color and shape outperforms a pure shape-based approach.

488 citations


"Treasure hunting for humanoids robo..." refers methods in this paper

  • ...We also produce visual words based on color information by clustering color descriptors [22]....

    [...]