Showing papers on "Object model published in 2015"

PDF

Open Access

Proceedings Article•DOI•

In defense of color-based model-free tracking

[...]

Horst Possegger¹, Thomas Mauthner¹, Horst Bischof¹•Institutions (1)

07 Jun 2015

TL;DR: This paper proposes an efficient discriminative object model which allows for an efficient implementation to enable online object tracking in real-time and exploits this knowledge to adapt the object representation beforehand so that distractors are suppressed and the risk of drifting is significantly reduced.

...read moreread less

Abstract: In this paper, we address the problem of model-free online object tracking based on color representations. According to the findings of recent benchmark evaluations, such trackers often tend to drift towards regions which exhibit a similar appearance compared to the object of interest. To overcome this limitation, we propose an efficient discriminative object model which allows us to identify potentially distracting regions in advance. Furthermore, we exploit this knowledge to adapt the object representation beforehand so that distractors are suppressed and the risk of drifting is significantly reduced. We evaluate our approach on recent online tracking benchmark datasets demonstrating state-of-the-art results. In particular, our approach performs favorably both in terms of accuracy and robustness compared to recent tracking algorithms. Moreover, the proposed approach allows for an efficient implementation to enable online object tracking in real-time.

...read moreread less

366 citations

Proceedings Article•DOI•

Understanding tools: Task-oriented object modeling, learning and recognition

[...]

Yixin Zhu¹, Yibiao Zhao¹, Song-Chun Zhu¹•Institutions (1)

University of California, Los Angeles¹

07 Jun 2015

TL;DR: A new framework is presented - task-oriented modeling, learning and recognition which aims at understanding the underlying functions, physics and causality in using objects as “tools”, and any objects can be viewed as a hammer or a shovel.

...read moreread less

Abstract: In this paper, we present a new framework - task-oriented modeling, learning and recognition which aims at understanding the underlying functions, physics and causality in using objects as “tools”. Given a task, such as, cracking a nut or painting a wall, we represent each object, e.g. a hammer or brush, in a generative spatio-temporal representation consisting of four components: i) an affordance basis to be grasped by hand; ii) a functional basis to act on a target object (the nut), iii) the imagined actions with typical motion trajectories; and iv) the underlying physical concepts, e.g. force, pressure, etc. In a learning phase, our algorithm observes only one RGB-D video, in which a rational human picks up one object (i.e. tool) among a number of candidates to accomplish the task. From this example, our algorithm learns the essential physical concepts in the task (e.g. forces in cracking nuts). In an inference phase, our algorithm is given a new set of objects (daily objects or stones), and picks the best choice available together with the inferred affordance basis, functional basis, imagined human actions (sequence of poses), and the expected physical quantity that it will produce. From this new perspective, any objects can be viewed as a hammer or a shovel, and object recognition is not merely memorizing typical appearance examples for each category but reasoning the physical mechanisms in various tasks to achieve generalization.

...read moreread less

163 citations

Proceedings Article•DOI•

Unsupervised Object Discovery and Tracking in Video Collections

[...]

Suha Kwak¹, Minsu Cho¹, Ivan Laptev¹, Jean Ponce¹, Cordelia Schmid - Show less +1 more•Institutions (1)

École Normale Supérieure¹

07 Dec 2015

TL;DR: In this article, the authors address the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision, and formulate the problem as a combination of two complementary processes: discovery and tracking.

...read moreread less

Abstract: This paper addresses the problem of automatically localizing dominant objects as spatio-temporal tubes in a noisy collection of videos with minimal or even no supervision. We formulate the problem as a combination of two complementary processes: discovery and tracking. The first one establishes correspondences between prominent regions across videos, and the second one associates similar object regions within the same video. Interestingly, our algorithm also discovers the implicit topology of frames associated with instances of the same object class across different videos, a role normally left to supervisory information in the form of class labels in conventional image and video understanding methods. Indeed, as demonstrated by our experiments, our method can handle video collections featuring multiple object classes, and substantially outperforms the state of the art in colocalization, even though it tackles a broader problem with much less supervision.

...read moreread less

126 citations

Patent•

Three-dimensional (3d) printing

[...]

Alexey S. Kabalnov¹, Jacob Tyler Wright¹, Vladek P. Kasperchik¹•Institutions (1)

Hewlett-Packard¹

27 Apr 2015

TL;DR: In this paper, a 3D printing system comprises a coarse 3D interface to form a core and a fine 3D object shell around at least some of the 3D core.

...read moreread less

Abstract: In at least some examples, a three-dimensional (3D) printing system comprises a coarse 3D printing interface to form a 3D object core. The 3D printing system also comprises a fine 3D printing interface to form a 3D object shell around at least some of the 3D object core. The 3D printing system also comprises a controller to receive a dataset corresponding to a 3D object model and to direct the coarse 3D printing interface to form the 3D object core based on the dataset.

...read moreread less

115 citations

Journal Article•DOI•

Multi-View and 3D Deformable Part Models

[...]

Bojan Pepik¹, Michael Stark¹, Peter V. Gehler¹, Bernt Schiele¹•Institutions (1)

Max Planck Society¹

01 Nov 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work gradually extends the successful deformable part model to include viewpoint information and part-level 3D geometry information, resulting in several different models with different level of expressiveness, which provide consistently better joint object localization and viewpoint estimation than the state-of-the-art multi-view and 3D object detectors on various benchmarks.

...read moreread less

Abstract: As objects are inherently 3D, they have been modeled in 3D in the early days of computer vision. Due to the ambiguities arising from mapping 2D features to 3D models, 3D object representations have been neglected and 2D feature-based models are the predominant paradigm in object detection nowadays. While such models have achieved outstanding bounding box detection performance, they come with limited expressiveness, as they are clearly limited in their capability of reasoning about 3D shape or viewpoints. In this work, we bring the worlds of 3D and 2D object representations closer, by building an object detector which leverages the expressive power of 3D object representations while at the same time can be robustly matched to image evidence. To that end, we gradually extend the successful deformable part model [1] to include viewpoint information and part-level 3D geometry information, resulting in several different models with different level of expressiveness. We end up with a 3D object model, consisting of multiple object parts represented in 3D and a continuous appearance model. We experimentally verify that our models, while providing richer object hypotheses than the 2D object models, provide consistently better joint object localization and viewpoint estimation than the state-of-the-art multi-view and 3D object detectors on various benchmarks (KITTI [2] , 3D object classes [3] , Pascal3D+ [4] , Pascal VOC 2007 [5] , EPFL multi-view cars [6] ).

...read moreread less

90 citations

Journal Article•DOI•

View-based 3D object retrieval via multi-modal graph learning

[...]

Sicheng Zhao¹, Hongxun Yao¹, Yanhao Zhang¹, Yasi Wang¹, Shaohui Liu¹ - Show less +1 more•Institutions (1)

Harbin Institute of Technology¹

01 Jul 2015-Signal Processing

TL;DR: A feature fusion method via multi-modal graph learning for view-based 3D object retrieval via multigraph learning that demonstrates the superior performance of the proposed method, as compared to the state-of-the-art approaches.

...read moreread less

81 citations

Patent•

Method of converting 2d video to 3d video using 3d object models

[...]

Vicente Niebla, Tony Baldridge, Thomas Schad, Scott Jones

17 Sep 2015

TL;DR: In this paper, a method for converting 2D video to 3D video using 3D object models is presented, where the object models are obtained from 3D scanner data; planes, polygons, or surfaces may be fit to this data to generate a 3D model.

...read moreread less

Abstract: Method for converting 2D video to 3D video using 3D object models. Embodiments of the invention obtain a 3D object model for one or more objects in a 2D video scene, such as a character. Object models may for example be derived from 3D scanner data; planes, polygons, or surfaces may be fit to this data to generate a 3D model. In each frame in which a modeled object appears, the location and orientation of the 3D model may be determined in the frame, and a depth map for the object may be generated from the model. 3D video may be generated using the depth map. Embodiments may use feature tracking to automatically determine object location and orientation. Embodiments may use rigged 3D models with degrees of freedom to model objects with parts that move relative to one another.

...read moreread less

67 citations

Patent•

Method to monitor additive manufacturing process for detection and in-situ correction of defects

[...]

Riley Reese, Hemant Bheda, Wiener Mondesir

12 Feb 2015

TL;DR: In this article, a system and a method for real-time monitoring and identifying defects occurring in a 3D object build via an additive manufacturing process is presented, where a plurality of functional tool heads possessing freedom of motion in arbitrary planes and approach are automatically and independently controlled based on a feedback analysis from the printing process.

...read moreread less

Abstract: The present invention provides a system and a method for real time monitoring and identifying defects occurring in a three dimensional object build via an additive manufacturing process. Further, the present invention provides in-situ correction of such defects by a plurality of functional tool heads possessing freedom of motion in arbitrary planes and approach, where the functional tool heads are automatically and independently controlled based on a feedback analysis from the printing process, implementing analyzing techniques. Furthermore, the present invention provides a mechanism for analyzing defected data collected from detection devices and correcting tool path instructions and object model in-situ during construction of a 3D object. A build report is also generated that displays, in 3D space, the structural geometry and inherent properties of a final build object along with the features of corrected and uncorrected defects. Advantageously, the build report helps in improving 3D printing process for subsequent objects.

...read moreread less

65 citations

Patent•

Apparatus and process for forming three-dimensional objects

[...]

Deaville Todd¹•Institutions (1)

Magna International¹

27 Apr 2015

TL;DR: In this paper, a 3D printing head is used to fabricate a first portion of the 3D object by forming a plurality of successive layers of a first material, and a delivery head is also used for fabricating a second portion by dispensing onto the first part a continuous-fiber reinforced second material.

...read moreread less

Abstract: An apparatus for forming a three-dimensional (3D) object includes a 3D printing head, for fabricating a first portion of the 3D object by forming a plurality of successive layers of a first material. The apparatus also includes a delivery head for fabricating a second portion of the 3D object by dispensing onto the first portion of the 3D object a plurality of layers of a continuous-fiber reinforced second material. Further, the delivery head includes comprising a roller for pressing the continuous-fiber rein-forced second material into place during the dispensing thereof. A controller controls the 3D printing head and the delivery head to cooperatively form the 3D object, based on a dataset corresponding to a 3D object model.

...read moreread less

64 citations

Posted Content•

Tracking Randomly Moving Objects on Edge Box Proposals

[...]

Gao Zhu, Fatih Porikli, Hongdong Li

29 Jul 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents an object tracker that is not limited to a local search window and has ability to probe efficiently the entire frame and provides improved robustness for fast moving objects as well as for ultra low-frame-rate videos.

...read moreread less

Abstract: Most tracking-by-detection methods employ a local search window around the predicted object location in the current frame assuming the previous location is accurate, the trajectory is smooth, and the computational capacity permits a search radius that can accommodate the maximum speed yet small enough to reduce mismatches These, however, may not be valid always, in particular for fast and irregularly moving objects Here, we present an object tracker that is not limited to a local search window and has ability to probe efficiently the entire frame Our method generates a small number of "high-quality" proposals by a novel instance-specific objectness measure and evaluates them against the object model that can be adopted from an existing tracking-by-detection approach as a core tracker During the tracking process, we update the object model concentrating on hard false-positives supplied by the proposals, which help suppressing distractors caused by difficult background clutters, and learn how to re-rank proposals according to the object model Since we reduce significantly the number of hypotheses the core tracker evaluates, we can use richer object descriptors and stronger detector Our method outperforms most recent state-of-the-art trackers on popular tracking benchmarks, and provides improved robustness for fast moving objects as well as for ultra low-frame-rate videos

...read moreread less

64 citations

Journal Article•DOI•

Region-Based Object Recognition by Color Segmentation Using a Simplified PCNN

[...]

Yuli Chen¹, Yide Ma², Donghwan Kim³, Sung-Kee Park³•Institutions (3)

Shaanxi Normal University¹, Lanzhou University², Korea Institute of Science and Technology³

01 Aug 2015-IEEE Transactions on Neural Networks

TL;DR: The proposed SPCNN-RBOR method overcomes the drawback of feature-based methods that inevitably includes background information into local invariant feature descriptors when keypoints locate near object boundaries and is robust for diverse complex variations, even under partial occlusion and highly cluttered environments.

...read moreread less

Abstract: In this paper, we propose a region-based object recognition (RBOR) method to identify objects from complex real-world scenes. First, the proposed method performs color image segmentation by a simplified pulse-coupled neural network (SPCNN) for the object model image and test image, and then conducts a region-based matching between them. Hence, we name it as RBOR with SPCNN (SPCNN-RBOR). Hereinto, the values of SPCNN parameters are automatically set by our previously proposed method in terms of each object model. In order to reduce various light intensity effects and take advantage of SPCNN high resolution on low intensities for achieving optimized color segmentation, a transformation integrating normalized Red Green Blue (RGB) with opponent color spaces is introduced. A novel image segmentation strategy is suggested to group the pixels firing synchronously throughout all the transformed channels of an image. Based on the segmentation results, a series of adaptive thresholds, which is adjustable according to the specific object model is employed to remove outlier region blobs, form potential clusters, and refine the clusters in test images. The proposed SPCNN-RBOR method overcomes the drawback of feature-based methods that inevitably includes background information into local invariant feature descriptors when keypoints locate near object boundaries. A large number of experiments have proved that the proposed SPCNN-RBOR method is robust for diverse complex variations, even under partial occlusion and highly cluttered environments. In addition, the SPCNN-RBOR method works well in not only identifying textured objects, but also in less-textured ones, which significantly outperforms the current feature-based methods.

...read moreread less

Proceedings Article•DOI•

3D Tracking of Human Hands in Interaction with Unknown Objects.

[...]

Paschalis Panteleris¹, Nikolaos Kyriazis¹, Antonis A. Argyros¹•Institutions (1)

Foundation for Research & Technology – Hellas¹

01 Jan 2015

TL;DR: A novel approach that can track human hands in interaction with unknown objects and is close to that of [2], although the latter assumes that the object model is known a priori.

...read moreread less

Abstract: The analysis and the understanding of object manipulation scenarios based on computer vision techniques can be greatly facilitated if we can gain access to the full articulation of the manipulating hands and the 3D pose of the manipulated objects. Currently, there exist methods for tracking hands in interaction with objects whose 3D models are known [2]. There are also methods that can reconstruct 3D models of objects that are partially observable in each frame of a sequence [3]. However, no method can track hands in interaction with unknown objects, ie objects whose 3D model is not known a priori. In this paper we propose a novel approach that can track human hands in interaction with unknown objects. As illustrated in Fig.1, the input to the method is a sequence of RGBD frames showing the interaction of one or two hands with an unknown object. Starting with the raw depth map (left) we perform a pre-processing step and compute the scene point cloud. We employ an appropriately modified model based hand tracker [4] and temporal information to track the hand 3D positions and posture (middle bottom). In this process, a progressively built object model is also taken into account to cope with hand-object occlusions. We use the estimated fingertip positions of the hand to segment the manipulated object from the rest of the scene (middle top). The segmented object points are used to update the object position and orientation in the current frame and are integrated into the object 3D representation (right). More specifically, the work flow of the proposed approach consists of five main components linked together as shown in Fig. 2. At a first, preprocessing stage, the raw depth information from the sensor is prepared to enter the pipeline. A point cloud is computed along with the normals for each vertex. Then, the user’s hands are tracked in the scene. An articulated model for the left and right hands, with 26 degrees of freedom each, is fit to the pre-processed depth input. The current, possibly incomplete (or even empty, for the first frame) object model is incorporated to hand tracking to assist in handling hand/object occlusions. Using the computed 3D location of the user’s hands as well as the last position of the (possibly incomplete) object model, the region of the object is segmented in the input depth map. The hands are masked-out from the observation, by comparing it to the rendered hand models. Object tracking is achieved using a mutli-scale ICP [1]. The segmented object depth is used for a coarse to fine alignment with the (partially reconstructed) object model. Finally, the segmented and aligned depth data of the object with the current, partial 3D model are merged. The object’s 3D model is maintained in a voxel grid with a Truncated Signed Distance Function (TSDF) [3] representation. Experiment Proposed [2], GT model [2], Scanned model mean/median error mean/median error mean/median error Single hand, cat 0.42 / 0.39 0.47 / 0.43 0.45 / 0.43 Single hand, spray 0.65 / 0.63 0.70 / 0.53 0.63 / 0.47 Two hands, cat 0.38 / 0.34 0.33 / 0.31 0.44 / 0.39 Two hands, spray 0.59 / 0.44 0.51 / 0.38 0.62 / 0.41 Table 1: Hand tracking accuracy (in cm) measured on the synthetic datasets. The accuracy of the method is close to that of [2], although the latter assumes that the object model is known a priori.

...read moreread less

Journal Article•DOI•

Stereovision-Based Multiple Object Tracking in Traffic Scenarios Using Free-Form Obstacle Delimiters and Particle Filters

[...]

Andrei Vatavu, Radu Danescu, Sergiu Nedevschi

01 Feb 2015-IEEE Transactions on Intelligent Transportation Systems

TL;DR: The proposed solution combines the efficiency of the rigid model with the benefits of a flexible object model and accurately modeling the object geometry using the polygonal lines instead of a 3-D box and separating the position and speed tracking from the geometry tracking at the estimator level.

...read moreread less

Abstract: In this paper we present a stereovision-based approach for tracking multiple objects in crowded environments where, typically, the road lane markings are not visible and the surrounding infrastructure is not known. The proposed technique relies on measurement data provided by an intermediate occupancy grid derived from processing a stereovision-based elevation map and on free-form object delimiters extracted from this grid. Unlike other existing methods that track rigid objects using also rigid representations, we present a particle filter-based solution for tracking visual appearance-based free-form obstacle representations. At each step, the particle state is described by two components, i.e., the object's dynamic parameters and its estimated geometry. In order to solve the high-dimensionality state–space problem, a Rao–Blackwellized particle filter is used. By accurately modeling the object geometry using the polygonal lines instead of a 3-D box and, at the same time, separating the position and speed tracking from the geometry tracking at the estimator level, the proposed solution combines the efficiency of the rigid model with the benefits of a flexible object model.

...read moreread less

Proceedings Article•DOI•

Amodal Completion and Size Constancy in Natural Scenes

[...]

Abhishek Kar¹, Shubham Tulsiani¹, Joao Carreira¹, Jitendra Malik¹•Institutions (1)

University of California, Berkeley¹

07 Dec 2015

TL;DR: In this article, the authors propose a probabilistic framework for learning category-specific object size distributions from available annotations and leverage these in conjunction with amodal completions to infer veridical sizes of objects in novel images.

...read moreread less

Abstract: We consider the problem of enriching current object detection systems with veridical object sizes and relative depth estimates from a single image. There are several technical challenges to this, such as occlusions, lack of calibration data and the scale ambiguity between object size and distance. These have not been addressed in full generality in previous work. Here we propose to tackle these issues by building upon advances in object recognition and using recently created large-scale datasets. We first introduce the task of amodal bounding box completion, which aims to infer the the full extent of the object instances in the image. We then propose a probabilistic framework for learning category-specific object size distributions from available annotations and leverage these in conjunction with amodal completions to infer veridical sizes of objects in novel images. Finally, we introduce a focal length prediction approach that exploits scene recognition to overcome inherent scale ambiguities and demonstrate qualitative results on challenging real-world scenes.

...read moreread less

Proceedings Article•DOI•

RGB-D object modelling for object recognition and tracking

[...]

Johann Prankl¹, Aitor Aldoma¹, Alexander Svejda¹, Markus Vincze¹•Institutions (1)

Vienna University of Technology¹

17 Dec 2015

TL;DR: This work presents a flexible system to reconstruct 3D models of objects captured with an RGB-D sensor that allows the user to acquire a full 3D model of the object and is directly used by state-of-the-art object instance recognition and object tracking modules.

...read moreread less

Abstract: This work presents a flexible system to reconstruct 3D models of objects captured with an RGB-D sensor. A major advantage of the method is that unlike other modelling tools, our reconstruction pipeline allows the user to acquire a full 3D model of the object. This is achieved by acquiring several partial 3D models in different sessions—each individual session presenting the object of interest in different configurations that reveal occluded parts of the object — that are automatically merged together to reconstruct a full 3D model. In addition, the 3D models acquired by our system can be directly used by state-of-the-art object instance recognition and object tracking modules, providing object-perception capabilities to complex applications requiring these functionalities (e.g. human-object interaction analysis, robot grasping, etc.). The system does not impose constraints in the appearance of objects (textured, untextured) nor in the modelling setup (moving camera with static object or turn-table setups with static camera). The proposed reconstruction system has been used to model a large number of objects resulting in metrically accurate and visually appealing 3D models.

...read moreread less

Posted Content•

Learning Articulated Motions From Visual Demonstration

[...]

Sudeep Pillai¹, Matthew R. Walter¹, Seth Teller¹•Institutions (1)

Massachusetts Institute of Technology¹

05 Feb 2015-arXiv: Robotics

TL;DR: In this paper, the authors describe a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion.

...read moreread less

Abstract: Many functional elements of human homes and workplaces consist of rigid components which are connected through one or more sliding or rotating linkages. Examples include doors and drawers of cabinets and appliances; laptops; and swivel office chairs. A robotic mobile manipulator would benefit from the ability to acquire kinematic models of such objects from observation. This paper describes a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion. We envision that in future, a machine newly introduced to an environment could be shown by its human user the articulated objects particular to that environment, inferring from these "visual demonstrations" enough information to actuate each object independently of the user. Our method employs sparse (markerless) feature tracking, motion segmentation, component pose estimation, and articulation learning; it does not require prior object models. Using the method, a robot can observe an object being exercised, infer a kinematic model incorporating rigid, prismatic and revolute joints, then use the model to predict the object's motion from a novel vantage point. We evaluate the method's performance, and compare it to that of a previously published technique, for a variety of household objects.

...read moreread less

Journal Article•DOI•

Learning View-Model Joint Relevance for 3D Object Retrieval

[...]

Ke Lu¹, Ning He², Jian Xue¹, Jiyang Dong¹, Ling Shao³ - Show less +1 more•Institutions (3)

Chinese Academy of Sciences¹, Beijing Union University², Northumbria University³

28 Jan 2015-IEEE Transactions on Image Processing

TL;DR: This is the first work to jointly explore the view-based and model-based relevance among the 3D objects in a graph-based framework and demonstrates the effectiveness on retrieval accuracy of the proposed 3D object retrieval method.

...read moreread less

Abstract: 3D object retrieval has attracted extensive research efforts and become an important task in recent years. It is noted that how to measure the relevance between 3D objects is still a difficult issue. Most of the existing methods employ just the model-based or view-based approaches, which may lead to incomplete information for 3D object representation. In this paper, we propose to jointly learn the view-model relevance among 3D objects for retrieval, in which the 3D objects are formulated in different graph structures. With the view information, the multiple views of 3D objects are employed to formulate the 3D object relationship in an object hypergraph structure. With the model data, the model-based features are extracted to construct an object graph to describe the relationship among the 3D objects. The learning on the two graphs is conducted to estimate the relevance among the 3D objects, in which the view/model graph weights can be also optimized in the learning process. This is the first work to jointly explore the view-based and model-based relevance among the 3D objects in a graph-based framework. The proposed method has been evaluated in three data sets. The experimental results and comparison with the state-of-the-art methods demonstrate the effectiveness on retrieval accuracy of the proposed 3D object retrieval method.

...read moreread less

Journal Article•DOI•

Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments

[...]

S. Hamidreza Kasaei¹, Miguel Oliveira¹, Gi Hyun Lim¹, Luís Seabra Lopes¹, Ana Maria Tomé¹ - Show less +1 more•Institutions (1)

University of Aveiro¹

31 Jan 2015-Journal of Intelligent and Robotic Systems

TL;DR: An efficient approach capable of learning and recognizing object categories in an interactive and open-ended manner, which is able to interact with human users, learning new object categories continuously over time is presented.

...read moreread less

Abstract: 3D object detection and recognition is increasingly used for manipulation and navigation tasks in service robots. It involves segmenting the objects present in a scene, estimating a feature descriptor for the object view and, finally, recognizing the object view by comparing it to the known object categories. This paper presents an efficient approach capable of learning and recognizing object categories in an interactive and open-ended manner. In this paper, “open-ended” implies that the set of object categories to be learned is not known in advance. The training instances are extracted from on-line experiences of a robot, and thus become gradually available over time, rather than at the beginning of the learning process. This paper focuses on two state-of-the-art questions: (1) How to automatically detect, conceptualize and recognize objects in 3D scenes in an open-ended manner? (2) How to acquire and use high-level knowledge obtained from the interaction with human users, namely when they provide category labels, in order to improve the system performance? This approach starts with a pre-processing step to remove irrelevant data and prepare a suitable point cloud for the subsequent processing. Clustering is then applied to detect object candidates, and object views are described based on a 3D shape descriptor called spin-image. Finally, a nearest-neighbor classification rule is used to predict the categories of the detected objects. A leave-one-out cross validation algorithm is used to compute precision and recall, in a classical off-line evaluation setting, for different system parameters. Also, an on-line evaluation protocol is used to assess the performance of the system in an open-ended setting. Results show that the proposed system is able to interact with human users, learning new object categories continuously over time.

...read moreread less

Proceedings Article•DOI•

Unplanned, model-free, single grasp object classification with underactuated hands and force sensors

[...]

Minas Liarokapis¹, Berk Calli¹, Adam Spiers¹, Aaron M. Dollar¹•Institutions (1)

Yale University¹

17 Dec 2015

TL;DR: The technique leverages the benefits of simple, adaptive robot grippers (which can grasp successfully without prior knowledge of the hand or the object model), with an advanced machine learning technique (Random Forests) to discriminate between different object classes.

...read moreread less

Abstract: In this paper we present a methodology for discriminating between different objects using only a single force closure grasp with an underactuated robot hand equipped with force sensors. The technique leverages the benefits of simple, adaptive robot grippers (which can grasp successfully without prior knowledge of the hand or the object model), with an advanced machine learning technique (Random Forests). Unlike prior work in literature, the proposed methodology does not require object exploration, release or re-grasping and works for arbitrary object positions and orientations within the reach of a grasp. A two-fingered compliant, underactuated robot hand is controlled in an open-loop fashion to grasp objects with various shapes, sizes and stiffness. The Random Forests classification technique is used in order to discriminate between different object classes. The feature space used consists only of the actuator positions and the force sensor measurements at two specific time instances of the grasping process. A feature variables importance calculation procedure facilitates the identification of the most crucial features, concluding to the minimum number of sensors required. The efficiency of the proposed method is validated with two experimental paradigms involving two sets of fabricated model objects with different shapes, sizes and stiffness and a set of everyday life objects.

...read moreread less

Posted Content•

Joint Object and Part Segmentation using Deep Learned Potentials

[...]

Peng Wang¹, Xiaohui Shen², Zhe Lin², Scott Cohen², Brian Price², Alan L. Yuille¹ - Show less +2 more•Institutions (2)

University of California, Los Angeles¹, Adobe Systems²

01 May 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a joint solution that tackles semantic object and part segmentation simultaneously, in which higher object-level context is provided to guide part segmentsation, and more detailed part-level localization is utilized to refine object segmentation.

...read moreread less

Abstract: Segmenting semantic objects from images and parsing them into their respective semantic parts are fundamental steps towards detailed object understanding in computer vision. In this paper, we propose a joint solution that tackles semantic object and part segmentation simultaneously, in which higher object-level context is provided to guide part segmentation, and more detailed part-level localization is utilized to refine object segmentation. Specifically, we first introduce the concept of semantic compositional parts (SCP) in which similar semantic parts are grouped and shared among different objects. A two-channel fully convolutional network (FCN) is then trained to provide the SCP and object potentials at each pixel. At the same time, a compact set of segments can also be obtained from the SCP predictions of the network. Given the potentials and the generated segments, in order to explore long-range context, we finally construct an efficient fully connected conditional random field (FCRF) to jointly predict the final object and part labels. Extensive evaluation on three different datasets shows that our approach can mutually enhance the performance of object and part segmentation, and outperforms the current state-of-the-art on both tasks.

...read moreread less

Proceedings Article•DOI•

3D object class detection in the wild

[...]

Bojan Pepik¹, Michael Stark¹, Peter V. Gehler¹, Tobias Ritschel¹, Bernt Schiele¹ - Show less +1 more•Institutions (1)

Max Planck Society¹

07 Jun 2015

TL;DR: The 3D object class detection method consists of several stages gradually enriching the object detection output with object viewpoint, keypoints and 3D shape estimates, which achieves state-of-the-art performance in simultaneous 2D bounding box and viewpoint estimation on the challenging Pascal3D+ dataset.

...read moreread less

Abstract: Object class detection has been a synonym for 2D bounding box localization for the longest time, fueled by the success of powerful statistical learning techniques, combined with robust image representations. Only recently, there has been a growing interest in revisiting the promise of computer vision from the early days: to precisely delineate the contents of a visual scene, object by object, in 3D. In this paper, we draw from recent advances in object detection and 2D-3D object lifting in order to design an object class detector that is particularly tailored towards 3D object class detection. Our 3D object class detection method consists of several stages gradually enriching the object detection output with object viewpoint, keypoints and 3D shape estimates. Following careful design, in each stage it constantly improves the performance and achieves state-of-the-art performance in simultaneous 2D bounding box and viewpoint estimation on the challenging Pascal3D+ [50] dataset.

...read moreread less

Proceedings Article•DOI•

Real-time object detection, localization and verification for fast robotic depalletizing

[...]

Dirk Holz¹, Angeliki Topalidou-Kyniazopoulou¹, Jörg Stückler¹, Sven Behnke¹•Institutions (1)

University of Bonn¹

17 Dec 2015

TL;DR: A system for depalletizing and a complete pipeline for detecting and localizing objects as well as verifying that the found object does not deviate from the known object model, e.g., if it is not the object to pick.

...read moreread less

Abstract: Depalletizing is a challenging task for manipulation robots. Key to successful application are not only robustness of the approach, but also achievable cycle times in order to keep up with the rest of the process. In this paper, we propose a system for depalletizing and a complete pipeline for detecting and localizing objects as well as verifying that the found object does not deviate from the known object model, e.g., if it is not the object to pick. In order to achieve high robustness (e.g., with respect to different lighting conditions) and generality with respect to the objects to pick, our approach is based on multi-resolution surfel models. All components (both software and hardware) allow operation at high frame rates and, thus, allow for low cycle times. In experiments, we demonstrate depalletizing of automotive and other prefabricated parts with both high reliability (w.r.t. success rates) and efficiency (w.r.t. low cycle times).

...read moreread less

Patent•

Method and a system to optimize printing parameters in additive manufacturing process

[...]

Hemant Bheda, Wiener Mondesir, Riley Reese, Shekhar Mantha

16 Feb 2015

TL;DR: In this article, a system and a method for optimizing printing parameters, such as slicing parameters and tool path instructions, for additive manufacturing is presented, which includes a property analysis module that predicts and analyses properties of a filament object model, representing a constructed 3D object.

...read moreread less

Abstract: The present invention relates to a system and a method for optimizing printing parameters, such as slicing parameters and tool path instructions, for additive manufacturing. The present invention comprises a property analysis module that predicts and analyses properties of a filament object model, representing a constructed 3D object. The filament object model is generated based on the tool path instructions and user specified object properties. Analysis includes comparing the predicted filament object model properties with the user specified property requirements; and further modifying the printing parameters in order to meet the user specified property requirements.

...read moreread less

Journal Article•DOI•

Global optimal searching for textureless 3D object tracking

[...]

Guofeng Wang¹, Bin Wang¹, Fan Zhong¹, Xueying Qin¹, Baoquan Chen¹ - Show less +1 more•Institutions (1)

Shandong University¹

01 Jun 2015-The Visual Computer

TL;DR: A new method based on global optimization for searching 3D–2D correspondence between a known 3D object model and 2D scene edges in an image is proposed, which performs favorably compared to the state-of-the-art methods in highly cluttered backgrounds.

...read moreread less

Abstract: Textureless 3D object tracking of the object's position and orientation is a considerably challenging problem, for which a 3D model is commonly used. The 3D---2D correspondence between a known 3D object model and 2D scene edges in an image is standardly used to locate the 3D object, one of the most important problems in model-based 3D object tracking. State-of-the-art methods solve this problem by searching correspondences independently. However, this often fails in highly cluttered backgrounds, owing to the presence of numerous local minima. To overcome this problem, we propose a new method based on global optimization for searching these correspondences. With our search mechanism, a graph model based on an energy function is used to establish the relationship of the candidate correspondences. Then, the optimal correspondences can be efficiently searched with dynamic programming. Qualitative and quantitative experimental results demonstrate that the proposed method performs favorably compared to the state-of-the-art methods in highly cluttered backgrounds.

...read moreread less

Patent•

Systems and method of interacting with a virtual object

[...]

Raffi Bedikian, Hongyuan (Jimmy) He, David S. Holz

19 Feb 2015

TL;DR: In this paper, a method of interacting with a virtual object in an augmented reality space is described, which involves identifying a physical location of a device in at least one image of the AR space and generating for display a control coincident with a surface of the device.

...read moreread less

Abstract: The technology disclosed relates to a method of interacting with a virtual object. In particular, it relates to referencing a virtual object in an augmented reality space, identifying a physical location of a device in at least one image of the augmented reality space, generating for display a control coincident with a surface of the device, sensing interactions between at least one control object and the control coincident with the surface of the device, and generating data signaling manipulations of the control coincident with the surface of the device.

...read moreread less

Patent•

Object ingestion through canonical shapes, systems and methods

[...]

Kamil Wnuk, David Mckinnon, Jeremi Sudol, Bing Song, Matheen Siddiqui - Show less +1 more

16 Feb 2015

TL;DR: In this paper, an object recognition ingestion system is presented, which combines the canonical shape object along with the image data to create a model of the object, and then generates recognition descriptors from each of the model PoVs, which are combined into key frame bundles having sufficient information to allow other computing devices to recognize the object at a later time.

...read moreread less

Abstract: An object recognition ingestion system is presented. The object ingestion system captures image data of objects, possibly in an uncontrolled setting. The image data is analyzed to determine if one or more a priori know canonical shape objects match the object represented in the image data. The canonical shape object also includes one or more reference PoVs indicating perspectives from which to analyze objects having the corresponding shape. An object ingestion engine combines the canonical shape object along with the image data to create a model of the object. The engine generates a desirable set of model PoVs from the reference PoVs, and then generates recognition descriptors from each of the model PoVs. The descriptors, image data, model PoVs, or other contextually relevant information are combined into key frame bundles having sufficient information to allow other computing devices to recognize the object at a later time.

...read moreread less

Journal Article•DOI•

Heterogeneous object modeling with material convolution surfaces

[...]

Vikas Gupta, Puneet Tandon¹•Institutions (1)

Indian Institute of Information Technology, Design and Manufacturing, Jabalpur¹

01 May 2015-Computer-aided Design

TL;DR: A stand-alone convolution surface-based modeling approach to model complex heterogeneous objects with multi-functional heterogeneities, entailing stratified sub-analytic boundary-representation, convolution material primitives, membership functions and material-potential functions is presented.

...read moreread less

Abstract: The possibility to attain diverse applications from heterogeneous objects calls for a generic and systematic modeling approach for design, analysis and rapid manufacturing of heterogeneous objects. The available heterogeneous object modeling techniques model simple material-distributions only and just a few of them are capable of modeling heterogeneous objects with complex geometries. Even these approaches have also, at time, shown some glitches while modeling complex objects with compound and irregular material variations. This paper unfolds the development of a stand-alone convolution surface-based modeling approach to model complex heterogeneous objects with multi-functional heterogeneities, entailing stratified sub-analytic boundary-representation, convolution material primitives, membership functions and material-potential functions. One-dimensional (associative and non-associative) and compound two-and three-dimensional material-distribution schemas are formulated and outlined to model simple, compound and irregular material-distributions in simple/complex geometry objects. The paper also illustrates a few examples of modeling complex heterogeneous objects by implementing the approach using specialized languages and software tools. A material convolution surface-based approach is presented for modeling complex heterogeneous objects.Complex one-dimensional material-distributions are modeled with material primitives and field functions.Schema for compound and irregular heterogeneities in two-and three-dimensions is formulated and outlined.We report a few examples of complex heterogeneous object modeling for the validation of proposed approach.

...read moreread less

Patent•

Systems and methods for object tracking

[...]

Jianfeng Ren¹, Feng Guo¹, Ruiduo Yang¹•Institutions (1)

Qualcomm¹

18 Aug 2015

TL;DR: In this paper, a method performed by an electronic device is described, which includes obtaining a first frame of a scene and performing object recognition of at least one object within a first bounding region of the first frame.

...read moreread less

Abstract: A method performed by an electronic device is described The method includes obtaining a first frame of a scene The method also includes performing object recognition of at least one object within a first bounding region of the first frame The method further includes performing object tracking of the at least one object within the first bounding region of the first frame The method additionally includes determining a second bounding region of a second frame based on the object tracking The second frame is subsequent to the first frame The method also includes determining whether the second bounding region is valid based on a predetermined object model

...read moreread less

Book Chapter•DOI•

Introducing Windows Communication Foundation

[...]

Andrew Troelsen

01 Jan 2015

TL;DR: Windows Communication Foundation is the name of the API designed specifically for the process of building distributed systems, providing a single, unified, and extendable programming object model that you can use to interact with a number of previously diverse distributed technologies.

...read moreread less

Abstract: Windows Communication Foundation (WCF) is the name of the API designed specifically for the process of building distributed systems. Unlike other specific distributed APIs you might have used in the past (e.g., DCOM, .NET remoting, XML web services, message queuing), WCF provides a single, unified, and extendable programming object model that you can use to interact with a number of previously diverse distributed technologies.

...read moreread less

Proceedings Article•DOI•

Concurrent learning of visual codebooks and object categories in open-ended domains

[...]

Miguel Oliveira¹, Luís Seabra Lopes¹, Gi Hyun Lim¹, S. Hamidreza Kasaei¹, Angel D. Sappa, Ana Maria Tomé¹ - Show less +2 more•Institutions (1)

University of Aveiro¹

17 Dec 2015

TL;DR: Results show that the proposed system with concurrent learning of object categories and codebooks is capable of learning more categories, requiring less examples, and with similar accuracies, when compared to the classical Bag of Words approach using codebooks constructed offline.

...read moreread less

Abstract: In open-ended domains, robots must continuously learn new object categories. When the training sets are created offline, it is not possible to ensure their representativeness with respect to the object categories and features the system will find when operating online. In the Bag of Words model, visual codebooks are usually constructed from training sets created offline. This might lead to non-discriminative visual words and, as a consequence, to poor recognition performance. This paper proposes a visual object recognition system which concurrently learns in an incremental and online fashion both the visual object category representations as well as the codebook words used to encode them. The codebook is defined using Gaussian Mixture Models which are updated using new object views. The approach contains similarities with the human visual object recognition system: evidence suggests that the development of recognition capabilities occurs on multiple levels and is sustained over large periods of time. Results show that the proposed system with concurrent learning of object categories and codebooks is capable of learning more categories, requiring less examples, and with similar accuracies, when compared to the classical Bag of Words approach using codebooks constructed offline.

...read moreread less

Collapse