Journal ArticleDOI
3D attention-driven depth acquisition for object identification
Kai Xu,Yifei Shi,Lintao Zheng,Junyu Zhang,Min Liu,Hui Huang,Hao Su,Daniel Cohen-Or,Baoquan Chen +8 more
- Vol. 35, Iss: 6, pp 238
TLDR
A 3D Attention Model that selects the best views to scan from, as well as the most informative regions in each view to focus on, to achieve efficient object recognition is developed, which leads to focus-driven features which are quite robust against object occlusion.Abstract:
We address the problem of autonomously exploring unknown objects in a scene by consecutive depth acquisitions. The goal is to reconstruct the scene while online identifying the objects from among a large collection of 3D shapes. Fine-grained shape identification demands a meticulous series of observations attending to varying views and parts of the object of interest. Inspired by the recent success of attention-based models for 2D recognition, we develop a 3D Attention Model that selects the best views to scan from, as well as the most informative regions in each view to focus on, to achieve efficient object recognition. The region-level attention leads to focus-driven features which are quite robust against object occlusion. The attention model, trained with the 3D shape collection, encodes the temporal dependencies among consecutive views with deep recurrent networks. This facilitates order-aware view planning accounting for robot movement cost. In achieving instance identification, the shape collection is organized into a hierarchy, associated with pre-trained hierarchical classifiers. The effectiveness of our method is demonstrated on an autonomous robot (PR) that explores a scene and identifies the objects to construct a 3D scene model.read more
Citations
More filters
Journal ArticleDOI
3D2SeqViews: Aggregating Sequential Views for 3D Global Feature Learning by CNN With Hierarchical Attention Aggregation
Zhizhong Han,Hong-Lei Lu,Zhenbao Liu,Chi-Man Vong,Yu-Shen Liu,Matthias Zwicker,Junwei Han,C. L. Philip Chen +7 more
TL;DR: 3D to Sequential Views (3D2SeqViews) is proposed to more effectively aggregate the sequential views using convolutional neural networks with a novel hierarchical attention aggregation to resolve the discriminability of learned features.
Journal ArticleDOI
Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey
TL;DR: This survey paper provides a comprehensive background to the developed techniques according to a taxonomy based on the scene understanding tasks, and summarizes the performance metrics used for evaluation in different tasks and a quantitative comparison among the recent state-of-the-art techniques.
Journal ArticleDOI
A multi-view recurrent neural network for 3D mesh segmentation
Truc Le,Giang Bui,Ye Duan +2 more
TL;DR: A multi-view recurrent neural network (MV-RNN) approach for 3D mesh segmentation that combines the convolutional neural networks and a two-layer long short term memory to yield coherent segmentation of 3D shapes is introduced.
Journal ArticleDOI
Language-driven synthesis of 3D scenes from scene databases
Rui Ma,Akshay Gadi Patil,Matthew Fisher,Manyi Li,Sören Pirk,Binh-Son Hua,Sai-Kit Yeung,Xin Tong,Leonidas J. Guibas,Hao Zhang +9 more
TL;DR: A novel framework for using natural language to generate and edit 3D indoor scenes, harnessing scene semantics and text-scene grounding knowledge learned from large annotated 3D scene databases is introduced.
Journal ArticleDOI
View planning in robot active vision: A survey of systems, algorithms, and applications
TL;DR: Some basic concepts of active robot vision are summarized, representative work on systems, algorithms and applications from four perspectives are reviewed from three perspectives: object reconstruction, scene reconstruction, object recognition, and pose estimation.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal ArticleDOI
ImageNet classification with deep convolutional neural networks
TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Journal ArticleDOI
Control of goal-directed and stimulus-driven attention in the brain
TL;DR: Evidence for partially segregated networks of brain areas that carry out different attentional functions is reviewed, finding that one system is involved in preparing and applying goal-directed selection for stimuli and responses, and the other is specialized for the detection of behaviourally relevant stimuli.
Journal ArticleDOI
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.