Mining actionlet ensemble for action recognition with depth cameras

doi:10.1109/CVPR.2012.6247813

Proceedings ArticleDOI

Mining actionlet ensemble for action recognition with depth cameras

Jiang Wang, +3 more

- pp 1290-1297

Chats0

TLDR

An actionlet ensemble model is learnt to represent each action and to capture the intra-class variance, and novel features that are suitable for depth data are proposed.

Abstract:

Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations in the actions. In this paper, an actionlet ensemble model is learnt to represent each action and to capture the intra-class variance. In addition, novel features that are suitable for depth data are proposed. They are robust to noise, invariant to translational and temporal misalignments, and capable of characterizing both the human motion and the human-object interactions. The proposed approach is evaluated on two challenging action recognition datasets captured by commodity depth cameras, and another dataset captured by a MoCap system. The experimental evaluations show that the proposed approach achieves superior performance to the state of the art algorithms.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Going deeper with convolutions

Christian Szegedy, +8 more

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Proceedings Article

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Sijie Yan, +2 more

TL;DR: Wang et al. as discussed by the authors proposed a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data.

...read moreread less

Proceedings ArticleDOI

Hierarchical recurrent neural network for skeleton based action recognition

Yong Du, +2 more

TL;DR: This paper proposes an end-to-end hierarchical RNN for skeleton based action recognition, and demonstrates that the model achieves the state-of-the-art performance with high computational efficiency.

...read moreread less

Journal ArticleDOI

Enhanced Computer Vision With Microsoft Kinect Sensor: A Review

Jungong Han, +3 more

- 25 Jun 2013 -

IEEE Transactions on Systems, Man, and C...

TL;DR: A comprehensive review of recent Kinect-based computer vision algorithms and applications covering topics including preprocessing, object tracking and recognition, human activity analysis, hand gesture analysis, and indoor 3-D mapping.

...read moreread less

Posted Content

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Amir Shahroudy, +3 more

- 11 Apr 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, a large-scale dataset for RGB+D human action recognition was introduced with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Proceedings Article

Fast algorithms for mining association rules

Rakesh Agrawal, +1 more

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

Book

Discrete-Time Signal Processing

Alan V. Oppenheim, +1 more

TL;DR: In this paper, the authors provide a thorough treatment of the fundamental theorems and properties of discrete-time linear systems, filtering, sampling, and discrete time Fourier analysis.

...read moreread less

Proceedings ArticleDOI

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Svetlana Lazebnik, +2 more

TL;DR: This paper presents a method for recognizing scene categories based on approximate global geometric correspondence that exceeds the state of the art on the Caltech-101 database and achieves high accuracy on a large database of fifteen natural scene categories.

...read moreread less

Proceedings ArticleDOI

Learning realistic human actions from movies

Ivan Laptev, +3 more

TL;DR: A new method for video classification that builds upon and extends several recent ideas including local space-time features,space-time pyramids and multi-channel non-linear SVMs is presented and shown to improve state-of-the-art results on the standard KTH action dataset.

...read moreread less

Collapse

Mining actionlet ensemble for action recognition with depth cameras

Citations

Going deeper with convolutions

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Hierarchical recurrent neural network for skeleton based action recognition

Enhanced Computer Vision With Microsoft Kinect Sensor: A Review

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

References

Histograms of oriented gradients for human detection

Fast algorithms for mining association rules

Discrete-Time Signal Processing

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

Learning realistic human actions from movies

Related Papers (5)

Action recognition based on a bag of 3D points

View invariant human action recognition using histograms of 3D joints

Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group

Real-time human pose recognition in parts from single depth images

Hierarchical recurrent neural network for skeleton based action recognition