RGB-D-based action recognition datasets
TLDR
In this article, a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view, 10 multi-view and 7 multi-person datasets, is presented.About:
This article is published in Pattern Recognition.The article was published on 2016-12-01 and is currently open access. It has received 244 citations till now.read more
Citations
More filters
Journal ArticleDOI
Recent advances in convolutional neural networks
Jiuxiang Gu,Zhenhua Wang,Jason Kuen,Lianyang Ma,Amir Shahroudy,Bing Shuai,Ting Liu,Xingxing Wang,Gang Wang,Jianfei Cai,Tsuhan Chen +10 more
TL;DR: A broad survey of the recent advances in convolutional neural networks can be found in this article, where the authors discuss the improvements of CNN on different aspects, namely, layer design, activation function, loss function, regularization, optimization and fast computation.
Posted Content
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
TL;DR: In this paper, a large-scale dataset for RGB+D human action recognition was introduced with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects.
Proceedings ArticleDOI
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
TL;DR: A large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects is introduced and a new recurrent neural network structure is proposed to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification.
Posted Content
Recent Advances in Convolutional Neural Networks
Jiuxiang Gu,Zhenhua Wang,Jason Kuen,Lianyang Ma,Amir Shahroudy,Bing Shuai,Ting Liu,Xingxing Wang,Li Wang,Gang Wang,Jianfei Cai,Tsuhan Chen +11 more
TL;DR: This paper details the improvements of CNN on different aspects, including layer design, activation function, loss function, regularization, optimization and fast computation, and introduces various applications of convolutional neural networks in computer vision, speech and natural language processing.
Journal ArticleDOI
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
TL;DR: This work introduces a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames, and investigates a novel one-shot 3D activity recognition problem on this dataset.
References
More filters
Proceedings ArticleDOI
Large-Scale Video Classification with Convolutional Neural Networks
TL;DR: This work studies multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggests a multiresolution, foveated architecture as a promising way of speeding up the training.
Proceedings ArticleDOI
Actions as space-time shapes
TL;DR: The method is fast, does not require video alignment and is applicable in many scenarios where the background is known, and the robustness of the method is demonstrated to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action and low quality video.
Proceedings ArticleDOI
ActivityNet: A large-scale video benchmark for human activity understanding
TL;DR: This paper introduces ActivityNet, a new large-scale video benchmark for human activity understanding that aims at covering a wide range of complex human activities that are of interest to people in their daily living.
Proceedings ArticleDOI
Hierarchical recurrent neural network for skeleton based action recognition
Yong Du,Wei Wang,Liang Wang +2 more
TL;DR: This paper proposes an end-to-end hierarchical RNN for skeleton based action recognition, and demonstrates that the model achieves the state-of-the-art performance with high computational efficiency.