Skeleton-based action recognition with convolutional neural networks

doi:10.1109/ICMEW.2017.8026285

Proceedings ArticleDOI

Skeleton-based action recognition with convolutional neural networks

Chao Li, +3 more

- pp 597-600

Chats0

TLDR

A novel convolutional neural networks (CNN) based framework for both action classification and detection of skeleton-based action recognition and a window proposal network to extract temporal segment proposals, which are further classified within the same network.

Abstract:

Current state-of-the-art approaches to skeleton-based action recognition are mostly based on recurrent neural networks (RNN). In this paper, we propose a novel convolutional neural networks (CNN) based framework for both action classification and detection. Raw skeleton coordinates as well as skeleton motion are fed directly into CNN for label prediction. A novel skeleton transformer module is designed to rearrange and select important skeleton joints automatically. With a simple 7-layer network, we obtain 89.3% accuracy on validation set of the NTU RGB+D dataset. For action detection in untrimmed videos, we develop a window proposal network to extract temporal segment proposals, which are further classified within the same network. On the recent PKU-MMD dataset, we achieve 93.7% mAP, surpassing the baseline by a large margin.

Citations

PDF

Open Access

More filters

Proceedings Article

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Sijie Yan, +2 more

TL;DR: Wang et al. as discussed by the authors proposed a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data.

...read moreread less

Proceedings ArticleDOI

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

Lei Shi, +3 more

TL;DR: Zhang et al. as mentioned in this paper proposed a two-stream adaptive graph convolutional network (2s-AGCN) to model both the first-order and the second-order information simultaneously, which shows notable improvement for the recognition accuracy.

...read moreread less

Proceedings ArticleDOI

Skeleton-Based Action Recognition With Directed Graph Neural Networks

Lei Shi, +3 more

TL;DR: A novel directed graph neural network is designed specially to extract the information of joints, bones and their relations and make prediction based on the extracted features and is tested on two large-scale datasets, NTU-RGBD and Skeleton-Kinetics, and exceeds state-of-the-art performance on both of them.

...read moreread less

Proceedings ArticleDOI

Skeleton-Based Action Recognition With Shift Graph Convolutional Network

Ke Cheng, +5 more

TL;DR: The proposed Shift-GCN notably exceeds the state-of-the-art methods with more than 10 times less computational complexity, and is composed of novel shift graph operations and lightweight point-wise convolutions.

...read moreread less

Proceedings ArticleDOI

Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation

Chao Li, +3 more

TL;DR: This paper introduces a global spatial aggregation scheme, which is able to learn superior joint co-occurrence features over local aggregation, and consistently outperforms other state-of-the-arts on action recognition and detection benchmarks like NTU RGB+D, SBU Kinect Interaction and PKU-MMD.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Proceedings Article

Two-Stream Convolutional Networks for Action Recognition in Videos

Karen Simonyan, +1 more

TL;DR: This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.

...read moreread less

Proceedings Article

Maxout Networks

Ian Goodfellow, +4 more

TL;DR: A simple new model called maxout is defined designed to both facilitate optimization by dropout and improve the accuracy of dropout's fast approximate model averaging technique.

...read moreread less

Posted Content

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Amir Shahroudy, +3 more

- 11 Apr 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, a large-scale dataset for RGB+D human action recognition was introduced with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects.

...read moreread less

Skeleton-based action recognition with convolutional neural networks

Citations

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

Skeleton-Based Action Recognition With Directed Graph Neural Networks

Skeleton-Based Action Recognition With Shift Graph Convolutional Network

Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation

References

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Two-Stream Convolutional Networks for Action Recognition in Videos

Maxout Networks

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Related Papers (5)

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Hierarchical recurrent neural network for skeleton based action recognition

Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition

Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group