A Comprehensive Survey of Vision-Based Human Action Recognition Methods.

doi:10.3390/S19051005

Open AccessJournal ArticleDOI

A Comprehensive Survey of Vision-Based Human Action Recognition Methods.

Hong-Bo Zhang, +6 more

- 27 Feb 2019 -

Sensors

- Vol. 19, Iss: 5, pp 1005

TLDR

This survey paper provides a comprehensive overview of recent approaches in human action recognition research, including progress in hand-designed action features in RGB and depth data, current deep learning-based action feature representation methods, advances in human–object interaction recognition methods, and the current prominent research topic of action detection methods.

Abstract:

Although widely used in many applications, accurate and efficient human action recognition remains a challenging area of research in the field of computer vision. Most recent surveys have focused on narrow problems such as human action recognition methods using depth data, 3D-skeleton data, still image data, spatiotemporal interest point-based methods, and human walking motion recognition. However, there has been no systematic survey of human action recognition. To this end, we present a thorough review of human action recognition methods and provide a comprehensive overview of recent approaches in human action recognition research, including progress in hand-designed action features in RGB and depth data, current deep learning-based action feature representation methods, advances in human⁻object interaction recognition methods, and the current prominent research topic of action detection methods. Finally, we present several analysis recommendations for researchers. This survey paper provides an essential reference for those interested in further research on human action recognition.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Vision-based human activity recognition: a survey

Djamila Romaissa Beddiar, +3 more

- 01 Nov 2020 -

Multimedia Tools and Applications

TL;DR: Most computer vision applications such as human computer interaction, virtual reality, security, video surveillance and home monitoring are highly correlated to HAR tasks, which establishes new trend and milestone in the development cycle of HAR systems.

...read moreread less

Book

Synthetic Data for Deep Learning

Sergey I. Nikolenko

TL;DR: The synthetic-to-real domain adaptation problem that inevitably arises in applications of synthetic data is discussed, including synthetic- to-real refinement with GAN-based models and domain adaptation at the feature/model level without explicit data transformations.

...read moreread less

Journal ArticleDOI

Skeleton-based action recognition via spatial and temporal transformer networks

Chiara Plizzari, +3 more

- 01 Jul 2021 -

Computer Vision and Image Understanding

TL;DR: This work proposes a novel Spatial-Temporal Transformer network (ST-TR) which models dependencies between joints using the Transformer self-attention operator, outperforming the state-of-the-art on NTU-RGB+D w.r.t.

...read moreread less

Book ChapterDOI

Spatial Temporal Transformer Network for Skeleton-based Action Recognition

Chiara Plizzari, +2 more

- 11 Dec 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes a novel Spatial-Temporal Transformer network (ST-TR) which models dependencies between joints using the Transformer self-attention operator, outperforming the state-of-the-art on NTU-RGB+D w.r.t. models using the same input data consisting of joint information.

...read moreread less

Journal ArticleDOI

Vision-based human action recognition: An overview and real world challenges

Imen Jegham, +3 more

TL;DR: This paper investigates an overview of the existing methods according to the kind of issue they address, and presents a comparison of the already introduced datasets introduced for the human action recognition field.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Representation Learning: A Review and New Perspectives

Yoshua Bengio, +2 more

- 01 Aug 2013 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.

...read moreread less

Proceedings Article

Two-Stream Convolutional Networks for Action Recognition in Videos

Karen Simonyan, +1 more

TL;DR: This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.

...read moreread less

Posted Content

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

Khurram Soomro, +2 more

- 03 Dec 2012 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work introduces UCF101 which is currently the largest dataset of human actions and provides baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%.

...read moreread less

Posted Content

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

Jeff Donahue, +6 more

- 17 Nov 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable, and shows such models have distinct advantages over state-of-the-art models for recognition or generation which are separately defined and/or optimized.

...read moreread less

Posted Content

Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

Zhe Cao, +3 more

- 24 Nov 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents an approach to efficiently detect the 2D pose of multiple people in an image using a nonparametric representation, which it refers to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image.

...read moreread less