A Grid-based Representation for Human Action Recognition

doi:10.1109/ICPR48806.2021.9413136

Open AccessProceedings ArticleDOI

A Grid-based Representation for Human Action Recognition

Soufiane Lamghari, +2 more

- pp 10500-10507

Chats0

TLDR

Zhang et al. as mentioned in this paper proposed a grid-based representation for action recognition, which encodes the most discriminative appearance information of an action with explicit attention on representative pose features.

Abstract:

Human action recognition (HAR) in videos is a fundamental research topic in computer vision. It consists mainly in understanding actions performed by humans based on a sequence of visual observations. In recent years, HAR have witnessed significant progress, especially with the emergence of deep learning models. However, most of existing approaches for action recognition rely on information that is not always relevant for this task, and are limited in the way they fuse the temporal information. In this paper, we propose a novel method for human action recognition that encodes efficiently the most discriminative appearance information of an action with explicit attention on representative pose features, into a new compact grid representation. Our GRAR (Grid-based Representation for Action Recognition) method is tested on several benchmark datasets demonstrating that our model can accurately recognize human actions, despite intra-class appearance variations and occlusion challenges.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Developing an Objective Framework to Evaluate Street Functions

Абдусаттаров Одилжон

- 12 Jun 2022 -

Sustainability

TL;DR: In this article , the authors proposed a holistic and objective framework to evaluate streets based on their actual use by all users, which is developed based on direct user observation to assess the various street functions (i.e., transit, access and place) using objective indicators at a microscopic (individual) level.

...read moreread less

Book ChapterDOI

Development Human Activity Recognition for the Elderly Using Inertial Sensor and Statistical Feature

David Bardey

TL;DR: In this article , the authors tested the selection of feature extraction and machine learning methods regarding Human Activity Recognition and found that the most accurate machine learning algorithm is Random Forest, which has a 99.59% accuracy rate.

...read moreread less

Journal ArticleDOI

Dual attention based spatial-temporal inference network for volleyball group activity recognition

Yanshan Li, +4 more

- 07 Oct 2022 -

Multimedia Tools and Applications

Proceedings ArticleDOI

ActAR: Actor-Driven Pose Embeddings for Video Action Recognition

TL;DR: Li et al. as mentioned in this paper proposed a new method that simultaneously learns to recognize efficiently human actions in the infrared spectrum, while automatically identifying the key-actors performing the action without using any prior knowledge or explicit annotations.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings Article

Two-Stream Convolutional Networks for Action Recognition in Videos

Karen Simonyan, +1 more

TL;DR: This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.

...read moreread less