A Grid-based Representation for Human Action Recognition
Soufiane Lamghari,Guillaume-Alexandre Bilodeau,Nicolas Saunier +2 more
- pp 10500-10507
Reads0
Chats0
TLDR
Zhang et al. as mentioned in this paper proposed a grid-based representation for action recognition, which encodes the most discriminative appearance information of an action with explicit attention on representative pose features.Abstract:
Human action recognition (HAR) in videos is a fundamental research topic in computer vision. It consists mainly in understanding actions performed by humans based on a sequence of visual observations. In recent years, HAR have witnessed significant progress, especially with the emergence of deep learning models. However, most of existing approaches for action recognition rely on information that is not always relevant for this task, and are limited in the way they fuse the temporal information. In this paper, we propose a novel method for human action recognition that encodes efficiently the most discriminative appearance information of an action with explicit attention on representative pose features, into a new compact grid representation. Our GRAR (Grid-based Representation for Action Recognition) method is tested on several benchmark datasets demonstrating that our model can accurately recognize human actions, despite intra-class appearance variations and occlusion challenges.read more
Citations
More filters
Journal ArticleDOI
Developing an Objective Framework to Evaluate Street Functions
TL;DR: In this article , the authors proposed a holistic and objective framework to evaluate streets based on their actual use by all users, which is developed based on direct user observation to assess the various street functions (i.e., transit, access and place) using objective indicators at a microscopic (individual) level.
Book ChapterDOI
Development Human Activity Recognition for the Elderly Using Inertial Sensor and Statistical Feature
TL;DR: In this article , the authors tested the selection of feature extraction and machine learning methods regarding Human Activity Recognition and found that the most accurate machine learning algorithm is Random Forest, which has a 99.59% accuracy rate.
Journal ArticleDOI
Dual attention based spatial-temporal inference network for volleyball group activity recognition
Proceedings ArticleDOI
ActAR: Actor-Driven Pose Embeddings for Video Action Recognition
TL;DR: Li et al. as mentioned in this paper proposed a new method that simultaneously learns to recognize efficiently human actions in the infrared spectrum, while automatically identifying the key-actors performing the action without using any prior knowledge or explicit annotations.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Proceedings Article
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.