Event recognition in egocentric videos using a novel trajectory based feature

doi:10.1145/3009977.3010011

Proceedings ArticleDOI

Event recognition in egocentric videos using a novel trajectory based feature

Vinodh Buddubariki, +2 more

- pp 76

Chats0

TLDR

It is shown that the dense trajectory features based on the proposed GF-STIP descriptors enhance the efficacy of the event recognition system in egocentric videos.

Abstract:

This paper proposes an approach for event recognition in Egocentric videos using dense trajectories over Gradient Flow - Space Time Interest Point (GF-STIP) feature. We focus on recognizing events of diverse categories (including indoor and outdoor activities, sports and social activities and adventures) in egocentric videos. We introduce a dataset with diverse egocentric events, as all the existing egocentric activity recognition datasets consist of indoor videos only. The dataset introduced in this paper contains 102 videos with 9 different events (containing indoor and outdoor videos with varying lighting conditions). We extract Space Time Interest Points (STIP) from each frame of the video. The interest points are taken as the lead pixels and Gradient-Weighted Optical Flow (GWOF) features are calculated on the lead pixels by multiplying the optical flow measure and the magnitude of gradient at the pixel, to obtain the GF-STIP feature. We construct pose descriptors with the GF-STIP feature. We use the GF-STIP descriptors for recognizing events in egocentric videos with three different approaches: following a Bag of Words (BoW) model, implementing Fisher Vectors and obtaining dense trajectories for the videos. We show that the dense trajectory features based on the proposed GF-STIP descriptors enhance the efficacy of the event recognition system in egocentric videos.

Event recognition in egocentric videos using a novel trajectory based feature

Citations

An Information-rich Sampling Technique over Spatio-Temporal CNN for Classification of Human Actions in Videos

Human action and event recognition using a novel descriptor based on improved dense trajectories

Recognizing Human Activities in Videos Using Improved Dense Trajectories over LSTM

Activity Recognition in Egocentric Videos Using Bag of Key Action Units

An information-rich sampling technique over spatio-temporal CNN for classification of human actions in videos

References

Histograms of oriented gradients for human detection

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

A Combined Corner and Edge Detector

SURF: speeded up robust features

Learning Spatiotemporal Features with 3D Convolutional Networks

Related Papers (5)

Where Am I? Comparing CNN and LSTM for Location Classification in Egocentric Videos

Video-Based Object Recognition Using Novel Set-of-Sets Representations

You Talkin' to Me?: Recognizing Complex Human Interactions in Unconstrained Videos

Action recognition using interest points capturing differential motion information

Video Object Recognition and Modeling by SIFT Matching Optimization