Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences

doi:10.1109/3DV.2016.14

Proceedings ArticleDOI

Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences

- pp 47-55

TLDR

The hierarchical E-GTI outperforms the existing methods in 3D action recognition and is tested on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.

Abstract:

In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subject's speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Action Recognition From Arbitrary Views Using Transferable Dictionary Learning

Jingtian Zhang, +3 more

- 15 May 2018 -

IEEE Transactions on Image Processing

TL;DR: This paper proposes a novel end-to-end framework to jointly learn a view-invariance transfer dictionary and a classifier to recognize actions with an arbitrary view, and introduces a new feature set called the 3D dense trajectories to effectively encode extracted trajectory information on 3D videos.

...read moreread less

Proceedings ArticleDOI

Speaker independent diarization for child language environment analysis using deep neural networks

Maryam Najafian, +1 more

TL;DR: This study exploits complex Hidden Markov Models with multiple states to model the temporal dependencies between different sources of acoustic variability and estimate the HMM state output probabilities using deep neural networks as a discriminative modeling approach and confirms that this approach outperforms the state-of theart Gaussian mixture model based diarization without the need for bottom-up clustering.

...read moreread less

Journal ArticleDOI

Exploring 3D Human Action Recognition Using STACOG on Multi-View Depth Motion Maps Sequences.

Mohammad Farhad Bulbul, +5 more

- 24 May 2021 -

Sensors

TL;DR: Wang et al. as discussed by the authors proposed an action recognition framework for depth map sequences using the 3D Space-Time Auto-Correlation of Gradients (STACOG) algorithm, where each depth map sequence is split into two sets of sub-sequences of two different frame lengths individually, and a number of depth motion maps (DMMs) sequences from every set are generated and fed into STACOG to find an auto-correlation feature vector.

...read moreread less

Proceedings ArticleDOI

Connectivity patterns of interictal epileptiform discharges using coherence analysis

Panuwat Janwattanapong, +6 more

TL;DR: In this article, the functional connectivity maps were used to delineate distinctive patterns that can be used to classify different types of epileptiform discharges (interictal spike, spike and slow wave complex) by quantification of functional connections presented in the regions of interest within different frequency bands.

...read moreread less

Proceedings ArticleDOI

Environment aware speaker diarization for moving targets using parallel DNN-based recognizers

Maryam Najafian, +1 more

TL;DR: This paper presents an acoustic environment aware child-adult diarization applied to the audio recorded by a single microphone attached to moving targets under realistic high noise conditions which outperforms the state-of-the-art diarizations without the need to prior clustering or front-end speech activity detection.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

A threshold selection method from gray level histograms

Nobuyuki Otsu

- 01 Jan 1979 -

IEEE Transactions on Systems, Man, and C...

Proceedings ArticleDOI

Learning realistic human actions from movies

Ivan Laptev, +3 more

TL;DR: A new method for video classification that builds upon and extends several recent ideas including local space-time features,space-time pyramids and multi-channel non-linear SVMs is presented and shown to improve state-of-the-art results on the standard KTH action dataset.

...read moreread less

Proceedings ArticleDOI

Vlfeat: an open and portable library of computer vision algorithms

Andrea Vedaldi, +1 more

TL;DR: VLFeat is an open and portable library of computer vision algorithms that includes rigorous implementations of common building blocks such as feature detectors, feature extractors, (hierarchical) k-means clustering, randomized kd-tree matching, and super-pixelization.

...read moreread less

Journal ArticleDOI

The recognition of human movement using temporal templates

Aaron F. Bobick, +1 more

- 01 Mar 2001 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A view-based approach to the representation and recognition of human movement is presented, and a recognition method matching temporal templates against stored instances of views of known actions is developed.

...read moreread less

Journal ArticleDOI

On Space-Time Interest Points

Ivan Laptev

TL;DR: This paper builds on the idea of the Harris and Förstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time and illustrates how a video representation in terms of local space- time features allows for detection of walking people in scenes with occlusions and dynamic cluttered backgrounds.

...read moreread less

Collapse

Related Papers (5)

Motion clustering-based action recognition technique using optical flow

Upal Mahbub, +2 more

Large Disparity Motion Layer Extraction via Topological Clustering

Yongtao Wang, +5 more

- 01 Jan 2011 -

IEEE Transactions on Image Processing

IEEE Transactions on Multimedia

Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences

Citations

Action Recognition From Arbitrary Views Using Transferable Dictionary Learning

Speaker independent diarization for child language environment analysis using deep neural networks

Exploring 3D Human Action Recognition Using STACOG on Multi-View Depth Motion Maps Sequences.

Connectivity patterns of interictal epileptiform discharges using coherence analysis

Environment aware speaker diarization for moving targets using parallel DNN-based recognizers

References

A threshold selection method from gray level histograms

Learning realistic human actions from movies

Vlfeat: an open and portable library of computer vision algorithms

The recognition of human movement using temporal templates

On Space-Time Interest Points

Related Papers (5)

Motion clustering-based action recognition technique using optical flow

Large Disparity Motion Layer Extraction via Topological Clustering

Gesture Segmentation from a Video Sequence Using Greedy Similarity Measure

A 3D face recognition algorithm using histogram-based features

Correspondence Matching of Multi-View Video Sequences Using Mutual Information Based Similarity Measure