scispace - formally typeset
Proceedings ArticleDOI

Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences

TLDR
The hierarchical E-GTI outperforms the existing methods in 3D action recognition and is tested on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.
Abstract
In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subject's speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.

read more

Citations
More filters
Journal ArticleDOI

Action Recognition From Arbitrary Views Using Transferable Dictionary Learning

TL;DR: This paper proposes a novel end-to-end framework to jointly learn a view-invariance transfer dictionary and a classifier to recognize actions with an arbitrary view, and introduces a new feature set called the 3D dense trajectories to effectively encode extracted trajectory information on 3D videos.
Proceedings ArticleDOI

Speaker independent diarization for child language environment analysis using deep neural networks

TL;DR: This study exploits complex Hidden Markov Models with multiple states to model the temporal dependencies between different sources of acoustic variability and estimate the HMM state output probabilities using deep neural networks as a discriminative modeling approach and confirms that this approach outperforms the state-of theart Gaussian mixture model based diarization without the need for bottom-up clustering.
Journal ArticleDOI

Exploring 3D Human Action Recognition Using STACOG on Multi-View Depth Motion Maps Sequences.

TL;DR: Wang et al. as discussed by the authors proposed an action recognition framework for depth map sequences using the 3D Space-Time Auto-Correlation of Gradients (STACOG) algorithm, where each depth map sequence is split into two sets of sub-sequences of two different frame lengths individually, and a number of depth motion maps (DMMs) sequences from every set are generated and fed into STACOG to find an auto-correlation feature vector.
Proceedings ArticleDOI

Connectivity patterns of interictal epileptiform discharges using coherence analysis

TL;DR: In this article, the functional connectivity maps were used to delineate distinctive patterns that can be used to classify different types of epileptiform discharges (interictal spike, spike and slow wave complex) by quantification of functional connections presented in the regions of interest within different frequency bands.
Proceedings ArticleDOI

Environment aware speaker diarization for moving targets using parallel DNN-based recognizers

TL;DR: This paper presents an acoustic environment aware child-adult diarization applied to the audio recorded by a single microphone attached to moving targets under realistic high noise conditions which outperforms the state-of-the-art diarizations without the need to prior clustering or front-end speech activity detection.
References
More filters
Proceedings ArticleDOI

Learning realistic human actions from movies

TL;DR: A new method for video classification that builds upon and extends several recent ideas including local space-time features,space-time pyramids and multi-channel non-linear SVMs is presented and shown to improve state-of-the-art results on the standard KTH action dataset.
Proceedings ArticleDOI

Vlfeat: an open and portable library of computer vision algorithms

TL;DR: VLFeat is an open and portable library of computer vision algorithms that includes rigorous implementations of common building blocks such as feature detectors, feature extractors, (hierarchical) k-means clustering, randomized kd-tree matching, and super-pixelization.
Journal ArticleDOI

The recognition of human movement using temporal templates

TL;DR: A view-based approach to the representation and recognition of human movement is presented, and a recognition method matching temporal templates against stored instances of views of known actions is developed.
Journal ArticleDOI

On Space-Time Interest Points

TL;DR: This paper builds on the idea of the Harris and Förstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time and illustrates how a video representation in terms of local space- time features allows for detection of walking people in scenes with occlusions and dynamic cluttered backgrounds.
Related Papers (5)