scispace - formally typeset
Journal ArticleDOI

Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

Reads0
Chats0
TLDR
A latency-aware learning formulation is used to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses.
Abstract
An important aspect in designing interactive, action-based interfaces is reliably recognizing actions with minimal latency. High latency causes the system's feedback to lag behind user actions and thus significantly degrades the interactivity of the user experience. This paper presents algorithms for reducing latency when recognizing actions. We use a latency-aware learning formulation to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses. We introduce a novel (publicly released) dataset for the purpose of our experiments. Comparisons of our method against both a Bag of Words and a Conditional Random Field (CRF) classifier show improved recognition performance for both pre-segmented and online classification tasks. Additionally, we employ GentleBoost to reduce our feature set and further improve our results. We then present experiments that explore the accuracy/latency trade-off over a varying number of actions. Finally, we evaluate our algorithm on two existing datasets.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

TL;DR: A novel approach to human action recognition from 3D skeleton sequences extracted from depth data that uses the covariance matrix for skeleton joint locations over time as a discriminative descriptor for a sequence to encode the relationship between joint movement and time.
Proceedings ArticleDOI

The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection

TL;DR: A fast, simple, yet powerful non-parametric Moving Pose (MP) framework that enables low-latency recognition, one-shot learning, and action detection in difficult unsegmented sequences and is real-time, scalable, and outperforms more sophisticated approaches on challenging benchmarks.
Journal ArticleDOI

3D skeleton-based human action classification

TL;DR: This survey highlights motivations and challenges of this very recent research area by presenting technologies and approaches for 3D skeleton-based action classification, and introduces a categorization of the most recent works according to the adopted feature representation.
Journal ArticleDOI

Max-Margin Early Event Detectors

TL;DR: This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection, based on Structured Output SVM, but extends it to accommodate sequential data.
Journal ArticleDOI

Effective 3D action recognition using EigenJoints

TL;DR: An effective method to recognize human actions using 3D skeleton joints recovered from 3D depth data of RGBD cameras is proposed and a new action feature descriptor, EigenJoints, is designed which combine action information including static posture, motion property, and overall dynamics.
References
More filters
Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.
Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Journal ArticleDOI

Robust Real-Time Face Detection

TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Proceedings ArticleDOI

Robust real-time face detection

TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Related Papers (5)