Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

doi:10.1007/S11263-012-0550-7

Journal ArticleDOI

Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

Chris Ellis, +4 more

- 01 Feb 2013 -

International Journal of Computer Vision

- Vol. 101, Iss: 3, pp 420-436

Chats0

TLDR

A latency-aware learning formulation is used to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses.

Abstract:

An important aspect in designing interactive, action-based interfaces is reliably recognizing actions with minimal latency. High latency causes the system's feedback to lag behind user actions and thus significantly degrades the interactivity of the user experience. This paper presents algorithms for reducing latency when recognizing actions. We use a latency-aware learning formulation to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses. We introduce a novel (publicly released) dataset for the purpose of our experiments. Comparisons of our method against both a Bag of Words and a Conditional Random Field (CRF) classifier show improved recognition performance for both pre-segmented and online classification tasks. Additionally, we employ GentleBoost to reduce our feature set and further improve our results. We then present experiments that explore the accuracy/latency trade-off over a varying number of actions. Finally, we evaluate our algorithm on two existing datasets.

Citations

PDF

Open Access

More filters

Proceedings Article

Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

Mohamed E. Hussein, +3 more

TL;DR: A novel approach to human action recognition from 3D skeleton sequences extracted from depth data that uses the covariance matrix for skeleton joint locations over time as a discriminative descriptor for a sequence to encode the relationship between joint movement and time.

...read moreread less

Proceedings ArticleDOI

The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection

Mihai Zanfir, +2 more

TL;DR: A fast, simple, yet powerful non-parametric Moving Pose (MP) framework that enables low-latency recognition, one-shot learning, and action detection in difficult unsegmented sequences and is real-time, scalable, and outperforms more sophisticated approaches on challenging benchmarks.

...read moreread less

Journal ArticleDOI

3D skeleton-based human action classification

Liliana Lo Presti, +1 more

- 01 May 2016 -

Pattern Recognition

TL;DR: This survey highlights motivations and challenges of this very recent research area by presenting technologies and approaches for 3D skeleton-based action classification, and introduces a categorization of the most recent works according to the adopted feature representation.

...read moreread less

Journal ArticleDOI

Max-Margin Early Event Detectors

Minh Hoai, +1 more

- 01 Apr 2014 -

International Journal of Computer Vision

TL;DR: This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection, based on Structured Output SVM, but extends it to accommodate sequential data.

...read moreread less

Journal ArticleDOI

Effective 3D action recognition using EigenJoints

Xiaodong Yang, +1 more

- 01 Jan 2014 -

Journal of Visual Communication and Imag...

TL;DR: An effective method to recognize human actions using 3D skeleton joints recovered from 3D depth data of RGBD cameras is proposed and a new action feature descriptor, EigenJoints, is designed which combine action information including static posture, motion property, and overall dynamics.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

Paul A. Viola, +1 more

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +2 more

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Journal ArticleDOI

Robust Real-Time Face Detection

Paul A. Viola, +1 more

- 01 May 2004 -

International Journal of Computer Vision

TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.

...read moreread less

Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +3 more

Proceedings ArticleDOI

Robust real-time face detection

Paul A. Viola, +1 more

TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.

...read moreread less

Collapse

Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition

Citations

Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations

The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection

3D skeleton-based human action classification

Max-Margin Early Event Detectors

Effective 3D action recognition using EigenJoints

References

Rapid object detection using a boosted cascade of simple features

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Robust Real-Time Face Detection

Probabilistic Models for Segmenting and Labeling Sequence Data

Robust real-time face detection

Related Papers (5)

Action recognition based on a bag of 3D points

Mining actionlet ensemble for action recognition with depth cameras

View invariant human action recognition using histograms of 3D joints

Real-time human pose recognition in parts from single depth images

HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences