Journal ArticleDOI
Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition
Reads0
Chats0
TLDR
A latency-aware learning formulation is used to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses.Abstract:
An important aspect in designing interactive, action-based interfaces is reliably recognizing actions with minimal latency. High latency causes the system's feedback to lag behind user actions and thus significantly degrades the interactivity of the user experience. This paper presents algorithms for reducing latency when recognizing actions. We use a latency-aware learning formulation to train a logistic regression-based classifier that automatically determines distinctive canonical poses from data and uses these to robustly recognize actions in the presence of ambiguous poses. We introduce a novel (publicly released) dataset for the purpose of our experiments. Comparisons of our method against both a Bag of Words and a Conditional Random Field (CRF) classifier show improved recognition performance for both pre-segmented and online classification tasks. Additionally, we employ GentleBoost to reduce our feature set and further improve our results. We then present experiments that explore the accuracy/latency trade-off over a varying number of actions. Finally, we evaluate our algorithm on two existing datasets.read more
Citations
More filters
Proceedings Article
Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations
TL;DR: A novel approach to human action recognition from 3D skeleton sequences extracted from depth data that uses the covariance matrix for skeleton joint locations over time as a discriminative descriptor for a sequence to encode the relationship between joint movement and time.
Proceedings ArticleDOI
The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection
TL;DR: A fast, simple, yet powerful non-parametric Moving Pose (MP) framework that enables low-latency recognition, one-shot learning, and action detection in difficult unsegmented sequences and is real-time, scalable, and outperforms more sophisticated approaches on challenging benchmarks.
Journal ArticleDOI
3D skeleton-based human action classification
TL;DR: This survey highlights motivations and challenges of this very recent research area by presenting technologies and approaches for 3D skeleton-based action classification, and introduces a categorization of the most recent works according to the adopted feature representation.
Journal ArticleDOI
Max-Margin Early Event Detectors
Minh Hoai,Fernando De la Torre +1 more
TL;DR: This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection, based on Structured Output SVM, but extends it to accommodate sequential data.
Journal ArticleDOI
Effective 3D action recognition using EigenJoints
Xiaodong Yang,Yingli Tian +1 more
TL;DR: An effective method to recognize human actions using 3D skeleton joints recovered from 3D depth data of RGBD cameras is proposed and a new action feature descriptor, EigenJoints, is designed which combine action information including static posture, motion property, and overall dynamics.
References
More filters
Proceedings ArticleDOI
Rapid object detection using a boosted cascade of simple features
Paul A. Viola,Michael Jones +1 more
TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.
Proceedings Article
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Journal ArticleDOI
Robust Real-Time Face Detection
Paul A. Viola,Michael Jones +1 more
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
Proceedings ArticleDOI
Robust real-time face detection
Paul A. Viola,Michael Jones +1 more
TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.