scispace - formally typeset
Open AccessBook ChapterDOI

Challenges in Multi-modal Gesture Recognition

Reads0
Chats0
TLDR
The state of the art on multimodal gesture recognition is surveyed and the JMLR special topic on gesture recognition 2011–2015 is introduced and several datasets recorded, including tens of thousands of videos, are made available to conduct further research.
Abstract
This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011–2015. We began right at the start of the Kinect\(^\mathrm{TM}\) revolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Vision-based human activity recognition: a survey

TL;DR: Most computer vision applications such as human computer interaction, virtual reality, security, video surveillance and home monitoring are highly correlated to HAR tasks, which establishes new trend and milestone in the development cycle of HAR systems.
Proceedings ArticleDOI

Learning Spatiotemporal Features Using 3DCNN and Convolutional LSTM for Gesture Recognition

TL;DR: Experiments on the ChaLearn LAP large-scale isolated gesture dataset (IsoGD) and the Sheffield Kinect Gesture (SKIG) dataset demonstrate the superiority of the proposed deep architecture.
Journal ArticleDOI

Sign Language Recognition - A Deep Survey.

TL;DR: A taxonomy to categorize the proposed models for isolated and continuous sign language recognition is presented, discussing applications, datasets, hybrid models, complexity, and future lines of research in the field.
Journal ArticleDOI

MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences

TL;DR: An effective multi-dimensional feature learning approach, termed as MultiD-CNN, for human gesture recognition in RGB-D videos is presented, demonstrating that this approach is particularly impressive where it outperforms prior arts in both accuracy and efficiency.
Posted Content

Survey on Emotional Body Gesture Recognition

TL;DR: It is shown that for emotion recognition the quantity of labelled data is scarce and there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.
References
More filters
Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.
Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.
Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Journal ArticleDOI

C ONDENSATION —Conditional Density Propagation forVisual Tracking

TL;DR: The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set.
Related Papers (5)