Challenges in Multi-modal Gesture Recognition

The state of the art on multimodal gesture recognition is surveyed and the JMLR special topic on gesture recognition 2011–2015 is introduced and several datasets recorded, including tens of thousands of videos, are made available to conduct further research.

Abstract:

This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011–2015. We began right at the start of the Kinect\(^\mathrm{TM}\) revolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Vision-based human activity recognition: a survey

Djamila Romaissa Beddiar,Brahim Nini,Mohammad Sabokrou,Abdenour Hadid +3 moreUniversity of Oulu

- 01 Nov 2020 -

Multimedia Tools and Applications

Show Less

TL;DR: Most computer vision applications such as human computer interaction, virtual reality, security, video surveillance and home monitoring are highly correlated to HAR tasks, which establishes new trend and milestone in the development cycle of HAR systems.

...read moreread less

Proceedings ArticleDOI

Learning Spatiotemporal Features Using 3DCNN and Convolutional LSTM for Gesture Recognition

Liang Zhang,Guangming Zhu,Peiyi Shen,Juan Song +3 moreXidian University

Show Less

TL;DR: Experiments on the ChaLearn LAP large-scale isolated gesture dataset (IsoGD) and the Sheffield Kinect Gesture (SKIG) dataset demonstrate the superiority of the proposed deep architecture.

...read moreread less

Journal ArticleDOI

Sign Language Recognition - A Deep Survey.

Razieh Rastgoo,Kourosh Kiani,Sergio Escalera +2 moreSemnan University,University of Barcelona

- 01 Feb 2021 -

Expert Systems With Applications

Show Less

TL;DR: A taxonomy to categorize the proposed models for isolated and continuous sign language recognition is presented, discussing applications, datasets, hybrid models, complexity, and future lines of research in the field.

...read moreread less

Journal ArticleDOI

MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences

Abdessamad Elboushaki,Rachida Hannane,Karim Afdel,Lahcen Koutti +3 more

- 01 Jan 2020 -

Expert Systems With Applications

Show Less

TL;DR: An effective multi-dimensional feature learning approach, termed as MultiD-CNN, for human gesture recognition in RGB-D videos is presented, demonstrating that this approach is particularly impressive where it outperforms prior arts in both accuracy and efficiency.

...read moreread less

Posted Content

Survey on Emotional Body Gesture Recognition

Fatemeh Noroozi,Ciprian A. Corneanu,Dorota Kamińska,Tomasz Sapiński,Sergio Escalera,Gholamreza Anbarjafari +5 moreUniversity of Tartu,University of Barcelona,Lodz University of Technology

- 23 Jan 2018 -

arXiv: Computer Vision and Pattern Recog...

Show Less

TL;DR: It is shown that for emotion recognition the quantity of labelled data is scarce and there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.

...read moreread less

1
2
3
4
…
5
6
7
8
9
10

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

Paul A. Viola,Michael Jones +1 moreMitsubishi

Show Less

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham,Luc Van Gool,Christopher Williams,John Winn,Andrew Zisserman +4 moreUniversity of Leeds,Katholieke Universiteit Leuven,University of Edinburgh,Microsoft,University of Oxford

- 01 Jun 2010 -

International Journal of Computer Vision

Show Less

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty,Andrew McCallum,Fernando Pereira +2 moreCarnegie Mellon University

Show Less

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty,Andrew McCallum,Fernando Pereira,Kevin Duh +3 more

Show Less

Journal ArticleDOI

C ONDENSATION —Conditional Density Propagation forVisual Tracking

Michael Isard,Andrew Blake +1 moreUniversity of Oxford

- 01 Aug 1998 -

International Journal of Computer Vision

Show Less

TL;DR: The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set.

...read moreread less

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

Collapse

IEEE Transactions on Pattern Analysis an...

Show Less

SciSpace

About Careers Resources Support Browse Papers Pricing SciSpace Affiliate Program Cancellation & Refund Policy

Tools

Citation generator AI Detector Paraphraser Citation Booster

Extensions

SciSpace

Directories

Papers Topics Journals Authors Conferences Institutions Questions Citation Styles

Contact

support@typeset.io +91 8431021544

Challenges in Multi-modal Gesture Recognition

Citations

Vision-based human activity recognition: a survey

Learning Spatiotemporal Features Using 3DCNN and Convolutional LSTM for Gesture Recognition

Sign Language Recognition - A Deep Survey.

MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences

Survey on Emotional Body Gesture Recognition

References

Rapid object detection using a boosted cascade of simple features

The Pascal Visual Object Classes (VOC) Challenge

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Probabilistic Models for Segmenting and Labeling Sequence Data

C ONDENSATION —Conditional Density Propagation forVisual Tracking

Related Papers (5)

Chalearn looking at people challenge 2014: Dataset and results

Very Deep Convolutional Networks for Large-Scale Image Recognition

Deep Residual Learning for Image Recognition

ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition

ModDrop: Adaptive Multi-Modal Gesture Recognition