Isolated Sign Language Recognition with Grassmann Covariance Matrices

doi:10.1145/2897735

Journal ArticleDOI

Isolated Sign Language Recognition with Grassmann Covariance Matrices

Hanjie Wang, +4 more

- 07 May 2016 -

ACM Transactions on Accessible Computing

- Vol. 8, Iss: 4, pp 14

TLDR

This article proposes a covariance matrix--based representation to naturally fuse information from multimodal sources to utilize long-term dynamics over an isolated sign sequence, and demonstrates that the proposed method outperforms the state-of-the-art methods both in accuracy and computational cost.

Abstract:

In this article, to utilize long-term dynamics over an isolated sign sequence, we propose a covariance matrix--based representation to naturally fuse information from multimodal sources. To tackle the drawback induced by the commonly used Riemannian metric, the proximity of covariance matrices is measured on the Grassmann manifold. However, the inherent Grassmann metric cannot be directly applied to the covariance matrix. We solve this problem by evaluating and selecting the most significant singular vectors of covariance matrices of sign sequences. The resulting compact representation is called the Grassmann covariance matrix. Finally, the Grassmann metric is used to be a kernel for the support vector machine, which enables learning of the signs in a discriminative manner. To validate the proposed method, we collect three challenging sign language datasets, on which comprehensive evaluations show that the proposed method outperforms the state-of-the-art methods both in accuracy and computational cost.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

SubUNets: End-to-End Hand Shape and Continuous Sign Language Recognition

Necati Cihan Camgoz, +3 more

TL;DR: A novel deep learning approach to solve simultaneous alignment and recognition problems (referred to as “Sequence-to-sequence” learning) is proposed, which decompose the problem into a series of specialised expert systems referred to as SubUNets, and serves to significantly improve the performance of the overarching recognition system.

...read moreread less

Posted Content

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

Necati Cihan Camgoz, +3 more

- 30 Mar 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A novel transformer based architecture that jointly learns Continuous Sign Language Recognition and Translation while being trainable in an end-to-end manner is introduced by using a Connectionist Temporal Classification (CTC) loss to bind the recognition and translation problems into a single unified architecture.

...read moreread less

Proceedings Article

Hierarchical LSTM for Sign Language Translation

Dan Guo, +3 more

TL;DR: A hierarchical-LSTM (HLSTM) encoderdecoder model with visual content and word embedding for SLT exhibits promising performance on singer-independent test with seen sentences and also outperforms the comparison algorithms on unseen sentences.

...read moreread less

Proceedings ArticleDOI

Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation

Necati Cihan Camgoz, +3 more

TL;DR: Sign Language Transformers as mentioned in this paper use a Connectionist Temporal Classification (CTC) loss to bind the recognition and translation problems into a single unified architecture, which leads to significant performance gains.

...read moreread less

Journal ArticleDOI

Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks

Liao Yanqiu, +4 more

- 14 Mar 2019 -

IEEE Access

TL;DR: A multimodal dynamic sign language recognition method based on a deep 3-dimensional residual ConvNet and bi-directional LSTM networks, which is named as BLSTM-3D residual network (B3D ResNet), which can obtain state-of-the-art recognition accuracy.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

LIBSVM: A library for support vector machines

Chih-Chung Chang, +1 more

- 06 May 2011 -

ACM Transactions on Intelligent Systems ...

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Book

Matrix computations

Gene H. Golub

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Journal ArticleDOI

Real-time human pose recognition in parts from single depth images

Jamie Shotton, +7 more

- 01 Jan 2013 -

Communications of The ACM

TL;DR: This work takes an object recognition approach, designing an intermediate body parts representation that maps the difficult pose estimation problem into a simpler per-pixel classification problem, and generates confidence-scored 3D proposals of several body joints by reprojecting the classification result and finding local modes.

...read moreread less

Journal ArticleDOI

On Space-Time Interest Points

Ivan Laptev

TL;DR: This paper builds on the idea of the Harris and Förstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time and illustrates how a video representation in terms of local space- time features allows for detection of walking people in scenes with occlusions and dynamic cluttered backgrounds.

...read moreread less

Collapse

Isolated Sign Language Recognition with Grassmann Covariance Matrices

Citations

SubUNets: End-to-End Hand Shape and Continuous Sign Language Recognition

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

Hierarchical LSTM for Sign Language Translation

Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation

Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks

References

LIBSVM: A library for support vector machines

Matrix computations

Histograms of oriented gradients for human detection

Real-time human pose recognition in parts from single depth images

On Space-Time Interest Points

Related Papers (5)

Neural Sign Language Translation

Learning Spatiotemporal Features with 3D Convolutional Networks

Adam: A Method for Stochastic Optimization

Effective Approaches to Attention-based Neural Machine Translation

ImageNet: A large-scale hierarchical image database