Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

doi:10.1109/CVPR.2017.364

Proceedings ArticleDOI

Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

Oscar Koller, +2 more

- pp 3416-3424

Chats0

TLDR

This work proposes an algorithm that treats the provided training labels as weak labels and refines the label-to-image alignment on-the-fly in a weakly supervised fashion, and embedded into an HMM the resulting deep model continuously improves its performance in several re-alignments.

Abstract:

This work presents an iterative re-alignment approach applicable to visual sequence labelling tasks such as gesture recognition, activity recognition and continuous sign language recognition. Previous methods dealing with video data usually rely on given frame labels to train their classifiers. However, looking at recent data sets, these labels often tend to be noisy which is commonly overseen. We propose an algorithm that treats the provided training labels as weak labels and refines the label-to-image alignment on-the-fly in a weakly supervised fashion. Given a series of frames and sequence-level labels, a deep recurrent CNN-BLSTM network is trained end-to-end. Embedded into an HMM the resulting deep model corrects the frame labels and continuously improves its performance in several re-alignments. We evaluate on two challenging publicly available sign recognition benchmark data sets featuring over 1000 classes. We outperform the state-of-the-art by up to 10% absolute and 30% relative.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Neural Sign Language Translation

Necati Cihan Camgoz, +4 more

TL;DR: This work formalizes SLT in the framework of Neural Machine Translation (NMT) for both end-to-end and pretrained settings (using expert knowledge) and allows to jointly learn the spatial representations, the underlying language model, and the mapping between sign and spoken language.

...read moreread less

Proceedings ArticleDOI

Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective

Danielle Bragg, +11 more

TL;DR: The results of an interdisciplinary workshop are presented, providing key background that is often overlooked by computer scientists, a review of the state-of-the-art, a set of pressing challenges, and a call to action for the research community.

...read moreread less

Journal ArticleDOI

A Deep Neural Framework for Continuous Sign Language Recognition by Iterative Training

Runpeng Cui, +2 more

- 01 Jan 2019 -

IEEE Transactions on Multimedia

TL;DR: This work develops a continuous sign language (SL) recognition framework with deep neural networks, which directly transcribes videos of SL sentences to sequences of ordered gloss labels, and proposed architecture adopts deep convolutional neural networks with stacked temporal fusion layers as the feature extraction module.

...read moreread less

Journal ArticleDOI

Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos

Oscar Koller, +3 more

- 01 Sep 2020 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work applies the approach to the domain of sign language recognition exploiting the sequential parallelism to learn sign language, mouth shape and hand shape classifiers and clearly outperform the state-of-the-art on all data sets and observe significantly faster convergence using the parallel alignment approach.

...read moreread less

Posted Content

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

Necati Cihan Camgoz, +3 more

- 30 Mar 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A novel transformer based architecture that jointly learns Continuous Sign Language Recognition and Translation while being trainable in an end-to-end manner is introduced by using a Connectionist Temporal Classification (CTC) loss to bind the recognition and translation problems into a single unified architecture.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

LSTM Neural Networks for Language Modeling.

Martin Sundermeyer, +2 more

TL;DR: This work analyzes the Long Short-Term Memory neural network architecture on an English and a large French language modeling task and gains considerable improvements in WER on top of a state-of-the-art speech recognition system.

...read moreread less

Journal ArticleDOI

A Novel Connectionist System for Unconstrained Handwriting Recognition

Alex Graves, +5 more

- 01 May 2009 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies, significantly outperforming a state-of-the-art HMM-based system.

...read moreread less

Proceedings ArticleDOI

Hierarchical recurrent neural network for skeleton based action recognition

Yong Du, +2 more

TL;DR: This paper proposes an end-to-end hierarchical RNN for skeleton based action recognition, and demonstrates that the model achieves the state-of-the-art performance with high computational efficiency.

...read moreread less

Book

Connectionist Speech Recognition: A Hybrid Approach

Hervé Bourlard, +1 more

TL;DR: Connectionist Speech Recognition: A Hybrid Approach describes the theory and implementation of a method to incorporate neural network approaches into state-of-the-art continuous speech recognition systems based on Hidden Markov Models (HMMs) to improve their performance.

...read moreread less

Proceedings ArticleDOI

Describing Videos by Exploiting Temporal Structure

Li Yao, +6 more

TL;DR: In this paper, a spatial temporal 3-D convolutional neural network (3-D CNN) representation of the short temporal dynamics is used for video description, which is trained on video action recognition tasks, so as to produce a representation that is tuned to human motion and behavior.

...read moreread less

Collapse

Related Papers (5)

Continuous Sign Language Recognition: Towards Large Vocabulary Statistical Recognition Systems Handling Multiple Signers

Oscar Koller, +2 more

- 01 Dec 2015 -

Computer Vision and Image Understanding

Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

Citations

Neural Sign Language Translation

Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective

A Deep Neural Framework for Continuous Sign Language Recognition by Iterative Training

Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

References

LSTM Neural Networks for Language Modeling.

A Novel Connectionist System for Unconstrained Handwriting Recognition

Hierarchical recurrent neural network for skeleton based action recognition

Connectionist Speech Recognition: A Hybrid Approach

Describing Videos by Exploiting Temporal Structure

Related Papers (5)

Continuous Sign Language Recognition: Towards Large Vocabulary Statistical Recognition Systems Handling Multiple Signers

Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization

Neural Sign Language Translation

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset