Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

doi:10.1109/CVPR.2017.364

Proceedings ArticleDOI

Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

Oscar Koller, +2 more

- pp 3416-3424

Chats0

TLDR

This work proposes an algorithm that treats the provided training labels as weak labels and refines the label-to-image alignment on-the-fly in a weakly supervised fashion, and embedded into an HMM the resulting deep model continuously improves its performance in several re-alignments.

Abstract:

This work presents an iterative re-alignment approach applicable to visual sequence labelling tasks such as gesture recognition, activity recognition and continuous sign language recognition. Previous methods dealing with video data usually rely on given frame labels to train their classifiers. However, looking at recent data sets, these labels often tend to be noisy which is commonly overseen. We propose an algorithm that treats the provided training labels as weak labels and refines the label-to-image alignment on-the-fly in a weakly supervised fashion. Given a series of frames and sequence-level labels, a deep recurrent CNN-BLSTM network is trained end-to-end. Embedded into an HMM the resulting deep model corrects the frame labels and continuously improves its performance in several re-alignments. We evaluate on two challenging publicly available sign recognition benchmark data sets featuring over 1000 classes. We outperform the state-of-the-art by up to 10% absolute and 30% relative.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Graph-Based Multimodal Sequential Embedding for Sign Language Translation

Shengeng Tang, +3 more

- 01 Jan 2021 -

IEEE Transactions on Multimedia

TL;DR: A graph-based multimodal sequential embedding network (MSeqGraph), in which multiple sequential modalities are densely correlated, and a connectionist temporal decoding strategy is adopted to explore the entire video’s temporal transition and translate the sentence.

...read moreread less

Journal ArticleDOI

Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network.

Ilias Papastratis, +2 more

- 01 Apr 2021 -

Sensors

TL;DR: In this article, a generative adversarial network (GAN) is proposed for context-aware continuous sign language recognition using a generator that recognizes sign language glosses by extracting spatial and temporal features from video sequences, as well as a discriminator that evaluates the quality of the generator's predictions by modeling text information at the sentence and gloss levels.

...read moreread less

Journal ArticleDOI

An optimized Generative Adversarial Network based continuous sign language classification

R. Elakkiya, +3 more

- 15 Nov 2021 -

Expert Systems With Applications

TL;DR: A novel hyperparameter based optimized Generative Adversarial Networks to classify the sign gestures is introduced, and the experimental results reveal that the H-GANs improved the accuracy and recognition rate when compared with the state-of-the-art classification methods with reduced complexity.

...read moreread less

Journal ArticleDOI

Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks

Ben Saunders, +2 more

- 11 Mar 2021 -

International Journal of Computer Vision

TL;DR: In this paper, a Progressive Transformer Network (PTN) is proposed to translate from spoken language sentences to continuous 3D multi-channel sign pose sequences in an end-to-end manner.

...read moreread less

Journal ArticleDOI

A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition

- 01 Jan 2022 -

IEEE Transactions on Multimedia

TL;DR: In this article , a comparative experimental assessment of computer vision-based methods for sign language recognition is conducted by implementing the most recent deep neural network methods in this field, a thorough evaluation on multiple publicly available datasets is performed.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Collapse

Related Papers (5)

Continuous Sign Language Recognition: Towards Large Vocabulary Statistical Recognition Systems Handling Multiple Signers

Oscar Koller, +2 more

- 01 Dec 2015 -

Computer Vision and Image Understanding

Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs

Citations

Graph-Based Multimodal Sequential Embedding for Sign Language Translation

Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network.

An optimized Generative Adversarial Network based continuous sign language classification

Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks

A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition

References

ImageNet Classification with Deep Convolutional Neural Networks

Long short-term memory

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Maximum likelihood from incomplete data via the EM algorithm

Related Papers (5)

Continuous Sign Language Recognition: Towards Large Vocabulary Statistical Recognition Systems Handling Multiple Signers

Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization

Neural Sign Language Translation

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset