Text recognition using deep BLSTM networks

doi:10.1109/ICAPR.2015.7050699

Proceedings ArticleDOI

Text recognition using deep BLSTM networks

- pp 1-6

TLDR

A Deep Bidirectional Long Short Term Memory (LSTM) based Recurrent Neural Network architecture for text recognition that uses Connectionist Temporal Classification (CTC) for training to learn the labels of an unsegmented sequence with unknown alignment.

Abstract:

This paper presents a Deep Bidirectional Long Short Term Memory (LSTM) based Recurrent Neural Network architecture for text recognition. This architecture uses Connectionist Temporal Classification (CTC) for training to learn the labels of an unsegmented sequence with unknown alignment. This work is motivated by the results of Deep Neural Networks for isolated numeral recognition and improved speech recognition using Deep BLSTM based approaches. Deep BLSTM architecture is chosen due to its ability to access long range context, learn sequence alignment and work without the need of segmented data. Due to the use of CTC and forward backward algorithms for alignment of output labels, there are no unicode re-ordering issues, thus no need of lexicon or postprocessing schemes. This is a script independent and segmentation free approach. This system has been implemented for the recognition of unsegmented words of printed Oriya text. This system achieves 4.18% character level error and 12.11% word error rate on printed Oriya text.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Botnet Detection in the Internet of Things using Deep Learning Approaches

Christopher D. McDermott, +2 more

TL;DR: The paper demonstrates that although the bidirectional approach adds overhead to each epoch and increases processing time, it proves to be a better progressive model over time.

...read moreread less

Journal Article

Document Analysis and Recognition

Takahiro Watanabe

- 25 Mar 1999 -

IEICE Transactions on Information and Sy...

TL;DR: This paper addresses current topics about document image understanding from a technical point of view as a survey and proposes methods/approaches for recognition of various kinds of documents.

...read moreread less

Journal ArticleDOI

Parallel Architecture of Convolutional Bi-Directional LSTM Neural Networks for Network-Wide Metro Ridership Prediction

Xiaolei Ma, +4 more

- 01 Jun 2019 -

IEEE Transactions on Intelligent Transpo...

TL;DR: A parallel architecture comprising convolutional neural network (CNN) and bi-directional long short-term memory network (BLSTM) to extract spatial and temporal features, respectively, suitable for ridership prediction in large-scale metro networks is proposed.

...read moreread less

Journal ArticleDOI

A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition

Ritesh Sarkhel, +3 more

- 01 Oct 2016 -

Pattern Recognition

TL;DR: A multi-objective region sampling methodology for isolated handwritten Bangla characters and digits recognition has been proposed and an AFS theory based fuzzy logic is utilized to develop a model for combining the pareto-optimal solutions from two multi- objective heuristics algorithms.

...read moreread less

Journal ArticleDOI

A multi-scale deep quad tree based feature extraction method for the recognition of isolated handwritten characters of popular indic scripts

Ritesh Sarkhel, +4 more

- 01 Nov 2017 -

Pattern Recognition

TL;DR: In the present work, a non-explicit feature based approach, more specifically, a multi-column multi-scale convolutional neural network (MMCNN) based architecture has been proposed for this purpose and a deep quad-tree based staggered prediction model has be proposed for faster character recognition.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Alex Graves, +1 more

TL;DR: In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.

...read moreread less

Journal ArticleDOI

Online and off-line handwriting recognition: a comprehensive survey

Réjean Plamondon, +1 more

- 01 Jan 2000 -

IEEE Transactions on Pattern Analysis an...

TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.

...read moreread less

Journal ArticleDOI

2005 Special Issue: Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Alex Graves, +1 more

- 01 Jun 2005 -

Neural Networks

TL;DR: In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.

...read moreread less

Book

Supervised Sequence Labelling with Recurrent Neural Networks

Alex Graves

TL;DR: A new type of output layer that allows recurrent networks to be trained directly for sequence labelling tasks where the alignment between the inputs and the labels is unknown, and an extension of the long short-term memory network architecture to multidimensional data, such as images and video sequences.

...read moreread less

Journal ArticleDOI

A Novel Connectionist System for Unconstrained Handwriting Recognition

Alex Graves, +5 more

- 01 May 2009 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies, significantly outperforming a state-of-the-art HMM-based system.

...read moreread less

Collapse

Related Papers (5)

Offline Printed Urdu Nastaleeq Script Recognition with Bidirectional LSTM Networks

Adnan Ul-Hasan, +4 more

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

Text recognition using deep BLSTM networks

Citations

Botnet Detection in the Internet of Things using Deep Learning Approaches

Document Analysis and Recognition

Parallel Architecture of Convolutional Bi-Directional LSTM Neural Networks for Network-Wide Metro Ridership Prediction

A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition

A multi-scale deep quad tree based feature extraction method for the recognition of isolated handwritten characters of popular indic scripts

References

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Online and off-line handwriting recognition: a comprehensive survey

2005 Special Issue: Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Supervised Sequence Labelling with Recurrent Neural Networks

A Novel Connectionist System for Unconstrained Handwriting Recognition

Related Papers (5)

Offline Printed Urdu Nastaleeq Script Recognition with Bidirectional LSTM Networks

Long short-term memory

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks

Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks