Text recognition using deep BLSTM networks

doi:10.1109/ICAPR.2015.7050699

Home
/
Papers
/
Text recognition using deep BLSTM networks

Proceedings Article•DOI•

Text recognition using deep BLSTM networks

Anupama Ray¹, Sai Rajeswar¹, Santanu Chaudhury¹•Institutions (1)

Indian Institute of Technology Delhi¹

01 Jan 2015-pp 1-6

TL;DR: A Deep Bidirectional Long Short Term Memory (LSTM) based Recurrent Neural Network architecture for text recognition that uses Connectionist Temporal Classification (CTC) for training to learn the labels of an unsegmented sequence with unknown alignment.

read less

Abstract: This paper presents a Deep Bidirectional Long Short Term Memory (LSTM) based Recurrent Neural Network architecture for text recognition. This architecture uses Connectionist Temporal Classification (CTC) for training to learn the labels of an unsegmented sequence with unknown alignment. This work is motivated by the results of Deep Neural Networks for isolated numeral recognition and improved speech recognition using Deep BLSTM based approaches. Deep BLSTM architecture is chosen due to its ability to access long range context, learn sequence alignment and work without the need of segmented data. Due to the use of CTC and forward backward algorithms for alignment of output labels, there are no unicode re-ordering issues, thus no need of lexicon or postprocessing schemes. This is a script independent and segmentation free approach. This system has been implemented for the recognition of unsegmented words of printed Oriya text. This system achieves 4.18% character level error and 12.11% word error rate on printed Oriya text.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Botnet Detection in the Internet of Things using Deep Learning Approaches

[...]

Christopher D. McDermott¹, Farzan Majdani¹, Andrei Petrovski¹•Institutions (1)

Robert Gordon University¹

08 Jul 2018

TL;DR: The paper demonstrates that although the bidirectional approach adds overhead to each epoch and increases processing time, it proves to be a better progressive model over time.

...read moreread less

Abstract: The recent growth of the Internet of Things (IoT) has resulted in a rise in IoT based DDoS attacks. This paper presents a solution to the detection of botnet activity within consumer IoT devices and networks. A novel application of Deep Learning is used to develop a detection model based on a Bidirectional Long Short Term Memory based Recurrent Neural Network (BLSTM-RNN). Word Embedding is used for text recognition and conversion of attack packets into tokenised integer format. The developed BLSTM-RNN detection model is compared to a LSTM-RNN for detecting four attack vectors used by the mirai botnet, and evaluated for accuracy and loss. The paper demonstrates that although the bidirectional approach adds overhead to each epoch and increases processing time, it proves to be a better progressive model over time. A labelled dataset was generated as part of this research, and is available upon request.

...read moreread less

230 citations

Cites methods from "Text recognition using deep BLSTM n..."

...However [18] demonstrated that a Deep Bidirectional Long Short Term Memory based RNN (BLSTM-RNN) can be used which provides promising results for text recognition....
[...]

Journal Article•

Document Analysis and Recognition

[...]

Takahiro Watanabe

25 Mar 1999-IEICE Transactions on Information and Systems

TL;DR: This paper addresses current topics about document image understanding from a technical point of view as a survey and proposes methods/approaches for recognition of various kinds of documents.

...read moreread less

Abstract: The subject about document image understanding is to extract and classify individual data meaningfully from paper-based documents. Until today, many methods/approaches have been proposed with regard to recognition of various kinds of documents, various technical problems for extensions of OCR, and requirements for practical usages. Of course, though the technical research issues in the early stage are looked upon as complementary attacks for the traditional OCR which is dependent on character recognition techniques, the application ranges or related issues are widely investigated or should be established progressively. This paper addresses current topics about document image understanding from a technical point of view as a survey. key words: document model, top-down, bottom-up, layout structure, logical structure, document types, layout recognition

...read moreread less

222 citations

Cites methods from "Text recognition using deep BLSTM n..."

...[26,27], etc, that have achieved higher accuracies than the presently proposed method....
[...]

Journal Article•DOI•

Parallel Architecture of Convolutional Bi-Directional LSTM Neural Networks for Network-Wide Metro Ridership Prediction

[...]

Xiaolei Ma¹, Jiyu Zhang¹, Bowen Du¹, Chuan Ding¹, Leilei Sun² - Show less +1 more•Institutions (2)

Beihang University¹, Tsinghua University²

01 Jun 2019-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A parallel architecture comprising convolutional neural network (CNN) and bi-directional long short-term memory network (BLSTM) to extract spatial and temporal features, respectively, suitable for ridership prediction in large-scale metro networks is proposed.

...read moreread less

Abstract: Accurate metro ridership prediction can guide passengers in efficiently selecting their departure time and transferring from station to station. An increasing number of deep learning algorithms are being utilized to forecast metro ridership due to the development of computational intelligence. However, limited efforts have been exerted to consider spatiotemporal features, which are important in forecasting ridership through deep learning methods, in large-scale metro networks. To fill this gap, this paper proposes a parallel architecture comprising convolutional neural network (CNN) and bi-directional long short-term memory network (BLSTM) to extract spatial and temporal features, respectively. Metro ridership data are transformed into ridership images and time series. Spatial features can be learned from ridership image data by using CNN, which demonstrates favorable performance in video detection. Time series data are input into the BLSTM which considers the historical and future impacts of ridership in temporal feature extraction. The two networks are concatenated in parallel and prevented from interfering with each other. Joint spatiotemporal features are fed into a fully connected network for metro ridership prediction. The Beijing metro network is used to demonstrate the efficiency of the proposed algorithm. The proposed model outperforms traditional statistical models, deep learning architectures, and sequential structures, and is suitable for ridership prediction in large-scale metro networks. Metro authorities can thus effectively allocate limited resources to overcrowded areas for service improvement.

...read moreread less

117 citations

Cites methods from "Text recognition using deep BLSTM n..."

...This model consists of a forward and backward LSTM to extract temporal features in two directions; hence, it can effectively capture the periodicity and regularity of ridership data [33]....
[...]

Journal Article•DOI•

A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition

[...]

Ritesh Sarkhel¹, Nibaran Das¹, Amit Saha¹, Mita Nasipuri¹•Institutions (1)

Jadavpur University¹

01 Oct 2016-Pattern Recognition

TL;DR: A multi-objective region sampling methodology for isolated handwritten Bangla characters and digits recognition has been proposed and an AFS theory based fuzzy logic is utilized to develop a model for combining the pareto-optimal solutions from two multi- objective heuristics algorithms.

...read moreread less

99 citations

Journal Article•DOI•

A multi-scale deep quad tree based feature extraction method for the recognition of isolated handwritten characters of popular indic scripts

[...]

Ritesh Sarkhel¹, Nibaran Das¹, Aritra Das¹, Mahantapas Kundu¹, Mita Nasipuri¹ - Show less +1 more•Institutions (1)

Jadavpur University¹

01 Nov 2017-Pattern Recognition

TL;DR: In the present work, a non-explicit feature based approach, more specifically, a multi-column multi-scale convolutional neural network (MMCNN) based architecture has been proposed for this purpose and a deep quad-tree based staggered prediction model has be proposed for faster character recognition.

...read moreread less

88 citations

1
2
3
4
…
5
6
7
8
9
10
11
12

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Automatic recognition of printed Oriya script

[...]

Bidyut B. Chaudhuri¹, Umapada Pal¹, Mandar Mitra¹•Institutions (1)

Indian Statistical Institute¹

10 Sep 2001

TL;DR: The paper deals with an optical character recognition system for printed Oriya, a popular Indian script, that achieves 96.3% character level accuracy on average.

...read moreread less

Abstract: The paper deals with an optical character recognition system for printed Oriya, a popular Indian script. The development of OCR for this script is difficult because a large number of characters have to be recognized. In the proposed system, the digitized document image is first passed through preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation, etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of a water reservoir. The feature detection methods are simple and robust. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

...read moreread less

105 citations

"Text recognition using deep BLSTM n..." refers methods in this paper

...In this work we used a bidirectional hierarchically sub-sampled RNN with Long Short Term Memory (LSTM) architecture for the recognition of printed Oriya text....
[...]
...Recognition of Oriya characters is very challenging due to the presence of large number of classes and highly similar shapes of basic characters....
[...]
...Oriya script is used due to the unavailability of OCR for this script and due to the challenges involved such as the huge number of classes and shape complexities of the script....
[...]
...Chaudhuri and Pal reported a character recognition accuracy of 96.3% on printed Oriya dataset of basic characters[32] using a segmentation based, script dependent approach....
[...]
...This approach has been tested on word images of printed Oriya Script....
[...]

Proceedings Article•DOI•

HMM-Based Online Handwriting Recognition System for Telugu Symbols

[...]

V.J. Babu¹, L. Prasanth¹, R. Raghunath Sharma¹, A. Bharath²•Institutions (2)

Sri Sathya Sai University¹, Hewlett-Packard²

23 Sep 2007

TL;DR: An online handwritten symbol recognition system for Telugu, a widely spoken language in India, is presented based on hidden Markov models (HMM) and uses a combination of time-domain and frequency-domain features.

...read moreread less

Abstract: In this paper we present an online handwritten symbol recognition system for Telugu, a widely spoken language in India. The system is based on hidden Markov models (HMM) and uses a combination of time-domain and frequency-domain features. The system gives top-1 accuracy of 91.6% and top-5 accuracy of 98.7% on a dataset containing 29,158 train samples and 9,235 test samples. We also introduce a cost-effective and natural data collection procedure based on ACECADreg Digimemoreg and describe its usage in building a Telugu handwriting dataset.

...read moreread less

96 citations

Book Chapter•DOI•

Biologically Plausible Speech Recognition with LSTM Neural Nets

[...]

Alex Graves, Douglas Eck, Nicole Beringer, Jürgen Schmidhuber

29 Jan 2004-Lecture Notes in Computer Science

TL;DR: It is concluded that LSTM should be further investigated as a biologically plausible basis for a bottom-up, neural net-based approach to speech recognition.

...read moreread less

Abstract: Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) are local in space and time and closely related to a biological model of memory in the prefrontal cortex. Not only are they more biologically plausible than previous artificial RNNs, they also outperformed them on many artificially generated sequential processing tasks. This encouraged us to apply LSTM to more realistic problems, such as the recognition of spoken digits. Without any modification of the underlying algorithm, we achieved results comparable to state-of-the-art Hidden Markov Model (HMM) based recognisers on both the TIDIGITS and TI46 speech corpora. We conclude that LSTM should be further investigated as a biologically plausible basis for a bottom-up, neural net-based approach to speech recognition.

...read moreread less

93 citations

"Text recognition using deep BLSTM n..." refers background in this paper

...Long Short Term Memory based Recurrent Neural network architecture has been widely used for speech recognition [7], [8], text recognition [9], social signal prediction [10], emotion recognition [11] and time series prediction problems since it has the ability of sequence learning....
[...]

Book•DOI•

Guide to OCR for Indic Scripts

[...]

Venu Govindaraju, Srirangaraj Setlur

01 Jan 2010

TL;DR: This book provides an overview of the current state-of-the–art in the OCR of the different Indic scripts as well as other issues in the creation of accessible digital libraries for Indi scripts.

...read moreread less

Abstract: Research on OCR of Indian scripts is gaining momentum in recent times. Many projects funded by government and industry are currently underway to scan hundreds of thousands of indic-script documents and manuscripts to create large digital library archives to preserve these treasures for posterity. OCR is a key enabling technology for making these archives practically accessible to researchers and lay users alike by creating search-able indexes and machine readable text repositories of these documents. This book provides an overview of the current state-of-the–art in the OCR of the different Indic scripts as well as other issues in the creation of accessible digital libraries for Indic scripts. It provides a good technical overview of the latest research in the field.

...read moreread less

68 citations

"Text recognition using deep BLSTM n..." refers methods in this paper

...Most of these algorithms used are segmentation based and script dependent [19]....
[...]

Proceedings Article•

Recognition of printed Devanagari text using BLSTM Neural Network

[...]

Naveen Sankaran¹, C. V. Jawahar¹•Institutions (1)

International Institute of Information Technology, Hyderabad¹

01 Nov 2012

TL;DR: This paper proposes a recognition scheme for the Indian script of Devanagari using a Recurrent Neural Network known as Bidirectional LongShort Term Memory (BLSTM) and reports a reduction of more than 20% in word error rate and over 9% reduction in character error rate while comparing with the best available OCR system.

...read moreread less

Abstract: In this paper, we propose a recognition scheme for the Indian script of Devanagari. Recognition accuracy of Devanagari script is not yet comparable to its Roman counterparts. This is mainly due to the complexity of the script, writing style etc. Our solution uses a Recurrent Neural Network known as Bidirectional LongShort Term Memory (BLSTM). Our approach does not require word to character segmentation, which is one of the most common reason for high word error rate. We report a reduction of more than 20% in word error rate and over 9% reduction in character error rate while comparing with the best available OCR system.

...read moreread less

64 citations

"Text recognition using deep BLSTM n..." refers methods in this paper

...Naveen et al presented a direct implementation of single layer LSTM network for the recognition of Devanagiri scripts [25], [26] and further experimented on more Indic scripts [27]....
[...]