Improving Isolated Bangla Compound Character Recognition Through Feature-map Alignment

doi:10.1109/ICAPR.2017.8593008

Home
/
Papers
/
Improving Isolated Bangla Compound Character Recognition Through Feature-map Alignment

Proceedings Article•DOI•

Improving Isolated Bangla Compound Character Recognition Through Feature-map Alignment

Pinaki Ranjan Sarkar, Deepak Mishra, Gorthi R.K.S.S. Manyam

01 Dec 2017-pp 205-209

TL;DR: This work highlights that the performance of optical character classifiers which are based on the deep learning framework can be improved through feature-map alignment through spatial transformer network to align the feature maps of a convolutional neural network model which is proposed for the classification problem.

read less

Abstract: Due to high variability in writing style of different individuals, non-centered and non-uniformly scaled optical characters are very difficult to recognize. Several techniques are proposed in-order to solve the recognition problem. In this work, we highlight that the performance of optical character classifiers which are based on the deep learning framework can be improved through feature-map alignment. Here, we have used spatial transformer network to align the feature maps of a convolutional neural network model which is proposed for the classification problem. We demonstrate that with the proposed framework not only the slight transformed versions which are usually considered in the conventional datasets can be classified with high accuracy, but also highly non-uniform in scale characters can also be fairly recognized with quite higher accuracy. We evaluate our proposed model on CMATERdb 3.1.3 database which consists of isolated Bangla handwritten compound characters and our model obtained 97.86 % recognition accuracy in the original database and 96.34 % on various rotated data in training and testing.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A squeeze and excitation ResNeXt-based deep learning model for Bangla handwritten compound character recognition

[...]

Mohammad Meraj Khan¹, Mohammad Shorif Uddin², Mohammad Zavid Parvez¹, Lutfur Nahar²•Institutions (2)

BRAC University¹, Jahangirnagar University²

16 Feb 2021-Journal of King Saud University - Computer and Information Sciences

TL;DR: A deep CNN (Convolutional Neural Network) model using the SE-ResNeXt, which exhibits an average accuracy of 99.82% in recognizing Bangla handwritten compound characters and outperforms the state-of-the-art models by demonstrating higher results.

...read moreread less

16 citations

Proceedings Article•DOI•

A Complete Bangla Optical Character Recognition System: An Effective Approach

[...]

Tasnim Ahmed¹, Md. Nishat Raihan¹, Rafsanjany Kushol², Sirajus Salekin³•Institutions (3)

Islamic University of Technology¹, University of Alberta², University of South Florida³

01 Dec 2019

TL;DR: A Convolutional Neural Network is used to build the model which identifies individual Bangla characters, and this model has been able to achieve a very impressive accuracy rate in the image to text conversion and by far the highest accuracy rates in scripted character detection.

...read moreread less

Abstract: Bangla character recognition is a significant field of research because Bangla is the most widely used language in the Indian subcontinent and eighth most used language for the written document. Research on Bangla character recognition has been started since the mid-1980s. This report describes a complete Optical Character Recognition (OCR) system for scripted Bangla characters. This paper proposes a new recognition system for scripted Bangla characters. We used a Convolutional Neural Network to build the model which identifies individual Bangla characters. From an image, the lines, words, and characters are segmented based on empty space. We also used binarization, noise reduction and other techniques to improve accuracy. Finally, we have been able to achieve a very impressive accuracy rate in the image to text conversion and by far the highest accuracy rate in scripted character detection.

...read moreread less

8 citations

Additional excerpts

...Since then a few approaches has been made based on CNN for handwritten character recognition [12]–[14]....
[...]

Proceedings Article•DOI•

Bangla Compound Character Recognition by Combining Deep Convolutional Neural Network with Bidirectional Long Short-Term Memory

[...]

Md. Jahid Hasan¹, Md. Ferdous Wahid¹, Md. Shahin Alom¹•Institutions (1)

Hajee Mohammad Danesh Science & Technology University¹

01 Dec 2019

TL;DR: A novel approach to recognize Bangla handwritten compound characters by combining deep convolutional neural network with Bidirectional long short-term memory (CNN-BiLSTM) is proposed.

...read moreread less

Abstract: Recognition of Bangla handwritten compound characters has a significant role in Bangla language in order to develop a complete Bangla OCR. It is a challenging task owing to its high variation in individual writing style and structural resemblance between characters. This paper proposed a novel approach to recognize Bangla handwritten compound characters by combining deep convolutional neural network with Bidirectional long short-term memory (CNN-BiLSTM). The efficacy of the proposed model is evaluated on test set of compound character dataset CMATERdb 3.1.3.3, which consists of 171 distinct character classes. The model has gained 98.50% recognition accuracy which is significantly better than current state-of-the-art techniques on this dataset.

...read moreread less

6 citations

Cites background from "Improving Isolated Bangla Compound ..."

...[9] proposed a deep learning model with spatial transformer network (STN) for classification of handwritten Bangla compound character....
[...]

References

PDF

Open Access

More filters

Journal Article•DOI•

Gradient-based learning applied to document recognition

[...]

Yann LeCun¹, Léon Bottou², Léon Bottou³, Yoshua Bengio⁴, Yoshua Bengio³, Yoshua Bengio⁵, Patrick Haffner³ - Show less +3 more•Institutions (5)

Bell Labs¹, École Normale Supérieure², AT&T³, École Polytechnique de Montréal⁴, Alcatel-Lucent⁵

01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

...read moreread less

42,067 citations

Proceedings Article•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

...read moreread less

30,843 citations

"Improving Isolated Bangla Compound ..." refers methods in this paper

...Overfitting during training was avoided using Batch-normalization [9] and Dropout layers....
[...]

Posted Content•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

11 Feb 2015-arXiv: Learning

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.

...read moreread less

17,184 citations

Proceedings Article•

Spatial transformer networks

[...]

Max Jaderberg¹, Karen Simonyan¹, Andrew Zisserman¹, Koray Kavukcuoglu¹•Institutions (1)

Google¹

07 Dec 2015

TL;DR: This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.

...read moreread less

Abstract: Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. In this work we introduce a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network. This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the optimisation process. We show that the use of spatial transformers results in models which learn invariance to translation, scale, rotation and more generic warping, resulting in state-of-the-art performance on several benchmarks, and for a number of classes of transformations.

...read moreread less

6,150 citations

"Improving Isolated Bangla Compound ..." refers background or methods in this paper

...In literature, the best of way of learning optimal feature-map alignment is using spatial transformer network (STN) [2]....
[...]
...With the intuition that spatial transformer network [2] can optimally align the features in feature-maps, we used the STN in our present work....
[...]
...To solve this problem Google’s DeepMind team came up with a breakthrough solution, Spatial Transformer Network (STN) [2]....
[...]

Journal Article•DOI•

Handwritten isolated Bangla compound character recognition: A new benchmark using a novel deep learning approach

[...]

Saikat Roy¹, Nibaran Das¹, Mahantapas Kundu¹, Mita Nasipuri¹•Institutions (1)

Jadavpur University¹

15 Apr 2017-Pattern Recognition Letters

TL;DR: A novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.3.1.3 dataset is reported.

...read moreread less

113 citations

"Improving Isolated Bangla Compound ..." refers background or methods in this paper

...[6] Greedy layer based CNN original 90....
[...]
...Layer wise training to deep convolutional networks in a supervised fashion is done in [6] and the reported recognition accuracy in CMATERdb 3....
[...]
...Only [6], [7] have used high-level features extracted from CNN to solve this problem....
[...]