scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Improving Isolated Bangla Compound Character Recognition Through Feature-map Alignment

TL;DR: This work highlights that the performance of optical character classifiers which are based on the deep learning framework can be improved through feature-map alignment through spatial transformer network to align the feature maps of a convolutional neural network model which is proposed for the classification problem.
Abstract: Due to high variability in writing style of different individuals, non-centered and non-uniformly scaled optical characters are very difficult to recognize. Several techniques are proposed in-order to solve the recognition problem. In this work, we highlight that the performance of optical character classifiers which are based on the deep learning framework can be improved through feature-map alignment. Here, we have used spatial transformer network to align the feature maps of a convolutional neural network model which is proposed for the classification problem. We demonstrate that with the proposed framework not only the slight transformed versions which are usually considered in the conventional datasets can be classified with high accuracy, but also highly non-uniform in scale characters can also be fairly recognized with quite higher accuracy. We evaluate our proposed model on CMATERdb 3.1.3 database which consists of isolated Bangla handwritten compound characters and our model obtained 97.86 % recognition accuracy in the original database and 96.34 % on various rotated data in training and testing.
Citations
More filters
Journal ArticleDOI
TL;DR: A deep CNN (Convolutional Neural Network) model using the SE-ResNeXt, which exhibits an average accuracy of 99.82% in recognizing Bangla handwritten compound characters and outperforms the state-of-the-art models by demonstrating higher results.

16 citations

Proceedings ArticleDOI
01 Dec 2019
TL;DR: A Convolutional Neural Network is used to build the model which identifies individual Bangla characters, and this model has been able to achieve a very impressive accuracy rate in the image to text conversion and by far the highest accuracy rates in scripted character detection.
Abstract: Bangla character recognition is a significant field of research because Bangla is the most widely used language in the Indian subcontinent and eighth most used language for the written document. Research on Bangla character recognition has been started since the mid-1980s. This report describes a complete Optical Character Recognition (OCR) system for scripted Bangla characters. This paper proposes a new recognition system for scripted Bangla characters. We used a Convolutional Neural Network to build the model which identifies individual Bangla characters. From an image, the lines, words, and characters are segmented based on empty space. We also used binarization, noise reduction and other techniques to improve accuracy. Finally, we have been able to achieve a very impressive accuracy rate in the image to text conversion and by far the highest accuracy rate in scripted character detection.

8 citations


Additional excerpts

  • ...Since then a few approaches has been made based on CNN for handwritten character recognition [12]–[14]....

    [...]

Proceedings ArticleDOI
01 Dec 2019
TL;DR: A novel approach to recognize Bangla handwritten compound characters by combining deep convolutional neural network with Bidirectional long short-term memory (CNN-BiLSTM) is proposed.
Abstract: Recognition of Bangla handwritten compound characters has a significant role in Bangla language in order to develop a complete Bangla OCR. It is a challenging task owing to its high variation in individual writing style and structural resemblance between characters. This paper proposed a novel approach to recognize Bangla handwritten compound characters by combining deep convolutional neural network with Bidirectional long short-term memory (CNN-BiLSTM). The efficacy of the proposed model is evaluated on test set of compound character dataset CMATERdb 3.1.3.3, which consists of 171 distinct character classes. The model has gained 98.50% recognition accuracy which is significantly better than current state-of-the-art techniques on this dataset.

6 citations


Cites background from "Improving Isolated Bangla Compound ..."

  • ...[9] proposed a deep learning model with spatial transformer network (STN) for classification of handwritten Bangla compound character....

    [...]

References
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Proceedings Article
Sergey Ioffe1, Christian Szegedy1
06 Jul 2015
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

30,843 citations


"Improving Isolated Bangla Compound ..." refers methods in this paper

  • ...Overfitting during training was avoided using Batch-normalization [9] and Dropout layers....

    [...]

Posted Content
Sergey Ioffe1, Christian Szegedy1
TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.

17,184 citations

Proceedings Article
07 Dec 2015
TL;DR: This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.
Abstract: Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner. In this work we introduce a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network. This differentiable module can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps, conditional on the feature map itself, without any extra training supervision or modification to the optimisation process. We show that the use of spatial transformers results in models which learn invariance to translation, scale, rotation and more generic warping, resulting in state-of-the-art performance on several benchmarks, and for a number of classes of transformations.

6,150 citations


"Improving Isolated Bangla Compound ..." refers background or methods in this paper

  • ...In literature, the best of way of learning optimal feature-map alignment is using spatial transformer network (STN) [2]....

    [...]

  • ...With the intuition that spatial transformer network [2] can optimally align the features in feature-maps, we used the STN in our present work....

    [...]

  • ...To solve this problem Google’s DeepMind team came up with a breakthrough solution, Spatial Transformer Network (STN) [2]....

    [...]

Journal ArticleDOI
TL;DR: A novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.3.1.3 dataset is reported.

113 citations


"Improving Isolated Bangla Compound ..." refers background or methods in this paper

  • ...[6] Greedy layer based CNN original 90....

    [...]

  • ...Layer wise training to deep convolutional networks in a supervised fashion is done in [6] and the reported recognition accuracy in CMATERdb 3....

    [...]

  • ...Only [6], [7] have used high-level features extracted from CNN to solve this problem....

    [...]

Trending Questions (1)
How can optical character recognition be improved for use with gender-diverse datasets?

The provided paper does not mention anything about improving optical character recognition for use with gender-diverse datasets.