scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals

01 Mar 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 31, Iss: 3, pp 444-457
TL;DR: P pioneering development of two databases for handwritten numerals of two most popular Indian scripts, a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and application for the recognition of mixed handwritten numeral recognition of three Indian scripts Devanagari, Bangla and English.
Abstract: This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.
Citations
More filters
Proceedings ArticleDOI
01 Dec 2011
TL;DR: This paper proposes an unconstrained handwritten Kannada character recognition based on the ridgelet transforms, which is a powerful instrument in catching and representing mono-dimensional singularities in bi dimensional space.
Abstract: Handwritten character recognition is a difficult problem due to the great variations on writing styles, different size and orientation angle of the characters. In this paper, we propose an unconstrained handwritten Kannada character recognition based on the ridgelet transforms. Ridglets are a powerful instrument in catching and representing mono-dimensional singularities in bi dimensional space [7]. Ridgelet transforms is used to extracts low pass energy of character image and is then fed to PCA for feature extraction. We conducted experiment on very large database of handwritten Kannada character. The size of the class was 200 and encouraging results are obtained.

5 citations


Cites background from "Handwritten Numeral Databases of In..."

  • ...These features are used to train the SVM. Bhattacharya et al [16], proposed handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals....

    [...]

  • ...Bhattacharya et al [16], proposed handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals....

    [...]

  • ...However, some preliminary work has also been done on Indian scripts [4, 16, 11]....

    [...]

Proceedings ArticleDOI
16 Dec 2012
TL;DR: A review of the progress in the field of hand written character recognition applied to the Indian languages with a special emphasis on the palm leaf character recognition (PLCR) techniques is given in this paper.
Abstract: This paper briefly reviews the progress in the field of hand written character recognition (HWCR) applied to the Indian languages with a special emphasis on the palm leaf character recognition (PLCR) techniques. The various methodologies and techniques for character recognition (CR) have been discussed in the paper. HWCR applied to historical documents like Palm leaves and old hand written manuscripts is much more challenging due to the limited progress in this area. These documents containing texts and treaties on a host of subjects are of both national and historical importance. Characters on the palm leaf have the additional properties like depth, an added feature which can be gainfully exploited during Palm Leaf Character Recognition (PLCR). The unique method of data collection initiated with isolated Telugu characters from palm leaf manuscripts, and the building of the palm leaf character database is described in this paper. A comparative analysis of the results for PLCR obtained by various techniques are also presented.

5 citations

Proceedings ArticleDOI
01 Dec 2016
TL;DR: The proposed SAEPT method is shown to outperform other prominent existing methods achieving satisfactory recognition accuracy and contains the encoding of handwritten numeral into printed form in the course of pre-training and finally initializing a multi-layer perceptron (MLP) using these pre-trained weights.
Abstract: Recognition of handwritten numerals has gained much interest in recent years due to its various application potentials. Bangla is a major language in Indian subcontinent and is the first language of Bangladesh; but unfortunately, study regarding handwritten Bangla numeral recognition (HBNR) is very few with respect to other major languages such as English, Roman etc. Some noteworthy research works have been conducted for recognition of Bangla handwritten numeral using artificial neural network (ANN) as ANN and its various updated models are found to be efficient for classification task. The aim of this study is to develop a better HBNR system and hence investigated deep architecture of stacked auto encoder (SAE) incorporating printed text (SAEPT) method. SAE is a variant of neural networks (NNs) and is applied efficiently for hierarchical feature extraction from its input. The proposed SAEPT contains the encoding of handwritten numeral into printed form in the course of pre-training and finally initializing a multi-layer perceptron (MLP) using these pre-trained weights. Unlike other methods, it does not employ any feature extraction technique. Benchmark dataset with 22000 hand written numerals with different shapes, sizes and variations are used in this study. The proposed method is shown to outperform other prominent existing methods achieving satisfactory recognition accuracy.

5 citations


Cites methods from "Handwritten Numeral Databases of In..."

  • ...Basu et al. [3] used Dempster-Shafer (DS) technique where they combined the classification decisions of two MLP based classifiers for handwritten Bangla numeral using two different feature sets....

    [...]

01 Jan 2012
TL;DR: The unique method of data collection initiated with isolated Telugu characters from palm leaf manuscripts, and the building of the palm leaf character database is described in this paper.
Abstract: This paper briefly reviews the progress in the field of hand written character recognition (HWCR) applied to the Indian languages with a special emphasis on the palm leaf character recognition (PLCR) techniques. The various methodologies and techniques for character recognition (CR) have been discussed in the paper. HWCR applied to historical documents like Palm leaves and old hand written manuscripts is much more challenging due to the limited progress in this area. These documents containing texts and treaties on a host of subjects are of both national and historical importance. Characters on the palm leaf have the additional properties like depth, an added feature which can be gainfully exploited during Palm Leaf Character Recognition (PLCR). The unique method of data collection initiated with isolated Telugu characters from palm leaf manuscripts, and the building of the palm leaf character database is described in this paper. A comparative analysis of the results for PLCR obtained by various techniques are also presented.

5 citations


Cites background or result from "Handwritten Numeral Databases of In..."

  • ...search lead to the development of many standard databases such as NIST, MNIST, CEDAR, and CENPARMI available for Latin numerals [6] and English....

    [...]

  • ...Small databases collected in laboratory environments have been reported in previous studies [6]....

    [...]

  • ...U.Pal and B.B. Chaudhuri [22] have reported that OCR systems available in market are for Roman, Chinese, Ara­bic and Japanese characters....

    [...]

  • ...Bhattacharya and Chaudhuri [6] observes that a major obstacle to research on handwritten character recognition of Indian Scripts is the nonexistence of standard benchmark databases....

    [...]

  • ...Bhattacharya and Chaudhuri [6] observes that a ma jor obstacle to research on handwritten character recognition of Indian Scripts is the nonexistence of standard benchmark databases....

    [...]

Proceedings ArticleDOI
01 Sep 2017
TL;DR: The present work proposing a deep model and an image augmentation method for classifying handwritten Bangla numerals achieved a testing accuracy of 99.42% on its 74th iteration and used for cross validation on two different and independently collected Bangla benchmark datasets.
Abstract: In recent times, Convolutional Neural Networks (CNN) have proven their ability to classify a complex dataset like that of handwritten numerals. Though CNN does not require any handcrafted feature extraction method to learn useful features, they may need a large number of sample images to train accurately. Typically, different augmentation methods are used to enlarge the training datasets in order to learn more useful features as well as to stop over training. The present work proposing a deep model and an image augmentation method for classifying handwritten Bangla numerals. Moreover, the proposed deep model was trained with an augmented and enlarge training set and then tested with a collection of 3996 image samples. The proposed model with the proposed augmentation method achieved a testing accuracy of 99.42% on its 74th iteration. In addition, the best weight from that trained model also used for cross validation on two different and independently collected Bangla benchmark datasets. The cross validation results on those particular data sets are 99.53% and 95.56% respectively.

5 citations


Cites background from "Handwritten Numeral Databases of In..."

  • ...Work and Year Method/ Network Type Reported Result (%) [9], 2009 Multilayer Perceptron 98....

    [...]

  • ...In the past decade, some notable works have been done on Bangla numeral recognition, where different groups of researchers came up with different ideas[9], [5], [6], [7]....

    [...]

  • ...93% on the same dataset used by the [9]....

    [...]

References
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Book
16 Jul 1998
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.
Abstract: From the Publisher: This book represents the most comprehensive treatment available of neural networks from an engineering perspective. Thorough, well-organized, and completely up to date, it examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks. Written in a concise and fluid manner, by a foremost engineering textbook author, to make the material more accessible, this book is ideal for professional engineers and graduate students entering this exciting field. Computer experiments, problems, worked examples, a bibliography, photographs, and illustrations reinforce key concepts.

29,130 citations

Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Book ChapterDOI
01 Jan 1988
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

17,604 citations


Additional excerpts

  • ...Ç...

    [...]