scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals

01 Mar 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 31, Iss: 3, pp 444-457
TL;DR: P pioneering development of two databases for handwritten numerals of two most popular Indian scripts, a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and application for the recognition of mixed handwritten numeral recognition of three Indian scripts Devanagari, Bangla and English.
Abstract: This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.
Citations
More filters
Proceedings ArticleDOI
01 Oct 2014
TL;DR: Zonal based feature extraction is used in the present proposed method and using this Zoning method the recognition accuracy is found to be 78%.
Abstract: Character recognition is one of the oldest applications of pattern recognition. Recognizing Hand-Written Characters (HWC) is an effortless task for humans, but for a computer it is a difficult job. Research in character recognition is very popular for various potential applications such as in banks, post offices, defense organizations, reading aid for the blind, library automation, language processing and multi-media design. Optical Character Recognition (OCR) is based on optical mechanism which consists of a machine to recognize scanned and digitized character automatically. Automatic recognition of handwritten text can be done either Offline or Online. Offline handwritten recognition is the task of recognizing the image of a hand written text, in contrast to Online recognition where the dynamic characteristics of the writing are available and recorded while the scriber is writing on a special screen with a pen/stylus made for this application. Zonal based feature extraction is used in the present proposed method. The character image is divided into predefined number of zones and a statistical feature is computed from each of these zones. Usually, this feature is based on the pixels contained in that zone. The gray values of the pixels in that selected zone are summed up to form a feature for that zone in that image. The features of all the zones in the image form a feature vector which is used for handwritten character recognition. In this work, using this Zoning method the recognition accuracy is found to be 78%.

26 citations

Journal ArticleDOI
TL;DR: The proposed character recognition system using multilayer Feed forward neural network will aid applications for postal/parcel address recognition and conversion of any hand written document into structural text form.
Abstract: handwritten character recognition system using multilayer Feed forward neural network is proposed in this paper. The character data set suitable for recognizing postal addresses contains 38 elements which include 26 alphabets, 10 numerals and 2 symbols. Fifteen different handwritten data sets were used for training the neural network for classification and recognition of the characters. Three different orientations, namely, horizontal, vertical and diagonal directions are used for extracting 54 features from each character. The trained neural recognition system is tested for various inputs and found to perform well. The diagonal orientation for feature extraction is identified to be the most suitable method as it yields higher recognition accuracy. The proposed system will aid applications for postal/parcel address recognition and conversion of any hand written document into structural text form.

26 citations


Cites methods from "Handwritten Numeral Databases of In..."

  • ...However, in the off-line systems, the neural networks have been successfully used to yield comparably high recognition accuracy levels [4]....

    [...]

Journal ArticleDOI
TL;DR: The proposed CNN based Bengali handwritten numeral recognition scheme showed satisfactory recognition accuracy on the benchmark data set and outperformed other prominent existing methods for both Bengali and Bengali-English mixed cases.
Abstract: Recognition of handwritten numerals has gained much interest in recent years due to its various potential applications. Bengali is the fifth ranked among the spoken languages of the world. However, due to inherent difficulties of Bengali numeral recognition, a very few study on handwritten Bengali numeral recognition is found with respect to other major languages. The existing Bengali numeral recognition methods used distinct feature extraction techniques and various classification tools. Recently, convolutional neural network (CNN) is found efficient for image classification with its distinct features. In this paper, we have investigated a CNN based Bengali handwritten numeral recognition scheme. Since English numerals are frequently used with Bengali numerals, handwritten Bengali-English mixed numerals are also investigated in this study. The proposed scheme uses moderate pre-processing technique to generate patterns from images of handwritten numerals and then employs CNN to classify individual numerals. It does not employ any feature extraction method like other related works. The proposed method showed satisfactory recognition accuracy on the benchmark data set and outperformed other prominent existing methods for both Bengali and Bengali-English mixed cases.

25 citations


Cites background or methods from "Handwritten Numeral Databases of In..."

  • ...In case of 20 class classifier, the test set accuracy of [12] is only 69....

    [...]

  • ...20% for the works of [4] and [12], respectively....

    [...]

  • ...The recognition scheme of [12] correctly recognizes 99....

    [...]

  • ...Bhattacharya and Chaudhuri [12], 2009 Wavelet filter at different resolutions MLPs in two stages 16 CVPR, ISI and MNIST [18]; 86,000 and 14,000 98....

    [...]

  • ...[12] U. Bhattacharya and B. B. Chaudhuri, Handwritten numeral databases of Indian scripts and multistage recognition of mixed numerals, IEEE Trans....

    [...]

Journal ArticleDOI
TL;DR: Two dimensional Discrete wavelet transform, two dimensional fast Fourier transform and two dimensional discrete cosine transform are used for feature extraction and the 3D feature along with the proposed two level transform based technique helps to obtain better recognition accuracy.

24 citations

Journal ArticleDOI
TL;DR: There is still scope to work on feature extraction techniques for the recognition of multilingual characters as most of the existing CR methods will work successfully for one or two fonts and they have used the combination of existing features to improve the accuracy.
Abstract: This paper presents a comprehensive review of the feature extraction techniques for character recognition (CR) which will be helpful for the new researchers to understand the insight into the devel...

24 citations


Cites background or methods from "Handwritten Numeral Databases of In..."

  • ...05% [59] Class conditional probabilities and chain code histogram English 10,000 characters 98....

    [...]

  • ...Bhattacharya and Chaudhuri in the paper [59] proposed a multistage recognition scheme using MLP classifier....

    [...]

  • ...9% [59] Class conditional probabilities and chain code histogram Bangla 4000 characters 98....

    [...]

  • ...Only few hybrid CR methods from the literature such as [2,6,27,28,30,33,36,40,44,45,51,59,75,79] have claimed accuracies more than 98% for printed and hand written characters as shown in Table 3....

    [...]

References
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Book
16 Jul 1998
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.
Abstract: From the Publisher: This book represents the most comprehensive treatment available of neural networks from an engineering perspective. Thorough, well-organized, and completely up to date, it examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks. Written in a concise and fluid manner, by a foremost engineering textbook author, to make the material more accessible, this book is ideal for professional engineers and graduate students entering this exciting field. Computer experiments, problems, worked examples, a bibliography, photographs, and illustrations reinforce key concepts.

29,130 citations

Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Book ChapterDOI
01 Jan 1988
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

17,604 citations


Additional excerpts

  • ...Ç...

    [...]