scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals

01 Mar 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society)-Vol. 31, Iss: 3, pp 444-457
TL;DR: P pioneering development of two databases for handwritten numerals of two most popular Indian scripts, a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and application for the recognition of mixed handwritten numeral recognition of three Indian scripts Devanagari, Bangla and English.
Abstract: This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper proposes an approach that exploits the robust graph representation and spectral graph embedding concept to characterise and effectively represent handwritten characters, taking into account writing styles, cursiveness and relationships.
Abstract: Interpretation of different writing styles, unconstrained cursiveness and relationship between different primitive parts is an essential and challenging task for recognition of handwritten characters. As feature representation is inadequate, appropriate interpretation/description of handwritten characters seems to be a challenging task. Although existing research in handwritten characters is extensive, it still remains a challenge to get the effective representation of characters in feature space. In this paper, we make an attempt to circumvent these problems by proposing an approach that exploits the robust graph representation and spectral graph embedding concept to characterise and effectively represent handwritten characters, taking into account writing styles, cursiveness and relationships. For corroboration of the efficacy of the proposed method, extensive experiments were carried out on the standard handwritten numeral Computer Vision Pattern Recognition, Unit of Indian Statistical Institute Kolkata dataset. The experimental results demonstrate promising findings, which can be used in future studies.
Proceedings ArticleDOI
01 Jan 2020
TL;DR: In this research, a method to identify the cause of misrecognition in offline handwritten character recognition using a convolutional neural network (CNN) is proposed and confirmed from character recognition experiments targeting 440 types of Japanese characters.
Abstract: In this research, we propose a method to identify the cause of misrecognition in offline handwritten character recognition using a convolutional neural network (CNN). In our method, the CNN learns not only character images augmented by applying an image processing method, but also those generated from character models with stroke structures. Using these character models, the proposed method can generate character images which lack one stroke. By learning the augmented character images lacking a stroke, the CNN can identify the presence of each stroke in the characters to be recognized. Subsequently, by adding dense layers to the final layer and learning the character images, obtaining the CNN for the offline handwritten character recognition becomes possible. The obtained CNN has nodes that can represent the presence of the strokes and can identify which strokes are the cause of misrecognition. The effectiveness of the proposed method is confirmed from character recognition experiments targeting 440 types of Japanese characters.

Cites methods from "Handwritten Numeral Databases of In..."

  • ...These methods have been mainly used for script recognition (Bhattacharya and Chaudhuri 2009; Saabni and El-Sana 2013), font generation (Miyazaki et al....

    [...]

  • ...These methods have been mainly used for script recognition (Bhattacharya and Chaudhuri 2009; Saabni and El-Sana 2013), font generation (Miyazaki et al. 2017), and so on....

    [...]

01 Jan 2015
TL;DR: An off-line handwritten character recognition system using multilayer feed forward neural network and a new method feature extraction based on diagonal directions is introduced for extracting the features of the handwritten characters.
Abstract: An off-line handwritten character recognition system using multilayer feed forward neural network is described in the paper. A new method feature extraction based on diagonal directions is introduced for extracting the features of the handwritten characters. The proposed recognition system performs quite well yielding higher levels of recognition accuracy compared to the systems employing the conventional horizontal and vertical methods of feature extraction. This system will be suitable for converting handwritten documents into structural text form and recognizing handwritten names.

Cites background from "Handwritten Numeral Databases of In..."

  • ...As a result, the off-line handwriting recognition continues to be an active area for research towards exploring the newer techniques that would improve recognition accuracy [6] [7]....

    [...]

Book ChapterDOI
20 Apr 2018
TL;DR: The paper presents the development of a new statistical method based on template matching and modified template matching used for recognition of a local language of the State of Maharashtra Marathi that gives good recognition rate and offers good CPU and memory efficiency.
Abstract: Optical Character Recognition (OCR) of local languages is an important research area as the techniques developed for one language cannot apply directly to other languages. The paper presents the development of a new statistical method based on template matching and modified template matching used for recognition of a local language of the State of Maharashtra Marathi. It is noted that proposed method not only gives good recognition rate but also have offered good CPU and memory efficiency. Along with system accuracy, average CPU consumption and memory utilization is also analyses and found the acceptable minimum. The proposed algorithm for Marathi OCR is optimized for speed compared with the existing algorithm and hence permits porting on handheld devices with low processing power like Mobile phones. The algorithm is robust in terms of characters size and style of writing.
Book ChapterDOI
01 Jan 2018
TL;DR: This article uses feed-forward propagation model of neural network for recognition of various Indian handwritten numerals like Punjabi, Hindi, Bengali, Telugu, and Marathi and has 98% recognition accuracy with respect to training data.
Abstract: In current years, extracting documents written by hand is extensively studied topic in image analysis and optical character recognition. These extractions of document images find their applications in document analysis, content analysis, document retrieval, and much more. Many complex text extracting processes such as maximization likelihood ratio (MLR), neural networks, edge point detection technique, corner point edge detection are generally employed for extraction of text documents from images. This article uses feed-forward propagation model of neural network for recognition of various Indian handwritten numerals like Punjabi, Hindi, Bengali, Telugu, and Marathi. Recognition is achieved by initially acquiring the image, then preprocessing it and then feature extraction. Preprocessing is performed by binarizing the image and segmenting the preprocessed image by cropping it to its edges. Feature extraction involves the normalizing the numeral matrix into 12 × 10 matrixes. Feature recognition applies artificial neural network for detection of numerals. The network is constructed with 120 input nodes, 10 hidden layer nodes, and 10 output nodes. The network has one input, single output, and a hidden layer. The numbers used for training are divided using a morphological method, and the network is trained for various Indian numerals. The proposed system has 98% recognition accuracy with respect to training data.
References
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

Book
16 Jul 1998
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.
Abstract: From the Publisher: This book represents the most comprehensive treatment available of neural networks from an engineering perspective. Thorough, well-organized, and completely up to date, it examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks. Written in a concise and fluid manner, by a foremost engineering textbook author, to make the material more accessible, this book is ideal for professional engineers and graduate students entering this exciting field. Computer experiments, problems, worked examples, a bibliography, photographs, and illustrations reinforce key concepts.

29,130 citations

Journal ArticleDOI
TL;DR: In this paper, it is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2 /sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions.
Abstract: Multiresolution representations are effective for analyzing the information content of images. The properties of the operator which approximates a signal at a given resolution were studied. It is shown that the difference of information between the approximation of a signal at the resolutions 2/sup j+1/ and 2/sup j/ (where j is an integer) can be extracted by decomposing this signal on a wavelet orthonormal basis of L/sup 2/(R/sup n/), the vector space of measurable, square-integrable n-dimensional functions. In L/sup 2/(R), a wavelet orthonormal basis is a family of functions which is built by dilating and translating a unique function psi (x). This decomposition defines an orthogonal multiresolution representation called a wavelet representation. It is computed with a pyramidal algorithm based on convolutions with quadrature mirror filters. Wavelet representation lies between the spatial and Fourier domains. For images, the wavelet representation differentiates several spatial orientations. The application of this representation to data compression in image coding, texture discrimination and fractal analysis is discussed. >

20,028 citations

Book ChapterDOI
01 Jan 1988
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Abstract: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion

17,604 citations


Additional excerpts

  • ...Ç...

    [...]