scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A complete printed Bangla OCR system

01 Mar 1998-Pattern Recognition (Pergamon)-Vol. 31, Iss: 5, pp 531-549
TL;DR: A complete Optical Character Recognition (OCR) system for printed Bangla, the fourth most popular script in the world, is presented and extension of the work to Devnagari, the third most popular Script in the World, is discussed.
About: This article is published in Pattern Recognition.The article was published on 1998-03-01. It has received 381 citations till now. The article focuses on the topics: Optical character recognition & Document processing.
Citations
More filters
Journal ArticleDOI
TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.

592 citations

Journal ArticleDOI
TL;DR: P pioneering development of two databases for handwritten numerals of two most popular Indian scripts, a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and application for the recognition of mixed handwritten numeral recognition of three Indian scripts Devanagari, Bangla and English.
Abstract: This article primarily concerns the problem of isolated handwritten numeral recognition of major Indian scripts. The principal contributions presented here are (a) pioneering development of two databases for handwritten numerals of two most popular Indian scripts, (b) a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and (c) application of (b) for the recognition of mixed handwritten numerals of three Indian scripts Devanagari, Bangla and English. The present databases include respectively 22,556 and 23,392 handwritten isolated numeral samples of Devanagari and Bangla collected from real-life situations and these can be made available free of cost to researchers of other academic Institutions. In the proposed scheme, a numeral is subjected to three multilayer perceptron classifiers corresponding to three coarse-to-fine resolution levels in a cascaded manner. If rejection occurred even at the highest resolution, another multilayer perceptron is used as the final attempt to recognize the input numeral by combining the outputs of three classifiers of the previous stages. This scheme has been extended to the situation when the script of a document is not known a priori or the numerals written on a document belong to different scripts. Handwritten numerals in mixed scripts are frequently found in Indian postal mails and table-form documents.

328 citations

01 Jan 2014
TL;DR: Investigation of the phonological length of utterance in native Kannada speaking children of 3 to 7 years age revealed increase inPMLU score as the age increased suggesting a developmental trend in PMLU acquisition.
Abstract: Phonological mean length of utterance (PMLU) is a whole word measure for measuring phonological proficiency. It measures the length of a child’s word and the number of correct consonants. The present study investigated the phonological length of utterance in native Kannada speaking children of 3 to 7 years age. A total of 400 subjects in the age range of 3-7 years participated in the study. Spontaneous speech samples were elicited from each child and analyzed for PMLU as per the rules suggested by Ingram. Mann-Whitney U test and Kruskal Wallis test were employed to compare the differences between the means of PMLU scores across the gender and the age respectively. The result revealed increase in PMLU score as the age increased suggesting a developmental trend in PMLU acquisition. No statistically significant differences were observed between the means of PMLU scores across the gender.

230 citations

Journal ArticleDOI
TL;DR: A neural network is proposed for Gujarati handwritten digits identification and a multi layered feed forward Neural network is suggested for classification of digits.

176 citations

Book ChapterDOI
13 Dec 2006
TL;DR: A quadratic classifier based scheme for the recognition of off-line Devnagari handwritten characters using chain code information of the contour points of the characters and using five-fold cross-validation technique for result computation.
Abstract: Recognition of handwritten characters is a challenging task because of the variability involved in the writing styles of different individuals. In this paper we propose a quadratic classifier based scheme for the recognition of off-line Devnagari handwritten characters. The features used in the classifier are obtained from the directional chain code information of the contour points of the characters. The bounding box of a character is segmented into blocks and the chain code histogram is computed in each of the blocks. Based on the chain code histogram, here we have used 64 dimensional features for recognition. These chain code features are fed to the quadratic classifier for recognition. From the proposed scheme we obtained 98.86% and 80.36% recognition accuracy on Devnagari numerals and characters, respectively. We used five-fold cross-validation technique for result computation.

157 citations


Cites background or methods from "A complete printed Bangla OCR syste..."

  • ...For removing noises from the images, we have used a method discussed in [3]....

    [...]

  • ...Many pieces of work have been done towards the recognition of Indian printed characters and at present OCR systems are commercially available for some of the printed Indian scripts [3]....

    [...]

References
More filters
Book
01 Jan 1976
TL;DR: The rapid rate at which the field of digital picture processing has grown in the past five years had necessitated extensive revisions and the introduction of topics not found in the original edition.
Abstract: The rapid rate at which the field of digital picture processing has grown in the past five years had necessitated extensive revisions and the introduction of topics not found in the original edition.

4,231 citations

Journal ArticleDOI
TL;DR: Two methods of entropic thresholding proposed by Pun (Signal Process.,2, 1980, 223–237;Comput.16, 1981, 210–239) have been carefully and critically examined and a new method with a sound theoretical foundation is proposed.
Abstract: Two methods of entropic thresholding proposed by Pun (Signal Process.,2, 1980, 223–237;Comput. Graphics Image Process.16, 1981, 210–239) have been carefully and critically examined. A new method with a sound theoretical foundation is proposed. Examples are given on a number of real and artifically generated histograms.

3,551 citations

Journal ArticleDOI
TL;DR: Research aimed at correcting words in text has focused on three progressively more difficult problems: nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction, which surveys documented findings on spelling error patterns.
Abstract: Research aimed at correcting words in text has focused on three progressively more difficult problems:(1) nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction. In response to the first problem, efficient pattern-matching and n-gram analysis techniques have been developed for detecting strings that do not appear in a given word list. In response to the second problem, a variety of general and application-specific spelling correction techniques have been developed. Some of them were based on detailed studies of spelling error patterns. In response to the third problem, a few experiments using natural-language-processing tools or statistical-language models have been carried out. This article surveys documented findings on spelling error patterns, provides descriptions of various nonword detection and isolated-word error correction techniques, reviews the state of the art of context-dependent word correction techniques, and discusses research issues related to all three areas of automatic error correction in text.

1,417 citations


"A complete printed Bangla OCR syste..." refers methods in this paper

  • ...Of the two classes of approaches of error detection and correction(27, 28) based on n-gram and dictionary search, we use the latter approach....

    [...]

Journal ArticleDOI
01 Jul 1992
TL;DR: Both template matching and structure analysis approaches to R&D are considered and it is noted that the two approaches are coming closer and tending to merge.
Abstract: Research and development of OCR systems are considered from a historical point of view. The historical development of commercial systems is included. Both template matching and structure analysis approaches to R&D are considered. It is noted that the two approaches are coming closer and tending to merge. Commercial products are divided into three generations, for each of which some representative OCR systems are chosen and described in some detail. Some comments are made on recent techniques applied to OCR, such as expert systems and neural networks, and some open problems are indicated. The authors' views and hopes regarding future trends are presented. >

892 citations