scispace - formally typeset
Proceedings ArticleDOI

Recognition of Bengali Handwritten Characters Using Skeletal Convexity and Dynamic Programming

19 Feb 2011-pp 265-268

TL;DR: This paper presents a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane, and preliminary results demonstrate the efficacy of this approach.

AbstractThe main challenge in recognizing handwritten characters is to handle large-scale shape variations in the handwriting of different individuals. In this paper, we present a novel handwritten character recognition method based on the structural shape of a character irrespective of the viewing direction on the 2D plane. Structural shape of a character is described by different skeletal convexities of character strokes. Such skeletal convexity acts as an invariant feature for character recognition. Longest common subsequence matching is used for recognition. We have tested out method on a benchmark dataset of handwritten Bengali character images. Preliminary results demonstrate the efficacy of our approach.

...read more


Citations
More filters
Journal Article
TL;DR: This paper addresses current topics about document image understanding from a technical point of view as a survey and proposes methods/approaches for recognition of various kinds of documents.
Abstract: The subject about document image understanding is to extract and classify individual data meaningfully from paper-based documents. Until today, many methods/approaches have been proposed with regard to recognition of various kinds of documents, various technical problems for extensions of OCR, and requirements for practical usages. Of course, though the technical research issues in the early stage are looked upon as complementary attacks for the traditional OCR which is dependent on character recognition techniques, the application ranges or related issues are widely investigated or should be established progressively. This paper addresses current topics about document image understanding from a technical point of view as a survey. key words: document model, top-down, bottom-up, layout structure, logical structure, document types, layout recognition

221 citations


Cites background or methods from "Recognition of Bengali Handwritten ..."

  • ...The ground-truth generators in [9,17,19] can only allow rectangular bounding-boxes for annotation....

    [...]

  • ...In [8, 9], researchers have proposed line and table detection methods from document images....

    [...]

  • ...The authors in [9,15,20] have presented Various layout based ground-truth generation tools....

    [...]

  • ...Kumar and Doermann [9] present a method for retrieval of document images with chosen layout characteristics....

    [...]

  • ...[9] have utilised Sparse Representation Classifier on image zone density for classification of Bangla numerals....

    [...]

Journal ArticleDOI
TL;DR: A novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.3.1.3 dataset is reported.
Abstract: In this work, a novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmarkof recognition accuracy on the CMATERdb 3.1.3.3 dataset is reported. Greedy layer wise training of Deep Neural Network has helped to make significant strides in various pattern recognition problems. We employ layerwise training to Deep Convolutional Neural Networks (DCNN) in a supervised fashion and augment the training process with the RMSProp algorithm to achieve faster convergence. We compare results with those obtained from standard shallow learning methods with predefined features, as well as standard DCNNs. Supervised layerwise trained DCNNs are found to outperform standard shallow learning models such as Support Vector Machines as well as regular DCNNs of similar architecture by achieving error rate of 9.67% thereby setting a new benchmark on the CMATERdb 3.1.3.3 with recognition accuracy of 90.33%, representing an improvement of nearly 10%.

91 citations


Cites methods from "Recognition of Bengali Handwritten ..."

  • ...In [2], a method to improve classification performance on Bangla Basic characters using topological features derived from the convex shapes of various strokes was proposed....

    [...]

Journal Article
TL;DR: Multilayer perceptrons (MLP) trained by backpropagation (BP) algorithm are used as classifiers in the present study and results of this study on recognition of handwritten Bangla basic characters will be reported.
Abstract: Recently, a few works on recognition of handwritten Bangla characters have been reported in the literature. However, there is scope for further research in this area. In the present article, results of our recent study on recognition of handwritten Bangla basic characters will be reported. This is a 50 class problem since the alphabet of Bangla has 50 basic characters. In this study, features are obtained by computing local chain code histograms of input character shape. Comparative recognition results are obtained between computation of the above feature based on the contour and one-pixel skeletal representations of the input character image. Also, the classification results are obtained after down sampling the histogram feature by applying Gaussian filter in both these cases. Multilayer perceptrons (MLP) trained by back propagation (BP) algorithm are used as classifiers in the present study. Near exhaustive studies are done for selection of its hidden layer size. An analysis of the misclassified samples shows an interesting error pattern and this has been used for further improvement in the recognition results. Final recognition accuracies on the training and the test sets are respectively 94.65% and 92.14%.

80 citations

Journal ArticleDOI
TL;DR: A review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India, and the various methodologies and their reported results are presented.
Abstract: The past few decades have witnessed an intensive research on optical character recognition (OCR) for Roman, Chinese, and Japanese scripts. A lot of work has been also reported on OCR efforts for various Indian scripts, like Devanagari, Bangla, Oriya, Tamil, Telugu, Malayalam, Kannada, Gurmukhi, Gujarati, etc. In this paper, we present a review of OCR work on Indian scripts, mainly on Bangla and Devanagari—the two most popular scripts in India. We have summarized most of the published papers on this topic and have also analysed the various methodologies and their reported results. Future directions of research in OCR for Indian scripts have been also given.

63 citations


Cites background or methods from "Recognition of Bengali Handwritten ..."

  • ...Bag et al (2011b) have proposed topological features (Bag et al 2012) to improve the recognition performance for printed and handwritten Bangla basic characters....

    [...]

  • ...To handle large-scale shape variations in the handwriting of different individuals, Bag et al (2011a) have proposed a method based on the structural shape of a character irrespective of the viewing direction on the 2D plane....

    [...]

  • ...Bag et al (2011b) have proposed topological features (Bag & Harit 2011) to improve the recognition performance for printed and handwritten Bangla basic characters....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a novel shape decomposition-based segmentation technique to decompose the compound characters into prominent shape components, which reduces the classification complexity in terms of less number of classes to recognize, and at the same time improves the recognition accuracy.
Abstract: Proper recognition of complex-shaped handwritten compound characters is still a big challenge for Bangla OCR systems. In this paper, we propose a novel shape decomposition-based segmentation technique to decompose the compound characters into prominent shape components. This shape decomposition reduces the classification complexity in terms of less number of classes to recognize, and at the same time improves the recognition accuracy. The decomposition is done at the segmentation area where the two basic shapes are joined to form a compound character. We use chain code histogram feature set with multi-layer perceptron (MLP) based classifier with backpropagation learning for classification. On experimentation, the proposed method is observed to provide good recognition accuracy comparing with other existing methods.

61 citations


References
More filters
Book
01 Jan 1990
TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Abstract: From the Publisher: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures. Like the first edition,this text can also be used for self-study by technical professionals since it discusses engineering issues in algorithm design as well as the mathematical aspects. In its new edition,Introduction to Algorithms continues to provide a comprehensive introduction to the modern study of algorithms. The revision has been updated to reflect changes in the years since the book's original publication. New chapters on the role of algorithms in computing and on probabilistic analysis and randomized algorithms have been included. Sections throughout the book have been rewritten for increased clarity,and material has been added wherever a fuller explanation has seemed useful or new information warrants expanded coverage. As in the classic first edition,this new edition of Introduction to Algorithms presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers. Further,the algorithms are presented in pseudocode to make the book easily accessible to students from all programming language backgrounds. Each chapter presents an algorithm,a design technique,an application area,or a related topic. The chapters are not dependent on one another,so the instructor can organize his or her use of the book in the way that best suits the course's needs. Additionally,the new edition offers a 25% increase over the first edition in the number of problems,giving the book 155 problems and over 900 exercises thatreinforcethe concepts the students are learning.

21,642 citations

01 Jan 2005

19,237 citations

Journal ArticleDOI
TL;DR: In this article, a language similar to logo is used to draw geometric pictures using this language and programs are developed to draw geometrical pictures using it, which is similar to the one we use in this paper.
Abstract: The primary purpose of a programming language is to assist the programmer in the practice of her art. Each language is either designed for a class of problems or supports a different style of programming. In other words, a programming language turns the computer into a ‘virtual machine’ whose features and capabilities are unlimited. In this article, we illustrate these aspects through a language similar tologo. Programs are developed to draw geometric pictures using this language.

5,749 citations

Journal ArticleDOI
TL;DR: A comprehensive survey of thinning methodologies, including iterative deletion of pixels and nonpixel-based methods, is presented and the relationships among them are explored.
Abstract: A comprehensive survey of thinning methodologies is presented. A wide range of thinning algorithms, including iterative deletion of pixels and nonpixel-based methods, is covered. Skeletonization algorithms based on medial axis and other distance transforms are not considered. An overview of the iterative thinning process and the pixel-deletion criteria needed to preserve the connectivity of the image pattern is given first. Thinning algorithms are then considered in terms of these criteria and their modes of operation. Nonpixel-based methods that usually produce a center line of the pattern directly in one pass without examining all the individual pixels are discussed. The algorithms are considered in great detail and scope, and the relationships among them are explored. >

1,757 citations

Journal ArticleDOI
TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.
Abstract: Intensive research has been done on optical character recognition (OCR) and a large number of articles have been published on this topic during the last few decades. Many commercial OCR systems are now available in the market. But most of these systems work for Roman, Chinese, Japanese and Arabic characters. There are no sufficient number of work on Indian language character recognition although there are 12 major scripts in India. In this paper, we present a review of the OCR work done on Indian language scripts. The review is organized into 5 sections. Sections 1 and 2 cover introduction and properties on Indian scripts. In Section 3, we discuss different methodologies in OCR development as well as research work done on Indian scripts recognition. In Section 4, we discuss the scope of future work and further steps needed for Indian script OCR development. In Section 5 we conclude the paper.

565 citations


"Recognition of Bengali Handwritten ..." refers methods in this paper

  • ...To improve the recognition performance, many feature selection and extraction methods are reported for Indian languages [1]....

    [...]