scispace - formally typeset
Proceedings ArticleDOI

A Novel Approach of Bangla Handwritten Text Recognition Using HMM

Reads0
Chats0
TLDR
A preliminary experiment is performed on a dataset of 10,120 Bangla handwritten words and it is found that the proposed approach outperforms the custom way of HMM based recognition.
Abstract
This paper presents a novel approach for offline Bangla (Bengali) handwritten word recognition by Hidden Markov Model (HMM). Due to the presence of complex features such as headline, vowels, modifiers, etc., character segmentation in Bangla script is not easy. Also, the position of vowels and compound characters make the segmentation task of words into characters very complex. To take care of these problems we propose a novel method considering a zone-wise break up of words and next perform HMM based recognition. In particular, the word image is segmented into 3 zones, upper, middle and lower, respectively. The components in middle zone are modeled using HMM. By this zone segmentation approach we reduce the number of distinct component classes compared to total number of classes in Bangla character set. Once the middle zone portion is recognized, HMM based forced alignment is applied in this zone to mark the boundaries of individual components. The segmentation paths are extended later to other zones. Next, the residue components, if any, in upper and lower zones in their respective boundary are combined to achieve the final word level recognition. We have performed a preliminary experiment on a dataset of 10,120 Bangla handwritten words and found that the proposed approach outperforms the custom way of HMM based recognition.

read more

Citations
More filters
Journal Article

Lexicon directed algorithm for recognition of unconstrained handwritten words

TL;DR: Improvements made to a lexicon directed algorithm for recognition of unconstrained handwritten words (cursive, discrete, or mixed) such as those encountered in mail pieces, in order to achieve higher recognition accuracy and speed.
Journal ArticleDOI

HMM-based Indic handwritten word recognition using zone segmentation

TL;DR: An efficient word recognition framework by segmenting the handwritten word images horizontally into three zones (upper, middle and lower) and then recognize the corresponding zones to reduce the number of distinct component classes compared to the total number of classes in Indic scripts is proposed.
Journal ArticleDOI

Shape decomposition-based handwritten compound character recognition for Bangla OCR

TL;DR: This paper proposes a novel shape decomposition-based segmentation technique to decompose the compound characters into prominent shape components, which reduces the classification complexity in terms of less number of classes to recognize, and at the same time improves the recognition accuracy.
Journal ArticleDOI

An Improved Method for Handwritten Document Analysis Using Segmentation, Baseline Recognition and Writing Pressure Detection

TL;DR: This research proposed an off-line handwritten document analysis through segmentation, skew recognition and writing pressure detection for cursive handwritten document through modified horizontal and vertical projection that can segment the text lines and words even if the presence of overlapped and multi-skewed text lines.
Journal ArticleDOI

Off-line Bangla handwritten word recognition: a holistic approach

TL;DR: A holistic handwritten word recognition method is developed using a feature descriptor, designed by combining different Elliptical, Tetragonal and Vertical pixel density histogram-based features, which performs comparatively better with SVM than MLP for the prepared dataset.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

Online and off-line handwriting recognition: a comprehensive survey

TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.

The HTK book

TL;DR: The Fundamentals of HTK: General Principles of HMMs, Recognition and Viterbi Decoding, and Continuous Speech Recognition.
Journal ArticleDOI

Indian script character recognition: a survey

TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.
Journal ArticleDOI

Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals

TL;DR: P pioneering development of two databases for handwritten numerals of two most popular Indian scripts, a multistage cascaded recognition scheme using wavelet based multiresolution representations and multilayer perceptron classifiers and application for the recognition of mixed handwritten numeral recognition of three Indian scripts Devanagari, Bangla and English.