scispace - formally typeset
Search or ask a question
Author

Umapada Pal

Other affiliations: University of Mysore
Bio: Umapada Pal is an academic researcher from Indian Statistical Institute. The author has contributed to research in topics: Feature extraction & Handwriting recognition. The author has an hindex of 47, co-authored 478 publications receiving 9925 citations. Previous affiliations of Umapada Pal include University of Mysore.


Papers
More filters
Book ChapterDOI
13 Dec 2006
TL;DR: A quadratic classifier based scheme for the recognition of off-line Devnagari handwritten characters using chain code information of the contour points of the characters and using five-fold cross-validation technique for result computation.
Abstract: Recognition of handwritten characters is a challenging task because of the variability involved in the writing styles of different individuals. In this paper we propose a quadratic classifier based scheme for the recognition of off-line Devnagari handwritten characters. The features used in the classifier are obtained from the directional chain code information of the contour points of the characters. The bounding box of a character is segmented into blocks and the chain code histogram is computed in each of the blocks. Based on the chain code histogram, here we have used 64 dimensional features for recognition. These chain code features are fed to the quadratic classifier for recognition. From the proposed scheme we obtained 98.86% and 80.36% recognition accuracy on Devnagari numerals and characters, respectively. We used five-fold cross-validation technique for result computation.

157 citations

Journal ArticleDOI
TL;DR: Skew angle detection of scanned documents containing most popular Indian scripts (Devnagari and Bangla) is considered and it is found that character segmentation and zone detection can be readily done from head line information, which is useful in optical character recognition approaches of these scripts.
Abstract: Skew angle detection of scanned documents containing most popular Indian scripts (Devnagari and Bangla) is considered. Most characters in these scripts have horizontal lines at the top, called head lines. The character head lines mostly join one another in a word and the word appears as a single component. In the proposed method the components are at first labeled. The upper envelope of a component is found by columnwise scanning from an imaginary line above the component. Portions of upper envelope satisfying the properties of digital straight line are detected. They are clustered as belonging to single text lines. Estimates from individual clusters are combined to get the skew angle. Apart from accuracy and efficiency, an advantage of the method is that character segmentation and zone detection can be readily done from head line information, which is useful in optical character recognition approaches of these scripts.

143 citations

Posted Content
TL;DR: This paper models an offline writer independent signature verification task with a convolutional Siamese network, named SigNet, and exceeds the state-of-the-art results on most of the benchmark signature datasets, which paves the way for further research in this direction.
Abstract: Offline signature verification is one of the most challenging tasks in biometrics and document forensics. Unlike other verification problems, it needs to model minute but critical details between genuine and forged signatures, because a skilled falsification might often resembles the real signature with small deformation. This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases. In this paper, we model an offline writer independent signature verification task with a convolutional Siamese network. Siamese networks are twin networks with shared weights, which can be trained to learn a feature space where similar observations are placed in proximity. This is achieved by exposing the network to a pair of similar and dissimilar observations and minimizing the Euclidean distance between similar pairs while simultaneously maximizing it between dissimilar pairs. Experiments conducted on cross-domain datasets emphasize the capability of our network to model forgery in different languages (scripts) and handwriting styles. Moreover, our designed Siamese network, named SigNet, exceeds the state-of-the-art results on most of the benchmark signature datasets, which paves the way for further research in this direction.

137 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: In this paper, an automatic technique of separating the text lines using script characteristics and shape based features is presented and has an overall accuracy of about 98.5%.
Abstract: In a multi-lingual country like India, a document page may contain more than one script form. Under the three-language formula, the document may be printed in English, Devnagari and one of the other official Indian languages. For OCR of such a document page, it is necessary to separate these three script forms before feeding them to the OCRs of individual scripts. In this paper, an automatic technique of separating the text lines using script characteristics and shape based features is presented. At present, the system has an overall accuracy of about 98.5%.

134 citations

Journal ArticleDOI
TL;DR: Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey, which will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India.
Abstract: Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, Kannada, Malayalam, Oriya, Gurumukhi (for Punjabi language), Tamil, Telugu, and Nastaliq (for Urdu language). A state-of-the-art survey about the techniques available in the area of offline handwriting recognition (OHR) in Indian regional scripts will be of a great aid to the researchers in the subcontinent and hence a sincere attempt is made in this article to discuss the advancements reported in this regard during the last few decades. The survey is organized into different sections. A brief introduction is given initially about automatic recognition of handwriting and official regional scripts in India. The nine regional scripts are then categorized into four subgroups based on their similarity and evolution information. The first group contains Bangla, Oriya, Gujarati and Gurumukhi scripts. The second group contains Kannada and Telugu scripts and the third group contains Tamil and Malayalam scripts. The fourth group contains only Nastaliq script (Perso-Arabic script for Urdu), which is not an Indo-Aryan script. Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey. As it is important to identify the script before the recognition step, a section is dedicated to handwritten script identification techniques. A benchmarking database is very important for any pattern recognition related research. The details of the datasets available in different Indian regional scripts are also mentioned in the article. A separate section is dedicated to the observations made, future scope, and existing difficulties related to handwriting recognition in Indian regional scripts. We hope that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India. It will also help to accomplish a target of bringing the researchers working on different Indian scripts together. Looking at the recent developments in OHR of Indian regional scripts, this article will provide a better platform for future research activities.

133 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 1990
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article, where the authors present an overview of their work.
Abstract: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

2,933 citations

Reference EntryDOI
15 Oct 2004

2,118 citations