scispace - formally typeset
Search or ask a question
Author

R. Jayadevan

Bio: R. Jayadevan is an academic researcher from Army Institute of Technology, Pune. The author has contributed to research in topics: Devanagari & Handwriting recognition. The author has an hindex of 8, co-authored 12 publications receiving 430 citations. Previous affiliations of R. Jayadevan include Pune Institute of Computer Technology.

Papers
More filters
Journal ArticleDOI
01 Nov 2011
TL;DR: In this paper, the state of the art from 1970s of machine printed and handwritten Devanagari optical character recognition (OCR) is discussed in various sections of the paper.
Abstract: In India, more than 300 million people use Devanagari script for documentation. There has been a significant improvement in the research related to the recognition of printed as well as handwritten Devanagari text in the past few years. State of the art from 1970s of machine printed and handwritten Devanagari optical character recognition (OCR) is discussed in this paper. All feature-extraction techniques as well as training, classification and matching techniques useful for the recognition are discussed in various sections of the paper. An attempt is made to address the most important results reported so far and it is also tried to highlight the beneficial directions of the research till date. Moreover, the paper also contains a comprehensive bibliography of many selected papers appeared in reputed journals and conference proceedings as an aid for the researchers working in the field of Devanagari OCR.

159 citations

Journal ArticleDOI
TL;DR: Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey, which will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India.
Abstract: Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, Kannada, Malayalam, Oriya, Gurumukhi (for Punjabi language), Tamil, Telugu, and Nastaliq (for Urdu language). A state-of-the-art survey about the techniques available in the area of offline handwriting recognition (OHR) in Indian regional scripts will be of a great aid to the researchers in the subcontinent and hence a sincere attempt is made in this article to discuss the advancements reported in this regard during the last few decades. The survey is organized into different sections. A brief introduction is given initially about automatic recognition of handwriting and official regional scripts in India. The nine regional scripts are then categorized into four subgroups based on their similarity and evolution information. The first group contains Bangla, Oriya, Gujarati and Gurumukhi scripts. The second group contains Kannada and Telugu scripts and the third group contains Tamil and Malayalam scripts. The fourth group contains only Nastaliq script (Perso-Arabic script for Urdu), which is not an Indo-Aryan script. Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey. As it is important to identify the script before the recognition step, a section is dedicated to handwritten script identification techniques. A benchmarking database is very important for any pattern recognition related research. The details of the datasets available in different Indian regional scripts are also mentioned in the article. A separate section is dedicated to the observations made, future scope, and existing difficulties related to handwriting recognition in Indian regional scripts. We hope that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India. It will also help to accomplish a target of bringing the researchers working on different Indian scripts together. Looking at the recent developments in OHR of Indian regional scripts, this article will provide a better platform for future research activities.

133 citations

Journal ArticleDOI
TL;DR: An attempt is made to present the state of the art in automatic processing of handwritten cheque images and discusses the important results reported so far in preprocessing, extraction, recognition and verification of handwritten fields on bank cheques and highlights the positive directions of research till date.
Abstract: Bank cheques (checks) are still widely used all over the world for financial transactions. Huge volumes of handwritten bank cheques are processed manually every day in developing countries. In such a manual verification, user written information including date, signature, legal and courtesy amounts present on each cheque has to be visually verified. As many countries use cheque truncation systems (CTS) nowadays, much time, effort and money can be saved if this entire process of recognition, verification and data entry is done automatically using images of cheques. An attempt is made in this paper to present the state of the art in automatic processing of handwritten cheque images. It discusses the important results reported so far in preprocessing, extraction, recognition and verification of handwritten fields on bank cheques and highlights the positive directions of research till date. The paper has a comprehensive bibliography of many references as a support for researchers working in the field of automatic bank cheque processing. The paper also contains some information about the products available in the market for automatic cheque processing. To the best of our knowledge, there is no survey in the area of automatic cheque processing, and there is a need of such a survey to know the state of the art.

63 citations

Journal ArticleDOI
01 Jan 2009
TL;DR: A new approach of static handwritten signature verication based on Dynamic Time Warping (DTW) by using only ve genuine signatures for training is proposed in this paper and it is observed that the False Acceptance Rate (FAR) of the proposed system decreases as the number of genuine training samples increases.
Abstract: Static signature verication has a signicant use in establishing the authenticity of bank checks, insurance and legal documents based on the signatures they carry. As an individual signs only a few times on the forms for opening an account with any bank or for insurance related purposes, the number of genuine signature templates available in banking and insurance applications is limited, a new approach of static handwritten signature verication based on Dynamic Time Warping (DTW) by using only ve genuine signatures for training is proposed in this paper. Initially the genuine and test signatures belonging to an individual are normalized after calculating the aspect ratios of the genuine signatures. The horizontal and vertical projection features of a signature are extracted using discrete Radon transform and the two vectors are combined to form a combined projection feature vector. The feature vectors of two signatures are matched using DTW algorithm. The closed area formed by the matching path around the diagonal of the DTW-grid is computed and is multiplied with the dierence cost between the feature vectors. A threshold is calculated for each genuine sample during the training. The test signature is compared with each genuine sample and a matching score is calculated. A decision to accept or reject is made on the average of such scores. The entire experimentations were performed on a global signature database (GPDS-Signature Database) of 2106 signatures with 936 genuine signatures and 1170 skilled forgeries. To evaluate the performance, experiments were carried out with 4 to 5 genuine samples for training and with dierent ‘scores’. The proposed as well as the existing DTW-method were implemented and compared. It is observed that the proposed method is superior in terms of Equal Error Rate (EER) and Total Error Rate (TER) when 4 or 5 genuine signatures were used for training. Also it is observed that the False Acceptance Rate (FAR) of the proposed system decreases as the number of genuine training samples increases.

43 citations

Proceedings ArticleDOI
18 Sep 2011
TL;DR: A dataset containing 26,720 handwritten legal amount words written in Hindi and Marathi languages (Devanagari script) is presented in this paper along with a training-free technique to recognize such handwritten legal amounts present on Indian bank cheques.
Abstract: A dataset containing 26,720 handwritten legal amount words written in Hindi and Marathi languages (Devanagari script) is presented in this paper along with a training-free technique to recognize such handwritten legal amounts present on Indian bank cheques. The recognition of handwritten legal amount words in Hindi and Marathi languages is a challenging because of the similar size and shape of many words in the lexicon. Moreover, many words have same suffixes or prefixes. The recognition technique proposed is a combination of two approaches. The first approach is based on gradient, structural and cavity (GSC) features along with a binary vector matching (BVM) technique. The second approach is based on vertical projection profile (VPP) feature and dynamic time warping (DTW). A number of highly matched words in both the approaches are considered for the recognition step in the combined approach based on a ranking scheme. Syntactical knowledge related to the languages is also used to achieve higher reliability. To the best of our knowledge, this is the first work of its kind in recognizing handwritten legal amounts written in Hindi and Marathi. Researchers interested in the dataset can contact the authors to get it through a shared link.

35 citations


Cited by
More filters
Journal Article
TL;DR: Improvements made to a lexicon directed algorithm for recognition of unconstrained handwritten words (cursive, discrete, or mixed) such as those encountered in mail pieces, in order to achieve higher recognition accuracy and speed.
Abstract: Discusses improvements made to a lexicon directed algorithm for recognition of unconstrained handwritten words (cursive, discrete, or mixed) such as those encountered in mail pieces. The procedure consists of binarization, pre-segmentation, intermediate feature extraction, segmentation recognition, and post-processing. The segmentation recognition and the post-processing are repeated for all lexicon words while the binarization to the intermediate feature extraction are applied once for an input word. The result of performance evaluation using large handwritten address block database is described, and algorithm improvements are described and discussed, in order to achieve higher recognition accuracy and speed. As a result the performance for lexicons of size 10, 100, and 1000 are improved to 98.01%, 95.46%, and 91.49% respectively. The processing speed for each lexicon is improved to 2.0, 2.5, and 3.5 sec/word on a SUN SPARC station 2.<>

143 citations

Journal ArticleDOI
TL;DR: This review article serves the purpose of presenting state of the art results and techniques on OCR and also provide research directions by highlighting research gaps.
Abstract: Given the ubiquity of handwritten documents in human transactions, Optical Character Recognition (OCR) of documents have invaluable practical worth. Optical character recognition is a science that enables to translate various types of documents or images into analyzable, editable and searchable data. During last decade, researchers have used artificial intelligence/machine learning tools to automatically analyze handwritten and printed documents in order to convert them into electronic format. The objective of this review paper is to summarize research that has been conducted on character recognition of handwritten documents and to provide research directions. In this Systematic Literature Review (SLR) we collected, synthesized and analyzed research articles on the topic of handwritten OCR (and closely related topics) which were published between year 2000 to 2019. We followed widely used electronic databases by following pre-defined review protocol. Articles were searched using keywords, forward reference searching and backward reference searching in order to search all the articles related to the topic. After carefully following study selection process 176 articles were selected for this SLR. This review article serves the purpose of presenting state of the art results and techniques on OCR and also provide research directions by highlighting research gaps.

139 citations

Journal ArticleDOI
TL;DR: Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey, which will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India.
Abstract: Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, Kannada, Malayalam, Oriya, Gurumukhi (for Punjabi language), Tamil, Telugu, and Nastaliq (for Urdu language). A state-of-the-art survey about the techniques available in the area of offline handwriting recognition (OHR) in Indian regional scripts will be of a great aid to the researchers in the subcontinent and hence a sincere attempt is made in this article to discuss the advancements reported in this regard during the last few decades. The survey is organized into different sections. A brief introduction is given initially about automatic recognition of handwriting and official regional scripts in India. The nine regional scripts are then categorized into four subgroups based on their similarity and evolution information. The first group contains Bangla, Oriya, Gujarati and Gurumukhi scripts. The second group contains Kannada and Telugu scripts and the third group contains Tamil and Malayalam scripts. The fourth group contains only Nastaliq script (Perso-Arabic script for Urdu), which is not an Indo-Aryan script. Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey. As it is important to identify the script before the recognition step, a section is dedicated to handwritten script identification techniques. A benchmarking database is very important for any pattern recognition related research. The details of the datasets available in different Indian regional scripts are also mentioned in the article. A separate section is dedicated to the observations made, future scope, and existing difficulties related to handwriting recognition in Indian regional scripts. We hope that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India. It will also help to accomplish a target of bringing the researchers working on different Indian scripts together. Looking at the recent developments in OHR of Indian regional scripts, this article will provide a better platform for future research activities.

133 citations

Journal ArticleDOI
01 Mar 2014
TL;DR: The Urdu, Pushto, and Sindhi languages are discussed, with the emphasis being on the Nasta'liq and Naskh scripts, with an emphasis on the preprocessing, segmentation, feature extraction, classification, and recognition in OCR.
Abstract: We survey the optical character recognition (OCR) literature with reference to the Urdu-like cursive scripts. In particular, the Urdu, Pushto, and Sindhi languages are discussed, with the emphasis being on the Nasta'liq and Naskh scripts. Before detaining the OCR works, the peculiarities of the Urdu-like scripts are outlined, which are followed by the presentation of the available text image databases. For the sake of clarity, the various attempts are grouped into three parts, namely: (a) printed, (b) handwritten, and (c) online character recognition. Within each part, the works are analyzed par rapport a typical OCR pipeline with an emphasis on the preprocessing, segmentation, feature extraction, classification, and recognition. HighlightsA literature review of the Nasta'liq and Naskh cursive script OCR.The peculiarities and challenges are described a priori.Printed, handwritten and online OCR efforts are being explored.Analyses based on the stages of a typical OCR pipeline.

121 citations

Journal ArticleDOI
TL;DR: A novel deep learning technique for the recognition of handwritten Bangla isolated compound character is presented and a new benchmark of recognition accuracy on the CMATERdb 3.3.1.3 dataset is reported.

113 citations