scispace - formally typeset
Journal ArticleDOI

Handwriting Recognition in Indian Regional Scripts: A Survey of Offline Techniques

Reads0
Chats0
TLDR
Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey, which will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India.
Abstract
Offline handwriting recognition in Indian regional scripts is an interesting area of research as almost 460 million people in India use regional scripts. The nine major Indian regional scripts are Bangla (for Bengali and Assamese languages), Gujarati, Kannada, Malayalam, Oriya, Gurumukhi (for Punjabi language), Tamil, Telugu, and Nastaliq (for Urdu language). A state-of-the-art survey about the techniques available in the area of offline handwriting recognition (OHR) in Indian regional scripts will be of a great aid to the researchers in the subcontinent and hence a sincere attempt is made in this article to discuss the advancements reported in this regard during the last few decades. The survey is organized into different sections. A brief introduction is given initially about automatic recognition of handwriting and official regional scripts in India. The nine regional scripts are then categorized into four subgroups based on their similarity and evolution information. The first group contains Bangla, Oriya, Gujarati and Gurumukhi scripts. The second group contains Kannada and Telugu scripts and the third group contains Tamil and Malayalam scripts. The fourth group contains only Nastaliq script (Perso-Arabic script for Urdu), which is not an Indo-Aryan script. Various feature extraction and classification techniques associated with the offline handwriting recognition of the regional scripts are discussed in this survey. As it is important to identify the script before the recognition step, a section is dedicated to handwritten script identification techniques. A benchmarking database is very important for any pattern recognition related research. The details of the datasets available in different Indian regional scripts are also mentioned in the article. A separate section is dedicated to the observations made, future scope, and existing difficulties related to handwriting recognition in Indian regional scripts. We hope that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India. It will also help to accomplish a target of bringing the researchers working on different Indian scripts together. Looking at the recent developments in OHR of Indian regional scripts, this article will provide a better platform for future research activities.

read more

Citations
More filters
Book ChapterDOI

A Semi-automatic Methodology for Recognition of Printed Kannada Character Primitives Useful in Character Construction.

TL;DR: This paper proposes a semi-automatic method for extracting features from primitives for their recognition and further Kannada characters’ construction, using Euclidean distance measure to establish similarity between test input primitives and existing primitives present in the knowledge base.
Journal ArticleDOI

Hand Written Devanagari Script Short Scale Character Recognition

TL;DR: In this paper , a sample of images which are centralized and grayscale are considered and analyzed using the K-nearest neighbor classification, extremely randomized decision forest classification and random forest classification are considered.
Posted Content

Spatial Domain Feature Extraction Methods for Unconstrained Handwritten Malayalam Character Recognition.

TL;DR: In this paper, the authors dealt with handwritten Malayalam, with a complete set of basic characters, vowel and consonant signs and compound characters that may be present in the script.
Proceedings ArticleDOI

Analysis of Cursive Text Recognition Systems: A Systematic Literature Review

TL;DR: In this article , the authors present a systematic analysis to identify gaps in the literature and suggest new enhanced solution accordingly, which will ultimately help the researchers to perform an overview of the existing character/text recognition approaches, recognition capabilities, time consumption and subsequently identify the areas that require a significant attention in the near future.
Book ChapterDOI

Data Annotation and Preprocessing

TL;DR: In this paper, supervised statistical learning is used for constructing practical systems for application, and large-scale annotation data are the foundation and premise of this method, however, the data directly obtained from the Internet or the raw data from other sources, such as medical records written by doctors, maintenance logbooks and job cards for airplanes, and chat records in WeChat or Twitter, often contain noise, and there are many cases of ill-formed language that create obstacles to fulfilling model tasks.
References
More filters
Journal ArticleDOI

Online and off-line handwriting recognition: a comprehensive survey

TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.
Journal ArticleDOI

Indian script character recognition: a survey

TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.
Journal ArticleDOI

Script Recognition—A Review

TL;DR: An overview of the different script identification methodologies under each of the two broad categories-structure-based and visual-appearance-based techniques is given.
Proceedings ArticleDOI

Segmentation of Bangla unconstrained handwritten text

TL;DR: A robust scheme to segment unconstrained handwritten Banglatexts into lines, words and characters based on water reservoir principle is proposed to take care of variability involved in the writing style of different individuals.
Journal ArticleDOI

Gujarati handwritten numeral optical character reorganization through neural network

TL;DR: A neural network is proposed for Gujarati handwritten digits identification and a multi layered feed forward Neural network is suggested for classification of digits.
Related Papers (5)