Showing papers on "Intelligent word recognition published in 2015"

PDF

Open Access

Proceedings Article•DOI•

Deep learning based large scale handwritten Devanagari character recognition

[...]

Shailesh Acharya¹, Ashok Kumar Pant¹, Prashnna Kumar Gyawali¹•Institutions (1)

01 Dec 2015

TL;DR: This paper introduces a new public image dataset for Devanagari script, and proposes a deep learning architecture for recognition of those characters, with highest test accuracy of 98.47% on the dataset.

...read moreread less

Abstract: In this paper, we introduce a new public image dataset for Devanagari script: Devanagari Handwritten Character Dataset (DHCD). Our dataset consists of 92 thousand images of 46 different classes of characters of Devanagari script segmented from handwritten documents. We also explore the challenges in recognition of Devanagari characters. Along with the dataset, we also propose a deep learning architecture for recognition of those characters. Deep Convolutional Neural Network (CNN) have shown superior results to traditional shallow networks in many recognition tasks. Keeping distance with the regular approach of character recognition by Deep CNN, we focus the use of Dropout and dataset increment approach to improve test accuracy. By implementing these techniques in Deep CNN, we were able to increase test accuracy by nearly 1 percent. The proposed architecture scored highest test accuracy of 98.47% on our dataset.

...read moreread less

153 citations

Proceedings Article•DOI•

Beyond human recognition: A CNN-based framework for handwritten character recognition

[...]

Li Chen¹, Song Wang¹, Wei Fan¹, Jun Sun¹, Satoshi Naoi¹ - Show less +1 more•Institutions (1)

Fujitsu¹

01 Nov 2015

TL;DR: In the experiments, the proposed CNN-based handwritten character recognition framework performed even better than human on handwritten digit (MNIST) and Chinese character (CASIA) recognition.

...read moreread less

Abstract: Because of the various appearance (different writers, writing styles, noise, etc.), the handwritten character recognition is one of the most challenging task in pattern recognition. Through decades of research, the traditional method has reached its limit while the emergence of deep learning provides a new way to break this limit. In this paper, a CNN-based handwritten character recognition framework is proposed. In this framework, proper sample generation, training scheme and CNN network structure are employed according to the properties of handwritten characters. In the experiments, the proposed framework performed even better than human on handwritten digit (MNIST) and Chinese character (CASIA) recognition. The advantage of this framework is proved by these experimental results.

...read moreread less

117 citations

Proceedings Article•DOI•

CNN based common approach to handwritten character recognition of multiple scripts

[...]

Durjoy Sen Maitra¹, Ujjwal Bhattacharya¹, Swapan K. Parui¹•Institutions (1)

Indian Statistical Institute¹

23 Aug 2015

TL;DR: A convolutional neural network trained for a larger class recognition problem towards feature extraction of samples of several smaller class recognition problems of English, Devanagari, Bangla, Telugu and Oriya each of which is an official Indian script.

...read moreread less

Abstract: There are many scripts in the world, several of which are used by hundreds of millions of people. Handwritten character recognition studies of several of these scripts are found in the literature. Different hand-crafted feature sets have been used in these recognition studies. However, convolutional neural network (CNN) has recently been used as an efficient unsupervised feature vector extractor. Although such a network can be used as a unified framework for both feature extraction and classification, it is more efficient as a feature extractor than as a classifier. In the present study, we performed certain amount of training of a 5-layer CNN for a moderately large class character recognition problem. We used this CNN trained for a larger class recognition problem towards feature extraction of samples of several smaller class recognition problems. In each case, a distinct Support Vector Machine (SVM) was used as the corresponding classifier. In particular, the CNN of the present study is trained using samples of a standard 50-class Bangla basic character database and features have been extracted for 5 different 10-class numeral recognition problems of English, Devanagari, Bangla, Telugu and Oriya each of which is an official Indian script. Recognition accuracies are comparable with the state-of-the-art.

...read moreread less

117 citations

Journal Article•DOI•

Bangla Handwritten Character Recognition using Convolutional Neural Network

[...]

Md. Mahbubar Rahman, Md. Aminul Haque Akhand, Shahidul Islam, Pintu Chandra Shill, M.M. Hafizur Rahman - Show less +1 more

08 Jul 2015-International Journal of Image, Graphics and Signal Processing

TL;DR: The proposed method normalizes the written character images and then employ CNN to classify individual characters, which is shown satisfactory recognition accuracy and outperformed some other prominent exiting methods.

...read moreread less

Abstract: Handwritten character recognition complexity varies among different languages due to distinct shapes, strokes and number of characters. Numerous works in handwritten character recognition are available for English with respect to other major languages such as Bangla. Existing methods use distinct feature extraction techniques and various classification tools in their recognition schemes. Recently, Convolutional Neural Network (CNN) is found efficient for English handwritten character recognition. In this paper, a CNN based Bangla handwritten character recognition is investigated. The proposed method normalizes the written character images and then employ CNN to classify individual characters. It does not employ any feature extraction method like other related works. 20000 handwritten characters with different shapes and variations are used in this study. The proposed method is shown satisfactory recognition accuracy and outperformed some other prominent exiting methods.

...read moreread less

105 citations

Proceedings Article•DOI•

Handwritten Bangla numeral recognition using Local Binary Pattern

[...]

Tasnuva Hassan¹, Haider Adnan Khan¹•Institutions (1)

United International University¹

21 May 2015

TL;DR: The proposed OCR system was evaluated on the off-line handwritten Bangla numeral database CMATERdb 3.1, and achieved an excellent accuracy of 96:7% character recognition rate.

...read moreread less

Abstract: Local Binary Pattern (LBP) is a simple yet robust texture descriptor that has been widely used in many computer vision applications including face recognition. In this paper, we exploit LBP for handwritten Bangla numeral recognition. We classify Bangla digits from their LBP histograms using K Nearest Neighbors (KNN) classifier. The performance of three different variations of LBP - the basic LBP, the uniform LBP and the simplified LBP was investigated. The proposed OCR system was evaluated on the off-line handwritten Bangla numeral database CMATERdb 3.1.1, and achieved an excellent accuracy of 96:7% character recognition rate.

...read moreread less

63 citations

Proceedings Article•DOI•

Arabic handwritten characters recognition using Deep Belief Neural Networks

[...]

Mohamed Elleuch¹, Najiba Tagougui², Monji Kherallah³•Institutions (3)

Manouba University¹, University of Gabès², University of Sfax³

16 Mar 2015

TL;DR: The proposed DBNN structure for Arabic handwritten character/word recognition is not already able to deal with high-level dimensional data and thus has to be improved.

...read moreread less

Abstract: In the handwriting recognition field, the deep learning is becoming the new trend thanks to their ability to deal with unlabeled raw data especially with the huge size of raw data available nowadays. In this paper, we investigate Deep Belief Neural Network (DBNN) for Arabic handwritten character/word recognition. The proposed system takes the raw data as input and proceeds with a grasping layer-wise unsupervised learning algorithm. The approach was tested on two different databases. For the character level one, the results were promising with an error classification rate of 2.1% on the HACDB database. Unlike, the character level, the evaluation on the ADAB database to deal with word level shows an error rate which exceeds the 40%. Hence, the proposed DBNN structure is not already able to deal with high-level dimensional data and thus has to be improved.

...read moreread less

39 citations

Journal Article•DOI•

Word Segmentation Method for Handwritten Documents based on Structured Learning

[...]

Jewoong Ryu¹, Hyung Il Koo², Nam Ik Cho¹•Institutions (2)

Seoul National University¹, Ajou University²

08 Jan 2015-IEEE Signal Processing Letters

TL;DR: This work forms the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps, and estimates all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters.

...read moreread less

Abstract: Segmentation of handwritten document images into text-lines and words is an essential task for optical character recognition. However, since the features of handwritten document are irregular and diverse depending on the person, it is considered a challenging problem. In order to address the problem, we formulate the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps. Even though many parameters are involved in our formulation, we estimate all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters. Experimental results on ICDAR 2009/2013 handwriting segmentation databases show that proposed method achieves the state-of-the-art performance on Latin-based and Indian languages.

...read moreread less

37 citations

Posted Content•

Boosting Optical Character Recognition: A Super-Resolution Approach

[...]

Chao Dong, Ximei Zhu, Yubin Deng, Chen Change Loy, Yu Qiao - Show less +1 more

07 Jun 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is reported that the winning entry of text image super-resolution framework has largely improved the OCR performance with low-resolution images used as input, reaching an OCR accuracy score of 77.19%, which is comparable with that of using the original high- resolution images.

...read moreread less

Abstract: Text image super-resolution is a challenging yet open research problem in the computer vision community. In particular, low-resolution images hamper the performance of typical optical character recognition (OCR) systems. In this article, we summarize our entry to the ICDAR2015 Competition on Text Image Super-Resolution. Experiments are based on the provided ICDAR2015 TextSR dataset (3) and the released Tesseract-OCR 3.02 system (1). We report that our winning entry of text image super-resolution framework has largely improved the OCR performance with low-resolution images used as input, reaching an OCR accuracy score of 77.19%, which is comparable with that of using the original high-resolution images (78.80%). Index Terms—super resolution; optical character recogni- tion.

...read moreread less

33 citations

Proceedings Article•DOI•

Combined horizontal and vertical projection feature extraction technique for Gurmukhi handwritten character recognition

[...]

Manoj Kumar Mahto¹, Karamjit Bhatia¹, Rajendra Kumar Sharma²•Institutions (2)

Gurukul Kangri Vishwavidyalaya¹, Thapar University²

19 Mar 2015

TL;DR: This work proposes a combined horizontal and vertical projection feature extraction scheme for recognition of Gurmukhi characters, an Indic script commonly used in state of Punjab in India.

...read moreread less

Abstract: Despite the advancements in Optical Character Recognition (OCR) technologies, problem of Indic script character recognition remains challenging. Especially in case of handwritten characters the challenges are even more. In this work, we focus on off-line recognition of handwritten characters of Gurmukhi, an Indic script commonly used in state of Punjab in India. As a part of this work, we collected a Gurmukhi character dataset of 3500 images. This dataset is collected from 10 writers. We propose a combined horizontal and vertical projection feature extraction scheme for recognition of Gurmukhi characters. We have tested our method on the collected dataset and achieved a high character recognition accuracy of 98.06%.

...read moreread less

27 citations

Proceedings Article•DOI•

An algorithm for handwritten digit recognition using projection histograms and SVM classifier

[...]

Eva Tuba¹, Nebojsa Bacanin¹•Institutions (1)

Megatrend University¹

01 Nov 2015

TL;DR: An algorithm for handwritten digit recognition based on projections histograms based on carefully tuned 45 support vector machines (SVM) using One Against One strategy is described.

...read moreread less

Abstract: Higher level of image processing usually contains some kind of recognition. Digit recognition is common in applications and handwritten digit recognition is an important subfield. Handwritten digits are characterized by large variations so template matching, in general, is not very efficient. In this paper we describe an algorithm for handwritten digit recognition based on projections histograms. Classification is facilitated by carefully tuned 45 support vector machines (SVM) using One Against One strategy. Our proposed algorithm was tested on standard benchmark images from MNIST database and it achieved remarkable global accuracy of 99.05%, with possibilities for further improvement.

...read moreread less

26 citations

Proceedings Article•DOI•

An application of SVM in character recognition with chain code

[...]

Dipti Singh¹, Mohd. Aamir Khan¹, Atul Bansal¹, Neha Bansal¹•Institutions (1)

GLA University¹

01 Nov 2015

TL;DR: The general architecture of modern OCR system with details of each module is discussed, and Moore neighborhood tracing is applied for extracting boundary of characters and then chain rule for feature extraction.

...read moreread less

Abstract: Artificial intelligence, pattern recognition and computer vision has a significant importance in the field of electronics and image processing. Optical character recognition (OCR) is one of the main aspects of pattern recognition and has evolved greatly since its beginning. OCR is a system which recognized the readable characters from optical data and converts it into digital form. Various methodologies have been developed for this purpose using different approaches. In this paper, general architecture of modern OCR system with details of each module is discussed. We applied Moore neighborhood tracing for extracting boundary of characters and then chain rule for feature extraction. In the classification stage for character recognition, SVM is trained and is applied on suitable example.

...read moreread less

Proceedings Article•DOI•

Bangla handwritten numeral recognition using convolutional neural network

[...]

M. A. H. Akhand¹, Md. Mahbubar Rahman¹, Pintu Chandra Shill¹, Shahidul Islam¹, M.M. Hafizur Rahman² - Show less +1 more•Institutions (2)

Khulna University of Engineering & Technology¹, International Islamic University Malaysia²

21 May 2015

TL;DR: The proposed BHNR-CNN normalizes the written numeral images and then employ CNN to classify individual numerals, which is shown satisfactory recognition accuracy and outperformed other prominent exiting methods.

...read moreread less

Abstract: Recognition of handwritten numerals has gained much interest in recent years due to its various application potentials. Although Bangla is a major language in Indian subcontinent and is the first language of Bangladesh study regarding Bangla handwritten numeral recognition (BHNR) is very few with respect to other major languages such Roman. The existing BHNR methods uses distinct feature extraction techniques and various classification tools in their recognition schemes. Recently, convolutional neural network (CNN) is found efficient for image classification with its distinct features. It also automatically provides some degree of translation invariance. In this paper, a CNN based BHNR is investigated. The proposed BHNR-CNN normalizes the written numeral images and then employ CNN to classify individual numerals. It does not employ any feature extraction method like other related works. 17000 hand written numerals with different shapes, sizes and variations are used in this study. The proposed method is shown satisfactory recognition accuracy and outperformed other prominent exiting methods.

...read moreread less

Book Chapter•DOI•

On the Modification of Binarization Algorithms to Retain Grayscale Information for Handwritten Text Recognition

[...]

Mauricio Villegas¹, Verónica Romero¹, Joan Andreu Sánchez¹•Institutions (1)

Polytechnic University of Valencia¹

17 Jun 2015

TL;DR: In this paper, the authors proposed to take existing binarization techniques, in order to retain their advantages, and modify them in such a way that some of the original grayscale information is preserved and be considered by the subsequent recognizer.

...read moreread less

Abstract: The amount of digitized legacy documents has been rising over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed to provide historians and other researchers new ways of indexing, consulting and querying them. However, the performance accuracy of state-of-the-art Handwritten Text Recognition techniques decreases dramatically when they are applied to these historical documents. This is mainly due to the typical paper degradation problems. Therefore, robust pre-processing techniques is an important step for helping further recognition steps. This paper proposes to take existing binarization techniques, in order to retain their advantages, and modify them in such a way that some of the original grayscale information is preserved and be considered by the subsequent recognizer. Results are reported with the publicly available ESPOSALLES database.

...read moreread less

Proceedings Article•DOI•

Handwritten Kannada character recognition using wavelet transform and structural features

[...]

Saleem Pasha¹, M. C. Padma¹•Institutions (1)

P.E.S. College of Engineering¹

01 Dec 2015

TL;DR: The main aim of this paper is to propose an efficient feature extraction and classification techniques for OCR system for handwritten Kannada characters and numerals which involves several phases such as preprocessing, feature extraction, classification and classification.

...read moreread less

Abstract: The frontier area of research in the field of pattern recognition and image processing is handwritten character recognition. This leads to a great demand for OCR system containing handwritten documents. In order to recognize the text present in a document, an Optical Character Recognition (OCR) system is developed. In this paper, OCR system for handwritten Kannada characters and numerals is developed which involves several phases such as preprocessing, feature extraction and classification. Preprocessing includes the techniques that are suitable to convert the input image into an acceptable form for feature extraction. The main aim of this paper is to propose an efficient feature extraction and classification techniques. Suitable features are extracted as structural features and wavelet transform is employed for extracting global features. Artificial neural network classifier is used for recognizing the handwritten Kannada characters and numerals. The proposed method is experimented on 4800 images of handwritten Kannada characters and obtained an average accuracy of 91.00%. Also, the proposed method is experimented on 1000 images of handwritten Kannada numerals and obtained an average accuracy of 97.60%.

...read moreread less

Book Chapter•DOI•

Character Segmentation of Hindi Unconstrained Handwritten Words

[...]

Soumen Bag¹, Ankit Krishna²•Institutions (2)

Indian Institute of Technology Dhanbad¹, Indian Institutes of Information Technology²

24 Nov 2015

TL;DR: The proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language and can cope with high variations in writing style and skewed header lines as input.

...read moreread less

Abstract: The proper character level segmentation of printed or handwritten text is an important preprocessing step for optical character recognition OCR. It is noticed that the languages having cursive nature in writing make the segmentation problem much more complicated. Hindi is one of the well known language in India having this cursive nature in writing style. The main challenge in handwritten character segmentation is to handle the inherent variability in the writing style of different individuals. In this paper, we present an efficient character segmentation method for handwritten Hindi words. Segmentation is performed on the basis of some structural patterns observed in the writing style of this language. The proposed method can cope with high variations in writing style and skewed header lines as input. The method has been tested on our own database for both printed and handwritten words. The average success rate is 96.93i¾?%. The method yields fairly good results for this database comparing with other existing methods. We foresee that the proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language.

...read moreread less

Proceedings Article•DOI•

A review of feature extraction techniques for handwritten Arabic text recognition

[...]

Bouchra El Qacimy, Ahmed Hammouch, Mounir Ait Kerroum

25 Mar 2015

TL;DR: This work provides a comprehensive review of these methods for off-line handwritten Arabic text recognition and presents recognition rates and descriptions of the databases used for the discussed approaches.

...read moreread less

Abstract: Research in Arabic handwritten recognition has been of growing interest in the last few decades. This is mainly due to its broad spectrum of applications in different fields such as bank check processing, form data entry, postal mail sorting, automatic processing of old manuscripts, etc. In the literature, numerous techniques have been proposed for feature extraction and applied to various types of images. This work provides a comprehensive review of these methods for off-line handwritten Arabic text recognition. It also presents recognition rates and descriptions of the databases used for the discussed approaches. This paper includes background on the field, discussion of feature extraction methods, and future research directions.

...read moreread less

Journal Article•DOI•

A holistic word recognition technique for handwritten Bangla words

[...]

Showmik Bhowmik¹, Sanjib Polley², Md. Galib Roushan¹, Samir Malakar², Ram Sarkar¹, Mita Nasipuri¹ - Show less +2 more•Institutions (2)

Jadavpur University¹, MCKV Institute of Engineering²

25 May 2015

TL;DR: In this paper, concentric rectangles and convex hull-based features are designed in order to classify word images belonging to different classes and a neural network-based classifier is chosen on the basis of the performances of different classifiers and some statistical tests.

...read moreread less

Abstract: Holistic word recognition is the current trend for handwritten word recognition. The holistic paradigm in handwritten word recognition considers a word as a single, indivisible entity and attempts to recognise words from their overall shape unlike recognising the individual characters comprising the word. In the present work, concentric rectangles and convex hull-based features are designed in order to classify word images belonging to different classes. For the evaluation of the current technique, 2,754 handwritten Bangla word samples are collected from different sources. A neural network-based classifier is chosen on the basis of the performances of different classifiers and some statistical tests. The recognition performance of the technique is evaluated using a three-fold cross-validation method. From the experimental results, it is observed that the proposed technique correctly recognises 84.74% word images in best case.

...read moreread less

Journal Article•DOI•

Offline Tamil Handwritten Character Recognition Using Sub Line Direction and Bounding Box Techniques

[...]

S. M. Shyni¹, M. Antony Robert Raj², S. Abirami²•Institutions (2)

Sathyabama University¹, Anna University²

01 Apr 2015-Indian journal of science and technology

TL;DR: In order to achieve a better recognition rate, a learning algorithm, Support Vector Machine (SVM) has been implemented and these concepts are experimented on 30 Tamil character sets and achieved an accuracy rate of 88%.

...read moreread less

Abstract: Character recognition plays an important role in the field of pattern recognition. Offline character recognition methodology mainly focuses on recognizing the characters irrespective of the difficulties that may arise due to the variations in writing style. This writing style becomes more complex when the characters are in curvy structure. The proposed recognition methodology was applied on one of the complex structures of south Indian language 'Tamil'. The novelty behind this process lies on the selection and extraction of the feature sets. Zoning and Chain Code procedures are employed here to select the features and Sub Line Direction and Bounding box algorithms are used for extracting the features. In order to achieve a better recognition rate, a learning algorithm, Support Vector Machine (SVM) has been implemented. These concepts are experimented on 30 Tamil character sets (Vowels and Consonants) and achieved an accuracy rate of 88%.

...read moreread less

Proceedings Article•DOI•

Handwritten words recognition for legal amounts of bank cheques in English script

[...]

Sneha Singh¹, Tharun Kariveda¹, Jija Das Gupta², Kallol Bhattacharya²•Institutions (2)

Indian Institute of Technology Guwahati¹, University of Calcutta²

02 Mar 2015

TL;DR: A technique of text word recognition based on template matching technique using Correlation coefficient is proposed and overall 76.4 percent word recognition accuracy is achieved which is an encouraging result for off-line word recognition.

...read moreread less

Abstract: The recognition of legal amount present on a bank cheque is a big challenge because of the structural complexity of characters and variability of writing styles in automatic bank cheque processing. This paper proposes a technique of text word recognition based on template matching technique using Correlation coefficient. We have developed a database of 61 words, combination of which can represent any legal amount written in words in Indian bank cheque. Proposed algorithm is tested on our database and overall 76.4 percent word recognition accuracy is achieved which is an encouraging result for off-line word recognition.

...read moreread less

Proceedings Article•DOI•

Optical Character Recognition of Arabic handwritten characters using Neural Network

[...]

Rana S. Hussien¹, Azza A. Elkhidir¹, Mohamed Elnourani¹•Institutions (1)

University of Khartoum¹

01 Sep 2015

TL;DR: An approach to design and implement an off-line OCR system that recognizes Arabic handwritten characters; in this approach Artificial Neural Networks (ANNs) were used as classifiers.

...read moreread less

Abstract: Optical Character Recognition (OCR) is the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text It is widely used as a form of data entry This paper proposes an approach to design and implement an off-line OCR system that recognizes Arabic handwritten characters; in this approach Artificial Neural Networks (ANNs) were used as classifiers The ANN was trained based on the Hopfield Algorithm which was designed using MATLAB In our system, the image goes through a preprocessing stage, followed by a features extraction stage and a recognition stage For the recognition to be accurate certain properties of each of the letters are calculated, these properties also called features are extracted from the image Selection of a relevant feature extraction method is probably the single most important factor in achieving high recognition performance with much better accuracy in character recognition systems A collection of such features (vectors) define the character uniquely by the means of an ANN Experimental results showed that the system designed is able to recognize eight Arabic handwritten letters () with a successful recognition rate of (7725) The system designed can be further developed to include the rest of the Arabic Alphabets, and a segmentation stage so that it could recognize words

...read moreread less

Proceedings Article•DOI•

Page-level handwritten script identification using modified log-Gabor filter based features

[...]

Pawan Kumar Singh¹, Iman Chatterjee², Ram Sarkar¹•Institutions (2)

Jadavpur University¹, Netaji Subhash Engineering College²

09 Jul 2015

TL;DR: A page-level script identification technique for eight popular handwritten scripts namely, Bangla, Devanagari, Gurumukhi, Oriya, Tamil, Telugu, Urdu along with Roman has been proposed and it yields 95.57% accuracy in identifying the scripts of the documents.

...read moreread less

Abstract: Automatic identification of scripts, an imperative research problem during the last few decades, has posed many challenges in any multi-script environment. As India is a multilingual country, therefore, text documents containing more than one language are very familiar phenomenon here. But to digitize these multi-lingual documents using any Optical Character Recognition (OCR) engine, first it is required to recognize the scripts used to write the same. In this paper, a page-level script identification technique for eight popular handwritten scripts namely, Bangla, Devanagari, Gurumukhi, Oriya, Tamil, Telugu, Urdu along with Roman has been proposed. To start with, Modified log-Gabor filters based texture features are designed from each of the document pages. Then the proposed model is evaluated using multiple classifiers and based on their identification accuracies, it is found that Simple Logistic performs the best. Outcome of the present experiment reveals the usefulness of the Modified log-Gabor filters based features in recognition of handwritten Indic scripts. A total of 240 document pages is used to carry out the present experiment and it yields 95.57% accuracy in identifying the scripts of the documents. Even if the proposed method is assessed on limited dataset, but considering the intricacies of the scripts, the outcome can be assumed reasonably acceptable.

...read moreread less

Journal Article•DOI•

Handwritten Character Recognition in English: A Survey

[...]

Monica Patel, Shital P. Thakkar

28 Feb 2015-International Journal of Advanced Research in Computer and Communication Engineering

TL;DR: This paper presents a comprehensive review of Handwritten Character Recognition (HCR) in English language.

...read moreread less

Abstract: This paper presents a comprehensive review of Handwritten Character Recognition (HCR) in English language.The handwritten character recognition has been applied in variety of applications like Banking sectors, Health care industries and many such organizations where handwritten documents are dealt with. Handwritten Character Recognition is the process of conversion of handwritten text into machine readable form. For handwritten characters there are difficulties like it differs from one writer to another, even when same person writes same character there is difference in shape, size and position of character. Latest research in this area has used different types of method, classifiers and features to reduce the complexity of recognizing handwritten text.

...read moreread less

Proceedings Article•DOI•

Training an Arabic handwriting recognizer without a handwritten training data set

[...]

Irfan Ahmad¹, Gernot A. Fink²•Institutions (2)

King Fahd University of Petroleum and Minerals¹, Technical University of Dortmund²

23 Aug 2015

TL;DR: Results from handwritten Arabic word recognition task show that the approach is promising with good recognition rates, and investigates different approaches including, computer generated text in different typefaces as training data, unsupervised adaptation, and using recognition hypothesis on the test sets as trainingData.

...read moreread less

Abstract: Handwritten text recognition is an active research area in pattern recognition. One of the prerequisites of setting up a handwritten text recognizer is to train them using, mostly, large amounts of labeled training data. In the current paper we report our work on handwritten text recognition using no handwritten training set. We investigate different approaches including, computer generated text in different typefaces as training data, unsupervised adaptation, and using recognition hypothesis on the test sets as training data. Results from handwritten Arabic word recognition task show that the approach is promising with good recognition rates.

...read moreread less

Journal Article•DOI•

Optical character recognition menggunakan algoritma template matching correlation

[...]

Suryo Hartanto, Aris Sugiharto, Sukmawati Nur Endah

30 Apr 2015

TL;DR: In this paper, the template matching correlation method was used to identify different types of characters with different sizes and shapes, which achieved an average recognition success rate of 92,90% and achieved good accuracy.

...read moreread less

Abstract: OCR (Optical Character Recognition) is an effective solution to the process of converting printed documents into digital documents. The problems that arise in the process of computer letters recognition is how a recognition techniques to identify different types of characters with different sizes and shapes. Recognition method used in this final project is the template matching correlation method. Prior to the recognition process, the input image with a format *.bmp or *.jpg processed first at the preprocessing process, which includes the binerisasi, segmentation, and normalization of images. Average recognition success rate of 92,90% is generated by this system. The final results showed that the use of the template matching correlation method is effective enough to build an OCR system with good accuracy.

...read moreread less

Proceedings Article•DOI•

Generation of synthetic training data for handwritten Indic script recognition

[...]

Shivansh Gaur¹, Siddhant Sonkar¹, Partha Pratim Roy¹•Institutions (1)

Indian Institute of Technology Roorkee¹

23 Aug 2015

TL;DR: This paper presents a novel approach to create synthetic dataset for word recognition systems to improve performance of off-line handwritten text recognizers by providing it with additional synthetic training data.

...read moreread less

Abstract: This paper presents a novel approach to create synthetic dataset for word recognition systems. Our purpose is to improve performance of off-line handwritten text recognizers by providing it with additional synthetic training data. Due to lack of proper data-set for many languages it becomes hard to train recognition systems. To solve such problems synthetic handwriting could be used to expand the existing training dataset. Any available digital data from online newspaper and such sources can be used to generate this synthetic data. The digital data is distorted in such a way that the underlying pattern is conserved for identification of the word by both machine and human user. The images hence produced can be used to train any classification system for handwriting recognition. This data can be used independently to train the system or be combined with natural handwritten data to augment the original dataset and improve the accuracy of the results. We experimented using only synthetic data obtaining high recognition accuracy in both character and word recognition. The data was tested on 3 Indian scripts for numerals- Hindi, Bengali and Telugu, and 1 script-Hindi for words, the results achieved hence are highly promising.

...read moreread less

Proceedings Article•DOI•

Deep evolution of image representations for handwritten digit recognition

[...]

Alexandros Agapitos¹, Michael O'Neill¹, Miguel Nicolau¹, David Fagan¹, Ahmed Kattan¹, Anthony Brabazon¹, Kathleen M. Curran¹ - Show less +3 more•Institutions (1)

University College Dublin¹

25 May 2015

TL;DR: Results on a popular handwritten digit recognition benchmark clearly demonstrate that two layers of feature transformations improves generalisation compared to a single layer, and it is shown that the proposed system outperforms several standard Genetic Programming systems.

...read moreread less

Abstract: A training protocol for learning deep neural networks, called greedy layer-wise training, is applied to the evolution of a hierarchical, feed-forward Genetic Programming based system for feature construction and object recognition. Results on a popular handwritten digit recognition benchmark clearly demonstrate that two layers of feature transformations improves generalisation compared to a single layer. In addition, we show that the proposed system outperforms several standard Genetic Programming systems, which are based on hand-designed features, and use different program representations and fitness functions.

...read moreread less

Proceedings Article•DOI•

Recognizing Arabic Handwritten Script using Support Vector Machine classifier

[...]

Mohamed Elleuch¹, Houssem Lahiani², Monji Kherallah²•Institutions (2)

Manouba University¹, University of Sfax²

01 Dec 2015

TL;DR: This paper compared the well-functioning of the proposed SVMs for AHS recognition with character recognition reliabilities coming from state-of-the-art Arabic OCR which resulted in commendatory outcomes.

...read moreread less

Abstract: Handwriting recognition ranks among the highest and the most triumphant applications in the pattern recognition domain. Despite being a developed field, many enquiries are still needed and still represent a defiance mainly for the Arabic Handwritten Script (AHS). Recently, more regard has been given to Support Vector Machines (SVM) classifier for script recognition. Nevertheless, it has not been put in application yet to the handwritten Arabic field if compared with the other methods like ANN, CNN, RNN and HMM. SVMs for AHS recognition is examined in this paper. Handcrafted feature is handled as input by the suggested method and gets going with a supervised learning algorithm. We chose the Multi-class Support Vector Machine with an RBF kernel and we tested it on Handwritten Arabic Characters Database (HACDB) as well. It was proven that the proposed method was effective thanks to the simulation results. We compared the well-functioning of this method with character recognition reliabilities coming from state-of-the-art Arabic OCR which resulted in commendatory outcomes.

...read moreread less

Proceedings Article•DOI•

Effective handwritten digit recognition based on multi-feature extraction and deep analysis

[...]

Caiyun Ma¹, Hong Zhang•Institutions (1)

Wuhan University of Science and Technology¹

01 Aug 2015

TL;DR: This paper normalize images of various sizes and stroke thickness in preprocessing to eliminate negative information and keep relevant features and proposes specific feature definitions, including structure features, distribution features and projection features, which fuse multiple features into the deep neural networks for semantics recognition.

...read moreread less

Abstract: Handwritten digit recognition is an important research topic in computer vision and pattern recognition. This paper proposes an effective handwritten digit recognition approach based on specific multi-feature extraction and deep analysis. First, we normalize images of various sizes and stroke thickness in preprocessing to eliminate negative information and keep relevant features. Secondly, considering that handwritten digit image recognition is different from traditional image semantics recognition, we propose specific feature definitions, including structure features, distribution features and projection features. Moreover, we fuse multiple features into the deep neural networks for semantics recognition. Experiments results on benchmark database of MNIST handwritten digit images show that the performance of our algorithm is remarkable and demonstrate its superiority over several existing algorithms.

...read moreread less

Proceedings Article•DOI•

On-line handwritten Gujarati character Recognition using low level stroke

[...]

Chhaya C Gohel¹, Mukesh M. Goswami¹, Vishal K Prajapati¹•Institutions (1)

Dharamsinh Desai University¹

21 Dec 2015

TL;DR: This paper presents a low level stroke feature based method for recognition of online handwritten Gujarati characters and numerals using a nearest neighbor (i.e. K-NN) classifier with k-fold cross validation on the dataset having 4500 samples from 45 different classes.

...read moreread less

Abstract: This paper presents a low level stroke feature based method for recognition of online handwritten Gujarati characters and numerals. A reasonable size database of online handwritten Gujarati characters and numerals has been developed. This is the first such database of online handwritten symbols for Gujarati script The hierarchical histograms of twelve different low level stroke features and eight directional features were generated to capture the variation in strokes at different level. Recognition is performed using a nearest neighbor (i.e. K-NN) classifier with k-fold cross validation on the dataset having 4500 samples from 45 different classes (37 characters and 8 numerals). Overall Recognition rates achieved are 95%, 93% and 90% for numerals dataset, characters dataset and combine dataset of numerals and characters respectively.

...read moreread less

Proceedings Article•DOI•

Impact of zoning on Zernike moments for handwritten MODI character recognition

[...]

A. Kulkarni Sadanand¹, L. Borde Prashant¹, R Manza Ramesh¹, L. Yannawar Pravin¹•Institutions (1)

Dr. Babasaheb Ambedkar Marathwada University¹

01 Sep 2015

TL;DR: The work described in this paper presents efficiency of Zernike moments over Hu's seven moment with zoning for automatic recognition of handwritten `MODI' characters.

...read moreread less

Abstract: HOCR is abbreviated as Handwritten Optical Character Recognition. HOCR recognizes handwritten characters from a digital image of documents. Shape identification and feature extraction is very important part of any OCR. Feature extraction defines shape of the character as precisely and as uniquely as possible. Zernike moments describes shape, identify rotation invariant due to its orthogonal property. ‘MODI’ is an ancient script of India had cursive and complex representation of characters. The work described in this paper presents efficiency of Zernike moments over Hu's seven moment with zoning for automatic recognition of handwritten ‘MODI’ characters. 82.61% recognition rate was achieved by using zone based approach for Zernike moments.

...read moreread less