Showing papers on "Optical character recognition published in 2015"

PDF

Open Access

Journal Article•DOI•

[...]

Jack Greenhalgh¹, Majid Mirmehdi¹•Institutions (1)

01 Jun 2015-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A novel system for the automatic detection and recognition of text in traffic signs using Maximally stable extremal regions and hue, saturation, and value color thresholding to locate a large number of candidates and interprets the text contained within detected candidate regions.

...read moreread less

Abstract: We propose a novel system for the automatic detection and recognition of text in traffic signs. Scene structure is used to define search regions within the image, in which traffic sign candidates are then found. Maximally stable extremal regions (MSERs) and hue, saturation, and value color thresholding are used to locate a large number of candidates, which are then reduced by applying constraints based on temporal and structural information. A recognition stage interprets the text contained within detected candidate regions. Individual text characters are detected as MSERs and are grouped into lines, before being interpreted using optical character recognition (OCR). Recognition accuracy is vastly improved through the temporal fusion of text results across consecutive frames. The method is comparatively evaluated and achieves an overall $F_{\rm measure}$ of 0.87.

...read moreread less

96 citations

Proceedings Article•DOI•

Document Image Binarization using LSTM: A Sequence Learning Approach

[...]

Muhammad Zeshan Afzal¹, Joan Pastor-Pellicer², Faisal Shafait³, Thomas M. Breuel⁴, Andreas Dengel¹, Marcus Liwicki¹ - Show less +2 more•Institutions (4)

German Research Centre for Artificial Intelligence¹, Polytechnic University of Valencia², National University of Sciences and Technology³, Kaiserslautern University of Technology⁴

22 Aug 2015

TL;DR: The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.

...read moreread less

Abstract: We propose to address the problem of Document Image Binarization (DIB) using Long Short-Term Memory (LSTM) which is specialized in processing very long sequences. Thus, the image is considered as a 2D sequence of pixels and in accordance to this a 2D LSTM is employed for the classification of each pixel as text or background. The proposed approach processes the information using local context and then propagates the information globally in order to achieve better visual coherence. The method is robust against most of the document artifacts. We show that with a very simple network without any feature extraction and with limited amount of data the proposed approach works reasonably well for the DIBCO 2013 dataset. Furthermore a synthetic dataset is considered to measure the performance of the proposed approach with both binarization and OCR groundtruth. The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.

...read moreread less

52 citations

Journal Article•DOI•

Active contour based optical character recognition for automated scene understanding

[...]

Joanna Isabelle Olszewska¹•Institutions (1)

University of Gloucestershire¹

05 Aug 2015-Neurocomputing

TL;DR: A new optical character recognition (OCR) approach which allows real-time, automatic extraction and recognition of digits in images and videos and has outperformed state-of-the-art methods.

...read moreread less

50 citations

Journal Article•DOI•

Brazilian vehicle identification using a new embedded plate recognition system

[...]

Edson Cavalcanti Neto, Samuel Luz Gomes, Pedro Pedrosa Rebouças Filho, Victor Hugo C. de Albuquerque¹•Institutions (1)

University of Fortaleza¹

01 Jun 2015-Measurement

TL;DR: A new system to detect and recognize Brazilian vehicle license plates, in which the registered users have permission to enter the location, getting a 98.5% success rate on the tested cases.

...read moreread less

45 citations

Proceedings Article•DOI•

A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic

[...]

Mohammad Reza Yousefi, Mohammad Reza Soheili, Thomas M. Breuel¹, Didier Stricker•Institutions (1)

Kaiserslautern University of Technology¹

08 Feb 2015

TL;DR: This paper shows that using a simple pre-processing step that normalizes the position and baseline of letters, an Arabic handwriting recognition method can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance.

...read moreread less

Abstract: In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.

...read moreread less

44 citations

Proceedings Article•DOI•

Customised OCR correction for historical medical text

[...]

Paul Thompson¹, John McNaught¹, Sophia Ananiadou¹•Institutions (1)

University of Manchester¹

01 Sep 2015

TL;DR: A new OCR correction strategy, customised for historical medical documents, which combines rule-based correction of regular errors with a medically-tuned spell-checking strategy, whose corrections are guided by information about subject-specific language usage from the publication period of the article to be corrected.

...read moreread less

Abstract: Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, owing to large-scale digitisation efforts. Searchable access is typically provided by applying Optical Character Recognition (OCR) software to scanned page images. Often, however, the automatically recognised text contains a large number of errors, since OCR systems are typically optimised to deal with modern documents, and can struggle with historical document features, including variable print characteristics and archaic vocabulary usage. Low quality OCR text can reduce the efficiency of search systems over historical archives, particularly semantic systems that are based on the application of sophisticated text mining (TM) techniques. We report on a new OCR correction strategy, customised for historical medical documents. The method combines rule-based correction of regular errors with a medically-tuned spell-checking strategy, whose corrections are guided by information about subject-specific language usage from the publication period of the article to be corrected. The performance of our method compares favourably to other OCR post-correction strategies, in improving word-level accuracy of poor-quality documents by up to 16%.

...read moreread less

37 citations

Journal Article•DOI•

Word Segmentation Method for Handwritten Documents based on Structured Learning

[...]

Jewoong Ryu¹, Hyung Il Koo², Nam Ik Cho¹•Institutions (2)

Seoul National University¹, Ajou University²

08 Jan 2015-IEEE Signal Processing Letters

TL;DR: This work forms the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps, and estimates all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters.

...read moreread less

Abstract: Segmentation of handwritten document images into text-lines and words is an essential task for optical character recognition. However, since the features of handwritten document are irregular and diverse depending on the person, it is considered a challenging problem. In order to address the problem, we formulate the word segmentation problem as a binary quadratic assignment problem that considers pairwise correlations between the gaps as well as the likelihoods of individual gaps. Even though many parameters are involved in our formulation, we estimate all parameters based on the Structured SVM framework so that the proposed method works well regardless of writing styles and written languages without user-defined parameters. Experimental results on ICDAR 2009/2013 handwriting segmentation databases show that proposed method achieves the state-of-the-art performance on Latin-based and Indian languages.

...read moreread less

37 citations

Book Chapter•DOI•

Convolutional Neural Networks for the Recognition of Malayalam Characters

[...]

R. Anil¹, K. Manjusha¹, S. Sachin Kumar¹, K. P. Soman¹•Institutions (1)

Amrita Vishwa Vidyapeetham¹

01 Jan 2015-Advances in intelligent systems and computing

TL;DR: LeNet-5, a Convolutional Neural Network trained with gradient based learning and backpropagation algorithm is used for classification of Malayalam character images and result obtained for multi-class classifier shows that CNN performance is dropping down when the number of classes exceeds range of 40.

...read moreread less

Abstract: Optical Character Recognition (OCR) has an important role in information retrieval which converts scanned documents into machine editable and searchable text formats. This work is focussing on the recognition part of OCR. LeNet-5, a Convolutional Neural Network (CNN) trained with gradient based learning and backpropagation algorithm is used for classification of Malayalam character images. Result obtained for multi-class classifier shows that CNN performance is dropping down when the number of classes exceeds range of 40. Accuracy is improved by grouping misclassified characters together. Without grouping, CNN is giving an average accuracy of 75% and after grouping the performance is improved upto 92%. Inner level classification is done using multi-class SVM which is giving an average accuracy in the range of 99-100%.

...read moreread less

36 citations

Posted Content•

Boosting Optical Character Recognition: A Super-Resolution Approach

[...]

Chao Dong, Ximei Zhu, Yubin Deng, Chen Change Loy, Yu Qiao - Show less +1 more

07 Jun 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is reported that the winning entry of text image super-resolution framework has largely improved the OCR performance with low-resolution images used as input, reaching an OCR accuracy score of 77.19%, which is comparable with that of using the original high- resolution images.

...read moreread less

Abstract: Text image super-resolution is a challenging yet open research problem in the computer vision community. In particular, low-resolution images hamper the performance of typical optical character recognition (OCR) systems. In this article, we summarize our entry to the ICDAR2015 Competition on Text Image Super-Resolution. Experiments are based on the provided ICDAR2015 TextSR dataset (3) and the released Tesseract-OCR 3.02 system (1). We report that our winning entry of text image super-resolution framework has largely improved the OCR performance with low-resolution images used as input, reaching an OCR accuracy score of 77.19%, which is comparable with that of using the original high-resolution images (78.80%). Index Terms—super resolution; optical character recogni- tion.

...read moreread less

33 citations

Journal Article•DOI•

Nastalique segmentation-based approach for Urdu OCR

[...]

Sarmad Hussain¹, Salman Syed Ali¹, Qurat ul Ain Akram¹•Institutions (1)

University of Engineering and Technology, Lahore¹

01 Dec 2015-International Journal on Document Analysis and Recognition

TL;DR: A segmentation-based method is proposed for developing Nastalique OCR, deriving principles and techniques for the pre-processing and recognition, and the work is extensible to other languages using Nastsalique.

...read moreread less

Abstract: Much work on Arabic language optical character recognition (OCR) has been on Naskh writing style. Nastalique style, used for most of languages using Arabic script across Southern Asia, is much more challenging to process due to its compactness, cursiveness, higher context sensitivity and diagonality. This makes the Nastalique writing more complex with multiple letters horizontally overlapping each other. Due to these reasons, existing methods used for Naskh would not work for Nastalique and therefore most work on Nastalique has used non-segmentation methods. The current paper presents new approach for segmentation-based analysis for Nastalique style. The paper explains the complexity of Nastalique, why Naskh based techniques cannot work for Nastalique, and proposes a segmentation-based method for developing Nastalique OCR, deriving principles and techniques for the pre-processing and recognition. The OCR is developed for Urdu language. The system is optimized using 79,093 instances of 5249 main bodies derived from a corpus of 18 million words, giving recognition accuracy of 97.11 %. The system is then tested on document images of books with 87.44 % main body recognition accuracy. The work is extensible to other languages using Nastalique.

...read moreread less

30 citations

Patent•

Image-based character recognition

[...]

Xiaofan Lin, Arnab Sanat Kumar Dhua, Douglas Ryan Gray, Yu Lou

26 May 2015

TL;DR: In this article, the authors present a multithreaded approach for processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized texts.

...read moreread less

Abstract: Various embodiments enable a device to perform tasks such as processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized text. In at least one embodiment, processing the image includes substantially simultaneously or concurrently processing the image with at least two recognition engines, such as at least two optical character recognition (OCR) engines, running in a multithreaded mode. In at least one embodiment, the recognition engines can be tuned so that their respective processing speeds are roughly the same. Utilizing multiple recognition engines enables processing latency to be close to that of using only one recognition engine.

...read moreread less

Journal Article•DOI•

MapReduce Based Text Detection in Big Data Natural Scene Videos

[...]

Abdelkarim Ben Ayed¹, Mohamed Ben Halima¹, Adel M. Alimi¹•Institutions (1)

University of Sfax¹

01 Jan 2015-Procedia Computer Science

TL;DR: This paper is introducing a novel text detection technique using many blocks coming from frame decomposition which allowed the extraction of text coordinates using MapReduce programming model and it is found that the running speed can be more than 2 times as fast as classic approach.

...read moreread less

Proceedings Article•DOI•

Recognition of historical Greek polytonic scripts using LSTM networks

[...]

Fotini Simistira, Adnan Ul-Hassan¹, Vassilis Papavassiliou, Basilis Gatos, Vassilis Katsouros, Marcus Liwicki¹ - Show less +2 more•Institutions (1)

Kaiserslautern University of Technology¹

23 Aug 2015

TL;DR: Evaluation results show that the character error rate obtained with LSTM varies from 5.51% to 14.68% and is better than two well-known OCR engines, namely, Tesseract and ABBYY FineReader.

...read moreread less

Abstract: This paper reports on high-performance Optical Character Recognition (OCR) experiments using Long Short-Term Memory (LSTM) Networks for Greek polytonic script. Even though there are many Greek polytonic manuscripts, the digitization of such documents has not been widely applied, and very limited work has been done on the recognition of such scripts. We have collected a large number of diverse document pages of Greek polytonic scripts in a novel database, called Polyton-DB, containing 15; 689 textlines of synthetic and authentic printed scripts and performed baseline experiments using LSTM Networks. Evaluation results show that the character error rate obtained with LSTM varies from 5.51% to 14.68% (depending on the document) and is better than two well-known OCR engines, namely, Tesseract and ABBYY FineReader.

...read moreread less

Proceedings Article•DOI•

Combined horizontal and vertical projection feature extraction technique for Gurmukhi handwritten character recognition

[...]

Manoj Kumar Mahto¹, Karamjit Bhatia¹, Rajendra Kumar Sharma²•Institutions (2)

Gurukul Kangri Vishwavidyalaya¹, Thapar University²

19 Mar 2015

TL;DR: This work proposes a combined horizontal and vertical projection feature extraction scheme for recognition of Gurmukhi characters, an Indic script commonly used in state of Punjab in India.

...read moreread less

Abstract: Despite the advancements in Optical Character Recognition (OCR) technologies, problem of Indic script character recognition remains challenging. Especially in case of handwritten characters the challenges are even more. In this work, we focus on off-line recognition of handwritten characters of Gurmukhi, an Indic script commonly used in state of Punjab in India. As a part of this work, we collected a Gurmukhi character dataset of 3500 images. This dataset is collected from 10 writers. We propose a combined horizontal and vertical projection feature extraction scheme for recognition of Gurmukhi characters. We have tested our method on the collected dataset and achieved a high character recognition accuracy of 98.06%.

...read moreread less

Proceedings Article•DOI•

Text recognition from images

[...]

Pratik Madhukar Manwatkar, Shashank H. Yadav

19 Mar 2015

TL;DR: The objective of this paper is to recognition of text from image for better understanding of the reader by using particular sequence of different processing module.

...read moreread less

Abstract: Text recognition in images is a research area which attempts to develop a computer system with the ability to automatically read the text from images These days there is a huge demand in storing the information available in paper documents format in to a computer storage disk and then later reusing this information by searching process One simple way to store information from these paper documents in to computer system is to first scan the documents and then store them as images But to reuse this information it is very difficult to read the individual contents and searching the contents form these documents line-by-line and word-by-word The challenges involved in this the font characteristics of the characters in paper documents and quality of images Due to these challenges, computer is unable to recognize the characters while reading them Thus there is a need of character recognition mechanisms to perform Document Image Analysis (DIA) which transforms documents in paper format to electronic format In this paper we have discuss method for text recognition from images The objective of this paper is to recognition of text from image for better understanding of the reader by using particular sequence of different processing module

...read moreread less

Proceedings Article•DOI•

An application of SVM in character recognition with chain code

[...]

Dipti Singh¹, Mohd. Aamir Khan¹, Atul Bansal¹, Neha Bansal¹•Institutions (1)

GLA University¹

01 Nov 2015

TL;DR: The general architecture of modern OCR system with details of each module is discussed, and Moore neighborhood tracing is applied for extracting boundary of characters and then chain rule for feature extraction.

...read moreread less

Abstract: Artificial intelligence, pattern recognition and computer vision has a significant importance in the field of electronics and image processing. Optical character recognition (OCR) is one of the main aspects of pattern recognition and has evolved greatly since its beginning. OCR is a system which recognized the readable characters from optical data and converts it into digital form. Various methodologies have been developed for this purpose using different approaches. In this paper, general architecture of modern OCR system with details of each module is discussed. We applied Moore neighborhood tracing for extracting boundary of characters and then chain rule for feature extraction. In the classification stage for character recognition, SVM is trained and is applied on suitable example.

...read moreread less

Proceedings Article•DOI•

Detection and Recognition of Painted Road Surface Markings

[...]

Jack Greenhalgh¹, Majid Mirmehdi¹•Institutions (1)

University of Bristol¹

10 Jan 2015

TL;DR: A method for the automatic detection and recognition of text and symbols painted on the road surface is presented and achieves F-measures of 0.85 for text characters and 0.91 for symbols.

...read moreread less

Abstract: A method for the automatic detection and recognition of text and symbols painted on the road surface is presented. Candidate regions are detected as maximally stable extremal regions (MSER) in a frame which has been transformed into an inverse perspective mapping (IPM) image, showing the road surface with the effects of perspective distortion removed. Detected candidates are then sorted into words and symbols, before they are interpreted using separate recognition stages. Symbol-based road markings are recognised using histogram of oriented gradient (HOG) features and support vector machines (SVM). Text-based road signs are recognised using a third-party optical character recognition (OCR) package, after application of a perspective correction stage. Matching of regions between frames, and temporal fusion of results is used to improve performance. The proposed method is validated using a data-set of videos, and achieves F-measures of 0.85 for text characters and 0.91 for symbols.

...read moreread less

Journal Article•DOI•

Support vector machine for identification of handwritten Gujarati alphabets using hybrid feature space

[...]

Apurva A. Desai¹•Institutions (1)

Veer Narmad South Gujarat University¹

13 Mar 2015-CSI Transactions on ICT

TL;DR: This paper has attempted the problem of Optical Character Recognition for handwritten Gujarati alphabets with SVM and kNN and the result is compared with that of SVM.

...read moreread less

Abstract: Gujarati language is used in the western state of Gujarat. Because of its peculiarities, its Optical Character Recognition becomes very difficult. For this language very less work has been done in the area of Optical Character Recognition. In this paper, I have attempted the problem of Optical Character Recognition for handwritten Gujarati alphabets. For this work, forty handwritten alphabets are collected from about one hundred and ninety nine writers. Here; aspect ratio, extent of alphabet, and image subdivision approach has been used as feature space and support vector machine (SVM) has been used for the classification purpose and it gives 86.66 % of performance accuracy. kNN is also used for classification and the result is compared with that of SVM. The paper also describes support vector machine.

...read moreread less

Proceedings Article•DOI•

SIFT based automatic number plate recognition

[...]

Khalil M. Ahmad Yousef¹, Maha Al-Tabanjah¹, Esraa Hudaib¹, Maymona Ikrai¹•Institutions (1)

Hashemite University¹

07 Apr 2015

TL;DR: A new and simple, but fast and efficient technique for automatic number plate recognition (ANPR) using SIFT (Scale Invariant Feature Transform) features, which is used to automatically locate and recognize, as a special case, the Jordanian license plates.

...read moreread less

Abstract: The aim of this paper is on presenting a new and simple, but fast and efficient technique for automatic number plate recognition (ANPR) using SIFT (Scale Invariant Feature Transform) features The proposed system is used to automatically locate and recognize, as a special case, the Jordanian license plates In the core of our system, SIFT-based template matching technique is used to locate special marks in the license plate Upon successful detection of those marks, the license plate is segmented out from the original image and OCR (Optical Character Recognition) is used to recognize the characters or numbers from the plate Due to the various invariance virtues of SIFT, our method can adaptively deal with various changes in the license plates, such as rotation, scaling, and illumination Experimental results using real datasets are presented, which show that our system has a good performance

...read moreread less

Journal Article•DOI•

Visual Recognition and Its Application to Robot Arm Control

[...]

Jih-Gau Juang, Yi-Ju Tsai, Yang-Wu Fan

20 Oct 2015-Applied Sciences

TL;DR: This paper presents an application of optical word recognition and fuzzy control to a smartphone automatic test system and the proposed control scheme allows the robot arm to perform different assigned test functions successfully.

...read moreread less

Abstract: This paper presents an application of optical word recognition and fuzzy control to a smartphone automatic test system. The system consists of a robot arm and two webcams. After the words from the control panel that represent commands are recognized by the robot system, the robot arm performs the corresponding actions to test the smartphone. One of the webcams is utilized to capture commands on the screen of the control panel, the other to recognize the words on the screen of the tested smartphone. The method of image processing is based on the Red-Green-Blue (RGB) and Hue-Saturation-Luminance (HSL) color spaces to reduce the influence of light. Fuzzy theory is used in the robot arm’s position control. The Optical Character Recognition (OCR) technique is applied to the word recognition, and the recognition results are then checked by a dictionary process to increase the recognition accuracy. The camera which is used to recognize the tested smartphone also provides object coordinates to the fuzzy controller, then the robot arm moves to the desired positions and presses the desired buttons. The proposed control scheme allows the robot arm to perform different assigned test functions successfully.

...read moreread less

Proceedings Article•DOI•

Handwritten Kannada character recognition using wavelet transform and structural features

[...]

Saleem Pasha¹, M. C. Padma¹•Institutions (1)

P.E.S. College of Engineering¹

01 Dec 2015

TL;DR: The main aim of this paper is to propose an efficient feature extraction and classification techniques for OCR system for handwritten Kannada characters and numerals which involves several phases such as preprocessing, feature extraction, classification and classification.

...read moreread less

Abstract: The frontier area of research in the field of pattern recognition and image processing is handwritten character recognition. This leads to a great demand for OCR system containing handwritten documents. In order to recognize the text present in a document, an Optical Character Recognition (OCR) system is developed. In this paper, OCR system for handwritten Kannada characters and numerals is developed which involves several phases such as preprocessing, feature extraction and classification. Preprocessing includes the techniques that are suitable to convert the input image into an acceptable form for feature extraction. The main aim of this paper is to propose an efficient feature extraction and classification techniques. Suitable features are extracted as structural features and wavelet transform is employed for extracting global features. Artificial neural network classifier is used for recognizing the handwritten Kannada characters and numerals. The proposed method is experimented on 4800 images of handwritten Kannada characters and obtained an average accuracy of 91.00%. Also, the proposed method is experimented on 1000 images of handwritten Kannada numerals and obtained an average accuracy of 97.60%.

...read moreread less

Proceedings Article•DOI•

Urdu text classification using decision trees

[...]

Khalil Khan¹, R. Ullah Khan¹, Ali Alkhalifah¹, Niaz Ahmad¹•Institutions (1)

University of Brescia¹

01 Dec 2015

TL;DR: The proposed approach presents the preprocessing, features extraction and classification of Urdu language text using three different features extraction techniques, the Hu moments, Zernike moments and the Principal Component Analysis (PCA) for classification.

...read moreread less

Abstract: this article reports the development and experimental analysis of an Urdu Optical Character REcognition (OCR) system. The proposed approach presents the preprocessing, features extraction and classification of Urdu language text. Three different features extraction techniques, the Hu moments, Zernike moments and the Principal Component Analysis (PCA) are used. Decision Tree algorithm J-48 is used for classification. A medium size database of 441 characters is created consisting of hand written and machine written Urdu language characters. An overall best recognition accuracy of 92.06% is achieved using the Hu moments.

...read moreread less

Proceedings Article•DOI•

Detecting text based image with optical character recognition for English translation and speech using Android

[...]

Sathiapriya Ramiah¹, Tan Yu Liong¹, Manoj Jayabalan¹•Institutions (1)

Asia Pacific University of Technology & Innovation¹

01 Dec 2015

TL;DR: In this study, an Android application is developed by integrating Tesseract OCR engine, Bing translator and phones' built-in speech out technology that helps travelers who visit a foreign country to understand messages portrayed in different language.

...read moreread less

Abstract: Smartphones have been known as most commonly used electronic devices in daily life today. As hardware embedded in smartphones can perform much more task than traditional phones, the smartphones are no longer just a communication device but also considered as a powerful computing device which able to capture images, record videos, surf the internet and etc. With advancement of technology, it is possible to apply some techniques to perform text detection and translation. Therefore, an application that allows smartphones to capture an image and extract the text from it to translate into English and speech it out is no longer a dream. In this study, an Android application is developed by integrating Tesseract OCR engine, Bing translator and phones' built-in speech out technology. Final deliverable is tested by various type of target end user from a different language background and concluded that the application benefits many users. By using this app, travelers who visit a foreign country able to understand messages portrayed in different language. Visually impaired users are also able to access important message from a printed text through speech out feature.

...read moreread less

Proceedings Article•DOI•

Word-level script identification for handwritten Indic scripts

[...]

Pawan Kumar Singh¹, Ram Sarkar¹, Mita Nasipuri¹, David Doermann²•Institutions (2)

Jadavpur University¹, University of Maryland, College Park²

23 Aug 2015

TL;DR: This paper proposes a word-level script identification technique for six handwritten Indic scripts- Bangla, Devanagari, Gurumukhi, Malayalam, Oriya Telugu and the Roman script using a combination of elliptical and polygonal approximation techniques.

...read moreread less

Abstract: Automatic script identification from handwritten document images facilitates many important applications such as indexing, sorting and triage. A given Optical Character Recognition (OCR) system is typically trained on only a single script but for documents or collections containing different scripts, there must be some way to automatically identify the script prior to OCR. For Indic script research, some results have been reported in the literature but the task is far from solved. In this paper, we propose a word-level script identification technique for six handwritten Indic scripts- Bangla, Devanagari, Gurumukhi, Malayalam, Oriya Telugu and the Roman script. A set of 82 features has been designed using a combination of elliptical and polygonal approximation techniques. Our approach has been evaluated on a dataset of 7000 handwritten text words, using multiple classifiers. A Multi-Layer Perceptron (MLP) classifier was found to be the best classifier resulting in 95.35% accuracy. The result is progressive considering the complexities and shape variations of the Indic scripts.

...read moreread less

Proceedings Article•DOI•

A technical review on text recognition from images

[...]

Pratik Madhukar Manwatkar, Kavita R. Singh

01 Oct 2015

TL;DR: The objective of this review paper is to summarize the well-known methods for text recognition from images for better understanding of the reader.

...read moreread less

Abstract: Text recognition in images is an active research area which attempts to develop a computer application with the ability to automatically read the text from images. Nowadays there is a huge demand of storing the information available on paper documents in to a computer readable form for later use. One simple way to store information from these paper documents in to computer system is to first scan the documents and then store them as images. However to reuse this information it is very difficult to read the individual contents and searching the contents form these documents line-by-line and word-by-word. The challenges involved are: font characteristics of the characters in paper documents and quality of the images. Due to these challenges, computer is unable to recognize the characters while reading them. Thus, there is a need of character recognition mechanisms to perform document image analysis which transforms documents in paper format to electronic format. In this paper, we have reviewed and analyzed different methods for text recognition from images. The objective of this review paper is to summarize the well-known methods for better understanding of the reader.

...read moreread less

Proceedings Article•DOI•

Scale and rotation invariant OCR for Pashto cursive script using MDLSTM network

[...]

Riaz Ahmad¹, Muhammad Zeshan Afzal¹, Sheikh Faisal Rashid¹, Marcus Liwicki², Thomas M. Breuel¹ - Show less +1 more•Institutions (2)

Kaiserslautern University of Technology¹, German Research Centre for Artificial Intelligence²

23 Aug 2015

TL;DR: This research aims to evaluate the performance of sequence classifiers like HMM and LSTM and compare their performance with descriptor based classifier like SIFT and introduces a database of 480,000 images containing 1000 unique ligatures or sub-words of Pashto.

...read moreread less

Abstract: Optical Character Recognition (OCR) of cursive scripts like Pashto and Urdu is difficult due the presence of complex ligatures and connected writing styles. In this paper, we evaluate and compare different approaches for the recognition of such complex ligatures. The approaches include Hidden Markov Model (HMM), Long Short Term Memory (LSTM) network and Scale Invariant Feature Transform (SIFT). Current state of the art in cursive script assumes constant scale without any rotation, while real world data contain rotation and scale variations. This research aims to evaluate the performance of sequence classifiers like HMM and LSTM and compare their performance with descriptor based classifier like SIFT. In addition, we also assess the performance of these methods against the scale and rotation variations in cursive script ligatures. Moreover, we introduce a database of 480,000 images containing 1000 unique ligatures or sub-words of Pashto. In this database, each ligature has 40 scale and 12 rotation variations. The evaluation results show a significantly improved performance of LSTM over HMM and traditional feature extraction technique such as SIFT.

...read moreread less

Book Chapter•DOI•

Character Segmentation of Hindi Unconstrained Handwritten Words

[...]

Soumen Bag¹, Ankit Krishna²•Institutions (2)

Indian Institute of Technology Dhanbad¹, Indian Institutes of Information Technology²

24 Nov 2015

TL;DR: The proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language and can cope with high variations in writing style and skewed header lines as input.

...read moreread less

Abstract: The proper character level segmentation of printed or handwritten text is an important preprocessing step for optical character recognition OCR. It is noticed that the languages having cursive nature in writing make the segmentation problem much more complicated. Hindi is one of the well known language in India having this cursive nature in writing style. The main challenge in handwritten character segmentation is to handle the inherent variability in the writing style of different individuals. In this paper, we present an efficient character segmentation method for handwritten Hindi words. Segmentation is performed on the basis of some structural patterns observed in the writing style of this language. The proposed method can cope with high variations in writing style and skewed header lines as input. The method has been tested on our own database for both printed and handwritten words. The average success rate is 96.93i¾?%. The method yields fairly good results for this database comparing with other existing methods. We foresee that the proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language.

...read moreread less

Posted Content•

Telugu OCR Framework using Deep Learning.

[...]

Rakesh Achanta, Trevor Hastie

20 Sep 2015-arXiv: Machine Learning

TL;DR: An end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model using a deep convolutional neural network to achieve acceptable error rates is presented.

...read moreread less

Abstract: In this paper, we address the task of Optical Character Recognition(OCR) for the Telugu script. We present an end-to-end framework that segments the text image, classifies the characters and extracts lines using a language model. The segmentation is based on mathematical morphology. The classification module, which is the most challenging task of the three, is a deep convolutional neural network. The language is modelled as a third degree markov chain at the glyph level. Telugu script is a complex alphasyllabary and the language is agglutinative, making the problem hard. In this paper we apply the latest advances in neural networks to achieve state-of-the-art error rates. We also review convolutional neural networks in great detail and expound the statistical justification behind the many tricks needed to make Deep Learning work.

...read moreread less

Proceedings Article•DOI•

Metric-based no-reference quality assessment of heterogeneous document images

[...]

Nibal Nayef, Jean-Marc Ogier

08 Feb 2015

TL;DR: This work proposes a NR-IQA method with the objective quality measure of OCR accuracy, which combines distortion-specific quality metrics and achieves competitive results with learning-based NR- IQA methods on standard datasets, and performs better on heterogeneous documents.

...read moreread less

Abstract: No-reference image quality assessment (NR-IQA) aims at computing an image quality score that best correlates with either human perceived image quality or an objective quality measure, without any prior knowledge of reference images. Although learning-based NR-IQA methods have achieved the best state-of-the-art results so far, those methods perform well only on the datasets on which they were trained. The datasets usually contain homogeneous documents, whereas in reality, document images come from different sources. It is unrealistic to collect training samples of images from every possible capturing device and every document type. Hence, we argue that a metric-based IQA method is more suitable for heterogeneous documents. We propose a NR-IQA method with the objective quality measure of OCR accuracy. The method combines distortion-specific quality metrics. The final quality score is calculated taking into account the proportions of, and the dependency among different distortions. Experimental results show that the method achieves competitive results with learning-based NR-IQA methods on standard datasets, and performs better on heterogeneous documents.

...read moreread less

Journal Article•DOI•

Recognition of damaged letters based on mathematical fuzzy logic analysis

[...]

Vilém Novák¹, Petr Hurtik¹, Hashim Habiballa¹, Martin Stepnicka¹•Institutions (1)

University of Ostrava¹

01 Jun 2015-Journal of Applied Logic

TL;DR: Two original recognition methods are presented, based on application of mathematical fuzzy logic and the second one is based on representation of an image by a fuzzy-valued function, which are compared with a simple neural network classifier and few other common methods.

...read moreread less

Collapse