scispace - formally typeset
Search or ask a question

Showing papers on "Optical character recognition published in 2013"


Proceedings ArticleDOI
01 Dec 2013
TL;DR: This work describes Photo OCR, a system for text extraction from images that is capable of recognizing text in a variety of challenging imaging conditions where traditional OCR systems fail, notably in the presence of substantial blur, low resolution, low contrast, high image noise and other distortions.
Abstract: We describe Photo OCR, a system for text extraction from images. Our particular focus is reliable text extraction from smartphone imagery, with the goal of text recognition as a user input modality similar to speech recognition. Commercially available OCR performs poorly on this task. Recent progress in machine learning has substantially improved isolated character classification, we build on this progress by demonstrating a complete OCR system using these techniques. We also incorporate modern data center-scale distributed language modelling. Our approach is capable of recognizing text in a variety of challenging imaging conditions where traditional OCR systems fail, notably in the presence of substantial blur, low resolution, low contrast, high image noise and other distortions. It also operates with low latency, mean processing time is 600 ms per image. We evaluate our system on public benchmark datasets for text extraction and outperform all previously reported results, more than halving the error rate on multiple benchmarks. The system is currently in use in many applications at Google, and is available as a user input modality in Google Translate for Android.

499 citations


Journal ArticleDOI
TL;DR: It is argued that the next step in the evolution of object recognition algorithms will require radical and bold steps forward in terms of the object representations, as well as the learning and inference algorithms used.

312 citations


Proceedings ArticleDOI
25 Aug 2013
TL;DR: An application of bidirectional LSTM networks to the problem of machine-printed Latin and Fraktur recognition and these recognition accuracies were found without using any language modelling or any other post-processing techniques.
Abstract: Long Short-Term Memory (LSTM) networks have yielded excellent results on handwriting recognition. This paper describes an application of bidirectional LSTM networks to the problem of machine-printed Latin and Fraktur recognition. Latin and Fraktur recognition differs significantly from handwriting recognition in both the statistical properties of the data, as well as in the required, much higher levels of accuracy. Applications of LSTM networks to handwriting recognition use two-dimensional recurrent networks, since the exact position and baseline of handwritten characters is variable. In contrast, for printed OCR, we used a one-dimensional recurrent network combined with a novel algorithm for baseline and x-height normalization. A number of databases were used for training and testing, including the UW3 database, artificially generated and degraded Fraktur text and scanned pages from a book digitization project. The LSTM architecture achieved 0.6% character-level test-set error on English text. When the artificially degraded Fraktur data set is divided into training and test sets, the system achieves an error rate of 1.64%. On specific books printed in Fraktur (not part of the training set), the system achieves error rates of 0.15% (Fontane) and 1.47% (Ersch-Gruber). These recognition accuracies were found without using any language modelling or any other post-processing techniques.

241 citations


Journal ArticleDOI
TL;DR: This paper offers the researchers a link to public image database for the algorithm assessment of text extraction from natural scene images and draws attention to studies on the first two steps in the extraction process, since OCR is a well-studied area where powerful algorithms already exist.

149 citations


Journal ArticleDOI
TL;DR: A comprehensive survey of recent developments in Arabic handwriting recognition, including a summary of the characteristics of Arabic text, followed by a general model for an Arabic text recognition system.
Abstract: Research in offline Arabic handwriting recognition has increased considerably in the past few years. This is evident from the numerous research results published recently in major journals and conferences in the area of handwriting recognition. Features and classifications techniques utilized in recent research work have diversified noticeably compared to the past. Moreover, more efforts have been diverted, in last few years, to construct different databases for Arabic handwriting recognition. This article provides a comprehensive survey of recent developments in Arabic handwriting recognition. The article starts with a summary of the characteristics of Arabic text, followed by a general model for an Arabic text recognition system. Then the used databases for Arabic text recognition are discussed. Research works on preprocessing phase, like text representation, baseline detection, line, word, character, and subcharacter segmentation algorithms, are presented. Different feature extraction techniques used in Arabic handwriting recognition are identified and discussed. Different classification approaches, like HMM, ANN, SVM, k-NN, syntactical methods, etc., are discussed in the context of Arabic handwriting recognition. Works on Arabic lexicon construction and spell checking are presented in the postprocessing phase. Several summary tables of published research work are provided for used Arabic text databases and reported results on Arabic character, word, numerals, and text recognition. These tables summarize the features, classifiers, data, and reported recognition accuracy for each technique. Finally, we discuss some future research directions in Arabic handwriting recognition.

135 citations


Journal ArticleDOI
TL;DR: The proposed COSFIRE filters are conceptually simple and easy to implement and are versatile keypoint detectors and are highly effective in practical computer vision applications.
Abstract: Background: Keypoint detection is important for many computer vision applications. Existing methods suffer from insufficient selectivity regarding the shape properties of features and are vulnerable to contrast variations and to the presence of noise or texture. Methods: We propose a trainable filter which we call Combination Of Shifted FIlter REsponses (COSFIRE) and use for keypoint detection and pattern recognition. It is automatically configured to be selective for a local contour pattern specified by an example. The configuration comprises selecting given channels of a bank of Gabor filters and determining certain blur and shift parameters. A COSFIRE filter response is computed as the weighted geometric mean of the blurred and shifted responses of the selected Gabor filters. It shares similar properties with some shape-selective neurons in visual cortex, which provided inspiration for this work. Results: We demonstrate the effectiveness of the proposed filters in three applications: the detection of retinal vascular bifurcations (DRIVE dataset: 98.50 percent recall, 96.09 percent precision), the recognition of handwritten digits (MNIST dataset: 99.48 percent correct classification), and the detection and recognition of traffic signs in complex scenes (100 percent recall and precision). Conclusions: The proposed COSFIRE filters are conceptually simple and easy to implement. They are versatile keypoint detectors and are highly effective in practical computer vision applications.

119 citations


Proceedings ArticleDOI
25 Aug 2013
TL;DR: This work has presented the results of applying RNN to printed Urdu text in Nastaleeq script, and evaluated BLSTM networks for two cases: one ignoring the character's shape variations and the second is considering them.
Abstract: Recurrent neural networks (RNN) have been successfully applied for recognition of cursive handwritten documents, both in English and Arabic scripts. Ability of RNNs to model context in sequence data like speech and text makes them a suitable candidate to develop OCR systems for printed Nabataean scripts (including Nastaleeq for which no OCR system is available to date). In this work, we have presented the results of applying RNN to printed Urdu text in Nastaleeq script. Bidirectional Long Short Term Memory (BLSTM) architecture with Connectionist Temporal Classification (CTC) output layer was employed to recognize printed Urdu text. We evaluated BLSTM networks for two cases: one ignoring the character's shape variations and the second is considering them. The recognition error rate at character level for first case is 5.15% and for the second is 13.6%. These results were obtained on synthetically generated UPTI dataset containing artificially degraded images to reflect some real-world scanning artifacts along with clean images. Comparison with shape-matching based method is also presented.

112 citations


Proceedings ArticleDOI
04 Feb 2013
TL;DR: A generic Optical Character Recognition system for Arabic script languages called Nabocr is presented, initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts, however, it can be trained by users to be used for other ArabicScript languages.
Abstract: In this paper, we present a generic Optical Character Recognition system for Arabic script languages called Nabocr. Nabocr uses OCR approaches specific for Arabic script recognition. Performing recognition on Arabic script text is relatively more difficult than Latin text due to the nature of Arabic script, which is cursive and context sensitive. Moreover, Arabic script has different writing styles that vary in complexity. Nabocr is initially trained to recognize both Urdu Nastaleeq and Arabic Naskh fonts. However, it can be trained by users to be used for other Arabic script languages. We have evaluated our system's performance for both Urdu and Arabic. In order to evaluate Urdu recognition, we have generated a dataset of Urdu text called UPTI (Urdu Printed Text Image Database), which measures different aspects of a recognition system. The performance of our system for Urdu clean text is 91%. For Arabic clean text, the performance is 86%. Moreover, we have compared the performance of our system against Tesseract's newly released Arabic recognition, and the performance of both systems on clean images is almost the same.

101 citations


Journal ArticleDOI
01 Apr 2013
TL;DR: A proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments and in order to find different rooms and building amenities, which incorporates object detection with text recognition.
Abstract: Independent travel is a well known challenge for blind and visually impaired persons. In this paper, we propose a proof-of-concept computer vision-based wayfinding aid for blind people to independently access unfamiliar indoor environments. In order to find different rooms (e.g. an office, a lab, or a bathroom) and other building amenities (e.g. an exit or an elevator), we incorporate object detection with text recognition. First we develop a robust and efficient algorithm to detect doors, elevators, and cabinets based on their general geometric shape, by combining edges and corners. The algorithm is general enough to handle large intra-class variations of objects with different appearances among different indoor environments, as well as small inter-class differences between different objects such as doors and door-like cabinets. Next, in order to distinguish intra-class objects (e.g. an office door from a bathroom door), we extract and recognize text information associated with the detected objects. For text recognition, we first extract text regions from signs with multiple colors and possibly complex backgrounds, and then apply character localization and topological analysis to filter out background interference. The extracted text is recognized using off-the-shelf optical character recognition (OCR) software products. The object type, orientation, location, and text information are presented to the blind traveler as speech.

95 citations


Journal ArticleDOI
TL;DR: The description of the Arabic script characteristics is presented with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.
Abstract: Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause misclassifications of characters which in turn may lead to wrong results. Therefore, off-line Arabic character segmentation is a difficult research problem and little research has been achieved in this area in the past few decades. This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries. Most of the character recognition methods used in the recognition of Arabic characters are adopted from available methods used on handwritten Latin and Chinese characters; however, other methods are developed only for Arabic character segmentation. This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.

87 citations


01 Jan 2013
TL;DR: A literature review on English OCR techniques is presented, finding some new methodologies to overcome the complexity of English writing style and identifying characters among these classes.
Abstract: This paper presents a literature review on English OCR techniques. English OCR system is compulsory to convert numerous published books of English into editable computer text files. Latest research in this area has been able to grown some new methodologies to overcome the complexity of English writing style. Still these algorithms have not been tested for complete characters of English Alphabet. Hence, a system is required which can handle all classes of English text and identify characters among these classes.

Journal ArticleDOI
TL;DR: The main objective of this paper is to present the study on various existing binarization algorithms and compared their measurements to act as guide for fresher’s to start their work on Binarization.
Abstract: ABSTARCT Image binarization is important step in the OCR (Optical Character Recognition). There are several methods used for image binarization recently, but there is no way to select single or best method which is used for all images. The main objective of this paper is to present the study on various existing binarization algorithms and compared their measurements. This paper will act as guide for fresher’s to start their work on binarization.

Journal ArticleDOI
TL;DR: Very promising results are achieved when binarization features and the multilayer feed forward neural network classifier is used to recognize the off-line cursive handwritten characters.

Book ChapterDOI
23 Aug 2013
TL;DR: A dataset of camera captured document images containing varying levels of focal-blur introduced manually during capture is presented and three recent methods for predicting the OCR quality of images on this dataset are presented.
Abstract: With the proliferation of cameras on mobile devices there is an increased desire to image document pages as an alternative to scanning However, the quality of captured document images is often lower than its scanned equivalent due to hardware limitations and stability issues In this context, automatic assessment of the quality of captured images is useful for many applications Although there has been a lot of work on developing computational methods and creating standard datasets for natural scene image quality assessment, until recently quality estimation of camera captured document images has not been given much attention One traditional quality indicator for document images is the Optical Character Recognition (OCR) accuracy In this work, we present a dataset of camera captured document images containing varying levels of focal-blur introduced manually during capture For each image we obtained the character level OCR accuracy Our dataset can be used to evaluate methods for predicting OCR quality of captured documents as well as enhancements In order to make the dataset publicly and freely available, originals from two existing datasets - University of Washington dataset and Tobacco Database were selected We present a case study with three recent methods for predicting the OCR quality of images on our dataset

Journal ArticleDOI
TL;DR: A new vertical segmentation algorithm is proposed in which the segmentation points are located after thinning the word image to get the stroke width of a single pixel and high segmentation accuracy is found to be achieved.

Proceedings ArticleDOI
25 Aug 2013
TL;DR: Experiments show that the Co-HOG based technique clearly outperforms state-of-the-art techniques that use HOG, Scale Invariant Feature Transform (SIFT), and Maximally Stable Extremal Regions (MSER).
Abstract: Scene text recognition is a fundamental step in End-to-End applications where traditional optical character recognition (OCR) systems often fail to produce satisfactory results. This paper proposes a technique that uses co-occurrence histogram of oriented gradients (Co-HOG) to recognize the text in scenes. Compared with histogram of oriented gradients (HOG), Co-HOG is a more powerful tool that captures spatial distribution of neighboring orientation pairs instead of just a single gradient orientation. At the same time, it is more efficient compared with HOG and therefore more suitable for real-time applications. The proposed scene text recognition technique is evaluated on ICDAR2003 character dataset and Street View Text (SVT) dataset. Experiments show that the Co-HOG based technique clearly outperforms state-of-the-art techniques that use HOG, Scale Invariant Feature Transform (SIFT), and Maximally Stable Extremal Regions (MSER).

Patent
06 Mar 2013
TL;DR: In this article, a processing system uses optical character recognition (OCR) to provide augmented reality (AR) in a video of a scene based on whether the scene includes a predetermined AR target, and retrieves an OCR zone definition associated with the AR target.
Abstract: A processing system uses optical character recognition (OCR) to provide augmented reality (AR). The processing system automatically determines, based on video of a scene, whether the scene includes a predetermined AR target. In response to determining that the scene includes the AR target, the processing system automatically retrieves an OCR zone definition associated with the AR target. The OCR zone definition identifies an OCR zone. The processing system automatically uses OCR to extract text from the OCR zone. The processing system uses results of the OCR to obtain AR content which corresponds to the text from the OCR zone. The processing system automatically causes that AR content to be presented in conjunction with the scene. Other embodiments are described and claimed.

Proceedings ArticleDOI
04 Feb 2013
TL;DR: This novel approach combines the OCR outputs from multiple thresholded images by aligning the text output and producing a lattICE of word alternatives from which a lattice word error rate (LWER) is calculated.
Abstract: For noisy, historical documents, a high optical character recognition (OCR) word error rate (WER) can render the OCR text unusable. Since image binarization is often the method used to identify foreground pixels, a body of research seeks to improve image-wide binarization directly. Instead of relying on any one imperfect binarization technique, our method incorporates information from multiple simple thresholding binarizations of the same image to improve text output. Using a new corpus of 19th century newspaper grayscale images for which the text transcription is known, we observe WERs of 13.8% and higher using current binarization techniques and a state-of-the-art OCR engine. Our novel approach combines the OCR outputs from multiple thresholded images by aligning the text output and producing a lattice of word alternatives from which a lattice word error rate (LWER) is calculated. Our results show a LWER of 7.6% when aligning two threshold images and a LWER of 6.8% when aligning five. From the word lattice we commit to one hypothesis by applying the methods of Lund et al. (2011) achieving an improvement over the original OCR output and a 8.41% WER result on this data set.

Journal ArticleDOI
TL;DR: Three feature extraction techniques have been used to improve the rate of recognition of Optical Character Recognition (OCR) for printed Hindi text in Devanagari script, using Artificial Neural Network (ANN), which improves its efficiency.
Abstract: Hindi is the most widely spoken language in India, with more than 300 million speakers. As there is no separation between the characters of texts written in Hindi as there is in English, the Optical Character Recognition (OCR) systems developed for the Hindi language carry a very poor recognition rate. In this paper we propose an OCR for printed Hindi text in Devanagari script, using Artificial Neural Network (ANN), which improves its efficiency. One of the major reasons for the poor recognition rate is error in character segmentation. The presence of touching characters in the scanned documents further complicates the segmentation process, creating a major problem when designing an effective character segmentation technique. Preprocessing, character segmentation, feature extraction, and finally, classification and recognition are the major steps which are followed by a general OCR. The preprocessing tasks considered in the paper are conversion of gray scaled images to binary images, image rectification, and segmentation of the documents textual contents into paragraphs, lines, words, and then at the level of basic symbols. The basic symbols, obtained as the fundamental unit from the segmentation process, are recognized by the neural classifier. In this work, three feature extraction techniques-: histogram of projection based on mean distance, histogram of projection based on pixel value, and vertical zero crossing, have been used to improve the rate of recognition. These feature extraction techniques are powerful enough to extract features of even distorted characters/symbols. For development of the neural classifier, a back-propagation neural network with two hidden layers is used. The classifier is trained and tested for printed Hindi texts. A performance of approximately 90% correct recognition rate is achieved.

Proceedings ArticleDOI
25 Aug 2013
TL;DR: This paper proposed a new algorithm for printed script identification based on texture analysis that uses the histogram of the local patterns as description of the script stroke directions distribution which is the characteristic of every script.
Abstract: Script identification is an important step in multi-script document analysis. As different textures present in text portion of a script are the main distinct features of the script, in this paper, we proposed a new algorithm for printed script identification based on texture analysis. Since local patterns is a unifying concept for traditional statistical and structural approaches of texture analysis, here the basic idea is to use the histogram of the local patterns as description of the script stroke directions distribution which is the characteristic of every script. As local pattern, the basic version of the Local Binary Patterns (LBP) and a modified version of the Orientation of the Local Binary Patterns (OLBP) are proposed. A Least Square Support Vector Machine (LS-SVM) is used as identifier. The scheme has been verified on two databases. The first or training database is a database with 200 sheets of 10 different scripts. The scripts font is provided by the Google translator. The second or test database has been obtained by scanning different newspapers and books. It contains 5 common scripts among 10 different scripts of the first database. From the experiment we obtained encouraging results.

Proceedings ArticleDOI
01 Aug 2013
TL;DR: A combination of image processing technique and OCR to obtain the accurate vehicle plate recognition for vehicle in Malaysia and the development of Graphical User Interface to ease user in recognizing the characters and numbers in the vehicle or license plates is proposed.
Abstract: This paper presents the development of automatic vehicle plate detection system using image processing technique. The famous name for this system is Automatic Number Plate Recognition (ANPR). Automatic vehicle plate detection system is commonly used in field of safety and security systems especially in car parking area. Beside the safety aspect, this system is applied to monitor road traffic such as the speed of vehicle and identification of the vehicle's owner. This system is designed to assist the authorities in identifying the stolen vehicle not only for car but motorcycle as well. In this system, the Optical Character Recognition (OCR) technique was the prominent technique employed by researchers to analyse image of vehicle plate. The limitation of this technique was the incapability of the technique to convert text or data accurately. Besides, the characters, the background and the size of the vehicle plate are varied from one country to other country. Hence, this project proposes a combination of image processing technique and OCR to obtain the accurate vehicle plate recognition for vehicle in Malaysia. The outcome of this study is the system capable to detect characters and numbers of vehicle plate in different backgrounds (black and white) accurately. This study also involves the development of Graphical User Interface (GUI) to ease user in recognizing the characters and numbers in the vehicle or license plates.

01 Jan 2013
TL;DR: Noise in scanned document images is reviewed, which reduces the accuracy of subsequent tasks of OCR (Optical character Recognition) systems and some noise removal methods are discussed.
Abstract:  Abstract- document images may be contaminated with noise during transmission, scanning or conversion to digital form. We can categorize noises by identifying their features and can search for similar patterns in a document image to choose appropriate methods for their removal. After a brief introduction, this paper reviews noises that might appear in scanned document images and discusses some noise removal methods. owadays, with the increase in computer use in everybody's lives, the ability for people to convert documents to digital and readable formats has become a necessity. Scanning documents is a way of changing printed documents into digital format. A common problem encountered when scanning documents is 'noise' which can occur in an image because of paper quality, the typing machine used, or it can be created by scanners during the scanning process. Noise removal is one of the steps in pre- processing. Among other things, noise reduces the accuracy of subsequent tasks of OCR (Optical character Recognition) systems. It can appear in the foreground or background of an image and can be generated before or after scanning. Examples of noise in scanned document images are as follows. The page rule line is a source of noise which interferes with text objects. The marginal noise usually appears in a large dark region around the document image and can be textual or non-textual. Some forms of clutter noise appear in an image because of document skew while scanning or are from holes punched in the document, or background noise, such as uneven contrast, show through effects, interfering strokes, and background spots, etc. Next, we will discuss each type in detail.

Patent
15 Mar 2013
TL;DR: In this article, an electronic device and method capture multiple images of a scene of real world at several zoom levels, the scene containing text of one or more sizes, then the electronic devices and method extract from each of the multiple images, one or multiple text regions, followed by analyzing an attribute that is relevant to OCR.
Abstract: An electronic device and method capture multiple images of a scene of real world at a several zoom levels, the scene of real world containing text of one or more sizes. Then the electronic device and method extract from each of the multiple images, one or more text regions, followed by analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of the multiple images. When an attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, the version of the first text region is provided as input to OCR.

Journal ArticleDOI
TL;DR: This work introduces a method that automatically computes a two-channel profile from an OCRed historical text and shows a strong correlation between the true distribution of spelling variation patterns and recognition errors in the OCRed text and estimated ranks and scores automatically computed in profiles.

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This paper proposes a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence in run-length compressed printed-text-documents.
Abstract: Segmentation of a text-document into lines, words and characters, which is considered to be the crucial preprocessing stage in Optical Character Recognition (OCR) is traditionally carried out on uncompressed documents, although most of the documents in real life are available in compressed form, for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation has motivated us to take up research in document image analysis using compressed documents. In this paper, we think in a new way to carry out segmentation at line, word and character level in run-length compressed printed-text-documents. We extract the horizontal projection profile curve from the compressed file and using the local minima points perform line segmentation. However, tracing vertical information which leads to tracking words-characters in a run-length compressed file is not very straight forward. Therefore, we propose a novel technique for carrying out simultaneous word and character segmentation by popping out column runs from each row in an intelligent sequence. The proposed algorithms have been validated with 1101 text-lines, 1409 words and 7582 characters from a data-set of 35 noise and skew free compressed documents of Bengali, Kannada and English Scripts.

Patent
05 Sep 2013
TL;DR: In this paper, a weighted finite state transducer for each interpretation is proposed, where the weights are based on the predicted probabilities of accuracy of each interpretation, and the weighted transcer is combined into a document model that encodes the defined mathematical relationship.
Abstract: Optical character recognition systems and methods including the steps of: capturing an image of a document including a set of numbers having a defined mathematical relationship; analyzing the image to determine line segments; analyzing each line segment to determine one or more character segments; analyzing each character segment to determine possible interpretations, each interpretation having an associated predicted probability of being accurate; forming a weighted finite state transducer for each interpretation, wherein the weights are based on the predicted probabilities; combining the weighted finite state transducer for each interpretation into a document model weighted finite state transducer that encodes the defined mathematical relationship; searching the document model weighted finite state transducer for the lowest weight path, which is an interpretation of the document that is most likely to accurately represent the document; and outputting an optical character recognition version of the captured image.

Patent
21 Feb 2013
TL;DR: In this article, a system and method of electronically identifying a license plate and comparing the results to a predetermined database is presented, which runs on standard PC hardware and can be linked to other applications or databases.
Abstract: Provided is a system and method of electronically identifying a license plate and comparing the results to a predetermined database. The software aspect of the system runs on standard PC hardware and can be linked to other applications or databases. It first uses a series of image manipulation techniques to detect, normalize and enhance the image of the number plate. Optical character recognition (OCR) is used to extract the alpha-numeric characters of the license plate. The recognized characters are then compared to databases containing information about the vehicle and/or owner.

Patent
06 Jun 2013
TL;DR: In this article, text is extracted from a source image of a publication using an Optical Character Recognition (OCR) process and a document is generated containing text segments of the extracted text.
Abstract: Text is extracted from a source image of a publication using an Optical Character Recognition (OCR) process. A document is generated containing text segments of the extracted text. The document includes a control module that responds to user interactions with the displayed document. Responsive to a user selection of a displayed text segment, a corresponding image segment from the source image containing the text is retrieved and rendered in place of the selected text segment. The user can select again to toggle the display back to the text segment. Each text segment can be tagged with a garbage score indicating its quality. If the garbage score of a text segment exceeds a threshold value, the corresponding image segment can be automatically displayed instead.

Journal ArticleDOI
TL;DR: An artificial neural network-based OCR algorithm for ANPR application and its efficient architecture are presented and it is shown that the proposed architecture can meet the real-time requirement of an ANPR system and can process a character image in 0.7 ms with 97% successful character recognition rate.
Abstract: The last main stage in an automatic number plate recognition system (ANPRs) is optical character recognition (OCR), where the number plate characters on the number plate image are converted into encoded texts. In this study, an artificial neural network-based OCR algorithm for ANPR application and its efficient architecture are presented. The proposed architecture has been successfully implemented and tested using the Mentor Graphics RC240 field programmable gate arrays (FPGA) development board equipped with a 4M Gates Xilinx Virtex-4 LX40. A database of 3570 UK binary character images have been used for testing the performance of the proposed architecture. Results achieved have shown that the proposed architecture can meet the real-time requirement of an ANPR system and can process a character image in 0.7 ms with 97.3% successful character recognition rate and consumes only 23% of the available area in the used FPGA.

Proceedings ArticleDOI
Yan Liu1, Xiaoqing Lu1, Yeyang Qin1, Zhi Tang1, Jianbo Xu1 
04 Feb 2013
TL;DR: This paper reviews the development process of chart recognition techniques in the past decades and presents the focuses of current researches, which mainly includes three parts: chart segmentation, chart classification, and chart Interpretation.
Abstract: As an effective information transmitting way, chart is widely used to represent scientific statistics datum in books, research papers, newspapers etc. Though textual information is still the major source of data, there has been an increasing trend of introducing graphs, pictures, and figures into the information pool. Text recognition techniques for documents have been accomplished using optical character recognition (OCR) software. Chart recognition techniques as a necessary supplement of OCR for document images are still an unsolved problem due to the great subjectiveness and variety of charts styles. This paper reviews the development process of chart recognition techniques in the past decades and presents the focuses of current researches. The whole process of chart recognition is presented systematically, which mainly includes three parts: chart segmentation, chart classification, and chart Interpretation. In each part, the latest research work is introduced. In the last, the paper concludes with a summary and promising future research direction.