Showing papers on "Optical character recognition published in 2002"

PDF

Open Access

Journal Article•DOI•

Localizing and segmenting text in images and videos

[...]

Rainer Lienhart¹, A. Wernicke•Institutions (1)

01 Apr 2002-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: This work proposes a novel method for localizing and segmenting text in complex images and videos that is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video.

...read moreread less

Abstract: Many images, especially those used for page design on Web pages, as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. We propose a novel method for localizing and segmenting text in complex images and videos. Text lines are identified by using a complex-valued multilayer feed-forward network trained to detect text at a fixed scale and position. The network's output at all scales and positions is integrated into a single text-saliency map, serving as a starting point for candidate text lines. In the case of video, these candidate text lines are refined by exploiting the temporal redundancy of text in video. Localized text lines are then scaled to a fixed height of 100 pixels and segmented into a binary image with black characters on white background. For videos, temporal redundancy is exploited to improve segmentation performance. Input images and videos can be of any size due to a true multiresolution approach. Moreover, the system is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video, so that one text bitmap is created for all instances of that text line. Therefore, our text segmentation results can also be used for object-based video encoding such as that enabled by MPEG-4.

...read moreread less

478 citations

Proceedings Article•DOI•

Text localization, enhancement and binarization in multimedia documents

[...]

Christian Wolf¹, J.-M. Jolion¹, Francoise Chassaing•Institutions (1)

Vision Institute¹

10 Dec 2002

TL;DR: An algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text is presented and the quality of the localized text is improved by robust multiple frame integration.

...read moreread less

Abstract: The systems currently available for content based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text. The quality of the localized text is improved by robust multiple frame integration. Anew technique for the binarization of the text boxes is proposed. Finally, detection and OCR results for a commercial OCR are presented.

...read moreread less

262 citations

Journal Article•DOI•

Automatic recognition of handwritten numerical strings: a recognition and verification strategy

[...]

Luiz S. Oliveira¹, Robert Sabourin¹, Flávio Bortolozzi², Ching Y. Suen•Institutions (2)

École de technologie supérieure¹, Pontifícia Universidade Católica do Paraná²

01 Nov 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A modular system to recognize handwritten numerical strings using a segmentation-based recognition approach and a recognition and verification strategy that combines the outputs from different levels such as segmentation, recognition, and postprocessing in a probabilistic model is proposed.

...read moreread less

Abstract: A modular system to recognize handwritten numerical strings is proposed. It uses a segmentation-based recognition approach and a recognition and verification strategy. The approach combines the outputs from different levels such as segmentation, recognition, and postprocessing in a probabilistic model. A new verification scheme which contains two verifiers to deal with the problems of oversegmentation and undersegmentation is presented. A new feature set is also introduced to feed the oversegmentation verifier. A postprocessor based on a deterministic automaton is used and the global decision module makes an accept/reject decision. Finally, experimental results on two databases are presented: numerical amounts on Brazilian bank checks and NIST SD19. The latter aims at validating the concept of modular system and showing the robustness of the system using a well-known database.

...read moreread less

228 citations

Journal Article•DOI•

Off-Line Arabic Character Recognition --- A Review

[...]

Mohammad S. Khorsheed¹•Institutions (1)

University of Cambridge¹

01 May 2002-Pattern Analysis and Applications

TL;DR: This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.

...read moreread less

Abstract: Off-line recognition requires transferring the text under consideration into an image file. This represents the only available solution to bring the printed materials to the electronic media. However, the transferring process causes the system to lose the temporal information of that text. Other complexities that an off-line recognition system has to deal with are the lower resolution of the document and the poor binarisation, which can contribute to readability when essential features of the characters are deleted or obscured. Recognising Arabic script presents two additional challenges: orthography is cursive and letter shape is context sensitive. Certain character combinations form new ligature shapes, which are often font-dependent. Some ligatures involve vertical stacking of characters. Since not all letters connect, word boundary location becomes an interesting problem, as spacing may separate not only words, but also certain characters within a word. Various techniques have been implemented to achieve high recognition rates. These techniques have tackled different aspects of the recognition system. This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.

...read moreread less

207 citations

Journal Article•DOI•

Optical character recognition for cursive handwriting

[...]

Nafiz Arica, Fatos T. Yarman-Vural

01 Jun 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new analytic scheme, which uses a sequence of image segmentation and recognition algorithms, is proposed for the off-line cursive handwriting recognition problem and indicates higher recognition rates compared to the available methods reported in the literature.

...read moreread less

Abstract: A new analytic scheme, which uses a sequence of image segmentation and recognition algorithms, is proposed for the off-line cursive handwriting recognition problem. First, some global parameters, such as slant angle, baselines, stroke width and height, are estimated. Second, a segmentation method finds character segmentation paths by combining gray-scale and binary information. Third, a hidden Markov model (HMM) is employed for shape recognition to label and rank the character candidates. For this purpose, a string of codes is extracted from each segment to represent the character candidates. The estimation of feature space parameters is embedded in the HMM training stage together with the estimation of the HMM model parameters. Finally, information from a lexicon and from the HMM ranks is combined in a graph optimization problem for word-level recognition. This method corrects most of the errors produced by the segmentation and HMM ranking stages by maximizing an information measure in an efficient graph search algorithm. The experiments indicate higher recognition rates compared to the available methods reported in the literature.

...read moreread less

184 citations

Journal Article•DOI•

Lexicon-driven segmentation and recognition of handwritten character strings for Japanese address reading

[...]

Cheng-Lin Liu¹, Masashi Koga¹, Hiromichi Fujisawa¹•Institutions (1)

Hitachi¹

01 Nov 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A handwritten character string recognition system for Japanese mail address reading on a very large vocabulary because there is no extra space between words to achieve real-time recognition.

...read moreread less

Abstract: This paper describes a handwritten character string recognition system for Japanese mail address reading on a very large vocabulary. The address phrases are recognized as a whole because there is no extra space between words. The lexicon contains 111,349 address phrases, which are stored in a trie structure. In recognition, the text line image is matched with the lexicon entries (phrases) to obtain reliable segmentation and retrieve valid address phrases. The paper first introduces some effective techniques for text line image preprocessing and presegmentation. In presegmentation, the text line image is separated into primitive segments by connected component analysis and touching pattern splitting based on contour shape analysis. In lexicon matching, consecutive segments are dynamically combined into candidate character patterns. An accurate character classifier is embedded in lexicon matching to select characters matched with a candidate pattern from a dynamic category set. A beam search strategy is used to control the lexicon matching so as to achieve real-time recognition. In experiments on 3,589 live mail images, the proposed method achieved correct rate of 83.68 percent while the error rate is less than 1 percent.

...read moreread less

178 citations

Proceedings Article•DOI•

A new approach for video text detection

[...]

Min Cai¹, Jiqiang Song¹, Michael R. Lyu¹•Institutions (1)

The Chinese University of Hong Kong¹

10 Dec 2002

TL;DR: This paper proposes an efficient text detection approach, which is based on invariant features, such as edge strength, edge density, and horizontal distribution, and it applies edge detection and uses a low threshold to filter out definitely non-text edges.

...read moreread less

Abstract: Text detection is fundamental to video information retrieval and indexing. Existing methods cannot handle well those texts with different contrast or embedded in a complex background. To handle these difficulties, this paper proposes an efficient text detection approach, which is based on invariant features, such as edge strength, edge density, and horizontal distribution. First, it applies edge detection and uses a low threshold to filter out definitely non-text edges. Then, a local threshold is selected to both keep low-contrast text and simplify complex background of high-contrast text. Next, two text-area enhancement operators are proposed to highlight those areas with either high edge strength or high edge density. Finally, coarse-to-fine detection locates text regions efficiently. Experimental results show that this approach is robust for contrast, font-size, font-color, language, and background complexity.

...read moreread less

147 citations

Journal Article•DOI•

Document image analysis: A primer

[...]

Rangachar Kasturi¹, Lawrence O'Gorman², Venu Govindaraju³•Institutions (3)

Pennsylvania State University¹, Avaya², University at Buffalo³

01 Feb 2002-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: This paper briefly describes various components of a document analysis system and provides the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.

...read moreread less

Abstract: Document image analysis refers to algorithms and techniques that are applied to images of documents to obtain a computer-readable description from pixel data. A well-known document image analysis product is the Optical Character Recognition (OCR) software that recognizes characters in a scanned document. OCR makes it possible for the user to edit or search the document’s contents. In this paper we briefly describe various components of a document analysis system. Many of these basic building blocks are found in most document analysis systems, irrespective of the particular domain or language to which they are applied. We hope that this paper will help the reader by providing the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.

...read moreread less

143 citations

Journal Article•DOI•

A spatial-temporal approach for video caption detection and recognition

[...]

Xiaoou Tang¹, Xinbo Gao¹, Jianzhuang Liu¹, Hongjiang Zhang²•Institutions (2)

The Chinese University of Hong Kong¹, Microsoft²

01 Jul 2002-IEEE Transactions on Neural Networks

TL;DR: A video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier that improves the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions.

...read moreread less

Abstract: We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions. As the first attempt on Chinese video-caption recognition, our experiment results are very encouraging.

...read moreread less

131 citations

Journal Article•DOI•

Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis

[...]

Utpal Garain¹, Bidyut B. Chaudhuri¹•Institutions (1)

Indian Statistical Institute¹

01 Nov 2002

TL;DR: A new technique is presented for identification and segmentation of touching characters based on fuzzy multifactorial analysis and a predictive algorithm is developed for effectively selecting possible cut columns for segmenting the touching characters.

...read moreread less

Abstract: One of the important reasons for poor recognition rate in optical character recognition (OCR) system is the error in character segmentation. Existence of touching characters in the scanned documents is a major problem to design an effective character segmentation procedure. In this paper, a new technique is presented for identification and segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting possible cut columns for segmenting the touching characters. The proposed method has been applied to printed documents in Devnagari and Bangla: the two most popular scripts of the Indian sub-continent. The results obtained from a test-set of considerable size show that a reasonable improvement in recognition rate can be achieved with a modest increase in computations.

...read moreread less

126 citations

Journal Article•DOI•

Evaluating the performance of table processing algorithms

[...]

Jianying Hu¹, Ramanujan S. Kashi¹, Daniel P. Lopresti², Gordon Wilfong²•Institutions (2)

Avaya¹, Alcatel-Lucent²

01 Mar 2002-International Journal on Document Analysis and Recognition

TL;DR: An intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition are introduced and a new paradigm, “graph probing,” is described for comparing the results returned by the recognition system and the representation created during ground-truthing.

...read moreread less

Abstract: While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition have gained acceptance in the literature, attempts to formalize the problem for higher-level algorithms, while receiving a fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity. In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely (deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, “graph probing,” for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks as well.

...read moreread less

Proceedings Article•DOI•

Off-line handwritten Arabic character segmentation algorithm: ACSA

[...]

Toufik Sari, Labiba Souici, Mokhtar Sellami

06 Aug 2002

TL;DR: A new character segmentation algorithm (ACSA) of Arabic scripts is presented, which yields on the segmentation of isolated handwritten words in perfectly separated characters based on morphological rules constructed at the feature extraction phase.

...read moreread less

Abstract: Character segmentation is a necessary preprocessing step for character recognition in many OCR systems. It is an important step because incorrectly segmented characters are unlikely to be recognized correctly. The most difficult case in character segmentation is the cursive script. The scripted nature of Arabic written language poses some high challenges for automatic character segmentation and recognition. In this paper, a new character segmentation algorithm (ACSA) of Arabic scripts is presented. The developed segmentation algorithm yields on the segmentation of isolated handwritten words in perfectly separated characters. It is based on morphological rules, which are constructed at the feature extraction phase. Finally, ACSA is combined with an existing handwritten Arabic character recognition system (RECAM).

...read moreread less

Journal Article•DOI•

Neural network based system for script identification in Indian documents

[...]

S. Basavaraj Patil¹, N. V. Subbareddy¹•Institutions (1)

University B.D.T College of Engineering¹

01 Feb 2002-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: A neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts and results are very encouraging and prove the effectiveness of the approach.

...read moreread less

Abstract: The paper describes a neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts. Script identification is a basic requirement in automation of document processing, in multi-script, multi-lingual environments. The system developed includes a feature extractor and a modular neural network. The feature extractor consists of two stages. In the first stage the document image is dilated using 3 X 3 masks in horizontal, vertical, right diagonal, and left diagonal directions. In the next stage, average pixel distribution is found in these resulting images. The modular network is a combination of separately trained feedforward neural network classifiers for each script. The system recognizes 64 X 64 pixel document images. In the next level, the system is modified to perform on single word-document images in the same three scripts. Modified system includes a pre-processor, modified feature extractor and probabilistic neural network classifier. Pre-processor segments the multi-script multi-lingual document into individual words. The feature extractor receives these word-document images of variable size and still produces the discriminative features employed by the probabilistic neural classifier. Experiments are conducted on a manually developed database of document images of size 64 X 64 pixels and on a database of individual words in the three scripts. The results are very encouraging and prove the effectiveness of the approach.

...read moreread less

Patent•

System and method for detecting cheque fraud

[...]

Donald J. Douglas, Marcel Levesque

05 Dec 2002

TL;DR: In this article, a cheque scanning module scans cheques and matches the encoded Magnetic Ink Character Recognition (MICR) data (i.e. serial number, Customer Account Number and amount) from the scanned digital electronic images with items in an issuance database which contains client provided cheque particulars.

...read moreread less

Abstract: A system and method for detecting cheque fraud includes a cheque scanning module and a detection module. The cheque scanning module scans cheques and matches the encoded Magnetic Ink Character Recognition (MICR) data (i.e. serial number, Customer Account Number and amount) from the scanned digital electronic images with items in an issuance database which contains client provided cheque particulars. The detection module passes the cheque images through an optical character recognition (OCR) process to read what is written on the cheque and to match results against the issuance database. If the written information on the face of a cheque is unreadable or there is no match with the information in the issuance database, the detection module passes the cheque through a series of slower more precise OCR processes. Any cheques that are not successfully read and matched are highlighted as an "exception" and immediately forwarded to the client for further action.

...read moreread less

Journal Article•DOI•

Automatic recognition of printed Oriya script

[...]

Bidyut B. Chaudhuri¹, Umapada Pal¹, Mandar Mitra¹•Institutions (1)

Indian Statistical Institute¹

01 Feb 2002-Sadhana-academy Proceedings in Engineering Sciences

TL;DR: A prototype of the OCR system for printed Oriya script achieves 96.3% character level accuracy on average, and the feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning.

...read moreread less

Abstract: This paper deals with an Optical Character Recognition (OCR) system for printedOriya script. The development of OCR for this script is difficult because a large number of character shapes in the script have to be recognized. In the proposed system, the document image is first captured using a flat-bed scanner and then passed through different preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of water overflow from a reservoir. The feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

...read moreread less

Proceedings Article•DOI•

Efficient video text recognition using multiple frame integration

[...]

Xian-Sheng Hua, Pei Yin, Hong-Jiang Zhang

10 Dec 2002

TL;DR: This work uses multiple frame verification to reduce text detection false alarms and applies a block-based adaptive thresholding procedure to form a clearer "man-made" frame that is sent to an OCR engine for recognition.

...read moreread less

Abstract: Text superimposed on the video frames provides supplemental but important information for video indexing and retrieval. Many efforts have been made for videotext detection and recognition (video OCR). The main difficulties of video OCR are the low resolution and the background complexity. We present efficient schemes to deal with the second difficulty by sufficiently utilizing multiple frames that contain the same text to get every clear word from these frames. Firstly, we use multiple frame verification to reduce text detection false alarms. We then choose those frames where the text is most likely clear, thus it is more possible to be correctly recognized. We then detect and joint every clear text block from those frames to form a clearer "man-made" frame. Later we apply a block-based adaptive thresholding procedure on these "man-made" frames. Finally, the binarized frames are sent to an OCR engine for recognition. Experiments show that the word recognition rate has been increased over 28% by these methods.

...read moreread less

Journal Article•DOI•

Identification of different script lines from multi-script documents

[...]

Umapada Pal¹, Bidyut B. Chaudhuri¹•Institutions (1)

Indian Statistical Institute¹

01 Dec 2002-Image and Vision Computing

TL;DR: In this paper, an automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document has been presented, using shape-based features, statistical features and some features obtained from the concept of water overflow from the reservoir.

...read moreread less

Book Chapter•DOI•

Script Identification in Printed Bilingual Documents

[...]

D. Dhanya¹, A. G. Ramakrishnan¹•Institutions (1)

Indian Institute of Science¹

19 Aug 2002

TL;DR: Techniques to identify the script of a word using Gabor filters with suitable frequencies and orientations are discussed and results obtained are quite encouraging.

...read moreread less

Abstract: Identification of script in multi-lingual documents is essential for many language dependent applications suchas machine translation and optical character recognition. Techniques for script identification generally require large areas for operation so that sufficient information is available. Such assumption is nullified in Indian context, as there is an interspersion of words of two different scripts in most documents. In this paper, techniques to identify the script of a word are discussed. Two different approaches have been proposed and tested. The first method structures words into 3 distinct spatial zones and utilizes the information on the spatial spread of a word in upper and lower zones, together with the character density, in order to identify the script. The second technique analyzes the directional energy distribution of a word using Gabor filters with suitable frequencies and orientations. Words with various font styles and sizes have been used for the testing of the proposed algorithms and the results obtained are quite encouraging.

...read moreread less

Proceedings Article•DOI•

Recognition of off-line handwritten Arabic words using hidden Markov model approach

[...]

Somaya Al-Maadeed¹, C. Higgens¹, Dave Elliman¹•Institutions (1)

University of Nottingham¹

11 Aug 2002

TL;DR: A complete system able to classify Arabic handwritten words of one hundred different writers is proposed and discussed, and successful recognition results are reported.

...read moreread less

Abstract: Hidden Markov models (HMM) have been used with some success in recognizing printed Arabic words. In this paper, a complete scheme for totally unconstrained Arabic handwritten word recognition based on a model discriminant HMM is presented. A complete system able to classify Arabic handwritten words of one hundred different writers is proposed and discussed. The system first attempts to remove some of variation in the images that do not affect the identity of the handwritten word. Next, the system codes the skeleton and edge of the word so that feature information about the lines in the skeleton is extracted. Then a classification process based on the HMM approach is used. The output is a word in the dictionary. A detailed experiment is carried out and successful recognition results are reported.

...read moreread less

Proceedings Article•DOI•

Hybrid Chinese/English text detection in images and video frames

[...]

Wenge Mao, Fu-Lai Chung, Kin-Man Lam, Wan-chi Sun

11 Aug 2002

TL;DR: A multiscale texture-based method using local energy analysis for hybrid Chinese/English text detection in images and video frames and a tested dataset with low missed rate and false alarm rate is proposed.

...read moreread less

Abstract: We propose a multiscale texture-based method using local energy analysis for hybrid Chinese/English text detection in images and video frames. Local energy analysis has been shown to work well in text detection, where remarkable local energy variations of pixels correspond to text region or boundary of other objects and lower local energy variations of pixels correspond to background or the interior of non-text objects. Local energy variation is calculated in a local region based on the wavelet transform coefficients of images. Hybrid Chinese/English text in images and video frames can be detected whether it is aligned horizontally or vertically. The font size of text to be detected may vary in a wide range of values. The proposed method has been tested on 321 frame images obtained from local TV programs and a tested dataset with low missed rate and false alarm rate.

...read moreread less

Journal Article•DOI•

A HYBRID SCHEME FOR HANDPRINTED NUMERAL RECOGNITION BASED ON A SELF-ORGANIZING NETWORK AND MLP ClASSIFIERS

[...]

Ujjwal Bhattacharya¹, Tanmoy Kanti Das¹, Amitava Datta¹, Swapan K. Parui¹, Bidyut B. Chaudhuri¹ - Show less +1 more•Institutions (1)

Indian Statistical Institute¹

01 Nov 2002-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: A modified Topology Adaptive Self-Organizing Neural Network is proposed to extract a vector skeleton from a binary numeral image to prune artifacts, if any, in such a skeletal shape.

...read moreread less

Abstract: This paper proposes a novel approach to automatic recognition of handprinted Bangla (an Indian script) numerals. A modified Topology Adaptive Self-Organizing Neural Network is proposed to extract a vector skeleton from a binary numeral image. Simple heuristics are considered to prune artifacts, if any, in such a skeletal shape. Certain topological and structural features like loops, junctions, positions of terminal nodes, etc. are used along with a hierarchical tree classifier to classify handwritten numerals into smaller subgroups. Multilayer perceptron (MLP) networks are then employed to uniquely classify the numerals belonging to each subgroup. The system is trained using a sample data set of 1800 numerals and we have obtained 93.26% correct recognition rate and 1.71% rejection on a separate test set of another 7760 samples. In addition, a validation set consisting of 1440 samples has been used to determine the termination of the training algorithm of the MLP networks. The proposed scheme is sufficiently robust with respect to considerable object noise.

...read moreread less

Proceedings Article•DOI•

Term selection for searching printed Arabic

[...]

Kareem Darwish¹, Douglas W. Oard¹•Institutions (1)

University of Maryland, College Park¹

11 Aug 2002

TL;DR: Alternative choices of indexing terms are explored using both an existing electronic text collection and a newly developed collection built from images of actual printed Arabic documents, and character n-grams or lightly stemmed words were found to typically yield near-optimal retrieval effectiveness.

...read moreread less

Abstract: Since many Arabic documents are available only in print, automating retrieval from collections of scanned Arabic document images using Optical Character Recognition (OCR) is an interesting problem. Arabic combines rich morphology with a writing system that presents unique challenges to OCR systems. These factors must be considered when selecting terms for automatic indexing. In this paper, alternative choices of indexing terms are explored using both an existing electronic text collection and a newly developed collection built from images of actual printed Arabic documents. Character n-grams or lightly stemmed words were found to typically yield near-optimal retrieval effectiveness, and combining both types of terms resulted in robust performance across a broad range of conditions.

...read moreread less

Patent•

Limiting storage or transmission of visual information using optical character recognition

[...]

Eric T. Eaton¹•Institutions (1)

Motorola¹

12 Nov 2002

TL;DR: In this paper, a method of limiting visual information that can be stored or transmitted captures a visual image, and an optical symbol recognition of the image is performed, which is performed by comparing symbols generated by the recognition routine to one or more predetermined symbols.

...read moreread less

Abstract: A method carried out at a device ( 700 ), of limiting visual information that can be stored or transmitted captures a visual image ( 408 ). Whenever a control signal is detected ( 412 ) at the device ( 700 ), an optical symbol recognition of the image is performed. The method further includes comparing symbols generated by the optical symbol recognition routine to one or more predetermined symbols. When a symbol match to one or more predetermined symbols is generated, at least a portion of the visual image ( 408 ) is obscured.

...read moreread less

Proceedings Article•DOI•

A video based interface to textual information for the visually impaired

[...]

A. Zandifar¹, Ramani Duraiswami¹, A. Chahine¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

14 Oct 2002

TL;DR: An interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech (TTS) is described.

...read moreread less

Abstract: We describe the development of an interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech(TTS). The video provides a sequence of low resolution images in which text must be detected, rectified and converted into high resolution rectangular blocks that are capable of being analyzed via off-the-shelf OCR. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.

...read moreread less

Book Chapter•DOI•

Training Set Expansion in Handwritten Character Recognition

[...]

Javier Cano¹, Juan-Carlos Perez-Cortes¹, Joaquim Arlandis¹, Rafael Llobet¹•Institutions (1)

Polytechnic University of Valencia¹

06 Aug 2002-Lecture Notes in Computer Science

TL;DR: In this paper, a process of expansion of the training set by synthetic generation of handwritten uppercase letters via deformations of natural images is tested in combination with an approximate k-Nearest Neighbor (k-NN) classifier.

...read moreread less

Abstract: In this paper, a process of expansion of the training set by synthetic generation of handwritten uppercase letters via deformations of natural images is tested in combination with an approximate k-Nearest Neighbor (k-NN) classifier. It has been previously shown [11] [10] that approximate nearest neighbors search in large databases can be successfully used in an OCR task, and that significant performance improvements can be consistently obtained by simply increasing the size of the training set. In this work, extensive experiments adding distorted characters to the training set are performed, and the results are compared to directly adding new natural samples to the set of prototypes.

...read moreread less

Proceedings Article•DOI•

A PDA-based sign translator

[...]

Jing Zhang, Xilin Chen¹, Jie Yang¹, Alex Waibel¹•Institutions (1)

Carnegie Mellon University¹

14 Oct 2002

TL;DR: An effective approach for a PDA-based sign system that efficiently embeds multi-resolution, adaptive search in a hierarchical framework with different emphases at each layer, and introduces an intensity-based OCR method to recognize characters in various fonts and lighting conditions.

...read moreread less

Abstract: In this paper, we propose an effective approach for a PDA-based sign system, and it presents user the sign translator. Its main functions include 3 parts: detection, recognition and translation. Automatic detection and recognition of text in natural scenes is a prerequisite forautomatic sign translator. In order to make the system robust for text detection in various natural scenes, the detection approach efficiently embeds multi-resolution, adaptive search in a hierarchical framework with different emphases at each layer. We also introduce an intensity-based OCR method to recognize character in various fonts and lighting condition, where we employ Gabor transform to obtain local features, and LDA for selection and classification of features. The recognition rate is 92.4% for the testing set got from the natural sign. Sign is different from the normal used sentence. It is brief, with a lot of abbreviations and place nouns. We here only briefly introduce a rule-based place name translation. We have integrated all these functions in a PDA, which can capture sign image, auto segment andrecognize the Chinese sign, and translate it into English.

...read moreread less

Book Chapter•DOI•

A Complete Tamil Optical Character Recognition System

[...]

K. G. Aparna, A. G. Ramakrishnan

19 Aug 2002

TL;DR: Recognition of Indian language characters has been a topic of interest for quite some time and the need for efficient and robust algorithms and systems for recognition is being felt in India, especially in the post and telegraph department where OCR can assist the staff in sorting mail.

...read moreread less

Abstract: Document Image processing and Optical Character Recognition (OCR) have been a frontline research area in the field of human-machine interface for the last few decades. Recognition of Indian language characters has been a topic of interest for quite some time. The earlier contributions were reported in [1] and [2]. A more recent work is reported in [3] and [9]. The need for efficient and robust algorithms and systems for recognition is being felt in India, especially in the post and telegraph department where OCR can assist the staff in sorting mail. Character recognition can also form a part in applications like intelligent scanning machines, text to speech converters, and automatic language-to-language translators.

...read moreread less

Proceedings Article•DOI•

Optimized Gabor filter based feature extraction for character recognition

[...]

Xuewen Wang, Xiaoqing Ding, Changsong Liu

11 Aug 2002

TL;DR: The feature extraction method for Chinese character recognition is meliorated to improve the discriminability of histogram features and the non-linear function used in previous research to regulate the outputs of Gabor filters adaptively is modified.

...read moreread less

Abstract: /spl omega/This paper proposed a new feature extraction method for Chinese character recognition by using optimized Gabor filters. Based on the theory of Gabor filters and the statistical information of Chinese character images, a simple but effective method to design Gabor filters was developed. Moreover, to improve the performances for low quality images, we modified the non-linear function used in previous research to regulate the outputs of Gabor filters adaptively. This paper also meliorated the feature extraction method to improve the discriminability of histogram features. Experiments had shown that our method perform excellently for images with noises, backgrounds or stroke distortions and can be applied to printed or handwritten character recognition tasks in low quality greyscale or binary images.

...read moreread less

Proceedings Article•DOI•

Rejection strategies and confidence measures for a k-NN classifier in an OCR task

[...]

Joaquim Arlandis, Juan-Carlos Perez-Cortes, Javier Cano

11 Aug 2002

TL;DR: A combination of two confidence measures defined for a k-nearest neighbors (NN) classifier is proposed and experiments are presented comparing the performance of the same system with and without the new rejection rules.

...read moreread less

Abstract: In handwritten character recognition, the rejection of extraneous patterns, like image noise, strokes or corrections, can improve significantly the practical usefulness of a system In this paper a combination of two confidence measures defined for a k-nearest neighbors (NN) classifier is proposed Experiments are presented comparing the performance of the same system with and without the new rejection rules

...read moreread less

Proceedings Article•DOI•

OCR error correction using a noisy channel model

[...]

Okan Kolak¹, Philip Resnik¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 2002

TL;DR: A very general, theoretically optimal model is applied to the problem of OCR word correction, practical methods for parameter estimation are introduced, and performance on real data is evaluated.

...read moreread less

Abstract: In this paper, we take a pattern recognition approach to correcting errors in text generated from printed documents using optical character recognition (OCR). We apply a very general, theoretically optimal model to the problem of OCR word correction, introduce practical methods for parameter estimation, and evaluate performance on real data.

...read moreread less