scispace - formally typeset
Search or ask a question

Showing papers on "Optical character recognition published in 2002"


Journal ArticleDOI
Rainer Lienhart1, A. Wernicke
TL;DR: This work proposes a novel method for localizing and segmenting text in complex images and videos that is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video.
Abstract: Many images, especially those used for page design on Web pages, as well as videos contain visible text. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. We propose a novel method for localizing and segmenting text in complex images and videos. Text lines are identified by using a complex-valued multilayer feed-forward network trained to detect text at a fixed scale and position. The network's output at all scales and positions is integrated into a single text-saliency map, serving as a starting point for candidate text lines. In the case of video, these candidate text lines are refined by exploiting the temporal redundancy of text in video. Localized text lines are then scaled to a fixed height of 100 pixels and segmented into a binary image with black characters on white background. For videos, temporal redundancy is exploited to improve segmentation performance. Input images and videos can be of any size due to a true multiresolution approach. Moreover, the system is not only able to locate and segment text occurrences into large binary images, but is also able to track each text line with sub-pixel accuracy over the entire occurrence in a video, so that one text bitmap is created for all instances of that text line. Therefore, our text segmentation results can also be used for object-based video encoding such as that enabled by MPEG-4.

478 citations


Proceedings ArticleDOI
10 Dec 2002
TL;DR: An algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text is presented and the quality of the localized text is improved by robust multiple frame integration.
Abstract: The systems currently available for content based image and video retrieval work without semantic knowledge, i.e. they use image processing methods to extract low level features of the data. The similarity obtained by these approaches does not always correspond to the similarity a human user would expect. A way to include more semantic knowledge into the indexing process is to use the text included in the images and video sequences. It is rich in information but easy to use, e.g. by key word based queries. In this paper we present an algorithm to localize artificial text in images and videos using a measure of accumulated gradients and morphological post processing to detect the text. The quality of the localized text is improved by robust multiple frame integration. Anew technique for the binarization of the text boxes is proposed. Finally, detection and OCR results for a commercial OCR are presented.

262 citations


Journal ArticleDOI
TL;DR: A modular system to recognize handwritten numerical strings using a segmentation-based recognition approach and a recognition and verification strategy that combines the outputs from different levels such as segmentation, recognition, and postprocessing in a probabilistic model is proposed.
Abstract: A modular system to recognize handwritten numerical strings is proposed. It uses a segmentation-based recognition approach and a recognition and verification strategy. The approach combines the outputs from different levels such as segmentation, recognition, and postprocessing in a probabilistic model. A new verification scheme which contains two verifiers to deal with the problems of oversegmentation and undersegmentation is presented. A new feature set is also introduced to feed the oversegmentation verifier. A postprocessor based on a deterministic automaton is used and the global decision module makes an accept/reject decision. Finally, experimental results on two databases are presented: numerical amounts on Brazilian bank checks and NIST SD19. The latter aims at validating the concept of modular system and showing the robustness of the system using a well-known database.

228 citations


Journal ArticleDOI
TL;DR: This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.
Abstract: Off-line recognition requires transferring the text under consideration into an image file. This represents the only available solution to bring the printed materials to the electronic media. However, the transferring process causes the system to lose the temporal information of that text. Other complexities that an off-line recognition system has to deal with are the lower resolution of the document and the poor binarisation, which can contribute to readability when essential features of the characters are deleted or obscured. Recognising Arabic script presents two additional challenges: orthography is cursive and letter shape is context sensitive. Certain character combinations form new ligature shapes, which are often font-dependent. Some ligatures involve vertical stacking of characters. Since not all letters connect, word boundary location becomes an interesting problem, as spacing may separate not only words, but also certain characters within a word. Various techniques have been implemented to achieve high recognition rates. These techniques have tackled different aspects of the recognition system. This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.

207 citations


Journal ArticleDOI
TL;DR: A new analytic scheme, which uses a sequence of image segmentation and recognition algorithms, is proposed for the off-line cursive handwriting recognition problem and indicates higher recognition rates compared to the available methods reported in the literature.
Abstract: A new analytic scheme, which uses a sequence of image segmentation and recognition algorithms, is proposed for the off-line cursive handwriting recognition problem. First, some global parameters, such as slant angle, baselines, stroke width and height, are estimated. Second, a segmentation method finds character segmentation paths by combining gray-scale and binary information. Third, a hidden Markov model (HMM) is employed for shape recognition to label and rank the character candidates. For this purpose, a string of codes is extracted from each segment to represent the character candidates. The estimation of feature space parameters is embedded in the HMM training stage together with the estimation of the HMM model parameters. Finally, information from a lexicon and from the HMM ranks is combined in a graph optimization problem for word-level recognition. This method corrects most of the errors produced by the segmentation and HMM ranking stages by maximizing an information measure in an efficient graph search algorithm. The experiments indicate higher recognition rates compared to the available methods reported in the literature.

184 citations


Journal ArticleDOI
TL;DR: A handwritten character string recognition system for Japanese mail address reading on a very large vocabulary because there is no extra space between words to achieve real-time recognition.
Abstract: This paper describes a handwritten character string recognition system for Japanese mail address reading on a very large vocabulary. The address phrases are recognized as a whole because there is no extra space between words. The lexicon contains 111,349 address phrases, which are stored in a trie structure. In recognition, the text line image is matched with the lexicon entries (phrases) to obtain reliable segmentation and retrieve valid address phrases. The paper first introduces some effective techniques for text line image preprocessing and presegmentation. In presegmentation, the text line image is separated into primitive segments by connected component analysis and touching pattern splitting based on contour shape analysis. In lexicon matching, consecutive segments are dynamically combined into candidate character patterns. An accurate character classifier is embedded in lexicon matching to select characters matched with a candidate pattern from a dynamic category set. A beam search strategy is used to control the lexicon matching so as to achieve real-time recognition. In experiments on 3,589 live mail images, the proposed method achieved correct rate of 83.68 percent while the error rate is less than 1 percent.

178 citations


Proceedings ArticleDOI
10 Dec 2002
TL;DR: This paper proposes an efficient text detection approach, which is based on invariant features, such as edge strength, edge density, and horizontal distribution, and it applies edge detection and uses a low threshold to filter out definitely non-text edges.
Abstract: Text detection is fundamental to video information retrieval and indexing. Existing methods cannot handle well those texts with different contrast or embedded in a complex background. To handle these difficulties, this paper proposes an efficient text detection approach, which is based on invariant features, such as edge strength, edge density, and horizontal distribution. First, it applies edge detection and uses a low threshold to filter out definitely non-text edges. Then, a local threshold is selected to both keep low-contrast text and simplify complex background of high-contrast text. Next, two text-area enhancement operators are proposed to highlight those areas with either high edge strength or high edge density. Finally, coarse-to-fine detection locates text regions efficiently. Experimental results show that this approach is robust for contrast, font-size, font-color, language, and background complexity.

147 citations


Journal ArticleDOI
TL;DR: This paper briefly describes various components of a document analysis system and provides the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.
Abstract: Document image analysis refers to algorithms and techniques that are applied to images of documents to obtain a computer-readable description from pixel data. A well-known document image analysis product is the Optical Character Recognition (OCR) software that recognizes characters in a scanned document. OCR makes it possible for the user to edit or search the document’s contents. In this paper we briefly describe various components of a document analysis system. Many of these basic building blocks are found in most document analysis systems, irrespective of the particular domain or language to which they are applied. We hope that this paper will help the reader by providing the background necessary to understand the detailed descriptions of specific techniques presented in other papers in this issue.

143 citations


Journal ArticleDOI
TL;DR: A video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier that improves the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions.
Abstract: We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recognition accuracy from 13% to 86% on a set of news video captions. As the first attempt on Chinese video-caption recognition, our experiment results are very encouraging.

131 citations


Journal ArticleDOI
01 Nov 2002
TL;DR: A new technique is presented for identification and segmentation of touching characters based on fuzzy multifactorial analysis and a predictive algorithm is developed for effectively selecting possible cut columns for segmenting the touching characters.
Abstract: One of the important reasons for poor recognition rate in optical character recognition (OCR) system is the error in character segmentation. Existence of touching characters in the scanned documents is a major problem to design an effective character segmentation procedure. In this paper, a new technique is presented for identification and segmentation of touching characters. The technique is based on fuzzy multifactorial analysis. A predictive algorithm is developed for effectively selecting possible cut columns for segmenting the touching characters. The proposed method has been applied to printed documents in Devnagari and Bangla: the two most popular scripts of the Indian sub-continent. The results obtained from a test-set of considerable size show that a reasonable improvement in recognition rate can be achieved with a modest increase in computations.

126 citations


Journal ArticleDOI
TL;DR: An intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition are introduced and a new paradigm, “graph probing,” is described for comparing the results returned by the recognition system and the representation created during ground-truthing.
Abstract: While techniques for evaluating the performance of lower-level document analysis tasks such as optical character recognition have gained acceptance in the literature, attempts to formalize the problem for higher-level algorithms, while receiving a fair amount of attention in terms of theory, have generally been less successful in practice, perhaps owing to their complexity. In this paper, we introduce intuitive, easy-to-implement evaluation schemes for the related problems of table detection and table structure recognition. We also present the results of several small experiments, demonstrating how well the methodologies work and the useful sorts of feedback they provide. We first consider the table detection problem. Here algorithms can yield various classes of errors, including non-table regions improperly labeled as tables (insertion errors), tables missed completely (deletion errors), larger tables broken into a number of smaller ones (splitting errors), and groups of smaller tables combined to form larger ones (merging errors). This leads naturally to the use of an edit distance approach for assessing the results of table detection. Next we address the problem of evaluating table structure recognition. Our model is based on a directed acyclic attribute graph, or table DAG. We describe a new paradigm, “graph probing,” for comparing the results returned by the recognition system and the representation created during ground-truthing. Probing is in fact a general concept that could be applied to other document recognition tasks as well.

Proceedings ArticleDOI
06 Aug 2002
TL;DR: A new character segmentation algorithm (ACSA) of Arabic scripts is presented, which yields on the segmentation of isolated handwritten words in perfectly separated characters based on morphological rules constructed at the feature extraction phase.
Abstract: Character segmentation is a necessary preprocessing step for character recognition in many OCR systems. It is an important step because incorrectly segmented characters are unlikely to be recognized correctly. The most difficult case in character segmentation is the cursive script. The scripted nature of Arabic written language poses some high challenges for automatic character segmentation and recognition. In this paper, a new character segmentation algorithm (ACSA) of Arabic scripts is presented. The developed segmentation algorithm yields on the segmentation of isolated handwritten words in perfectly separated characters. It is based on morphological rules, which are constructed at the feature extraction phase. Finally, ACSA is combined with an existing handwritten Arabic character recognition system (RECAM).

Journal ArticleDOI
TL;DR: A neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts and results are very encouraging and prove the effectiveness of the approach.
Abstract: The paper describes a neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts. Script identification is a basic requirement in automation of document processing, in multi-script, multi-lingual environments. The system developed includes a feature extractor and a modular neural network. The feature extractor consists of two stages. In the first stage the document image is dilated using 3 X 3 masks in horizontal, vertical, right diagonal, and left diagonal directions. In the next stage, average pixel distribution is found in these resulting images. The modular network is a combination of separately trained feedforward neural network classifiers for each script. The system recognizes 64 X 64 pixel document images. In the next level, the system is modified to perform on single word-document images in the same three scripts. Modified system includes a pre-processor, modified feature extractor and probabilistic neural network classifier. Pre-processor segments the multi-script multi-lingual document into individual words. The feature extractor receives these word-document images of variable size and still produces the discriminative features employed by the probabilistic neural classifier. Experiments are conducted on a manually developed database of document images of size 64 X 64 pixels and on a database of individual words in the three scripts. The results are very encouraging and prove the effectiveness of the approach.

Patent
05 Dec 2002
TL;DR: In this article, a cheque scanning module scans cheques and matches the encoded Magnetic Ink Character Recognition (MICR) data (i.e. serial number, Customer Account Number and amount) from the scanned digital electronic images with items in an issuance database which contains client provided cheque particulars.
Abstract: A system and method for detecting cheque fraud includes a cheque scanning module and a detection module. The cheque scanning module scans cheques and matches the encoded Magnetic Ink Character Recognition (MICR) data (i.e. serial number, Customer Account Number and amount) from the scanned digital electronic images with items in an issuance database which contains client provided cheque particulars. The detection module passes the cheque images through an optical character recognition (OCR) process to read what is written on the cheque and to match results against the issuance database. If the written information on the face of a cheque is unreadable or there is no match with the information in the issuance database, the detection module passes the cheque through a series of slower more precise OCR processes. Any cheques that are not successfully read and matched are highlighted as an "exception" and immediately forwarded to the client for further action.

Journal ArticleDOI
TL;DR: A prototype of the OCR system for printed Oriya script achieves 96.3% character level accuracy on average, and the feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning.
Abstract: This paper deals with an Optical Character Recognition (OCR) system for printedOriya script. The development of OCR for this script is difficult because a large number of character shapes in the script have to be recognized. In the proposed system, the document image is first captured using a flat-bed scanner and then passed through different preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of water overflow from a reservoir. The feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

Proceedings ArticleDOI
10 Dec 2002
TL;DR: This work uses multiple frame verification to reduce text detection false alarms and applies a block-based adaptive thresholding procedure to form a clearer "man-made" frame that is sent to an OCR engine for recognition.
Abstract: Text superimposed on the video frames provides supplemental but important information for video indexing and retrieval. Many efforts have been made for videotext detection and recognition (video OCR). The main difficulties of video OCR are the low resolution and the background complexity. We present efficient schemes to deal with the second difficulty by sufficiently utilizing multiple frames that contain the same text to get every clear word from these frames. Firstly, we use multiple frame verification to reduce text detection false alarms. We then choose those frames where the text is most likely clear, thus it is more possible to be correctly recognized. We then detect and joint every clear text block from those frames to form a clearer "man-made" frame. Later we apply a block-based adaptive thresholding procedure on these "man-made" frames. Finally, the binarized frames are sent to an OCR engine for recognition. Experiments show that the word recognition rate has been increased over 28% by these methods.

Journal ArticleDOI
TL;DR: In this paper, an automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document has been presented, using shape-based features, statistical features and some features obtained from the concept of water overflow from the reservoir.

Book ChapterDOI
19 Aug 2002
TL;DR: Techniques to identify the script of a word using Gabor filters with suitable frequencies and orientations are discussed and results obtained are quite encouraging.
Abstract: Identification of script in multi-lingual documents is essential for many language dependent applications suchas machine translation and optical character recognition. Techniques for script identification generally require large areas for operation so that sufficient information is available. Such assumption is nullified in Indian context, as there is an interspersion of words of two different scripts in most documents. In this paper, techniques to identify the script of a word are discussed. Two different approaches have been proposed and tested. The first method structures words into 3 distinct spatial zones and utilizes the information on the spatial spread of a word in upper and lower zones, together with the character density, in order to identify the script. The second technique analyzes the directional energy distribution of a word using Gabor filters with suitable frequencies and orientations. Words with various font styles and sizes have been used for the testing of the proposed algorithms and the results obtained are quite encouraging.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: A complete system able to classify Arabic handwritten words of one hundred different writers is proposed and discussed, and successful recognition results are reported.
Abstract: Hidden Markov models (HMM) have been used with some success in recognizing printed Arabic words. In this paper, a complete scheme for totally unconstrained Arabic handwritten word recognition based on a model discriminant HMM is presented. A complete system able to classify Arabic handwritten words of one hundred different writers is proposed and discussed. The system first attempts to remove some of variation in the images that do not affect the identity of the handwritten word. Next, the system codes the skeleton and edge of the word so that feature information about the lines in the skeleton is extracted. Then a classification process based on the HMM approach is used. The output is a word in the dictionary. A detailed experiment is carried out and successful recognition results are reported.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: A multiscale texture-based method using local energy analysis for hybrid Chinese/English text detection in images and video frames and a tested dataset with low missed rate and false alarm rate is proposed.
Abstract: We propose a multiscale texture-based method using local energy analysis for hybrid Chinese/English text detection in images and video frames. Local energy analysis has been shown to work well in text detection, where remarkable local energy variations of pixels correspond to text region or boundary of other objects and lower local energy variations of pixels correspond to background or the interior of non-text objects. Local energy variation is calculated in a local region based on the wavelet transform coefficients of images. Hybrid Chinese/English text in images and video frames can be detected whether it is aligned horizontally or vertically. The font size of text to be detected may vary in a wide range of values. The proposed method has been tested on 321 frame images obtained from local TV programs and a tested dataset with low missed rate and false alarm rate.

Journal ArticleDOI
TL;DR: A modified Topology Adaptive Self-Organizing Neural Network is proposed to extract a vector skeleton from a binary numeral image to prune artifacts, if any, in such a skeletal shape.
Abstract: This paper proposes a novel approach to automatic recognition of handprinted Bangla (an Indian script) numerals. A modified Topology Adaptive Self-Organizing Neural Network is proposed to extract a vector skeleton from a binary numeral image. Simple heuristics are considered to prune artifacts, if any, in such a skeletal shape. Certain topological and structural features like loops, junctions, positions of terminal nodes, etc. are used along with a hierarchical tree classifier to classify handwritten numerals into smaller subgroups. Multilayer perceptron (MLP) networks are then employed to uniquely classify the numerals belonging to each subgroup. The system is trained using a sample data set of 1800 numerals and we have obtained 93.26% correct recognition rate and 1.71% rejection on a separate test set of another 7760 samples. In addition, a validation set consisting of 1440 samples has been used to determine the termination of the training algorithm of the MLP networks. The proposed scheme is sufficiently robust with respect to considerable object noise.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: Alternative choices of indexing terms are explored using both an existing electronic text collection and a newly developed collection built from images of actual printed Arabic documents, and character n-grams or lightly stemmed words were found to typically yield near-optimal retrieval effectiveness.
Abstract: Since many Arabic documents are available only in print, automating retrieval from collections of scanned Arabic document images using Optical Character Recognition (OCR) is an interesting problem. Arabic combines rich morphology with a writing system that presents unique challenges to OCR systems. These factors must be considered when selecting terms for automatic indexing. In this paper, alternative choices of indexing terms are explored using both an existing electronic text collection and a newly developed collection built from images of actual printed Arabic documents. Character n-grams or lightly stemmed words were found to typically yield near-optimal retrieval effectiveness, and combining both types of terms resulted in robust performance across a broad range of conditions.

Patent
Eric T. Eaton1
12 Nov 2002
TL;DR: In this paper, a method of limiting visual information that can be stored or transmitted captures a visual image, and an optical symbol recognition of the image is performed, which is performed by comparing symbols generated by the recognition routine to one or more predetermined symbols.
Abstract: A method carried out at a device ( 700 ), of limiting visual information that can be stored or transmitted captures a visual image ( 408 ). Whenever a control signal is detected ( 412 ) at the device ( 700 ), an optical symbol recognition of the image is performed. The method further includes comparing symbols generated by the optical symbol recognition routine to one or more predetermined symbols. When a symbol match to one or more predetermined symbols is generated, at least a portion of the visual image ( 408 ) is obscured.

Proceedings ArticleDOI
14 Oct 2002
TL;DR: An interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech (TTS) is described.
Abstract: We describe the development of an interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech(TTS). The video provides a sequence of low resolution images in which text must be detected, rectified and converted into high resolution rectangular blocks that are capable of being analyzed via off-the-shelf OCR. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.

Book ChapterDOI
TL;DR: In this paper, a process of expansion of the training set by synthetic generation of handwritten uppercase letters via deformations of natural images is tested in combination with an approximate k-Nearest Neighbor (k-NN) classifier.
Abstract: In this paper, a process of expansion of the training set by synthetic generation of handwritten uppercase letters via deformations of natural images is tested in combination with an approximate k-Nearest Neighbor (k-NN) classifier. It has been previously shown [11] [10] that approximate nearest neighbors search in large databases can be successfully used in an OCR task, and that significant performance improvements can be consistently obtained by simply increasing the size of the training set. In this work, extensive experiments adding distorted characters to the training set are performed, and the results are compared to directly adding new natural samples to the set of prototypes.

Proceedings ArticleDOI
14 Oct 2002
TL;DR: An effective approach for a PDA-based sign system that efficiently embeds multi-resolution, adaptive search in a hierarchical framework with different emphases at each layer, and introduces an intensity-based OCR method to recognize characters in various fonts and lighting conditions.
Abstract: In this paper, we propose an effective approach for a PDA-based sign system, and it presents user the sign translator. Its main functions include 3 parts: detection, recognition and translation. Automatic detection and recognition of text in natural scenes is a prerequisite forautomatic sign translator. In order to make the system robust for text detection in various natural scenes, the detection approach efficiently embeds multi-resolution, adaptive search in a hierarchical framework with different emphases at each layer. We also introduce an intensity-based OCR method to recognize character in various fonts and lighting condition, where we employ Gabor transform to obtain local features, and LDA for selection and classification of features. The recognition rate is 92.4% for the testing set got from the natural sign. Sign is different from the normal used sentence. It is brief, with a lot of abbreviations and place nouns. We here only briefly introduce a rule-based place name translation. We have integrated all these functions in a PDA, which can capture sign image, auto segment andrecognize the Chinese sign, and translate it into English.

Book ChapterDOI
19 Aug 2002
TL;DR: Recognition of Indian language characters has been a topic of interest for quite some time and the need for efficient and robust algorithms and systems for recognition is being felt in India, especially in the post and telegraph department where OCR can assist the staff in sorting mail.
Abstract: Document Image processing and Optical Character Recognition (OCR) have been a frontline research area in the field of human-machine interface for the last few decades. Recognition of Indian language characters has been a topic of interest for quite some time. The earlier contributions were reported in [1] and [2]. A more recent work is reported in [3] and [9]. The need for efficient and robust algorithms and systems for recognition is being felt in India, especially in the post and telegraph department where OCR can assist the staff in sorting mail. Character recognition can also form a part in applications like intelligent scanning machines, text to speech converters, and automatic language-to-language translators.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: The feature extraction method for Chinese character recognition is meliorated to improve the discriminability of histogram features and the non-linear function used in previous research to regulate the outputs of Gabor filters adaptively is modified.
Abstract: /spl omega/This paper proposed a new feature extraction method for Chinese character recognition by using optimized Gabor filters. Based on the theory of Gabor filters and the statistical information of Chinese character images, a simple but effective method to design Gabor filters was developed. Moreover, to improve the performances for low quality images, we modified the non-linear function used in previous research to regulate the outputs of Gabor filters adaptively. This paper also meliorated the feature extraction method to improve the discriminability of histogram features. Experiments had shown that our method perform excellently for images with noises, backgrounds or stroke distortions and can be applied to printed or handwritten character recognition tasks in low quality greyscale or binary images.

Proceedings ArticleDOI
11 Aug 2002
TL;DR: A combination of two confidence measures defined for a k-nearest neighbors (NN) classifier is proposed and experiments are presented comparing the performance of the same system with and without the new rejection rules.
Abstract: In handwritten character recognition, the rejection of extraneous patterns, like image noise, strokes or corrections, can improve significantly the practical usefulness of a system In this paper a combination of two confidence measures defined for a k-nearest neighbors (NN) classifier is proposed Experiments are presented comparing the performance of the same system with and without the new rejection rules

Proceedings ArticleDOI
01 Jan 2002
TL;DR: A very general, theoretically optimal model is applied to the problem of OCR word correction, practical methods for parameter estimation are introduced, and performance on real data is evaluated.
Abstract: In this paper, we take a pattern recognition approach to correcting errors in text generated from printed documents using optical character recognition (OCR). We apply a very general, theoretically optimal model to the problem of OCR word correction, introduce practical methods for parameter estimation, and evaluate performance on real data.