scispace - formally typeset
Search or ask a question

Showing papers on "Optical character recognition published in 1993"


Patent
23 Mar 1993
TL;DR: In this article, a method and system for capturing and processing visually perceptible data, such as address or telephone numbers, within a broadcast video signal is disclosed for capturing, digitizing and storing video frames as Tagged Image File Format (TIFF).
Abstract: A method and system are disclosed for capturing and processing visually perceptible data, such as address or telephone numbers, within a broadcast video signal. Optically recognizable text, numbers, or visual barcodes representative of text or numbers are transmitted within various frames of broadcast video signals. The broadcast video signal is then received and selected video frames are captured, digitized and stored as Tagged Image File Format (TIFF) in response to a user initiated command. Barcode readers or optical character recognition processes are then utilized to extract textual or numeric data from the captured video frames and that data is stored for future utilization. In one depicted embodiment a user defined template may be utilized to assist the optical character recognition process. Thereafter, an associated communication device, such as a modem, is utilized to automatically "dial" a captured telephone number by generating a series of DTMF tones associated with the captured telephone number, automatically establishing communication between the data processing system and an external location.

341 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present two new extraction techniques: a logical level technique and a mask-based subtraction technique for binary character/graphics image extraction from gray-scale document images.

189 citations


Proceedings ArticleDOI
Henry S. Baird1
20 Oct 1993
TL;DR: The author reviews the recent literature on explicit, quantitative, parameterized models of the image defects that occur during printing and scanning, and reports preliminary results in the estimation of the intrinsic error of precisely-specified text recognition problems.
Abstract: The accuracy of today's document recognition algorithms falls abruptly when image quality degrades even slightly. In an effort to surmount this barrier, researchers have in recent years intensified their study of explicit, quantitative, parameterized models of the image defects that occur during printing and scanning. The author reviews the recent literature and discusses the form these models might take. A preview of a large public-domain database of character images, labeled with ground-truth including all defect model parameters, is given. The use of massive pseudo-randomly generated training sets for the construction of high-performance decision trees for preclassification is described. In a more theoretical vein, the author reports preliminary results in the estimation of the intrinsic error of precisely-specified text recognition problems. Finally, the author calls attention to some open problems. >

150 citations


Patent
Dan Shmuel Chevion1, Ittai Gilat1, Andre Heilper1, Oren Kagan1, Amir Kolsky1, Yoav Medan1, Eugene Walach1 
03 Aug 1993
TL;DR: In this article, a system comprising optical character recognition logic for generating, from the document image or images, character data specifying one of a plurality of possible character values for corresponding segments of the document images.
Abstract: A data entry system generates an electronically stored coded representation of a character sequence from one or more electronically stored document images. The system comprising optical character recognition logic for generating, from the document image or images, character data specifying one of a plurality of possible character values for corresponding segments of the document images. The system also has an interactive display means for generating and sequentially displaying, one or more types of composite image, each composite image comprising segments of the document image or images arranged according to the character data, and a correction mechanism responsive to a user input operation to enable the operator to correct the character data associated with displayed segments.

131 citations


Proceedings ArticleDOI
TL;DR: Content-based retrieval is founded on neural networks, this technology allows automatic filing of images and a wide range of possible queries of the resulting database, in contrast to methods such as entering SQL keys manually for each image as it is filed and later correctly re-entering those keys to retrieve the same image.
Abstract: Content-based retrieval is founded on neural networks, this technology allows automatic filing of images and a wide range of possible queries of the resulting database. This is in contrast to methods such as entering SQL keys manually for each image as it is filed and later correctly re-entering those keys to retrieve the same image. An SQL-based approach does not take into account information that is hard to describe with text, such as sounds and images. Neural networks can be trained to translate `noisy' or chaotic image data into simpler, more reliable feature sets. By converting the images into the level of abstraction necessary for symbolic processing, standard database indexing methods can then be applied, or used in layers of associative database neural networks directly.

98 citations


Journal ArticleDOI
TL;DR: A method is introduced to combine and jointly optimize recognition and image normalization in optical character recognition algorithms based on pseudo two-dimensional (2D) hidden Markov models (HMMs) that provides a maximum likelihood estimate of the transformation parameters that can be used by higher level modules in an intelligent document recognition system as an aid in the recognition process.

94 citations


Proceedings ArticleDOI
20 Oct 1993
TL;DR: The purpose of the current PE92 database project is to provide a comprehensive set of character image data to a developer of a recognition system so that the developer can concentrate on developing an algorithm.
Abstract: The purpose of the current PE92 database project is two fold. One is to provide a comprehensive set of character image data to a developer of a recognition system so that the developer can concentrate on developing an algorithm. The other is to offer a means by which an evaluator can compare various algorithms objectively. The authors collected 100 sets of KS 2350 handwritten Korean character images. They tried to collect as many writing styles as possible. The first 70 sets were generated by more than 500 different writers, and each of the remaining 30 sets was written by the same person. Writers wrote down the characters in prespecified boxes and the database was created by scanning the data sheets by an image scanner. Each image is the size of 100/spl times/100 with 256 gray levels. Finally, the authors analyze the quality of the database created and calculated various statistics of the database PE92. >

74 citations


Journal ArticleDOI
TL;DR: A general mechanism for designing and training multi-modular architectures, integrating various neural networks into a unique pattern recognition system, which is globally trained and possible to realize, within the system, feature extraction and recognition in successive modules which are cooperatively trained.
Abstract: In practical applications, recognition accuracy is sometimes not the only criterion; capability to reject erroneous patterns might also be needed. We show that there is a trade-off between these two properties. An efficient solution to this trade-off is brought about by the use of different algorithms implemented in various modules, i.e. multi-modular architectures. We present a general mechanism for designing and training multi-modular architectures, integrating various neural networks into a unique pattern recognition system, which is globally trained. It is possible to realize, within the system, feature extraction and recognition in successive modules which are cooperatively trained. We discuss various rejection criteria for neural networks and multi-modular architectures. We then give two examples of such systems, study their rejection capabilities and show how to use them for segmentation. In handwritten optical character recognition, our system achieves performances at state-of-the-art level, but is eight times faster. In human face recognition, our system is intended to work in the real world.

71 citations


Journal ArticleDOI
TL;DR: The hybrid contextural algorithm for reading real-life documents printed in varying fonts of any size is presented and word-level hypotheses are generated using hybrid contextual text processing.
Abstract: The hybrid contextural algorithm for reading real-life documents printed in varying fonts of any size is presented. Text is recognized progressively in three passes. The first pass is used to generate character hypothesis, the second to generate word hypothesis, and the third to verify the word hypothesis. During the first pass, isolated characters are recognized using a dynamic contour warping classifier. Transient statistical information is collected to accelerate the recognition process and to verify hypotheses in later processing. A transient dictionary consisting of high confidence nondictionary words is constructed in this pass. During the second pass, word-level hypotheses are generated using hybrid contextual text processing. Nondictionary words are recognized using a modified Viterbi algorithm, a string matching algorithm utilizing n grams, special handlers for touching characters, and pragmatic handlers for numerals, punctuation, hyphens, apostrophes, and a prefix/suffix handler. This processing usually generates several word hypothesis. During the third pass, word-level verification occurs. >

67 citations


Proceedings ArticleDOI
27 Apr 1993
TL;DR: An algorithm for connected text recognition using enhanced planar hidden Markov models (PHMMs) is presented, which automatically segments text into characters as an integral part of the recognition process, thus jointly optimizing segmentation and recognition.
Abstract: An algorithm for connected text recognition using enhanced planar hidden Markov models (PHMMs) is presented. The algorithm automatically segments text into characters (even if they are highly blurred and touching) as an integral part of the recognition process, thus jointly optimizing segmentation and recognition. Performance is enhanced by the use of state length models, transition probabilities among characters (bigrams), and grammars. Experiments are presented using: (1) a simulated database of over 24000 highly degraded images of city names and (2) a database of 6000 images rejected by a high-performance commercial OCR (optical character recognition) machine with 99.5% accuracy. Measured performance on the first database is 99.65% for the most degraded images when a grammar is used, and 98.76% in the second database. Traditional OCR algorithms would fail drastically on these images. >

61 citations


Proceedings ArticleDOI
20 Oct 1993
TL;DR: A new discrimination function for segmenting touching characters based on both pixel projection and profile projection is presented and a dynamic recursive segmentation algorithm is developed for effectively segmenting touched characters.
Abstract: A new discrimination function for segmenting touching characters based on both pixel projection and profile projection is presented. A dynamic recursive segmentation algorithm is developed for effectively segmenting touching characters. Contextual information and a spelling checker are used to correct errors caused by incorrect recognition and segmentation. The 99.85% top recognition accuracy has been achieved, while the minimum accuracy is 99.4% based on 12 real documents. >

Proceedings ArticleDOI
20 Oct 1993
TL;DR: The methodology uses diverse pattern recognition techniques, image processing algorithms (thresholding, underline removal, separation of lines, location and recognition of address components), and access to United States Postal Service databases to determine the DPC.
Abstract: Determining the delivery location for mail pieces based on handwritten addresses is a problem that trained humans can normally solve As a problem in machine reading and interpretation, it presents many challenges A method for determining the delivery point codes (DPCs) for handwritten addresses by computer is described Solution to the task requires locating and recognizing address components (eg, ZIP Code, street number, PO box number) and using multiple information sources to assign the DPC to an address The methodology uses diverse pattern recognition techniques, image processing algorithms (thresholding, underline removal, separation of lines, location and recognition of address components), and access to United States Postal Service (USPS) databases to determine the DPC >

Proceedings ArticleDOI
08 Oct 1993
TL;DR: A general-purpose approach for enhancing the accuracy of optical character recognition by taking the view that the printed page is a data transmission channel and raising the possibility of error detecting/correcting codes designed specifically for the OCR process.
Abstract: A general-purpose approach for enhancing the accuracy of optical character recognition is described. By taking the view that the printed page is a data transmission channel, the authors raise the possibility of error detecting/correcting codes designed specifically for the OCR process. They present experimental results that demonstrate the feasibility of fully automated, 100% accurate OCR for computer typeset documents. >

Proceedings ArticleDOI
27 Apr 1993
TL;DR: The problem of the automatic recognition of handwritten text is addressed and a left-to-right hidden markov model (HMM) for each character that models the dynamics of the written script is addressed.
Abstract: The problem of the automatic recognition of handwritten text is addressed. The text to be recognized is captured online and the temporal sequence of the data is presented. The approach is based on a left-to-right hidden markov model (HMM) for each character that models the dynamics of the written script. A mixture of Gaussian distributions is used to represent the output probabilities at each arc of the HMM. Several strategies for reestimating the model parameters are discussed. Experiments show that this approach results in significant decreases in error rate for the recognition of discretely written characters compared with elastic matching techniques. The HMM outperforms the elastic matching technique for both writer-dependent and writer-independent recognition tasks. >

Patent
02 Sep 1993
TL;DR: In this paper, an optical character recognition system which can extract information from documents into machine readable form for selected inclusion into a data base uses human classification through the use of translucent ink pens of colors which correlate to field designations.
Abstract: An optical character recognition system which can extract information from documents into machine readable form for selected inclusion into a data base uses human classification through the use of translucent ink pens of colors which correlate to field designations. The ink pens, commonly known as highlighters, are used to mark the selected text. An optical scanner reads the marked document and converts it to electronic data which is stored into data base fields according to the color marked regions.

01 Oct 1993
TL;DR: This paper carried out evaluations using simulated OCR output on a variety of databases and found that high quality OCR devices have little effect on the accuracy of retrieval, but low quality devices used with databases of short documents can result in significant degradation.
Abstract: Optical Character Recognition (OCR) is a critical part of many text-based applications. Although some commercial systems use the output from OCR devices to index documents without editing, there is very little quantitative data on the impact of OCR errors on the accuracy of a text retrieval system. Because of the difficulty of constructing test collections to obtain this data, we have carried out evaluations using simulated OCR output on a variety of databases. The results show that high quality OCR devices have little effect on the accuracy of retrieval, but low quality devices used with databases of short documents can result in significant degradation.

Proceedings ArticleDOI
20 Oct 1993
TL;DR: The authors have designed a writer-adaptable character recognition system for online characters entered on a touch terminal that is based on a Time Delay Neural Network that is pre-trained on examples from many writers to recognize digits and uppercase letters.
Abstract: The authors have designed a writer-adaptable character recognition system for online characters entered on a touch terminal. It is based on a Time Delay Neural Network (TDNN) that is pre-trained on examples from many writers to recognize digits and uppercase letters. The TDNN without its last layer serves as a preprocessor for an optimal hyperplane classifier that can be easily retrained to peculiar writing styles. This combination allows for fast writer-dependent learning of new letters and symbols. The system is memory and speed efficient. >

Proceedings ArticleDOI
Yi Lu1
20 Oct 1993
TL;DR: The problem of segmenting touching characters in various fonts and size in machine-printed text is addressed and different methods for detecting multiple character segments and for segmenting touched characters in these categories are developed.
Abstract: In many OCR systems, character segmentation is a necessary preprocessing step for character recognition. It is a critical step because incorrectly segmented characters are not likely to be correctly recognized. The most difficult cases in character segmentation are broken characters and touching characters. The problem of segmenting touching characters in various fonts and size in machine-printed text is addressed. The author classifies the touching characters into five categories: touching characters in fixed-pitch fonts, proportional and serif fonts, ambiguous touching characters, and strings with broken and touching characters. Different methods for detecting multiple character segments and for segmenting touching characters in these categories are developed. The methods use features of characters and fonts and profile models. >

Journal ArticleDOI
TL;DR: A new set of efficient two-dimensional moments is introduced, which are invariant under rotation, translation and scale of the image, less sensitive to noise and appear to have better classification performance over the existing sets of moments.

Proceedings ArticleDOI
S.-s. Kuo1, O.E. Agazzi1
27 Apr 1993
TL;DR: An algorithm for robust machine recognition of keywords embedded in a poorly printed document is presented, where two statistical models, called pseudo-2D hidden Markov models (P2-DHMMs), are created for representing the actual keyword and all the other extraneous words, respectively.
Abstract: An algorithm for robust machine recognition of keywords embedded in a poorly printed document is presented. For each keyword, two statistical models, called pseudo-2D hidden Markov models (P2-DHMMs), are created for representing the actual keyword and all the other extraneous words, respectively. Dynamic programming is then used for matching an unknown input word with the two models and making a maximum likelihood decision. Although the models are pseudo 2-D in the sense that they are not fully connected 2-D networks, they are shown to be general enough to characterize printed words efficiently. These models facilitate a nice 'elastic matching' property in both horizontal and vertical directions, which makes the recognizer not only independent of size and slant but also tolerant of highly deformed and noisy words. The system is evaluated on a synthetically created database which contains about 26000 words. A recognition accuracy of 99% is achieved when words in testing and training sets are in the same font size. An accuracy of 96% is achieved when they are in different sizes. In the latter case, the conventional 1-D HMM approach achieves only 70% accuracy rate. >

Proceedings ArticleDOI
M. Hamanaka1, Keiji Yamada, J. Tsukumo
20 Oct 1993
TL;DR: It is shown that an offline character recognition method is effective for use in an online Japanese character recognition, and has been improved with developments in nonlinear shape normalization, nonlinear pattern matching, and the normalization-cooperated feature extraction method.
Abstract: It is shown that an offline character recognition method is effective for use in an online Japanese character recognition. Major conventional online recognition methods have restricted the number and the order of strokes. The offline method removes these restrictions, based on pattern matching of orientation feature patterns. It has been improved with developments in nonlinear shape normalization, nonlinear pattern matching, and the normalization-cooperated feature extraction method. It was used to examine 52,944 online Kanji characters in 1,064 categories. The recognition rate achieved 95.1%, and the cumulation recognition rate within the best five candidates was 99.3%. >

Patent
Jasinski Leon1
19 Jul 1993
TL;DR: In this paper, a method of transmitting and receiving encoded data generated from input data comprising readable text characters (102) in a communication system having an optical character recognition element (206) and a graphic encoding element (208), the communication system also having a transmitter and a receiver (116).
Abstract: A method of transmitting and receiving encoded data generated from input data comprising readable text characters (102) in a communication system having an optical character recognition element (206) and a graphic encoding element (208), the communication system also having a transmitter (114) and a receiver (116), comprises accepting (602) the input data comprising the readable text characters (102) by a facsimile input (202). The method further comprises encoding (620, 622) as character code format data the readable text characters (102) received that are recognizable by the optical character recognition element (206), and encoding (610, 612, 614) as graphic code format data the readable text characters (102) received that are not recognizable by the optical character recognition element (206). The method further comprises assembling (632) the character code format data and the graphic code formate data into an output data stream, the output data stream including information that describes original sizes and positions relative to one another of the readable text characters (102).

Proceedings ArticleDOI
20 Oct 1993
TL;DR: Requirements for the objective evaluation of automated data-entry systems are presented and different measures of accuracy (error metrics) are appropriate for different applications, and at the character, word, text-line,Text-block, and document levels.
Abstract: Requirements for the objective evaluation of automated data-entry systems are presented. Because the cost of correcting errors dominates the document conversion process, the most important characteristic of an OCR device is accuracy. However, different measures of accuracy (error metrics) are appropriate for different applications, and at the character, word, text-line, text-block, and document levels. For wholly objective assessment, OCR devices must be tested under programmed, rather than interactive, control. >

Journal ArticleDOI
TL;DR: An omnifont classifier produced using the SMR modeling procedure outperforms a state of the art OCR system.

Proceedings ArticleDOI
Richard G. Casey1, Stephen K. Boyer1, P. Healey1, Alex Miller1, B. Oudot1, K. Zilles1 
20 Oct 1993
TL;DR: A prototype system for encoding chemical structure diagrams from scanned printed documents is described, and the final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling.
Abstract: A prototype system for encoding chemical structure diagrams from scanned printed documents is described. The system distinguishes a structure diagram from other printed material on a page image using size and spacing characteristics. It distinguishes line graphics from symbols in an intermediate vectorization stage. Line information is mapped into a connection diagram that represents atomic bonds. Atomic symbols are identified by means of chemical drawing conventions and optical character recognition. The final coded output interfaces to conventional chemistry software for database storage and retrieval, publishing, and modeling. >

Proceedings ArticleDOI
20 Oct 1993
TL;DR: A novel approach that performs OCR without the segmentation step was developed, and it is shown that even if some of the features are occluded or lost due to degradation, the remaining features can successfully identify the character.
Abstract: Segmentation is a key step in current OCR systems. It has been estimated that half the errors in character recognition are due to segmentation. A novel approach that performs OCR without the segmentation step was developed. The approach starts by extracting significant geometric features from the input document image of the page. Each feature then votes for the character that could have generated that feature. Thus, even if some of the features are occluded or lost due to degradation, the remaining features can successfully identify the character. In extreme cases, the degradation may be severe enough to prevent recognition of some of the characters in a word. In such cases, a lexicon-based word recognition technique is used to resolve ambiguity. Inexact matching and probabilistic evaluation used in the technique make it possible to identify the correct word, by detecting a partial set of characters. The authors first present an overview of their segmentation-free OCR system and then focus on the word recognition technique. Preliminary experimental results show that this is a very promising approach. >

Proceedings ArticleDOI
20 Oct 1993
TL;DR: A cheque processing system currently under development, based on a psychological model of the reading process for a fast reader, and the module for extracting graphical clues, implemented with the techniques of mathematical morphology, is discussed.
Abstract: A cheque processing system currently under development is described. More precisely, the cursive script recognition module for the legal amount is discussed. Commonly, systems perform recognition either on a character by character basis, or on a word level. The authors investigate the recognition at a higher level of abstraction, at the sentence level. Knowledge of context, orthography, syntax and semantics is used to supplement the information from the graphical input. The system is based on a psychological model of the reading process for a fast reader. The module for extracting graphical clues, implemented with the techniques of mathematical morphology, is discussed. >

Proceedings ArticleDOI
20 Oct 1993
TL;DR: Preliminary results on a new approach to document image binarization, an algorithm based on gray scale histogram and run-length histogram analysis, show that over 99% of such address blocks can be correctly binarized.
Abstract: Document image binarization is not a completely solved problem for unconstrained document images. Binarization algorithms, whether global or local, can easily fail on images with noisy or complex background, or poor contrast. The authors report preliminary results on a new approach to document image binarization, an algorithm based on gray scale histogram and run-length histogram analysis. Experimental results on unconstrained machine printed address blocks from the US letter mail stream show that over 99% of such address blocks can be correctly binarized. >

Proceedings ArticleDOI
B. Plessis1, A. Sicsu1, Laurent Heutte1, E. Menu1, Eric Lecolinet1, O. Debon1, J.V. Moreau1 
20 Oct 1993
TL;DR: A recognition scheme for reading handwritten cursive words using three word recognition techniques is described, with the focus on the implementation used to combine the three techniques based on a comparative study of different strategies.
Abstract: A recognition scheme for reading handwritten cursive words using three word recognition techniques is described. The focus is on the implementation used to combine the three techniques based on a comparative study of different strategies. The first holistic recognition technique derives a global encoding of the word. The other techniques both rely on the segmentation of the word into letters, but differ in the character classifier they use. The former runs a statistical linear classifier, and the latter runs a neural network with a different representation of the input data. The testing, comparison, and combination studies have been performed on word images from mail provided by the USPS. The top choice recognition rates achieved so far correspond to 88%, 76%, 65% with respect to lexicon sizes of 10, 100, and 1000 words. >

Patent
07 Apr 1993
TL;DR: In this paper, a neural network based optical character recognition technique is presented for identifying characters in a moving web, which is particularly useful for reading dot-matrix-type characters on a noisy, semi-transparent background.
Abstract: A neural network based optical character recognition technique is presented for identifying characters in a moving web. Image acquisition means defines an imaging window through which the moving web passes such that the characters printed thereon can be imaged. Classification data is extracted and accumulated for each printed web character passing through the imaging window. A light source provides transmissive illumination of the web as it is being imaged. A neural network accelerator is coupled to the image acquisition means for intelligent processing of the accumulated classification data to produce therefrom printed character classification information indicative of each corresponding character imaged. A processor is coupled to the accelerator for converting the classification information into the appropriate ASCII character code. The technique is particularly useful for reading dot-matrix-type characters on a noisy, semi-transparent background at fast real-time rates. A neural network algorithm based recognition method is also described.