scispace - formally typeset
Search or ask a question

Showing papers on "Optical character recognition published in 1995"


Patent
Gregory J. Wolff1, David G. Stork1
01 Nov 1995
TL;DR: A pen-like instrument with a writing point for making written entries upon a physical document and sensing the three-dimensional forces exerted on the writing tip as well as the motion associated with the act of writing is described in this article.
Abstract: A manual entry interactive paper and electronic document handling and process system uses a pen-like instrument (PI) with a writing point for making written entries upon a physical document and sensing the three-dimensional forces exerted on the writing tip as well as the motion associated with the act of writing. The PI is also equipped with a CCD array for reading pre-printed bar codes used for identifying document pages and other application defined areas on the page, as well as for providing optical character recognition data. A communication link between the PI and an associated base unit transfers the transducer data from the PI. The base unit includes a programmable processor, a display, and a communication link receiver. The processor includes programs for written character and word recognition, memory for storage of an electronic version of the physical document and any hand-written additions to the document. The display unit displays the corresponding electronic version of the physical document on a CRT or LCD as a means of feedback to the user and for use by authorized electronic agents.

1,024 citations


Journal ArticleDOI
TL;DR: Two methods for automatically locating text in complex color images that computes the local spatial variation in the gray-scale image, and locates text in regions with high variance are presented.

362 citations


Journal ArticleDOI
TL;DR: This paper introduces the general topic of optical character recognition (OCR), and introduces a five stage model for AOTR systems and classify research work according to this model, and presents an historical review of the Arabic text recognition systems.

260 citations


Journal ArticleDOI
Yi Lu1
TL;DR: An overview of the character segmentation techniques in machine-printed documents is presented, which will cover techniques for segmenting uniformed or proportional fonts, broken and touching characters; techniques based on text image features and techniquesbased on recognition results.

206 citations


Book
01 Jan 1995
TL;DR: This paper presents two new extraction techniques: a logical level technique and a mask-based subtraction technique, suggesting its suitability for high-speed low-cost applications.
Abstract: The extraction of binary character/graphics images from gray-scale document images with background pictures, shadows, highlight, smear, and smudge is a common critical image processing operation, particularly for document image analysis, optical character recognition, check image processing, image transmission, and videoconferencing. After a brief review of previous work with emphasis on five published extraction techniques, viz., a global thresholding technique, YDH technique, a nonlinear adaptive technique, an integrated function technique, and a local contrast technique, this paper presents two new extraction techniques: a logical level technique and a mask-based subtraction technique. With experiments on images of a typical check and a poor-quality text document, this paper systematically evaluates and analyses both new and published techniques with respect to six aspects, viz., speed, memory requirement, stroke width restriction, parameter number, parameter setting, and human subjective evaluation of result images. Experiments and evaluations have shown that one new technique is superior to the rest, suggesting its suitability for high-speed low-cost applications.

204 citations


Proceedings ArticleDOI
01 Dec 1995
TL;DR: Algorithms for automatic character segmentation in motion pictures which extract automatically and reliably the text in pre-title sequences, credit titles, and closing sequences with title and credits are developed.
Abstract: We have developed algorithms for automatic character segmentation in motion pictures which extract automatically and reliably the text in pre-title sequences, credit titles, and closing sequences with title and credits. The algorithms we propose make use of typical characteristics of text in videos in order to enhance segmentation and, consequently, recognition performance. As a result, we get segmented characters from video pictures. These can be parsed by any OCR software. The recognition results of multiple instances of the same character throughout subsequent frames are combined to enhance recognition result and to compute the final output. We have tested our segmentation algorithms in a series of experiments with video clips recorded from television and achieved good segmentation results.

184 citations


Journal ArticleDOI
TL;DR: A method for the off-line recognition of cursive handwriting based on hidden Markov models (HMMs) is described, which has an average correct recognition rate of over 98% on the word level and in experiments with cooperative writers using two dictionaries of I50 words each.

183 citations


Proceedings ArticleDOI
27 Nov 1995
TL;DR: First experiments along highways in the Netherlands show that the CLPR-system has an error rate, of 0.02% at a recognition rate of 98.51%.
Abstract: A car license plate recognition system (CLPR-system) has been developed to identify vehicles by the contents of their license plate for speed-limit enforcement. This type of application puts high demands on the reliability of the CLPR-system. A combination of neural and fuzzy techniques is used to guarantee a very low error rate at an acceptable recognition rate. First experiments along highways in the Netherlands show that the system has an error rate, of 0.02% at a recognition rate of 98.51%. These results are also compared with other published CLPR-systems.

180 citations


Patent
23 Mar 1995
TL;DR: In this article, the authors present a system for coding medical data, where the input data is text describing a medical diagnosis and operation which would be dictated or recorded by a surgeon subsequent to an operation being performed on a patient.
Abstract: The present invention relates to a system for coding data. An example implementation is disclosed whereby the coding system is a computer program especially suited to analysing text input to the computer by for example a keyboard, optical character recognition or voice recognition. The data to be coded may, for example, comprise information relating to an event, item or operation. In the preferred form of the invention, the input data is text describing a medical diagnosis and operation which would be dictated or recorded by a surgeon subsequent to an operation being performed on a patient. The coding system of the present invention analyses each word or term of the medical information in conjunction with specialised and generalised dictionaries of words and terms, along with the relationships between individual words or terms. In this way, in addition to producing a compressed symbolic representation of the original information which may later be interrogated or used for statistical analysis, the present invention is also capable of correcting or supplementing the original information.

162 citations


Proceedings ArticleDOI
14 Aug 1995
TL;DR: The proposed algorithm has been used to locate text in compact disc and book cover images, as well as in the images of traffic scenes captured by a video camera, and initial results suggest that these algorithms can be used in image retrieval applications.
Abstract: There is a substantial interest in retrieving images from a large database using the textual information contained in the images. An algorithm which will automatically locate the textual regions in the input image will facilitate this task; the optical character recognizer can then be applied to only those regions of the image which contain text. We present a method for automatically locating text in complex color images. The algorithm first finds the approximate locations of text lines using horizontal spatial variance, and then extracts text components in these boxes using color segmentation. The proposed method has been used to locate text in compact disc (CD) and book cover images, as well as in the images of traffic scenes captured by a video camera. Initial results are encouraging and suggest that these algorithms can be used in image retrieval applications.

154 citations


Journal ArticleDOI
TL;DR: The performance of 10 parallel thinning algorithms from this perspective is reported on by gathering statistics from their performance on large sets of data and examining the effects of the differentthinning algorithms on an OCR system.
Abstract: Skeletonization algorithms have played an important role in the preprocessing phase of OCR systems. In this paper we report on the performance of 10 parallel thinning algorithms from this perspective by gathering statistics from their performance on large sets of data and examining the effects of the different thinning algorithms on an OCR system. >

Journal ArticleDOI
TL;DR: A neural network approach is introduced to perform high accuracy recognition on multi-size and multi-font characters; a novel centroid-dithering training process with a low noise-sensitivity normalization procedure is used to achieve high accuracy results.
Abstract: Optical character recognition (OCR) refers to a process whereby printed documents are transformed into ASCII files for the purpose of compact storage, editing, fast retrieval, and other file manipulations through the use of a computer. The recognition stage of an OCR process is made difficult by added noise, image distortion, and the various character typefaces, sizes, and fonts that a document may have. In this study a neural network approach is introduced to perform high accuracy recognition on multi-size and multi-font characters; a novel centroid-dithering training process with a low noise-sensitivity normalization procedure is used to achieve high accuracy results. The study consists of two parts. The first part focuses on single size and single font characters, and a two-layered neural network is trained to recognize the full set of 94 ASCII character images in 12-pt Courier font. The second part trades accuracy for additional font and size capability, and a larger two-layered neural network is trained to recognize the full set of 94 ASCII character images for all point sizes from 8 to 32 and for 12 commonly used fonts. The performance of these two networks is evaluated based on a database of more than one million character images from the testing data set. >

Journal ArticleDOI
TL;DR: A methodology for automatically assessing the accuracy of optical character recognition decompositions is presented, and its use in evaluating six OCR systems is demonstrated.
Abstract: Many current optical character recognition (OCR) systems attempt to decompose printed pages into a set of zones, each containing a single column of text, before converting the characters into coded form. The authors present a methodology for automatically assessing the accuracy of such decompositions, and demonstrate its use in evaluating six OCR systems. >

Proceedings ArticleDOI
23 Oct 1995
TL;DR: This paper presents efficient algorithms for determining the language classification of machine generated documents without requiring the identification of individual characters using the less computationally intensive methods described.
Abstract: This paper presents efficient algorithms for determining the language classification of machine generated documents without requiring the identification of individual characters. Such algorithms may be useful for sorting and routing of facsimile documents as they arrive so that appropriate routing and secondary analysis, which may include OCR, is selected for each document. It may also prove useful as a component of a content addressable document access system. There have been numerous reported efforts which attempt to segment printed documents into homogeneous regions using Hough transforms, hidden Markov models, morphological filtering, and neural networks. However, language identification can be accomplished without explicit segmentation using the less computationally intensive methods described.

Proceedings ArticleDOI
14 Aug 1995
TL;DR: In this paper, a new methodology for character segmentation and recognition which makes the best use of the characteristics of gray-scale images is proposed.
Abstract: Generally speaking, through the binarization of gray-scale images, useful information for the segmentation of touching or overlapping characters may be lost. If we analyze gray-scale images, however, specific topographic features and the variation of intensity can be observed in the character boundaries. We believe that such kinds of clues obtained from gray-scale images should be useful for efficient character segmentation. In this paper, we propose a new methodology for character segmentation and recognition which makes the best use of the characteristics of gray-scale images. In the proposed methodology, the character segmentation regions are determined by using projection profiles and topographic features extracted form gray-scale images. Then the nonlinear character segmentation path in each character segmentation region is found by using multistage graph search algorithm. Finally, in order to confirm the character segmentation paths and recognition results, recognition based segmentation method is adopted.

Proceedings ArticleDOI
14 Aug 1995
TL;DR: A system that automatically identifies the script used in documents stored electronically in image form by comparing a subset of symbols from the document to each script's templates, screening out rare or unreliable templates, and choosing the script whose templates provide the best match.
Abstract: We describe a system that automatically identifies the script used in documents stored electronically in image form. The system can learn to distinguish any number of scripts. It develops a set of representative symbols (templates) for each script by clustering textual symbols from a set of training documents and representing each cluster by its centroid. "Textual symbols" include discrete characters in scripts such as Cyrillic, as well as adjoined characters, character fragments, and whole words in connected scripts such as Arabic. To identify a new document's script, the system compares a subset of symbols from the document to each script's templates, screening out rare or unreliable templates, and choosing the script whose templates provide the best match. Our current system, trained on thirteen scripts, correctly identifies all test documents except those printed in fonts that differ markedly from fonts in the training set.

Patent
11 Apr 1995
TL;DR: In this article, a document to be processed is scanned into a machine readable image, and the image is segmented into a plurality of fields. Predetermined characteristics are measured for each field and the set of characteristics is correlated with a predetermined set of attributes derived from a reference image.
Abstract: A document to be processed is scanned into a machine readable image. The image is segmented into a plurality of fields. Predetermined characteristics are measured for each field and the set of characteristics is correlated with a predetermined set of characteristics derived from a reference image. The fields with the highest degree of correlation to the characteristics from the reference document are selected for further processing, e.g., optical character recognition.

Journal ArticleDOI
TL;DR: This paper proposes a new method for the direct extraction of topographic features from gray scale character images by computing the directions of principal curvature efficiently and prevented the extraction of unnecessary features.
Abstract: Optical character recognition (OCR) traditionally applies to binary-valued imagery although text is always scanned and stored in gray scale. However, binarization of multivalued image may remove important topological information from characters and introduce noise to character background. In order to avoid this problem, it is indispensable to develop a method which can minimize the information loss due to binarization by extracting features directly from gray scale character images. In this paper, we propose a new method for the direct extraction of topographic features from gray scale character images. By comparing the proposed method with Wang and Pavlidis' method, we realized that the proposed method enhanced the performance of topographic feature extraction by computing the directions of principal curvature efficiently and prevented the extraction of unnecessary features. We also show that the proposed method is very effective for gray scale skeletonization compared to Levi and Montanari's method. >

04 Apr 1995
TL;DR: This study examines the effects of the well known cosinenormalization method in the presence of OCR errors and proposes a new, more robust, normalization method that yields significant improvements in retrieval effectiveness over cosine normalization.
Abstract: Optical character recognition (OCR) is the most commonly used technique to convert printed material into electronic form. Using OCR, large repositories of machine readable text can be created in a short time. An information retrieval system can then be used to search through large information bases thus created. Many information retrieval systems use sophisticated term weighting functions to improve the effectiveness of a search. Term weighting schemes can be highly sensitive to the errors in the input text, introduced by the OCR process. This study examines the effects of the well known cosine normalization method in the presence of OCR errors and proposes a new, more robust, normalization method. Experiments show that the new scheme is less sensitive to OCR errors and facilitates use of more diverse basic weighting schemes. It also yields significant improvements in retrieval effectiveness over cosine normalization.

Proceedings ArticleDOI
14 Aug 1995
TL;DR: In this work a classification system is presented which reads a raster image of a character and outputs two confidence values, one for "machine-written" and one for 'hand-written' character classes, respectively.
Abstract: In applications of character recognition where machine-printed and hand-written characters are involved, it is important to know if the character image, or the whole word, is machine- or hand-written. This is due to the accuracy difference between the algorithms and systems oriented to machine- or handwritten characters. Obviously, this type of knowledge leads to the increase of the overall system quality. In this work a classification system is presented which reads a raster image of a character and outputs two confidence values, one for "machine-written" and one for "hand-written" character classes, respectively. The proposed system features a preprocessing step, which transforms a general uncentered character image into a normalized form, then the feature extraction phase extracts relevant information from the image, and at the end, a standard classifier based on a feedforward neural network creates the final response. At the end, some results on a proprietary image database are reported.

Journal ArticleDOI
TL;DR: A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters based on a variant of the notion of relative neighborhood used in computational perception.
Abstract: A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model. It is based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters. Gaps are identified as potential parts of characters by implementing a variant of the notion of relative neighborhood used in computational perception. Each subgraph of strokes that matches a previously defined character prototype is recognized anywhere in the word even if it corresponds to a broken character or to a character touching another one. The characters are detected in the order defined by the matching quality. Each subgraph that is recognized is introduced as a node in a directed net that compiles different alternatives of interpretation of the features in the feature graph. A path in the net represents a consistent succession of characters. A final search for the optimal path under certain criteria gives the best interpretation of the word features. Broken characters are recognized by looking for gaps between features that may be interpreted as part of a character. Touching characters are recognized because the matching allows nonmatched adjacent strokes. The recognition results for over 24,000 printed numeral characters belonging to a USPS database and on some hand-printed words confirmed the method's high robustness level. >

Proceedings ArticleDOI
14 Aug 1995
TL;DR: A classifier for predicting the character accuracy achieved by any Optical Character Recognition (OCR) system on a given page is presented, based on measuring the amount of white speckle, the amounts of character fragments, and overall size information in the page.
Abstract: A classifier for predicting the character accuracy achieved by any Optical Character Recognition (OCR) system on a given page is presented. This classifier is based on measuring the amount of white speckle, the amount of character fragments, and overall size information in the page. No output from the OCR system is used. The given page is classified as either "good" quality (i.e. high OCR accuracy expected) or "poor" (i.e. low OCR accuracy expected). Results of processing 639 pages show a recognition rate of approximately 85%. This performance compares favorably with the ideal-case performance of a prediction method based upon the number of reject-markers in OCR generated text.

Proceedings ArticleDOI
E. Lethelier1, M. Leroux1, M. Gilloux1
14 Aug 1995
TL;DR: An automatic recognition system applied to handwritten numeral check amounts based on a segmentation-by-recognition probabilistic model that determines cut regions on digit links and provides a multiple spatial representation.
Abstract: We present an automatic recognition system applied to handwritten numeral check amounts. This system is based on a segmentation-by-recognition probabilistic model. The application is described from the field amount localization to the hypothesis generation of amounts. An explicit segmentation algorithm determines cut regions on digit links and provides a multiple spatial representation. The best path for the segmentation is determined by the combination of the recognition scores, segmentation weights and the outputs of a probabilistic parser. Training is done by a bootstrapping technique, which significantly improves the performances of the different algorithms. It also allows the use of a reject class at the recognition step. The system was evaluated on 10000 database images to show its robustness.

Patent
22 May 1995
TL;DR: In this article, a method for identifying, correcting, modifying and reporting imperfections and features in pixel images that prevent or hinder proper OCR (Optical Character Recognition) and other document imaging processes is provided.
Abstract: A method is provided for identifying, correcting, modifying and reporting imperfections and features in pixel images that prevent or hinder proper OCR (Optical Character Recognition) and other document imaging processes. One embodiment of this invention provides that run length compressed images can be analyzed and corrected directly for improved performance. Major steps included in this invention for the enhancement of images for OCR and document imaging are: The detection of undesired printed matter and the deletion of undesired printed matter. Detection of undesired printed matter includes successive generation of entries into an array listing of undesired printed matter to be deleted. Deletion of undesired printed matter follows in a separate step for all types of undesired printed matter.

Proceedings ArticleDOI
14 Aug 1995
TL;DR: A new algorithm for determining the skew angles of lines of text in an image of a document with the advantage that it only performs one iteration to determine the skew angle is presented.
Abstract: This paper presents the use of analysing the connected components extracted from the binary image of a document page. Such an analysis provides a lot of useful information, and will be used to perform skew correction, segmentation and classification of the document. We present a new algorithm for determining the skew angle of lines of text in an image of a document with the advantage that it only performs one iteration to determine the skew angle. Experiments on over 30 pages show that the method works well on a wide variety of layouts, including sparse textual regions, mixed fonts, multiple columns, and even for documents with a high graphical content.

Patent
20 Nov 1995
TL;DR: In this paper, a multi-stage multi-network character recognition system decomposes the estimation of a posteriori probabilities into coarse-to-fine stages, which is especially suitable for the tasks that involve a large number of categories.
Abstract: A multi-stage multi-network character recognition system decomposes the estimation of a posteriori probabilities into coarse-to-fine stages. Classification is then based on the estimated a posteriori probabilities. This classification process is especially suitable for the tasks that involve a large number of categories. The multi-network system is implemented in two stages: a soft pre-classifier and a bank of multiple specialized networks. The pre-classifier performs coarse evaluation of the input character, developing different probabilities that the input character falls into different predefined character groups. The bank of specialized networks, each corresponding to a single group of characters, performs fine evaluation of the input character, where each develops different probabilities that the input character represents each character in that specialized network's respective predefined character group. A network selector is employed to increase the system's efficiency by selectively invoking certain specialized networks selected, using a combination of prior external information and outputs of the pre-classifier. Relative to known single network or one-stage multiple network recognition systems, the invention provides improved recognition, accuracy, confidence measure, speed, and flexibility.

Patent
12 Apr 1995
TL;DR: In this paper, a method for exploiting correlated mail streams using optical character recognition is provided in which a static database is used to store data based on training, and a decision threshold is determined which is based on the real-time statistics of the mail stream.
Abstract: A method for exploiting correlated mail streams using optical character recognition is provided in which a static database is used to store data based on training. Real-time data for the parameters of interest, such as address block location, zip code, city, state, and font size or type is collected from the mail processing equipment in order to generate a statistical information database. The dynamic database can include probability density functions, correlations statistics, mean, variance, and high order moments. The statistical parameters are tracked using recursive least squares schemes with various windowing options, as will as moving average linear filters. Based on cost models which indicate the cost of making various types of errors in the OCR process, a decision threshold is determined which is based on the real-time statistics of the mail stream. The decision threshold determines the confidence level required by the adaptive process in order to assign a previously rejected mail pieces based solely on correlation statistics. The decisions threshold will adapt to the statistics of the mail stream and is not a constant value. Previously unassigned characters are assigned according to the decision threshold determination and assignment processes.

Proceedings ArticleDOI
23 Oct 1995
TL;DR: A system for recognition of segmented handwritten Persian/Arabic numerals irrespective of size and translation is developed, performed by a modified version of a four-layer probabilistic neural network called the edited PNN (EPNN).
Abstract: A system for recognition of segmented handwritten Persian/Arabic numerals irrespective of size and translation is developed. The image is represented by invariant features obtained from a new shadow coding scheme designed for the considered shapes. Classification is performed by a modified version of a four-layer probabilistic neural network (PNN) called the edited PNN (EPNN). Due to an editing and condensation procedure on the training samples, the EPNN has better performance and the network size is smaller. The performance of the system is evaluated on a database consisting of 2600 digits written by 10 different people. The obtained recognition accuracy is 97.8 percent. The developed system can process approximately two digits per second on a Intel 486 based PC with a 66 MHz clock.

Proceedings ArticleDOI
30 Mar 1995
TL;DR: This paper describes a system developed for the detection of isolated words, word portions, as well as multi-word phrases in images of documents and provides for automated training of desired keywords and creation of indexing filters to speed matching.
Abstract: With the advent of on-line access to very large collections of document images, electronic classification into areas of interest has become possible. A first approach to classification might be the use of OCR on each document followed by analysis of the resulting ASCII text. But if the quality of a document is poor, the format unconstrained, or time is critical, complete OCR of each image is not appropriate. An alternative approach is the use of word shape recognition (as opposed to individual character recognition) and the subsequent classification of documents by the presence or absence of selected keywords. Use of word shape recognition not only provides a more robust collection of features but also eliminates the need for character segmentation (a leading cause of error in OCR). In this paper we describe a system we have developed for the detection of isolated words, word portions, as well as multi-word phrases in images of documents. It is designed to be used with large, changeable, keyword sets and very large document sets. The system provides for automated training of desired keywords and creation of indexing filters to speed matching.© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Patent
31 Mar 1995
TL;DR: In this article, a communication system is composed of an OCR-FAX apparatus for reading contents of an order written in an optical character recognition (OCR) document sheet and performing an optical-character recognition for the contents and an oCR center apparatus for receiving pieces of character recognized data obtained in the OCR FAX apparatus.
Abstract: A communication system is composed of an OCR-FAX apparatus for reading contents of an order written in an optical character recognition (OCR) document sheet and performing an optical character recognition for the contents and an OCR center apparatus for receiving pieces of character recognized data obtained in the OCR-FAX apparatus and transmitting pieces of format information of the OCR document sheet to the OCR-FAX apparatus. A basic program of an OCR recognition program is stored in advance in a ROM region of an IC card. A subordinate program of the OCR recognition program and a piece of OCR document sheet identifying information are temporarily stored in a SRAM region of the IC card and are transferred to an EEPROM region of the IC card. The format information are temporarily stored in the SRAM region and are transferred to a format information storing unit. The format information stored is renewed by transmitting pieces of updated format information from the OCR center apparatus, and an updated OCR document sheet is printed out in the OCR-FAX apparatus according to the format information.