Showing papers on "Optical character recognition published in 1989"

PDF

Open Access

Journal Article•DOI•

Handwritten digit recognition: applications of neural network chips and automatic learning

[...]

Y. Le Cun¹, Lawrence D. Jackel¹, Bernhard E. Boser¹, John S. Denker¹, Hans Peter Graf¹, Isabelle Guyon¹, D. Henderson¹, Richard Howard¹, W. Hubbard¹ - Show less +5 more•Institutions (1)

Bell Labs¹

01 Nov 1989-IEEE Communications Magazine

TL;DR: Two novel methods for achieving handwritten digit recognition are described, based on a neural network chip that performs line thinning and feature extraction using local template matching and on a digital signal processor that makes extensive use of constrained automatic learning.

...read moreread less

Abstract: Two novel methods for achieving handwritten digit recognition are described. The first method is based on a neural network chip that performs line thinning and feature extraction using local template matching. The second method is implemented on a digital signal processor and makes extensive use of constrained automatic learning. Experimental results obtained using isolated handwritten digits taken from postal zip codes, a rather difficult data set, are reported and discussed. >

...read moreread less

430 citations

Journal Article•DOI•

A neural network approach to character recognition

[...]

A. Rajavelu¹, Mohamad Musavi¹, Mukul Shirvaikar¹•Institutions (1)

University of Maine at Augusta¹

01 Jul 1989-Neural Networks

TL;DR: The sensitivity of the network is such that small variations in the input do not affect the output and this results in an improvement in the recognition rate of characters with slight variations in structure, linearity, and orientation.

...read moreread less

128 citations

Journal Article•DOI•

Machine recognition and correction of printed Arabic text

[...]

Adnan Amin¹, J.F. Mari•Institutions (1)

Kuwait University¹

01 Sep 1989

TL;DR: A method for automatic recognition of a multifont Arabic text entered from a scanner of 300 dpi density is presented and was achieved despite several impeding properties of the Arabic script, especially the connectivity of characters.

...read moreread less

Abstract: A method for automatic recognition of a multifont Arabic text entered from a scanner of 300 dpi density is presented. The system is based on two components, one for character recognition and one for word recognition. Character recognition is further divided into three phases: the digitization process, segmentation of words into characters, and identification of characters. The word recognition component is based on the Viterbi algorithm and can handle some identification errors. Character recognition was achieved despite several impeding properties of the Arabic script, especially the connectivity of characters. The processing speed is close to three characters per second with a 90% recognition rate. All algorithms were written in Pascal and run on an IBM PC/AT. >

...read moreread less

105 citations

Patent•

Method for identifying unrecognizable characters in optical character recognition machines

[...]

Peter Rudak¹•Institutions (1)

Eastman Kodak Company¹

02 Jun 1989

TL;DR: In this paper, a bit-map video image of the unrecognized character(s) is inserted in the ASCII data line of neighboring characters to create an impression of the original line of text from the document.

...read moreread less

Abstract: A method for identifying a character which cannot be machine read so that the operator may observe and hopefully recognize the character in question. A bit-map video image of the unrecognized character(s) is inserted in the ASCII data line of neighboring characters to create an impression of the original line of text from the document. A data entry operator uses this information to enter the required correct character(s) via the keyboard or other means. This reject/reentry method allows for quick operator response, and minimizes data storage and transmission of video information.

...read moreread less

85 citations

Patent•

Apparatus and method for use in image processing

[...]

Smith Raymond W¹, Christopher John Robson¹•Institutions (1)

Hewlett-Packard¹

03 Mar 1989

TL;DR: In this paper, an edge extractor, a page segmentation facility and a novel feature extraction facility are proposed for optical character recognition in a system which has a scanner (10) for scanning a document, an edge extractsor (11) for identifying edges in the image produced by the scanner to produce an outline of each object identified in an image, segmentation facilities for grouping the object outlines into blocks, means for identifying features of the outlines, and a final classification stage (16) for providing data in an appropriate format representative of the characters in image.

...read moreread less

Abstract: Optical character recognition is achieved by a system which has a scanner (10) for scanning a document, an edge extractor (11) for identifying edges in the image produced by the scanner to produce an outline of each object identified in the image, a segmentation facility (15) for grouping the object outlines into blocks, means (14) for identifying features of the outlines, and a final classification stage (16) for providing data in an appropriate format representative of the characters in the image. Also disclosed are a novel edge extractor, a novel page segmentation facility and a novel feature extraction facility.

...read moreread less

81 citations

Patent•

Document recognition and automatic indexing for optical character recognition.

[...]

Lori L. Barski¹, Roger S. Gaborski¹•Institutions (1)

Eastman Kodak Company¹

15 May 1989

TL;DR: In this article, a library of templates defining the spacings between pre-printed lines and the corresponding line lengths for a plurality of different business forms is compared with the image data of an unknown document to determine the known business form (template) to which the document corresponds.

...read moreread less

Abstract: A library of templates defining the spacings between pre-printed lines and the corresponding line lengths for a plurality of different business forms is compared with the image data of an unknown document to determine the known business form (template) to which the document corresponds. Once the form of the document is determined, the optical character recognition system may intelligently associate the text characters in certain locations on the document with information fields defined by the pre-printed lines. The pre-printed lines in the image data are determined from the corresponding template and removed from the image data prior to optical character recognition processing.

...read moreread less

79 citations

Patent•

Method and apparatus for extracting information from forms

[...]

Kent D. Vincent¹, Rueiming Jamp¹•Institutions (1)

Hewlett-Packard¹

25 Aug 1989

TL;DR: In this paper, a system for extracting handwritten or typed information from forms that have been printed in colors other than the color of the handwritten information was proposed, which includes a detector for detecting color values for scanned pixel locations on a printed form; a comparator for comparing the color values with reference color values; an identifier for identifying ones of the scanned pixels locations that have color values that correspond to the reference colour values; and an optical character recognition engine for receiving data regarding the identified locations.

...read moreread less

Abstract: A system for extracting handwritten or typed information from forms that have been printed in colors other than the color of the handwritten or typed information. The information extraction system includes a detector for detecting color values for scanned pixel locations on a printed form; a comparator for comparing the color values with reference color values; an identifier for identifying ones of the scanned pixel locations that have color values that correspond to the reference color values; and an optical character recognition engine for receiving data regarding the identified locations.

...read moreread less

59 citations

Patent•

Apparatus for identifying and correcting unrecognizable characters in optical character recognition machines

[...]

Peter Rudak¹•Institutions (1)

Eastman Kodak Company¹

02 Jun 1989

...read moreread less

Abstract: An apparatus for identifying a character which cannot be machine read so that the operator may observe and hopefully recognize the character in question. A bit-map video image of the unrecognized character(s) is inserted in the ASCII data line of neighboring characters to create an impression of the original line of text from the document. A data entry operator uses this information to enter the required correct character(s) via the keyboard or other means. This reject/reentry method allows for quick operator response, and minimizes data storage and transmission of video information.

...read moreread less

39 citations

Patent•

Processing means for use in an optical character recognition system

[...]

David J. Ross

21 Jul 1989

TL;DR: In this paper, a means is provided for use in an optical character recognition system to narrow the possible characters associated with a given unknown input character, primarily based upon subline information.

...read moreread less

Abstract: Means is provided for use in an optical character recognition system to narrow the possible characters associated with a given unknown input character, primarily based upon subline information. This means also serves to add to the possibility set additional possible characters, and to determine point sizes for each character. In the event that the subline information provided is erroneous, the subline information is corrected.

...read moreread less

24 citations

Journal Article•DOI•

Modified rapid transform.

[...]

Ming Fang¹, Gerd Häusler¹•Institutions (1)

University of Erlangen-Nuremberg¹

15 Mar 1989-Applied Optics

TL;DR: The modified rapid transform (MRT) combines the well-known rapid transform with preprocessing steps and can be usefully applied as a preprocessing step in automatic inspection and pattern recognition, where shift invariance, uniqueness, and low computing time is required.

...read moreread less

Abstract: We describe a new fast, shift-invariant transform, the modified rapid transform (MRT). The MRT combines the well-known rapid transform with preprocessing steps. Computer simulations show that for 1-D binary patterns the MRT with a sufficient number of preprocessing steps may perform shift-invariant one-to-one mapping. The modification is also efficient for 2-D patterns. The MRT can be usefully applied as a preprocessing step in automatic inspection and pattern recognition, where shift invariance, uniqueness, and low computing time is required. As an example, the use of MRT in optical character recognition is discussed.

...read moreread less

14 citations

Proceedings Article•DOI•

Alphanumeric character recognition using a connectionist model with the pocket algorithm

[...]

Hayashi¹, Sakata, Nakao, Ohno, Ohhashi - Show less +1 more•Institutions (1)

Ibaraki University¹

01 Jan 1989

TL;DR: An activation criterion of output cells for character recognition and a useful technique are proposed to distinguish between characters that closely resemble each other by using the structure information of characters.

...read moreread less

Abstract: Summary form only given. New results are presented which were obtained by applying a connectionist model with the pocket algorithm proposed by Gallant to problems in alphanumeric character recognition. An evaluation is made of the recognition (classification) capability of the connectionist model for 62 and 93 alphanumeric characters of a single font having different kinds of typeface quality and 76 alphanumeric characters of multiple fonts having the same typeface quality. A useful technique is proposed to distinguish between characters that closely resemble each other by using the structure information of characters. An activation criterion of output cells for character recognition is also proposed. In the recognition of characters having different typeface qualities, a markedly high degree of accuracy (99.96% maximum, 99.74% on average) of the individual fonts was attained. In the recognition of the 76 alphanumeric characters having multiple fonts, the degree of accuracy achieved was 99.64% maximum. >

...read moreread less

Proceedings Article•DOI•

Devanagari character recognition using structure analysis

[...]

K. Jayanthi¹, Akihiro Suzuki¹, Hiroshi Kanai¹, Yoshiyuki Kawazoe¹, M. Kimura¹, Ken'iti Kido¹ - Show less +2 more•Institutions (1)

Tohoku University¹

22 Nov 1989

TL;DR: A method of character recognition using prior knowledge of the script Devanagari, a script widely used in India at present and found in Buddhist texts of the past, is proposed.

...read moreread less

Abstract: A method of character recognition using prior knowledge of the script is proposed Devanagari, a script widely used in India at present, and found in Buddhist texts of the past, is used for this purpose This study is confined to recognizing a particular font used in a printed Buddhist text: Saddharmapundarika >

...read moreread less

Proceedings Article•

Optical character recognition using artificial neural networks

[...]

E. Alpaydin¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

16 Oct 1989

TL;DR: Optical character recognition is examined to find a general framework by which it can be realized and a hierarchical 'cone' with feature extraction layers of increasing sophistication is described.

...read moreread less

Abstract: Optical character recognition is examined to find a general framework by which it can be realized. A hierarchical 'cone' with feature extraction layers of increasing sophistication is described. The system, unlike the artificial neural net examples in the literature, does not use one network only. Allowing recognition to take place in parallel over different representations of the same symbol introduces redundancy, facilities learning and thus improves performance. The resource requirements of the system, which parallel operation inevitably increases, can be decreased by limiting the size of the image that can be 'seen' at one time. There is an 'eye' that can be moved around and fixed on any part of the scene which returns detailed information about a small part of the scene. The integration of successive eye fixations is a temporal process and the operation of the system also turns into that of relaxation in time where temporal expectations and selective attention should be taken into account. One possibility for representing spatial relations by introducing sequential scanning of the image is shown. Synapses with an internal delay, together with a temporal summation mechanism, are proposed by which this order can be checked. Work is currently going on to apply this mechanism to more realistic objects, feature sets, and scanning orders. >

...read moreread less

Proceedings Article•DOI•

Human-based character string image retrieval from textual images

[...]

K. Yokosawa

14 Nov 1989

TL;DR: In an experiment in which human subjects were to identify characters shown on a CRT screen, character-string search performance was better than single-character search performance in Japanese sentence contexts, suggesting that character strings contain more information than characters alone.

...read moreread less

Abstract: A direct character-string retrieval method based on human search characteristics is proposed for Japanese textual image processing. In an experiment in which human subjects were to identify characters shown on a CRT screen, character-string search performance was better than single-character search performance in Japanese sentence contexts. This result suggests that character strings contain more information than characters alone. The analysis of reaction times shows that there are two stages in visual search. A character-string image retrieval system which, human-like, has two stages is effective in searching target images in ninety Japanese textual images. Moreover, human-like performance is obtained from the system: for example, it more easily identifies character-string images than single-character images. >

...read moreread less

Proceedings Article•DOI•

A comparison of feedforward and self-organizing approaches to the font orientation problems

[...]

Morris¹, Rubin¹, Tirri•Institutions (1)

Bell Labs¹

01 Jan 1989

TL;DR: The problem of determining the orientation of printed text is considered, and a feedforward network with structure and parameters derived using optimal detection theory and the learning vector quantization self-organizing networks of T. Kohonen are described.

...read moreread less

Abstract: The problem of determining the orientation of printed text is considered. The problem differs considerably from traditional optical character recognition, and its application to automatic inspection requires efficient processing and highly accurate results. Two methods are described. The first is a feedforward network, with structure and parameters derived using optimal detection theory. The second method makes use of the learning vector quantization self-organizing networks of T. Kohonen (Self-organization and Associative Memory, Springer-Verlag, 1988). Experimental results and a complete implementation are described. Both techniques are found to be successful and their relative advantages are discussed. >

...read moreread less

Proceedings Article•DOI•

Extracting text from real-world scenes

[...]

J. Patrick Bixler¹, David P. Miller•Institutions (1)

Virginia Tech¹

05 Jan 1989

TL;DR: The feasibility of extracting text from an arbitrary scene and using that information to guide the navigation of a mobile robot is discussed.

...read moreread less

Abstract: Many scenes contain significant textual information that can be extremely helpful for understanding and/or navigation. For example, text-based information can frequently be the primary cure used for navigating inside buildings. A subject might first read a marquee, then look for an appropriate hallway and walk along reading door signs and nameplates until the destination is found. Optical character recognition has been studied extensively in recent years, but has been applied almost exclusively to printed documents. As these techniques improve it becomes reasonable to ask whether they can be applied to an arbitrary scene in an attempt to extract text-based information. Before an automated system can be expected to navigate by reading signs, however, the text must first be segmented from the rest of the scene. This paper discusses the feasibility of extracting text from an arbitrary scene and using that information to guide the navigation of a mobile robot. Considered are some simple techniques for first locating text components and then tracking the individual characters to form words and phrases. Results for some sample images are also presented.

...read moreread less

Proceedings Article•DOI•

A hierarchical system for character recognition

[...]

J.A. Vlontzos¹, S.Y. Kung•Institutions (1)

University of Southern California¹

08 May 1989

TL;DR: A hierarchical system for character recognition with hidden-Markov model knowledge sources that solves both the context-sensitive problem and the character-instantiation problem is presented, thus permitting real-time multifont and multisize printed character recognition as well as handwriting recognition.

...read moreread less

Abstract: A hierarchical system for character recognition with hidden-Markov model knowledge sources that solves both the context-sensitive problem and the character-instantiation problem is presented. The algorithms and the structure of the system are described, and its operation is discussed. The system achieves 97 to 99% accuracy using a two-level architecture and has been implemented using a systolic array, thus permitting real-time (1 ms per character) multifont and multisize printed character recognition as well as handwriting recognition. >

...read moreread less

Journal Article•DOI•

Optical Chinese Character Recognition Using Accumulated Stroke Features

[...]

Bor-Shenn Jeng¹•Institutions (1)

National Central University¹

01 Jul 1989-Optical Engineering

TL;DR: An intelligent optical Chinese character recognition system using accumulated stroke features has been developed to solve the input problem of Chinese characters and results show that 99% of printed characters and 90% of constrained handwritten characters can be correctly recognized.

...read moreread less

Abstract: An intelligent optical Chinese character recognition system using accumulated stroke features has been developed to solve the input problem of Chinese characters. The hardware architecture of the system is built on an IBM PC-AT with three extension boards: the preprocessor board, the feature extraction board, and the matching recognition board. The system can recognize, at the same time in the same program, either printed or handwritten Chinese characters of different styles and sizes. At present, a total of 5401 commonly used Chinese characters can be recognized. Results show that 99% of printed characters and 90% of constrained handwritten characters can be correctly recognized, at a speed of about 300 characters per minute.

...read moreread less

Patent•

Optical character recognition apparatus and method using masks operation

[...]

Wayne Wang¹, Alan Lin¹•Institutions (1)

Ricoh¹

28 Mar 1989

TL;DR: An optical character recognition apparatus and method as a mask operation in conjunction with a decision tree process to provide fast recognition of multi-font alphanumeric characters on a document is described in this paper.

...read moreread less

Abstract: An optical character recognition apparatus and method as a mask operation in conjunction with a decision tree process to provide fast recognition of multi-font alphanumeric characters on a document.

...read moreread less

Isolating individual handwritten characters

[...]

C.G. Leedham¹, P.D. Friday•Institutions (1)

University of Essex¹

02 Oct 1989

TL;DR: The output from the pre-segmenter is passed into the character recognition stage of the document reader, which attempts classification of the objects between each pair of character boundaries within a joined group of characters.

...read moreread less

Abstract: Describes a document reading system incorporating a pre-segmenter. The output from the pre-segmenter is passed into the character recognition stage of the document reader, which attempts classification of the objects between each pair of character boundaries within a joined group of characters. The combination of boundaries which achieves the highest overall confidence level for the whole group of joined characters is chosen to be the correct one. In addition, post-processing may be used to choose between the most likely interpretations of the words. This may simply take the form of a spelling checker correction system, or may be fed back into the earlier stages of the document reading system to request or suggest new segmentation positions or alternative character classifications.

...read moreread less

Proceedings Article•DOI•

IOCR: an intelligent optical character reader

[...]

Kwong-Sak Leung¹, K.H. Lee¹•Institutions (1)

The Chinese University of Hong Kong¹

22 Nov 1989

TL;DR: Intelligent optical character reader for reading printed English text with alphanumeric symbols is presented and the spelling check and the feature analysis are based on some intelligent statistical rules, and together they enhance the recognition rate.

...read moreread less

Abstract: Intelligent optical character reader (IOCR) for reading printed English text with alphanumeric symbols is presented. There are seven major functional units in the system: a self trainer, a graphic editor, a database manager, a token extractor, a pattern matcher, a spelling checker and a guess by feature analyzer. The first three units provides facilities for reference pattern manipulation, while the last four units are for character recognition. The spelling check and the feature analysis are based on some intelligent statistical rules, and together they enhance the recognition rate. >

...read moreread less

Enhanced Good-Turing and Cat-Cal: Two New Methods for Estimating Probabilities of English Bigrams (abbreviated version)

[...]

Kenneth Church, William A. Gale

01 Jan 1989

TL;DR: This research is directed to 'backing-off' methods, that is, methods that build an (n+l)gram model from an n-gram model.

...read moreread less

Abstract: For many pattern recognition applications including speech recognition and optical character recognition, prior models of language are used to disambiguate otherwise equally probable outputs It is common practice to use tables of probabilities of single words, pairs of words, and triples of words (n-grams) as a prior model Our research is directed to 'backing-off' methods, that is, methods that build an (n+l)gram model from an n-gram model

...read moreread less

Proceedings Article•DOI•

Application Of Mathematical Morphology To Handwritten ZIP Code Recognition

[...]

Andrew M. Gillies¹, Paul D. Gader¹, Michael P. Whalen¹, Brian T. Mitchell¹•Institutions (1)

Environmental Research Institute of Michigan¹

01 Nov 1989

TL;DR: The morphological techniques used for preprocessing address block images, locating address block lines, splitting touching characters, and identifying handwritten numerals combine mathematical morphology, hierarchical matching of object models to symbolic image representations, and a strategy of propagating multiple hypotheses.

...read moreread less

Abstract: This paper describes applications of mathematical morphology to a system for recognizing handwritten ZIP Codes. It discusses morphological techniques used for preprocessing address block images, locating address block lines, splitting touching characters, and identifying handwritten numerals. These techniques combine mathematical morphology, hierarchical matching of object models to symbolic image representations, and a strategy of propagating multiple hypotheses. The various submodules of the system have been trained on over two thousand real address block images and tested on one thousand representative images. On the one thousand test images, the system correctly located 82.5 percent, correctly identified 45.6 percent, and incorrectly classified only 0.8% of the ZIP Codes. This system performance level could lead to a significant cost savings in mail piece sorting.

...read moreread less

Proceedings Article•DOI•

Optical machine recognition of Greek characters of any size

[...]

N. Alvertos¹, Ivan D'Cunha¹•Institutions (1)

Old Dominion University¹

09 Apr 1989

TL;DR: An algorithm for recognizing printed Greek letters that consists of a preprocessor for thinning and noise renewal and a classifier which differentiates each of the characters based on features such as existence of closed curve, number of intersections,Number of free ends, horizontal and vertical symmetry, and existence of diagonal neighbors.

...read moreread less

Abstract: The authors have developed and implemented an algorithm for recognizing printed Greek letters. The scheme consists of a preprocessor for thinning and noise renewal and a classifier which differentiates each of the characters based on features such as existence of closed curve, number of intersections, number of free ends, horizontal and vertical symmetry, and existence of diagonal neighbors. An algorithm based on mathematical modeling of the characters to classify the handwritten Greek letters is also investigated. Experimental results indicate that the percentage of successful recognition is almost 100% depending on how well the characters were thinned. >

...read moreread less

Journal Article•DOI•

Coping with some really rotten problems in automatic music recognition

[...]

A.T. Clarke¹, B. M. Brown¹, M.P. Thorne•Institutions (1)

University of Wales¹

01 Aug 1989-Microprocessing and Microprogramming

TL;DR: Some of the problems encountered, and some of the techniques that have been used and implemented, during the development of an Optical Character Recognition system for printed music, are described.

...read moreread less

Proceedings Article•DOI•

A Class Of Iterative Thresholding Algorithms For Real-Time Image Segmentation

[...]

Mohammad H. Hassan¹•Institutions (1)

Lawrence Technological University¹

27 Mar 1989

TL;DR: A real-time region growing algorithm, which locates the objects in the image while thresholding, is developed and implemented in a raster-scan format, making them attractive for real- time image segmentation in situations requiring fast data throughput such as robot vision and character recognition.

...read moreread less

Abstract: Thresholding algorithms are developed for segmenting gray-level images under nonuniform illumination. The algorithms are based on learning models generated from recursive digital filters which yield to continuously varying threshold tracking functions. A real-time region growing algorithm, which locates the objects in the image while thresholding, is developed and implemented. The algorithms work in a raster-scan format, thus making them attractive for real-time image segmentation in situations requiring fast data throughput such as robot vision and character recognition.

...read moreread less

Proceedings Article•DOI•

Stroke-Order Independent On-Line Recognition Of Handwritten Chinese Characters

[...]

Chang-Keng Lin¹, Bor-Shenn Jeng¹, Chun-Jen Lee²•Institutions (2)

National Central University¹, Ministry of Communications²

01 Nov 1989

TL;DR: This paper proposes an on-line handwritten Chinese character recognition system based on stroke-sequence feature extraction, using the finite state matching mechanism to extract primitive strokes, represented as stroke string, from the input character.

...read moreread less

Abstract: This paper proposes an on-line handwritten Chinese character recognition system based on stroke-sequence feature extraction. The character to be recognized can be stroke-order and stroke-number free, tolerance for combined strokes, size flexible, but within the constraint of normal hand-writing. Firstly, the recognizer, using the finite state matching mechanism, is used to extract primitive strokes, represented as stroke string, from the input character. Secondly, the recognizer, using a modified dynamic programming matching method, is employed to perform recognition processes with the stroke-string features. Reference patterns have been generated 2500 Chinese characters with stroke-numbers ranging from 1 to 29. The recognition results are based upon the 1800 handwritten characters by 10 people. The obtained recognition rate is 94.5%, and the cumulative classification rate of choosing fourth most similar characters is up to 98.7%. In the last part, a secondary recognition mechanism is used to further tell apart the candidates involved. The final recognition rate may be promoted up to 99%.

...read moreread less

Proceedings Article•DOI•

Cartographic Character Recognition

[...]

Howard B. Rafal, Matthew O. Ward¹•Institutions (1)

Worcester Polytechnic Institute¹

01 Nov 1989

TL;DR: This work details a methodology for recognizing text elements on cartographic documents using blobbing, stringing, and recognition, which helps make decisions about string paths.

...read moreread less

Abstract: This work details a methodology for recognizing text elements on cartographic documents. Cartographic Character Recognition differs from traditional OCR in that many fonts may occur on the same page, text may have any orientation, text may follow a curved path, and text may be interfered with by graphics. The technique presented reduces the process to three steps: blobbing, stringing, and recognition. Blobbing uses image processing techniques to turn the gray level image into a binary image and then separates the image into probable graphic elements and probable text elements. Stringing relates the text elements into words. This is done by using proximity information of the letters to create string contours. These contours also help to retrieve orientation information of the text element. Recognition takes the strings and associates a letter with each blob. The letters are first approximated using feature descriptions, resulting in a set of possible letters. Orientation information is then used to refine the guesses. Final recognition is performed using elastic matching Feedback is employed at all phases of execution to refine the processing. Stringing and recognition give information that is useful in finding hidden blobs. Recognition helps make decisions about string paths. Results of this work are shown.

...read moreread less

Perception of multi-author handprinted text

[...]

R.R. Malyan, S. Sunthankar, H. Teranchi, A. Yeghiazarian

02 Oct 1989

Journal Article•DOI•

Keyless entry: building a text database using OCR technology

[...]

C. W. Grotophorst¹•Institutions (1)

George Mason University¹

03 Jan 1989-Library Hi Tech

TL;DR: A prototypical “local” project—the creation of a full‐text database of dissertations done at George Mason University—has been undertaken by the Fenwick Library at that institution.

...read moreread less

Abstract: Optical character recognition (OCR) technology can be employed to produce an ASCII‐text database for mounting on computer systems. Current technologies and principles of scanning and OCR are discussed. A prototypical “local” project—the creation of a full‐text database of dissertations done at George Mason University—has been undertaken by the Fenwick Library at that institution. Problems encountered with current scanning and OCR technologies are illustrated and discussed, as well as techniques and “filter” programs developed to streamline the scanning and OCR conversion process.

...read moreread less