Showing papers on "Optical character recognition published in 1983"

PDF

Open Access

Journal Article•DOI•

Image thresholding for optical character recognition and other applications requiring character image extraction

[...]

J. M. White¹, G. D. Rohrer¹•Institutions (1)

01 Jul 1983-Ibm Journal of Research and Development

TL;DR: Two new, cost-effective thresholding algorithms for use in extracting binary images of characters from machine- or hand-printed documents are described, with a more aggressive approach directed toward specialized, high-volume applications which justify extra complexity.

...read moreread less

Abstract: Two new, cost-effective thresholding algorithms for use in extracting binary images of characters from machine- or hand-printed documents are described. The creation of a binary representation from an analog image requires such algorithms to determine whether a point is converted into a binary one because it falls within a character stroke or a binary zero because it does not. This thresholding is a critical step in Optical Character Recognition (OCR). It is also essential for other Character Image Extraction (CIE) applications, such as the processing of machine-printed or handwritten characters from carbon copy forms or bank checks, where smudges and scenic backgrounds, for example, may have to be suppressed. The first algorithm, a nonlinear, adaptive procedure, is implemented with a minimum of hardware and is intended for many CIE applications. The second is a more aggressive approach directed toward specialized, high-volume applications which justify extra complexity.

...read moreread less

283 citations

Journal Article•DOI•

Integrating diverse knowledge sources in text recognition

[...]

Sargur N. Srihari¹, Jonathan J. Hull¹, Ramesh Choudhari¹•Institutions (1)

University at Buffalo¹

01 Jan 1983-ACM Transactions on Information Systems

TL;DR: An algorithm for text recognition/correction that effectively merges a bottom-up refinement process that is based on the utilization of transitional probabilities and letter confusion probabilities, known as the Viterbi algorithm [VA], together with a top-down process based on searching a trie structure representation of a lexicon.

...read moreread less

Abstract: The capabilities of present commercial machines for producing correct text by recognizing words in print, handwriting and speech are very limited. For example, most optical character recognition [OCR] machines are limited to a few fonts of machine print, or text that is handprinted under certain constraints; any deviation from these constraints will produce highly garbled text. This paper describes an algorithm for text recognition/correction that effectively merges a bottom-up refinement process that is based on the utilization of transitional probabilities and letter confusion probabilities, known as the Viterbi algorithm [VA], together with a top-down process based on searching a trie structure representation of a lexicon. The algorithm is applicable to text containing an arbitrary number of character substitution errors such as that produced by OCR machines.

...read moreread less

109 citations

Patent•

Method of optical character recognition

[...]

Warner C. Scott¹•Institutions (1)

Texas Instruments¹

01 Aug 1983

TL;DR: In this article, a method for recognizing and providing an output corresponding to a character in which the character is received by an imager, digitized, and transmitted to a memory is presented.

...read moreread less

Abstract: A method for recognizing and providing an output corresponding to a character in which the character is received by an imager, digitized, and transmitted to a memory. Data in the memory is read in a sequence which circumnavigates the test character. Only data representative of the periphery of the character are read. During the circumnavigation, character parameters, such as height, width, perimeter, area and waveform are determined. The character parameters are compared with reference character parameters and the ASCII code for the reference character which matches the character is provided as an output.

...read moreread less

52 citations

Journal Article•DOI•

A processor-based OCR system

[...]

Richard G. Casey¹, C. R. Jih¹•Institutions (1)

IBM¹

01 Jul 1983-Ibm Journal of Research and Development

TL;DR: A previously developed classification technique, based on decision trees, has been extended in order to improve reading accuracy in an environment of considerable character variation, including the possibility that documents in the same font style may be produced using quite different print technologies.

...read moreread less

Abstract: A low-cost optical character recognition (OCR) system can be realized by means of a document scanner connected to a CPU through an interface. The interface performs elementary image processing functions, such as noise filtering and thresholding of the video image from the scanner. The processor receives a binary image of the document, formats the image into individual character patterns, and classifies the patterns one-by-one. A CPU implementation is highly flexible and avoids much of the development and manufacturing costs for special-purpose, parallel circuitry typically used in commercial OCR. A processor-based recognition system has been investigated for reading documents printed in fixed-pitch conventional type fonts, such as occur in routine office typing. Novel, efficient methods for tracking a print line, resolving it into individual character patterns, detecting underscores, and eliminating noise have been devised. A previously developed classification technique, based on decision trees, has been extended in order to improve reading accuracy in an environment of considerable character variation, including the possibility that documents in the same font style may be produced using quite different print technologies. The system has been tested on typical office documents, and also on artificial stress documents, obtained from a variety of typewriters.

...read moreread less

27 citations

Journal Article•DOI•

Revising Documents with Text Editors, Handwriting Recognition, and Speech Recognition Systems

[...]

John D. Gould

01 Oct 1983

TL;DR: Human f a c t o r s r e s e a r c h h a s n o t b e e n d i r e c t e d a t u n d e 7 s t a n d d i n g t h i s p r o c e s s .

...read moreread less

Abstract: A m a i n t a s k o f s e c r e t a r i e s a n d t y p i s t s i s t o r e t y p e d o c u m e n t s a f t e r t h e y h a v e b e e n e d i t e d i n p e n c i l by p r i n c i p a l s . I n c r e a s i n g l y , t h e y u s e w o r d p r o c e s s i n g s y s t e m s t o do t h i s . I n a d d i t i o n , some p r i n c i p a l s t y p e t h e i r own r e v i s i o n s a f t e r f i r s t m a k i n g t h e m i n p e n c i l . We h a v e i n f o r m a l l y o b s e r v e d t h a t p e o p l e u s i n g t e x t e d i t o r s s p e n d much o f t h e i r t i m e i n ( a ) v i s u a l s e a r c h ( l o o k i n g b a c k a n d f o r t h b e t w e e n t h e m a n u s c r i p t a n d t h e s c r e e n ) ; ( b ) d e c i s i o n m a k i n g ( d e c i d i n g how t o l o c a t e t h e r i g h t p l a c e i n t h e c o m p u t e r f i l e , d e c i d i n g how t o make t h e r e v i s i o n ) ; a n d ( c ) r e r e a d i n g . A c t u a l t i m e s p e n t e x e c u t i n g a command seems s m a l l i n c o m p a r i s o n . Human f a c t o r s r e s e a r c h h a s n o t b e e n d i r e c t e d a t u n d e 7 s t a n d i n g t h i s p r o c e s s .

...read moreread less

9 citations

DOI•

A digital image preprocessor for optical character recognition

[...]

Wolfram H.H.J. Lunscher

01 Jan 1983

1 citations

Optical Character Recognition for Automated Cartography: The Advanced Development Handprinted Symbol Recognition System.

[...]

Robert M Brown, C F Cheng

01 Mar 1983

TL;DR: The DMA Subtask objectives are provides and the general structure of the Handprinted Symbol Recognition System is outlined, which considers the key issues of information content, problems in the thinning or vectorization of a character, shape measurement and feature extraction, and finally character recognition or labeling.

...read moreread less

Abstract: : This NORDA Technical Note is composed of five chapters. The first chapter presents an overview of optical character recognition (OCR) and its relation to the automated cartography environment. It provides the DMA Subtask objectives and discusses them in the light of symbol digitizing and information transformations. The division of a total OCR system into data acquisition/document management and isolated character recognition is considered along with NORDA's recent tasking (FY-82) prototype for DMA production centers. Chapter Two presents a discussion of the different ways in which recognition systems are constructed. In particular, it considers the differences in approach necessary for constrained and free-form OCR. Chapter Three describes the DMA environment in which a handprinted OCR system must operate and discusses performance requirements. The general structure of the Handprinted Symbol Recognition System is outlined in Chapter Four. This material considers the key issues of information content, problems in the thinning or vectorization of a character, shape measurement and feature extraction, and finally character recognition or labeling. The interaction between each of these elements is emphasized. Chapter Five provides a brief summary of the current Subtask accomplishments and status along with areas where work is in progress toward developing other handprinted OCR capabilities for DMA.

...read moreread less

1 citations

Proceedings Article•DOI•

Development of a Hand-Held Camera for Inkprint to Text Translation

[...]

Sally L. Wood

22 Jun 1983

TL;DR: In this paper, a handheld camera which translates inkprint into computer readable text has a wide variety of potential applications, such as the input device for a voice output reading machine for the visually impaired.

...read moreread less

Abstract: A handheld camera which translates inkprint into computer readable text has a wide variety of potential applications, such as the input device for a voice output reading machine for the visually impaired. Commercial optical character recognition systems typically operate on controlled input and read with high speed and accuracy. As an input device for a voice output reading machine, much slower speeds of 200 words per minute are acceptable, but the text input is much less controlled in terms of quality, type size, style and format. In order to acquire text from a complex format which may include multiple columns, pictures, and graphs, operator control of the scanning is important. Automatic control of threshold and magnification combined with user control of the scanning sequence based on direct feedback from the camera image offers a potentially efficient input structure. Automatic thresholding and magnification control algorithms which work well on newsprint and good quality type are presented. Based on these results, spatial resolution and quantization requirements can be established for a system which will read text with an order of magnitude variation in size. Direct conversion auditory and tactile feedback for user control of scanning are considered.

...read moreread less