Showing papers on "Intelligent word recognition published in 1995"

PDF

Open Access

Comparison of learning algorithms for handwritten digit recognition

[...]

Yann LeCun, Lawrence D. Jackel, Léon Bottou¹, Léon Bottou², A. Brunot, Corinna Cortes², Corinna Cortes³, John S. Denker⁴, John S. Denker², Harris Drucker⁵, Harris Drucker⁴, Isabelle Guyon⁴, Urs A. Muller, E. Sackinger⁴, Patrice Y. Simard⁶, Patrice Y. Simard², Vladimir Vapnik - Show less +13 more•Institutions (6)

École Normale Supérieure¹, AT&T², Google³, Alcatel-Lucent⁴, Monmouth College⁵, Microsoft⁶

01 Jan 1995

TL;DR: This comparison of several learning algorithms for handwritten digits considers not only raw accuracy, but also rejection, training time, recognition time, and memory requirements.

...read moreread less

Abstract: COMPARISON OF LEARNINGALGORITHMS FOR HANDWRITTEN DIGITRECOGNITIONY. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes,J. Denker, H. Drucker, I. Guyon, U. M uller,E. Sackinger, P. Simard, and V. VapnikBell Lab oratories, Holmdel, NJ 07733, USAEmail: yann@research.att.comAbstractThis pap er compares the p erformance of several classi er algorithmson a standard database of handwritten digits. We consider not only rawaccuracy, but also rejection, training time, recognition time, and memoryrequirements.1

...read moreread less

633 citations

Learning algorithms for classification: A comparison on handwritten digit recognition

[...]

Yann LeCun, Lawrence D. Jackel, Léon Bottou¹, Léon Bottou², Corinna Cortes¹, Corinna Cortes³, John S. Denker¹, John S. Denker⁴, Harris Drucker⁵, Harris Drucker⁴, Isabelle Guyon⁴, Urs A. Muller, E. Sackinger⁴, Patrice Y. Simard⁶, Patrice Y. Simard¹, Vladimir Vapnik - Show less +12 more•Institutions (6)

AT&T¹, École Normale Supérieure², Google³, Alcatel-Lucent⁴, Monmouth College⁵, Microsoft⁶

01 Jan 1995

TL;DR: This paper compares the performance of several classi er algorithms on a standard database of handwritten digits by considering not only raw accuracy, but also training time, recognition time, and memory requirements.

...read moreread less

Abstract: This paper compares the performance of several classi er algorithms on a standard database of handwritten digits. We consider not only raw accuracy, but also training time, recognition time, and memory requirements. When available, we report measurements of the fraction of patterns that must be rejected so that the remaining patterns have misclassi cation rates less than a given threshold.

...read moreread less

451 citations

Journal Article•DOI•

Competition and segmentation in spoken word recognition

[...]

Dennis Norris, James M. McQueen¹, Anne Cutler¹•Institutions (1)

Max Planck Society¹

01 Sep 1995-Journal of Experimental Psychology: Learning, Memory and Cognition

TL;DR: This paper showed that competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition.

...read moreread less

Abstract: Spoken utterances contain few reliable cues to word boundaries, but listeners nonetheless experience little difficulty identifying words in continuous speech. The authors present data and simulations that suggest that this ability is best accounted for by a model of spoken-word recognition combining competition between alternative lexical candidates and sensitivity to prosodic structure. In a word-spotting experiment, stress pattern effects emerged most clearly when there were many competing lexical candidates for part of the input. Thus, competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition. A version of the Shortlist model (D. G. Norris, 1994b) incorporating the Metrical Segmentation Strategy (A. Cutler & D. Norris, 1988) accurately simulates the results using a lexicon of more than 25,000 words.

...read moreread less

267 citations

Journal Article•DOI•

Machine printed character segmentation —; An overview

[...]

Yi Lu¹•Institutions (1)

University of Michigan¹

01 Jan 1995-Pattern Recognition

TL;DR: An overview of the character segmentation techniques in machine-printed documents is presented, which will cover techniques for segmenting uniformed or proportional fonts, broken and touching characters; techniques based on text image features and techniquesbased on recognition results.

...read moreread less

206 citations

Journal Article•DOI•

Models of continuous speech recognition and the contents of the vocabulary

[...]

James M. McQueen¹, Anne Cutler¹, Ted Briscoe², Dennis Norris³•Institutions (3)

Max Planck Society¹, University of Cambridge², Medical Research Council³

01 Aug 1995-Language and Cognitive Processes

TL;DR: This paper showed that an overwhelming majority (84%) of polysyllables have shorter words embedded within them and that these embeddings are most common at the onsets of the longer word.

...read moreread less

Abstract: Several models of spoken word recognition postulate that recognition is achieved via a process of competition between lexical hypotheses. Competition not only provides a mechanism for isolated word recognition, it also assists in continuous speech recognition, since it offers a means of segmenting continuous input into individual words. We present statistics on the pattern of occurrence of words embedded in the polysyllabic words of the English vocabulary, showing that an overwhelming majority (84%) of polysyllables have shorter words embedded within them. Positional analyses show that these embeddings are most common at the onsets of the longer word. Although both phonological and syntactic constraints could rule out some embedded words, they do not remove the problem. Lexical competition provides a means of dealing with lexical embedding. It is also supported by a growing body of experimental evidence. We present results which indicate that competition operates both between word candidates that...

...read moreread less

94 citations

Proceedings Article•DOI•

Real-time on-line unconstrained handwriting recognition using statistical methods

[...]

Krishna Sundaram Nathan¹, Homayoon S. M. Beigi¹, Jayashree Subrahmonia¹, G.J. Clary¹, H. Maruyama¹ - Show less +1 more•Institutions (1)

IBM¹

09 May 1995

TL;DR: A general recognition system for large vocabulary, writer independent, unconstrained handwritten text, that performs recognition in real-time on 486 class PC platforms without the large amounts of memory required for traditional HMM based systems.

...read moreread less

Abstract: We address the problem of automatic recognition of unconstrained handwritten text. Statistical methods, such as hidden Markov models (HMMs) have been used successfully for speech recognition and they have been applied to the problem of handwriting recognition as well. We discuss a general recognition system for large vocabulary, writer independent, unconstrained handwritten text. "Unconstrained" implies that the user may write in any style e.g. printed, cursive or in any combination of styles. This is more representative of typical handwritten text where one seldom encounters purely printed or purely cursive forms. Furthermore, a key characteristic of the system is that it performs recognition in real-time on 486 class PC platforms without the large amounts of memory required for traditional HMM based systems. We focus mainly on the writer independent task. Some initial writer dependent results are also reported. An error rate of 18.9% is achieved for a writer-independent 21,000 word vocabulary task in the absence of any language models.

...read moreread less

86 citations

Journal Article•DOI•

Character recognition without segmentation

[...]

Jairo Rocha, Theodosios Pavlidis¹•Institutions (1)

State University of New York System¹

01 Sep 1995-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters based on a variant of the notion of relative neighborhood used in computational perception.

...read moreread less

Abstract: A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model. It is based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters. Gaps are identified as potential parts of characters by implementing a variant of the notion of relative neighborhood used in computational perception. Each subgraph of strokes that matches a previously defined character prototype is recognized anywhere in the word even if it corresponds to a broken character or to a character touching another one. The characters are detected in the order defined by the matching quality. Each subgraph that is recognized is introduced as a node in a directed net that compiles different alternatives of interpretation of the features in the feature graph. A path in the net represents a consistent succession of characters. A final search for the optimal path under certain criteria gives the best interpretation of the word features. Broken characters are recognized by looking for gaps between features that may be interpreted as part of a character. Touching characters are recognized because the matching allows nonmatched adjacent strokes. The recognition results for over 24,000 printed numeral characters belonging to a USPS database and on some hand-printed words confirmed the method's high robustness level. >

...read moreread less

62 citations

Proceedings Article•DOI•

Segmentation-free word recognition with application to Arabic

[...]

B. Al-Badr¹, Robert M. Haralick¹•Institutions (1)

University of Washington¹

14 Aug 1995

TL;DR: A system that recognizes machine-printed Arabic words without prior segmentation based on describing symbols in terms of shape primitives is described, which shows a recognition rate of 99.4% for noise-free text and 73% for scanned text.

...read moreread less

Abstract: This paper describes the design and implementation of a system that recognizes machine-printed Arabic words without prior segmentation. The technique is based on describing symbols in terms of shape primitives. At recognition time, the primitives are detected on a word image using mathematical morphology operations. The system then matches the detected primitives with symbol models. This leads to a spatial arrangement of matched symbol models. The system conducts a search in the space of spatial arrangements of models and outputs the arrangement with the highest posterior probability as the recognition of the word. The advantage of using this whole word approach versus a segmentation approach is that the result of recognition is optimized with regard to the whole word. Results of preliminary experiments using a lexicon of 42,000 words show a recognition rate of 99.4% for noise-free text and 73% for scanned text.

...read moreread less

61 citations

Book Chapter•DOI•

Visual Word Recognition: An Overview

[...]

Mark S. Seidenberg

01 Jan 1995

TL;DR: This chapter provides an overview of current research in this area, focusing on issues concerning the types of processing mechanisms and knowledge representations that are involved.

...read moreread less

Abstract: Visual word recognition is the aspect of reading for which the use of the visual perceptual channel has the largest impact on the comprehension process. This chapter provides an overview of current research in this area, focusing on issues concerning the types of processing mechanisms and knowledge representations that are involved. Whereas recognizing and pronouncing letter patterns are specific to this medium, most of the other capacities that are utilized in understanding a text are also used in comprehending spoken language. Thus, it is within the domain of word recognition that most of what is specific to reading comprehension rather than a reflection of more general linguistic factors, including the bases of reading-specific deficits and individual differences, is found. The chapter discusses that visual word recognition exists only because of the invention of writing systems, a relatively recently development in the span of human history. Reading exploits perceptual and cognitive capacities that did not evolve specifically for this function and provides an interesting domain in which to explore them.

...read moreread less

43 citations

Proceedings Article•DOI•

The handwritten trie: indexing electronic ink

[...]

Walid G. Aref¹, Daniel Barbará¹, Padmavathi Vallabhaneni¹•Institutions (1)

Princeton University¹

22 May 1995

TL;DR: An indexing technique based on Hidden Markov Models that dramatically improves the search time in a database of handwritten words and provides means for controlling the matching quality of the search process via a time-based budget is proposed.

...read moreread less

Abstract: The emergence of the pen as the main interface device for personal digital assistants and pen-computers has made handwritten text, and more generally ink, a first-class object. As for any other type of data, the need of retrieval is a prevailing one. Retrieval of handwritten text is more difficult than that of conventional data since it is necessary to identify a handwritten word given slightly different variations in its shape. The current way of addressing this is by using handwriting recognition, which is prone to errors and limits the expressiveness of ink. Alternatively, one can retrieve from the database handwritten words that are similar to a query handwritten word using techniques borrowed from pattern and speech recognition. In particular, Hidden Markov Models (HMM) can be used as representatives of the handwritten words in the database. However, using HMM techniques to match the input against every item in the database (sequential searching) is unacceptably slow and does not scale up for large ink databases. In this paper, an indexing technique based on HMMs is proposed. The new index is a variation of the trie data structure that uses HMMs and a new search algorithm to provide approximate matching. Each node in the tree contains handwritten letters, where each letter is represented by an HMM. Branching in the trie is based on the ranking of matches given by the HMMs. The new search algorithm is parametrized so that it provides means for controlling the matching quality of the search process via a time-based budget. The index dramatically improves the search time in a database of handwritten words. Due to the variety of platforms for which this work is aimed, ranging from personal digital assistants to desktop computers, we implemented both main-memory and disk-based systems. The implementations are reported in this paper, along with performance results that show the practicality of the technique under a variety of conditions.

...read moreread less

40 citations

Proceedings Article•DOI•

Handwritten word recognition for real-time applications

[...]

Gyeonghwan Kim¹, Venu Govindaraju•Institutions (1)

State University of New York System¹

14 Aug 1995

TL;DR: A fast handwritten word recognition system for real time applications is presented and dynamic matching between each character of a lexicon entry and segment(s) of input word image is used for ranking words in the lexicon.

...read moreread less

Abstract: A fast handwritten word recognition system for real time applications is presented. Preprocessing, segmentation and feature extraction are implemented using chain code representation. Dynamic matching between each character of a lexicon entry and segment(s) of input word image is used for ranking words in the lexicon. Speed of the entire recognition process is about 200 msec on a single SPARC-10 platform for lexicon size of 10. A top choice performance of 96% is achieved on a database of postal words captured at 212 dpi.

...read moreread less

Proceedings Article•DOI•

Graph-based handwritten digit string recognition

[...]

A. Filatov, A. Gitis, I. Kil

14 Aug 1995

TL;DR: A set of acceptable graph transformations corresponding to typical variations of the handwritten symbols allows us to solve the problems of structure recognition methods caused by a high variability of handwritten symbol topology.

...read moreread less

Abstract: The article presents a handwritten digit string recognition algorithm based on matching input subgraphs with prototype symbol graphs. The article defines a set of acceptable graph transformations corresponding to typical variations of the handwritten symbols. The search for a match between the input subgraph and prototype graph is conducted using this set of transformations. This approach allows us to solve the problems of structure recognition methods caused by a high variability of handwritten symbol topology. The article presents experimental results of the handwritten digit string recognition system.

...read moreread less

Proceedings Article•DOI•

Keyword spotting via word shape recognition

[...]

Jeff L. DeCurtins¹, Edward C. Chen¹•Institutions (1)

SRI International¹

30 Mar 1995

TL;DR: This paper describes a system developed for the detection of isolated words, word portions, as well as multi-word phrases in images of documents and provides for automated training of desired keywords and creation of indexing filters to speed matching.

...read moreread less

Abstract: With the advent of on-line access to very large collections of document images, electronic classification into areas of interest has become possible. A first approach to classification might be the use of OCR on each document followed by analysis of the resulting ASCII text. But if the quality of a document is poor, the format unconstrained, or time is critical, complete OCR of each image is not appropriate. An alternative approach is the use of word shape recognition (as opposed to individual character recognition) and the subsequent classification of documents by the presence or absence of selected keywords. Use of word shape recognition not only provides a more robust collection of features but also eliminates the need for character segmentation (a leading cause of error in OCR). In this paper we describe a system we have developed for the detection of isolated words, word portions, as well as multi-word phrases in images of documents. It is designed to be used with large, changeable, keyword sets and very large document sets. The system provides for automated training of desired keywords and creation of indexing filters to speed matching.© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

Patent•

Method of registering a character pattern into a user dictionary and a character recognition apparatus having the user dictionary

[...]

Takashi Harada¹, Katsuhiko Sakaguchi¹, Shigeki Mori¹, Kazuhiro Matsubayashi¹, Tsunekazu Arai¹, Eiji Takasu¹, Hiroto Yoshii¹ - Show less +3 more•Institutions (1)

Canon Inc.¹

09 Feb 1995

TL;DR: In this paper, an input handwritten character pattern is subjected to character recognition processing, and a recognition reliability of the character as a standard characteristic feature pattern is determined from the recognition result.

...read moreread less

Abstract: In the present invention, an input handwritten character pattern is subjected to character recognition processing, and a recognition reliability of the character as a standard characteristic feature pattern is determined from the recognition result. If the recognition reliability is low, a warning is issued. In response to the warning, a user or operator can decide whether the character pattern should be registered in the user dictionary (106). If it is decided that the character pattern should be registered in the user dictionary, the character pattern is stored in the user dictionary with the information representing that the character pattern has low recognition reliability. When character patterns registered in the user dictionary are displayed on a screen, these characters are displayed in such a manner that it is possible to distinguish characters having low recognition reliability from characters having high recognition reliability. There is also provided a user name index file (5309) for storing information regarding characteristic features of a handwritten character pattern peculiar to a specific user. Furthermore, there is also provided a password input-and-decision part (5103) for making a decision of whether or not allow to access to the user dictionary based on the information of the handwritten character pattern input by a specific user.

...read moreread less

Proceedings Article•DOI•

The analysis of error in an on-line recognition system of Arabic handwritten characters

[...]

Adel M. Alimi¹, O.A. Ghorbel•Institutions (1)

École Normale Supérieure¹

14 Aug 1995

TL;DR: A way is shown to select the optimum values for some key parameters of the system to obtain minimum recognition error rates and to present here the effects of some parameters of this system on its performances and on its recognition errors.

...read moreread less

Abstract: In this paper we describe a system that recognizes on-line Arabic handwriting characters. In this system, a dynamic programming algorithm is implemented. We present here the effects of some parameters of the system on its performances and on its recognition errors. This study shows a way to select the optimum values for some key parameters of the system to obtain minimum recognition error rates.

...read moreread less

Journal Article•DOI•

Handprinted word recognition on a NIST data set

[...]

Paul D. Gader¹, Michael P. Whalen, Margaret J. Ganzberger², Dan Hepp²•Institutions (2)

University of Missouri¹, Environmental Research Institute of Michigan²

08 Jan 1995

TL;DR: An approach to handprinted word recognition is described, based on the use of generating multiple possible segmentations of a word image into characters and matching these segmentations to a lexicon of candidate strings.

...read moreread less

Abstract: An approach to handprinted word recognition is described. The approach is based on the use of generating multiple possible segmentations of a word image into characters and matching these segmentations to a lexicon of candidate strings. The segmentation process uses a combination of connected component analysis and distance transform-based, connected character splitting. Neural networks are used to assign character confidence values to potential character within word images. Experimental results are provided for both character and word recognition modules on data extracted from the NIST handprinted character database.

...read moreread less

Book Chapter•DOI•

Fuzzy Full-Text Searches in OCR Databases

[...]

Andreas Myka¹, Ulrich Güntzer¹•Institutions (1)

University of Tübingen¹

15 May 1995

TL;DR: This chapter illustrates some of the possible methods that cope with the uncertainty of the database entries and add fuzziness to precisely formulated queries in order to increase their recall.

...read moreread less

Abstract: Though the quality of optical character recognition software is steadily improving, it is still far from being perfect. As a result, full-text databases that are lled by means of OCR software contain many errors. These errors have to be taken into consideration if such kind of databases are examined by means of full-text searches. In this chapter, we will illustrate some of the possible methods that { to a certain extent { cope with the uncertainty of the database entries. These methods add fuzziness to precisely formulated queries in order to increase their recall. In addition, the described methods are compared to the method of matching query terms exactly: the preliminary results of tests that show their eeects on recall and precision are given.

...read moreread less

Proceedings Article•DOI•

Visual inter-word relations and their use in OCR postprocessing

[...]

T. Hong¹, J.J. Hull¹•Institutions (1)

State University of New York System¹

14 Aug 1995

TL;DR: A technique is presented that uses visual relationships between word images in a document to improve the recognition of the text it contains and the resulting clusters are integrated with the recognition results provided by an OCR system.

...read moreread less

Abstract: A technique is presented that uses visual relationships between word images in a document to improve the recognition of the text it contains. This technique takes advantage of the visual relationships between word images that are usually lost in most conventional optical character recognition (OCR) techniques. The visual relations are defined to be the equivalence that exists between images of the same word or portions of word images. An algorithm is presented that calculates these relationships in a document. The resulting clusters are integrated with the recognition results provided by an OCR system. Inconsistencies in OCR results between equivalent images are identified and used to improve recognition performance. Experimental results are presented in which the input is provided directly from a commercial OCR system.

...read moreread less

Proceedings Article•DOI•

Segmentation and recognition of handwritten characters using subspace method

[...]

Yasuo Ariki¹, Y. Motegi•Institutions (1)

Ryukoku University¹

14 Aug 1995

TL;DR: This paper proposes a method to solve the problem of segmentation of characters freely written on papers by performing character recognition in segmentation process based on a subspace method and performs the character recognition simultaneously.

...read moreread less

Abstract: Segmentation of characters freely written on papers is a difficult problem for a computer system. Conventionally this problem has been dealt with by image processing such os horizontal or vertical projection. But it sometimes splits and merges the character images and fails to correctly segment them, due to its lack of character recognition ability in the segmentation process. We propose in this paper a method to solve this problem by performing character recognition in segmentation process based on a subspace method. At first, a binary image on which characters are written, is scanned by a fixed sale of a window. At every scanning location, 196(7/spl times/7/spl times/4) features are obtained and projected to each character subspace. The character recognition using subspace method is carried out and character name (or group name) and its confidence are obtained. Since this character segmentation based on the subspace method performs the character recognition simultaneously, it can be applied to isolatedly or cursively written characters.

...read moreread less

Journal Article•DOI•

Recognition of chain-coded handwritten character images with scanning n-tuple method

[...]

Simon M. Lucas¹, A. Amiri¹•Institutions (1)

University of Essex¹

23 Nov 1995-Electronics Letters

TL;DR: A method of applying n-tuple recognition techniques to handwritten OCR, which involves scanning an n-Tuple classifier over a chain-code of the image, is described, offering superior recognition accuracy, as demonstrated by results on three widely used data sets.

...read moreread less

Abstract: A method of applying n-tuple recognition techniques to handwritten OCR, which involves scanning an n-tuple classifier over a chain-code of the image, is described. The traditional advantages of n-tuple recognition, i.e. training and recognition speed, are retained, while offering superior recognition accuracy, as demonstrated by results on three widely used data sets.

...read moreread less

Journal Article•DOI•

Computer recognition of printed Bangla script

[...]

Umapada Pal¹, Bidyut B. Chaudhuri¹•Institutions (1)

Indian Statistical Institute¹

01 Nov 1995-International Journal of Systems Science

TL;DR: A complete OCR system for Bangla, the second most popular script in the Indian subcontinent, is described, where more than three hundred character shapes are recognized by a combination of template and feature-matching approach.

...read moreread less

Abstract: This paper considers optical character recognition (OCR) of Bangla, the second most popular script in the Indian subcontinent. A complete OCR system is described for documents of single Bangla font, where more than three hundred character shapes are recognized by a combination of template and feature-matching approach. Here the document image captured by a flatbed scanner is subject to tilt correction, line, word and character segmentation, simple and compound character separation, feature extraction and finally character recognition. Some character occurrence statistics have been computed to aid the recognition process. The simple character recognition is done by a feature-based tree classifier, and the compound character recognition involves a template matching approach preceded by a feature-based grouping. At present, recognition accuracy of about 96% is obtained by the system.

...read moreread less

Book Chapter•DOI•

On-Line Handwritten Alphanumeric Character Recognition Using Feature Sequences

[...]

Xiaolin Li¹, Dit-Yan Yeung¹•Institutions (1)

Hong Kong University of Science and Technology¹

11 Dec 1995

TL;DR: This paper presents an approach in which an on-line handwritten character is characterized by a sequence of dominant points in strokes and a sequences of writing directions between consecutive dominant points.

...read moreread less

Abstract: In this paper we present an approach in which an on-line handwritten character is characterized by a sequence of dominant points in strokes and a sequence of writing directions between consecutive dominant points. The directional information is used for character preclassification and the positional information is used for fine classification. Doth preclassification and fine classification are based on dynamic programming matching. A recognition experiment has been conducted with 62 character classes of different writing styles and 21 people as data contributors. The recognition rate of this experiment is 91%, with 7.9% substitution rate and 1.1% rejection rate. The average processing time is 0.35 second per character on a 486 50MHz personal computer.

...read moreread less

Proceedings Article•DOI•

An HMM-based legal amount field OCR system for checks

[...]

András Kornai¹, K. M. Mohiuddin¹, S.D. Connell•Institutions (1)

IBM¹

22 Oct 1995

TL;DR: It is argued that the most significant source of error in handwriting recognition is the segmentation process, and the HMM system described in this paper avoids taking segmentation decisions early in the recognition process.

...read moreread less

Abstract: The system described in this paper applies hidden Markov technology to the task of recognizing the handwritten legal amount on personal checks. We argue that the most significant source of error in handwriting recognition is the segmentation process. In traditional handwriting OCR systems, recognition is performed at the character level, using the output of an independent segmentation step. Using a fixed stepsize series of vertical slices from the image, the HMM system described in this paper avoids taking segmentation decisions early in the recognition process.

...read moreread less

Proceedings Article•

On the Use of Pronunciation Rules for Improved Word Recognition

[...]

Nick Cremelie, Jean-Pierre Martens

01 Jan 1995

Proceedings Article•DOI•

Lexicon-driven word recognition

[...]

Chien-Huei Chen

14 Aug 1995

TL;DR: A new approach to word recognition that uses a lexicon to "drive" the recognition process and performs recognition by verifying character hypotheses, as opposed to the classification method used in most conventional optical character recognition systems.

...read moreread less

Abstract: Most conventional document understanding systems use lexicons only in a postprocessing step to verify or correct character recognition results. The authors present a new approach to word recognition that uses a lexicon to "drive" the recognition process. Lexicon words are encoded in trie data structures, and recognition of a word image is done by searching a lexicon trie for a path whose node characters yield the best match to the word image. This approach has two important advantages. First, it is segmentation-free; there is no need to presegment the text image into isolated characters. Second, it performs recognition by verifying character hypotheses, as opposed to the classification method used in most conventional optical character recognition (OCR) systems. Hence, the recognition process is more efficient and the results are more accurate. They demonstrated the feasibility and the advantage of this approach with a lexicon size of more than 50000 words, on severely degraded images.

...read moreread less

Journal Article•DOI•

An approach to integration of off-line and on-line recognition of handwriting

[...]

Hirobumi Nishida¹•Institutions (1)

University of Aizu¹

01 Nov 1995-Pattern Recognition Letters

TL;DR: An approach to the integration of off-line and on-line recognition of unconstrained handwritten characters by adapting an on-LINE recognition algorithm to off- line recognition, based on high-quality thinning algorithms is presented.

...read moreread less

Patent•

System for indexing document images

[...]

Hirotaka Shiiyama¹, O Canon K.K. Masaki•Institutions (1)

Canon Inc.¹

15 Jun 1995

TL;DR: In this paper, an OCR processor for recognizing stored image information and outputting a recognition result while switching the number of candidate characters to be outputted as recognition result in accordance with a degree of a likelihood, and a document searcher for forming character trains for search from the recognition result and for registering as a search file.

...read moreread less

Abstract: When texts recognized by an OCR are registered and those texts are searched by a search word, a state in which the search cannot be performed depending on an error recognition at the time of the recognition by the OCR is eliminated. It is an object of the invention to realize a process such that no burden is exerted on an operator or an apparatus by the above state. There are provided an OCR processor for recognizing stored image information and outputting a recognition result while switching the number of candidate characters to be outputted as a recognition result in accordance with a degree of a likelihood; and a document searcher for forming character trains for search from the recognition result and for registering as a search file.

...read moreread less

Patent•

Device and method for handwriting recognition with adaptive weighting of recognition data

[...]

Farzad Ehsani, Liyang Zhou, John Lorne Campbell Seybold, Elton B. Sherwin, Kenneth J. Guzik - Show less +1 more

28 Dec 1995

TL;DR: In this article, an adaptive weighting handwriting recognition device and method compares information representing handwritten input with stored recognition data at least some of which has a weighting value associated therewith, which can be further processed with user editing instructions to modify and correct the candidate recognition information.

...read moreread less

Abstract: An adaptive weighting handwriting recognition device and method compares information representing handwritten input with stored recognition data at least some of which has a weighting value associated therewith. The weighting values remains fixed during comparison of the handwritten input and stored recognition data and provide candidate recognition information, which can be further processed with user editing instructions to modify and correct the candidate recognition information. During user editing, the weighting values of the stored recognition data associated with the corrected candidate recognition information modified are varied to enhance the likelihood of correct future handwriting recognition.

...read moreread less

Proceedings Article•DOI•

Realization of a high-performance bilingual Chinese-English OCR system

[...]

Hong Guo¹, Xiaoqing Ding¹, Zhong Zhang¹, Fanxia Guo¹, Youshou Wu¹ - Show less +1 more•Institutions (1)

Tsinghua University¹

14 Aug 1995

TL;DR: The Twice-Segment Algorithm is used for segmentation of documents with Chinese and English characters mixed and the comprehensive recognition method is employed to improve the robustness of Chinese character recognition.

...read moreread less

Abstract: This paper focuses on the realization of a bilingual Chinese-English OCR system. First, the Twice-Segment Algorithm is used for segmentation of documents with Chinese and English characters mixed. Then the comprehensive recognition method is employed to improve the robustness of Chinese character recognition. A new measurement of robustness of OCR recognition performance is also put forward here. Finally, exciting experimental results are given.

...read moreread less

Proceedings Article•

New n-best based rejection techniques for improving a real-time telephonic connected word recognition system.

[...]

F. Javier Caminero-Gil, Celinda de la Torre-Munilla, Luis A. Hernández Gómez, Cesar Martín del Alamo

01 Jan 1995