Topic
Intelligent word recognition
About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: In this paper, a feature extraction approach based on elastic meshing and directional decomposition techniques for handwritten Chinese character recognition (HCCR) is proposed in which three kinds of decomposition methods are proposed.
Abstract: A new feature extraction approach based on elastic meshing and directional decomposition techniques for handwritten Chinese character recognition (HCCR) is proposed in this letter. It is found that decomposing a Chinese character into horizontal, vertical stroke, left slant and right slant directional sub-patterns is very helpful for feature extraction and recognition. Three kinds of decomposition methods are proposed. A minimum distance classifier is trained by 3755 categories of characters using the new features. Testing on a total of 37,550 untrained handwritten samples produces the recognition rate of 92.36%, showing the effectiveness of the proposed approach.
42 citations
••
20 Aug 2006TL;DR: C-Cube (Cursive Character Challenge), a new public-domain cursive character database that contains 57293 cursive characters manually extracted from cursive handwritten words, including both upper and lower case versions of each letter is presented.
Abstract: Cursive character recognition is a challenging task due to high variability and intrinsic ambiguity of cursive letters. This paper presents C-Cube (Cursive Character Challenge), a new public-domain cursive character database. C-Cube contains 57293 cursive characters manually extracted from cursive handwritten words, including both upper and lower case versions of each letter. The database can be downoloaded from the Web and it provides predefined experimental protocols in order to compare rigorously the results obtained by different researchers
42 citations
••
14 Nov 1988TL;DR: A methodology for recognizing ZIP codes in handwritten addresses is presented that uses many diverse pattern recognition and image processing algorithms and takes the form of a blackboard architecture that opportunistically invokes routines as needed.
Abstract: A methodology for recognizing ZIP codes (US postal codes) in handwritten addresses is presented that uses many diverse pattern recognition and image processing algorithms. Given a high-resolution image of a handwritten address block, the solution invokes routines capable of hypothesizing the location of the ZIP code, segmenting and recognizing ZIP code digits, locating and recognizing city and state names, and looking up the results in a dictionary. The control structure is not strictly sequential, but rather in the form of a blackboard architecture that opportunistically invokes routines as needed. An implementation of the methodology is described as well as results with a database of grey-level images of handwritten addresses (taken from live mail in a US Postal Service mail processing facility). Future extensions of the approach are discussed. >
42 citations
••
26 Oct 2004TL;DR: A framework for grouping and recognition of characters and symbols in online free-form ink expressions that can achieve 94% grouping/recognition accuracy on a test dataset containing symbols from 25 writers held out from the training process.
Abstract: We present a framework for grouping and recognition of characters and symbols in online free-form ink expressions. The approach is completely spatial; it does not require any ordering on the strokes. It also does not place any constraints on the layout of the symbols. Initially each of the strokes on the page is linked in a proximity graph. A discriminative recognizer is used to classify connected subgraphs as either making up one of the known symbols or perhaps as an invalid combination of strokes (e.g. including strokes from two different symbols). This recognizer operates on the rendered image of the strokes plus stroke features such as curvature and endpoints. A small subset of very efficient image features is selected, yielding an extremely fast recognizer. Dynamic programming over connected subsets of the proximity graph is used to simultaneously find the optimal grouping and recognition of all the strokes on the page. Experiments demonstrate that the system can achieve 94% grouping/recognition accuracy on a test dataset containing symbols from 25 writers held out from the training process.
42 citations
•
IBM1
TL;DR: In this article, a document-specific database is created from an OCR scan of a document of interest, which contains an exhaustive listing of words in the document and images of each word, taken from all the fonts encountered, are entered into the database and mapped to a corresponding textual representation.
Abstract: Disclosed embodiments of the invention provide automated global optimization methods and systems of OCR, tailored to each document being digitized. A document-specific database is created from an OCR scan of a document of interest, which contains an exhaustive listing of words in the document. Images of each word, taken from all the fonts encountered, are entered into the database and mapped to a corresponding textual representation. After entry of a first instance of an image of a word written in a particular font, each new occurrence of the word in that font can be quickly recognized by image processing techniques. The disclosed methods and systems may be used in conjunction with adaptive character recognition training and word recognition training of the OCR engines.
41 citations