Topic
Intelligent word recognition
About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.
Papers published on a yearly basis
Papers
More filters
••
23 Sep 2007TL;DR: This article used topic models to detect and represent an article's semantic context, which reduces error by 7% over a global word distribution in a simulated OCR correction task, which is an important step towards improving OCR.
Abstract: Modern optical, character recognition software relies on human interaction to correct mis recognized characters. Even though the software often reliably identifies low-confidence output, the simple language and vocabulary models employed are insufficient to automatically correct mistakes. This paper demonstrates that topic models, which automatically detect and represent an article's semantic context, reduces error by 7% over a global word distribution in a simulated OCR correction task. Detecting and leveraging context in this manner is an important step towards improving OCR.
34 citations
••
20 Oct 1993TL;DR: A recognition scheme for reading handwritten cursive words using three word recognition techniques is described, with the focus on the implementation used to combine the three techniques based on a comparative study of different strategies.
Abstract: A recognition scheme for reading handwritten cursive words using three word recognition techniques is described. The focus is on the implementation used to combine the three techniques based on a comparative study of different strategies. The first holistic recognition technique derives a global encoding of the word. The other techniques both rely on the segmentation of the word into letters, but differ in the character classifier they use. The former runs a statistical linear classifier, and the latter runs a neural network with a different representation of the input data. The testing, comparison, and combination studies have been performed on word images from mail provided by the USPS. The top choice recognition rates achieved so far correspond to 88%, 76%, 65% with respect to lexicon sizes of 10, 100, and 1000 words. >
34 citations
••
27 Jan 2008TL;DR: A gap metrics based machine learning approach to separate a line of unconstrained handwritten text into words and proposes a combined distance measure computed using three different methods to overcome the disadvantage of different distance computation methods.
Abstract: Word segmentation is the most critical pre-processing step for any handwritten document recognition and/or
retrieval system. When the writing style is unconstrained (written in a natural manner), recognition of individual
components may be unreliable, so they must be grouped together into word hypotheses before recognition
algorithms can be used. This paper describes a gap metrics based machine learning approach to separate a line
of unconstrained handwritten text into words. Our approach uses a set of both local and global features, which
is motivated by the ways in which human beings perform this kind of task. In addition, in order to overcome
the disadvantage of different distance computation methods, we propose a combined distance measure computed
using three different methods. The classification is done by using a three-layer neural network. The algorithm is
evaluated using an unconstrained handwriting database that contains 50 pages (1026 line, 7562 words images)
handwritten documents. The overall accuracy is 90.8%, which shows a better performance than a previous
method.
34 citations
•
IBM1
TL;DR: In this paper, a predetermined characteristic amount is extracted for each stroke, a characteristic amount word is created having a binary value of 1 only in one or more bit positions corresponding to selected values of the characteristic amount, an AND operation is performed bit-by-bit between the reference word of the corresponding stroke of the character of interest, and it is determined if all the bits of the results of the AND operation are zero.
Abstract: An online handwritten character recognition system which performs the narrowing of candidates for handwritten character recognition quickly and very accurately by simple processing of a small amount of operations. A predetermined characteristic amount is extracted for each stroke, a characteristic amount word is created having a binary value of 1 only in one or more bit positions corresponding to selected values of the characteristic amount, an AND operation is performed bit-by-bit between the reference word of the corresponding stroke of the character of interest, and it is determined if all the bits of the results of the AND operation are zero. If the number of binary values of the results of the zero-determining operation for all the strokes of the character of interest exceeds a threshold, it is judged to be a candidate.
33 citations
••
01 Dec 2013TL;DR: A framework for investigating and comparing the recognition ability of two classifiers: Deep-Learning Feedforward-Backpropagation Neural Network (DFBNN) and Extreme Learning Machine (ELM).
Abstract: Feature extraction plays an essential role in hand written character recognition because of its effect on the capability of classifiers. This paper presents a framework for investigating and comparing the recognition ability of two classifiers: Deep-Learning Feedforward-Backpropagation Neural Network (DFBNN) and Extreme Learning Machine (ELM). Three data sets: Thai handwritten characters, Bangla handwritten numerals, and Devanagari handwritten numerals were studied. Each data set was divided into two categories: non-extracted and extracted features by Histograms of Oriented Gradients (HOG). The experimental results showed that using HOG to extract features can improve recognition rates of both of DFBNN and ELM. Furthermore, DFBNN provides higher slightly recognition rates than those of ELM.
33 citations