scispace - formally typeset
Search or ask a question
Topic

Intelligent word recognition

About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.


Papers
More filters
Patent
Tanveer Syeda-Mahmood1
29 Sep 1997
TL;DR: In this paper, a method and system of recognizing handwritten words in scanned documents is presented, wherein by processing a document containing handwriting, features for word localization are extracted from handwritten words contained in said document through basis points taken from a single curve of text lines.
Abstract: A method and system of recognizing handwritten words in scanned documents, wherein by processing a document containing handwriting, features for word localization are extracted from handwritten words contained in said document through basis points taken from a single curve of text lines. The method is independent of page orientation, and does not assume that the individual lines of handwritten text are parallel, and the method does not require that word regions be aligned with text line orientation wherein intra-word statistics are derived from sample pages rather than using a fixed threshold. The method has applications in digital libraries, handwriting tokenization, document management and OCR systems.

78 citations

Journal ArticleDOI
TL;DR: A forward-backward lattice pruning algorithm is proposed to reduce the computation in training when trigram language models are used, and beam search techniques are investigated to accelerate the decoding speed.
Abstract: This paper proposes a method for handwritten Chinese/Japanese text (character string) recognition based on semi-Markov conditional random fields (semi-CRFs). The high-order semi-CRF model is defined on a lattice containing all possible segmentation-recognition hypotheses of a string to elegantly fuse the scores of candidate character recognition and the compatibilities of geometric and linguistic contexts by representing them in the feature functions. Based on given models of character recognition and compatibilities, the fusion parameters are optimized by minimizing the negative log-likelihood loss with a margin term on a training string sample set. A forward-backward lattice pruning algorithm is proposed to reduce the computation in training when trigram language models are used, and beam search techniques are investigated to accelerate the decoding speed. We evaluate the performance of the proposed method on unconstrained online handwritten text lines of three databases. On the test sets of databases CASIA-OLHWDB (Chinese) and TUAT Kondate (Japanese), the character level correct rates are 95.20 and 95.44 percent, and the accurate rates are 94.54 and 94.55 percent, respectively. On the test set (online handwritten texts) of ICDAR 2011 Chinese handwriting recognition competition, the proposed method outperforms the best system in competition.

78 citations

Proceedings ArticleDOI
03 Aug 2003
TL;DR: The approach is to create a visual challenge that is easy for humans but difficult for a computer to recognize a string of random distorted characters, which presents hard segmentation problems that humans are particularly apt at solving.
Abstract: How do you tell a computer from a human? The situation arises often on the Internet, when online polls are conducted, accounts are requested, undesired email is received, and chat-rooms are spammed. The approach we use is to create a visual challenge that is easy for humans but difficult for a computer. More specifically, our challenge is to recognize a string of random distorted characters. To pass the challenge, the subject must type in the correct corresponding ASCII string. From an OCR point of view, this problem is interesting because our goal is to use the vast amount of accumulated knowledge to defeat the state of the art OCR algorithms. This is a role reversal from traditional OCR research. Unlike many other systems, our algorithm is based on the assumption that segmentation is much more difficult than recognition. Our image challenges present hard segmentation problems that humans are particularly apt at solving. The technology is currently being used in MSN's Hotmail registration system, where it has significantly reduced daily registration rate with minimal Consumer Support impact.

77 citations

Journal ArticleDOI
TL;DR: An operational system is presented for the recognition of handwritten words when written on line on a special transducer that represents a new approach to handwriting recognition, that of searching for the invariants of the patterns by consideration of the intrinsic movements that execute the handwriting.
Abstract: An operational system is presented for the recognition of handwritten words when written on line on a special transducer. The system represents a new approach to handwriting recognition, that of searching for the invariants of the patterns by consideration of the intrinsic movements that execute the handwriting. The handwritten words are analyzed by segmentation into strokes, recognition of strokes by the statistical likelihood of their belonging to preselected classes, and the use of constraints inherent in the script and word representations to limit the output sequences generated. Experiments carried out by computer simulation of the recognition system reveal that the system is capable of recognizing well-formed, legible handwritten words with a reliability that depends on the correspondence between the script of the test samples and that of the ensemble on which the machine's representation of handwriting is based. For an ensemble of 100 samples written by four subjects with a vocabulary chosen so that adjacent letters provided little contextual information, 91% of the samples were correctly recognized if the machine had been previously exposed to the same 100 samples. Lower recognition rates are obtained in situations in which differences exist between the teaching and test ensembles.

76 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Object detection
46.1K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202314
202241
20201
20192
20189
201751