scispace - formally typeset
Search or ask a question
Topic

Intelligent word recognition

About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.


Papers
More filters
Proceedings ArticleDOI
26 Oct 2004
TL;DR: The application of human interactive proofs (HIP), which is a relatively new research area with the primary focus of defending online services against abusive attacks, uses a set of security protocols based on automatic tests that humans can pass but the state-of-the-art computer programs cannot.
Abstract: Handwritten text offers challenges that are rarely encountered in machine-printed text. In addition, most problems faced in reading machine-printed text (e.g., character recognition, word segmentation, letter segmentation, etc.) are more severe, in handwritten text. In this paper we present the application of human interactive proofs (HIP), which is a relatively new research area with the primary focus of defending online services against abusive attacks. It uses a set of security protocols based on automatic tests that humans can pass but the state-of-the-art computer programs cannot. This is accomplished by exploiting the differential in the proficiency between humans and computers in reading handwritten word images.

83 citations

Journal ArticleDOI
TL;DR: A writer independent system for large vocabulary recognition of on-line handwritten cursive words that reached a 97.9% and 82.4% top-5 word recognition rate on a writer-dependent and writer-independent test, respectively.
Abstract: This paper presents a writer independent system for large vocabulary recognition of on-line handwritten cursive words. The system first uses a filtering module, based on simple letter features, to quickly reduce a large reference dictionary (lexicon) to a more manageable size; the reduced lexicon is subsequently fed to a recognition module. The recognition module uses a temporal representation of the input, instead of a static two-dimensional image, thereby preserving the sequential nature of the data and enabling the use of a Time-Delay Neural Network (TDNN); such networks have been previously successful in the continuous speech recognition domain. Explicit segmentation of the input words into characters is avoided by sequentially presenting the input word representation to the neural network-based recognizer. The outputs of the recognition module are collected and converted into a string of characters that is matched against the reduced lexicon using an extended Damerau-Levenshtein function. Trained on 2,443 unconstrained word images (11 k characters) from 55 writers and using a 21 k lexicon we reached a 97.9% and 82.4% top-5 word recognition rate on a writer-dependent and writer-independent test, respectively.

81 citations

Journal ArticleDOI
TL;DR: A prototype of the OCR system for printed Oriya script achieves 96.3% character level accuracy on average, and the feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning.
Abstract: This paper deals with an Optical Character Recognition (OCR) system for printedOriya script. The development of OCR for this script is difficult because a large number of character shapes in the script have to be recognized. In the proposed system, the document image is first captured using a flat-bed scanner and then passed through different preprocessing modules like skew correction, line segmentation, zone detection, word and character segmentation etc. These modules have been developed by combining some conventional techniques with some newly proposed ones. Next, individual characters are recognized using a combination of stroke and run-number based features, along with features obtained from the concept of water overflow from a reservoir. The feature detection methods are simple and robust, and do not require preprocessing steps like thinning and pruning. A prototype of the system has been tested on a variety of printed Oriya material, and currently achieves 96.3% character level accuracy on average.

81 citations

Proceedings ArticleDOI
01 Oct 2016
TL;DR: A deep convolutional feature representation is proposed that achieves superior performance for word spotting and recognition for handwritten images and enables query-by-string by learning a common subspace for image and text using the embedded attribute framework.
Abstract: We propose a deep convolutional feature representation that achieves superior performance for word spotting and recognition for handwritten images. We focus on: -(i) enhancing the discriminative ability of the convolutional features using a reduced feature representation that can scale to large datasets, and (ii) enabling query-by-string by learning a common subspace for image and text using the embedded attribute framework. We present our results on popular datasets such as the IAM corpus and historical document collections from the Bentham and George Washington pages. On the challenging IAM dataset, we achieve a state of the art mAP of 91.58% on word spotting using textual queries and a mean word error rate of 6.69% for the word recognition task.

80 citations

Patent
13 Jan 1997
TL;DR: In an optical character recognition (OCR) system an improved method and apparatus for recognizing the character and producing an indication of the confidence with which the character has been recognized as mentioned in this paper.
Abstract: In an optical character recognition (OCR) system an improved method and apparatus for recognizing the character and producing an indication of the confidence with which the character has been recognized. The system employs a plurality of different OCR devices each of which outputs a indicated (or recognized) character along with the individual devices own determination of how confident it is in the indication. The OCR system uses that data output from each of the different OCR devices along with other attributes of the indicated character such as the relative accuracy of the particular OCR device indicating the character to choose the select character recognized by the system and to produce a combined confidence indication of how confident the system is in its recognition.

80 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Object detection
46.1K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202314
202241
20201
20192
20189
201751