scispace - formally typeset
Search or ask a question
Topic

Intelligent word recognition

About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.


Papers
More filters
Journal ArticleDOI
Christopher John Burges1, Jan Ben1, John S. Denker1, Yann LeCun1, Craig R. Nohl1 
TL;DR: A method, “Shortest Path Segmentation” (SPS), which combines dynamic programming and a neural net recognizer for segmenting and recognizing character strings is described, and applications of some of these ideas are described.
Abstract: We describe a method, “Shortest Path Segmentation” (SPS), which combines dynamic programming and a neural net recognizer for segmenting and recognizing character strings. We describe the application of this method to two problems: recognition of handwritten ZIP Codes, and recognition of handwritten words. For the ZIP Codes, we also used the method to automatically segment the images during training: the dynamic programming stage both performs the segmentation and provides inputs and desired outputs to the neural network. Results are reported for a test set of 2642 unsegmented handwritten 212 dpi binary ZIP Code (5- and 9-digit) images. For handwritten word recognition, we combined SPS with a “Space Displacement Neural Network” approach, in which a single-character-recognition network is extended over the entire word image, and in which SPS techniques are then used to rank order a given lexicon. We report results on a test set of 3000 300 ppi gray scale word images, extracted from images of live mail pieces, for lexicons of size 10, 100, and 1000. Representing the problem as a graph as proposed in this paper has advantages beyond the efficient finding of the final optimal segmentation, or the automatic segmentation of images during training. We can also easily extend the technique to generate K “runner up” answers (for example, by finding the K shortest paths). This paper will also describe applications of some of these ideas.

28 citations

Book ChapterDOI
11 Apr 2008
TL;DR: Research on Urdu Nastaliq OCR is reported, challenges are discussed and a new solution for its implementation is suggested to suggest a new approach to its implementation.
Abstract: Character recognition in cursive scripts or handwritten Latin script has attracted researchers’ attention recently and some research has been done in this area. Optical character recognition is the translation of optically-scanned bitmaps of printed or written text into digitally editable data files. OCRs developed for many world languages are already in use but none exists for Urdu Nastaliq – a calligraphic adaptation of the Arabic script, just as Jawi is for Malay. Urdu Nastaliq has 39 characters against Arabic 28. Each character then has 2-4 different shapes according to its position in the word: initial, medial, final and isolated. In Nastaliq, inter-word and intra-word overlapping makes optical recognition more complex. Character recognition of the Latin script is relatively easier. This paper reports research on Urdu Nastaliq OCR, discusses challenges and suggest a new solution for its implementation.

27 citations

Patent
Yuji Izumi1
04 Nov 2002
TL;DR: In this paper, a handwritten character recognition apparatus performs a recognition process for a handwritten input pattern to input character codes, which is similar in shape to the handwritten input patterns, using a plurality of characters.
Abstract: A handwritten character recognition apparatus performs a recognition process for a handwritten input pattern to input character codes. The handwritten character recognition apparatus recognizes a handwritten input pattern as one pictorial symbol formed of a plurality of characters. The plurality of characters are similar in shape to the handwritten input pattern.

27 citations

Proceedings ArticleDOI
23 Aug 2004
TL;DR: Finite-state models are used to implement a handwritten text recognition and classification system for a real application entailing casual, spontaneous writing with large vocabulary.
Abstract: Finite-state models are used to implement a handwritten text recognition and classification system for a real application entailing casual, spontaneous writing with large vocabulary. Handwritten short paragraphs are to be classified into a small number of predefined classes. The paragraphs involve a wide variety of writing styles and contain many non-textual artifacts. HMMs and n-grams are used for text recognition and n-grams are also used for text classification. Experimental results are reported which, given the extreme difficulty of the task, are encouraging.

27 citations

Book ChapterDOI
27 Sep 2006
TL;DR: This paper proposes a quadratic classifier based scheme for the recognition of off-line handwritten characters of three popular south Indian scripts: Kannada, Telugu, and Tamil, and used 64-dimensional features for high speed recognition and 400-dimensional Features for high accuracy recognition.
Abstract: India is a multi-lingual, multi-script country. Considerably less work has been done towards handwritten character recognition of Indian languages than for other languages. In this paper we propose a quadratic classifier based scheme for the recognition of off-line handwritten characters of three popular south Indian scripts: Kannada, Telugu, and Tamil. The features used here are mainly obtained from the directional information. For feature computation, the bounding box of a character is segmented into blocks, and the directional features are computed in each block. These blocks are then down-sampled by a Gaussian filter, and the features obtained from the down-sampled blocks are fed to a modified quadratic classifier for recognition. Here, we used two sets of features. We used 64-dimensional features for high speed recognition and 400-dimensional features for high accuracy recognition. A five-fold cross validation technique was used for result computation, and we obtained 90.34%, 90.90%, and 96.73% accuracy rates from Kannada, Telugu, and Tamil characters, respectively, from 400 dimensional features.

27 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Object detection
46.1K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202314
202241
20201
20192
20189
201751