scispace - formally typeset
Search or ask a question
Topic

Intelligent word recognition

About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.


Papers
More filters
Proceedings ArticleDOI
10 Aug 1998
TL;DR: A novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese that outperforms the previously published method by using a statistical OCR model and character shape similarity.
Abstract: We present a novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OCR model, an approximate word matching method using character shape similarity, and a word segmentation algorithm using a statistical language model. By using a statistical OCR model and character shape similarity, the proposed error corrector outperforms the previously published method. When the baseline character recognition accuracy is 90%, it achieves 97.4% character recognition accuracy.

24 citations

Proceedings Article
01 Jan 2009
TL;DR: A novel technique is presented here for recognition of handwritten compound characters of Bangla alphabet, which advocates for incrementally expanding the number of learned character classes from more frequently occurred to less frequently occurred ones.
Abstract: A novel technique is presented here for recognition of handwritten compound characters of Bangla alphabet. It advocates for incrementally expanding the number of learned character classes from more frequently occurred to less frequently occurred ones. The work is preceded by a survey for finding the frequencies of occurrences of all Bangla characters in the standard literature. One important finding of the survey is that only 4.27 percent of characters in a standard text piece are on average compound characters. Out of the 160 compound character classes, characters of 55 classes constitute 90 percent of the compound characters occurring on average in a standard text piece. For the time being, handwritten characters from these classes are considered here. The average recognition rate, as observed under this work, is 84.67 percent after 3 fold cross validation of results. It is more or less comparable with the performance reported in another related work[3]. The work presented here can be considered as an important step for the development of OCR for handwritten Bangla characters, including complex shaped compound characters.

24 citations

Book ChapterDOI
09 Dec 2008
TL;DR: A novel skeletonization algorithm called MFITS (morphology-fused index table skeletonization) is proposed and a skeleton-based Chinese calligraphic character recognition method is proposed too.
Abstract: The large amount of digitized Chinese calligraphic works in existence is a valuable part of the Chinese cultural heritage. But they can hardly be recognized by optical character recognition (OCR) which performs well on machine printed characters against clean background, because there are so different styles of shape complexity characters. So the approaches of automatic Chinese calligraphic character recognition become more and more important. A novel skeletonization algorithm called MFITS (morphology-fused index table skeletonization) is proposed and a skeleton-based Chinese calligraphic character recognition method is proposed too. The experiments show that MFITS can extract skeletons with only a few deformations and the skeleton-based Chinese calligraphic character image recognition method has a good performance.

24 citations

Journal ArticleDOI
TL;DR: This study proposes a novel solution for performing character recognition in Tamil using octal graph conversion for recognizing off-line handwritten Tamil characters which improves the slant correction and indicates that the approach can be used forCharacter recognition in other Indic scripts as well.
Abstract: Problem Statement: Handwriting recognition has attracted voluminous research in recent times. The segmentation and recognition of the characters from handwritten scripts incorporates considerable overhead. Almost all the existing handwritten character recognition techniques use neural network approach, which requires lot of preprocessing and hence accomplishing these problems using neural network is a tedious task. Approach: In this study we propose a novel solution for performing character recognition in Tamil, the official language of the south Indian province of Tamil Nadu. Pursued by the preprocessing techniques, Segmentation, Normalization and Feature Extraction the approach utilizes octal graph conversion for recognizing off-line handwritten Tamil characters which improves the slant correction. The graph tries to represent the basic form of a letter independent of the style of writing. Using the weights of the graphs and by the appropriate feature matching with the predefined characters, the written characters are recognized. Results: The performance evaluation of off line handwritten Tamil character using octal graph conversion and the metrics based on ranks of the letters proves good Recognition Efficiency Conclusion: We show that, in practise, the proposed approach produces near optimal results besides outperforming the other methodologies in existence. Results indicate that the approach can be used for character recognition in other Indic scripts as well.

24 citations

Journal ArticleDOI
TL;DR: A powerful segmentation-free letter detection method based upon joint boosting with histograms of gradients as features based on efficient inference on an ensemble of hidden Markov models to recognize complete words in ambiguous handwritten text.

24 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
86% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
84% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Object detection
46.1K papers, 1.3M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202314
202241
20201
20192
20189
201751