Topic
Intelligent word recognition
About: Intelligent word recognition is a research topic. Over the lifetime, 2480 publications have been published within this topic receiving 45813 citations.
Papers published on a yearly basis
Papers
More filters
••
31 Aug 2005TL;DR: IAM-OnDB is a new large online handwritten sentences database that consists of text acquired via an electronic interface from a whiteboard and a recognizer for unconstrained English text that was trained and tested using this database.
Abstract: In this paper we present IAM-OnDB - a new large online handwritten sentences database. It is publicly available and consists of text acquired via an electronic interface from a whiteboard. The database contains about 86 K word instances from an 11 K dictionary written by more than 200 writers. We also describe a recognizer for unconstrained English text that was trained and tested using this database. This recognizer is based on hidden Markov models (HMMs). In our experiments we show that by using larger training sets we can significantly increase the word recognition rate. This recognizer may serve as a benchmark reference for future research.
208 citations
••
TL;DR: An overview of the character segmentation techniques in machine-printed documents is presented, which will cover techniques for segmenting uniformed or proportional fonts, broken and touching characters; techniques based on text image features and techniquesbased on recognition results.
206 citations
••
TL;DR: A lexicon-based, handwritten word recognition system combining segmentation-free and segmentations-based techniques is described that uses dynamic programming to match word images and strings.
Abstract: A lexicon-based, handwritten word recognition system combining segmentation-free and segmentation-based techniques is described. The segmentation-free technique constructs a continuous density hidden Markov model for each lexicon string. The segmentation-based technique uses dynamic programming to match word images and strings. The combination module uses differences in classifier capabilities to achieve significantly better performance.
193 citations
•
17 Feb 2005TL;DR: In this paper, a domain-based speech recognition method and apparatus is proposed, which performs speech recognition by using a first language model and generating a first recognition result including a plurality of first recognition sentences.
Abstract: A domain-based speech recognition method and apparatus, the method including: performing speech recognition by using a first language model and generating a first recognition result including a plurality of first recognition sentences; selecting a plurality of candidate domains, by using a word included in each of the first recognition sentences and having a confidence score equal to or higher than a predetermined threshold, as a domain keyword; performing speech recognition with the first recognition result, by using an acoustic model specific to each of the candidate domains and a second language model and generating a plurality of second recognition sentences; and selecting at least one or more final recognition sentence from the first recognition sentences and the second recognition sentences. According to this method and apparatus, the effect of a domain extraction error by misrecognition of a word on selection of a final recognition result can be minimized.
187 citations
••
03 Aug 2003TL;DR: A robust scheme to segment unconstrained handwritten Banglatexts into lines, words and characters based on water reservoir principle is proposed to take care of variability involved in the writing style of different individuals.
Abstract: To take care of variability involved in the writing style ofdifferent individuals in this paper we propose a robustscheme to segment unconstrained handwritten Banglatexts into lines, words and characters. For linesegmentation, at first, we divide the text into verticalstripes. Stripe width of a document is computed bystatistical analysis of the text height in the document.Next we determine horizontal histogram of these stripesand the relationship of the minimal values of thehistograms is used to segment text lines. Based onvertical projection profile lines are segmented intowords. Segmentation of characters from handwrittenword is very tricky as the characters are seldomvertically separable. We use a concept based on waterreservoir principle for the purpose. Here we, at first,identify isolated and connected (touching) characters ina word. Next touching characters of the word aresegmented based on the reservoir base area points andstructural feature of the component.
177 citations