A hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies and can be successfully used for handwritten word recognition.
Abstract:
Describes a hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies. After preprocessing, a word image is segmented into letters or pseudoletters and represented by two feature sequences of equal length, each consisting of an alternating sequence of shape-symbols and segmentation-symbols, which are both explicitly modeled. The word model is made up of the concatenation of appropriate letter models consisting of elementary HMMs and an HMM-based interpolation technique is used to optimally combine the two feature sets. Two rejection mechanisms are considered depending on whether or not the word image is guaranteed to belong to the lexicon. Experiments carried out on real-life data show that the proposed approach can be successfully used for handwritten word recognition.
TL;DR: This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies, significantly outperforming a state-of-the-art HMM-based system.
TL;DR: A review of the OCR work done on Indian language scripts and the scope of future work and further steps needed for Indian script OCR development is presented.
TL;DR: The use of hybrid Hidden Markov Model (HMM)/Artificial Neural Network (ANN) models for recognizing unconstrained offline handwritten texts and new techniques to remove slope and slant from handwritten text and to normalize the size of text images with supervised learning methods are presented.
TL;DR: Novel, general methods for detecting landmine signatures in ground penetrating radar (GPR) using hidden Markov models (HMMs) are proposed and evaluated and successfully tested at two different locations.
TL;DR: This survey is divided into two parts, the first one dealing with the general aspects of Cursive Word Recognition, the second one focusing on the applications presented in the literature.
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
TL;DR: This paper gives a tutorial exposition of the Viterbi algorithm and of how it is implemented and analyzed, and increasing use of the algorithm in a widening variety of areas is foreseen.
TL;DR: During the past few years several design algorithms have been developed for a variety of vector quantizers and the performance of these codes has been studied for speech waveforms, speech linear predictive parameter vectors, images, and several simulated random processes.
TL;DR: The EM (expectation-maximization) algorithm is ideally suited to problems of parameter estimation, in that it produces maximum-likelihood (ML) estimates of parameters when there is a many-to-one mapping from an underlying distribution to the distribution governing the observation.
TL;DR: This paper describes a number of statistical models for use in speech recognition, with special attention to determining the parameters for such models from sparse data, and describes two decoding methods appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks.
Q1. What contributions have the authors mentioned in the paper "An hmm-based approach for off-line unconstrained handwritten word modeling and recognition" ?
ÐThis paper describes a hidden Markov model-based approach designed to recognize off-line unconstrained handwritten words for large vocabularies.
Q2. What is the goal of the feature extraction phase?
The goal of the feature extraction phase is to extract, in an ordered way (suitable to Markovian modeling), a set of relevant features that reduce redundancy in the word image while preserving the discriminative information for recognition.
Q3. What is the usual solution to overcome this problem?
The usual solution to overcome this problem is to first make structural assumptions and then use parameter estimation to improve the probability of generating the training data by the models.
Q4. What is the way to model the vocabulary of words?
Since the entire vocabulary of words is large, it is more realistic to model basic units, such as letters, rather than whole words.
Q5. What is the main strength of the proposed system?
The main strength of the proposed system lies in its training phase, which does not require any manual segmentation of the data to train the character models.
Q6. What is the a priori selector of the writing style?
The ratio of the number of filtered maxima over the total number of maxima is used as an a priori selector of the writing style: cursive or uppercase (in which case no normalization is done).
Q7. What is the way to handle the two sequences?
An obvious solution to handle the two sequences is to use two independent word recognition engines and combine theiroutputs in a subsequent stage.
Q8. What are the limitations of hidden Markov models?
Although HMMs have some limitations such as the assumption of conditional independence of observations given the state sequence, these limitations are behind the well-defined theoretical foundations of HMMs and the existence of powerful algorithms for decoding and training.
Q9. What is the definition of character skew?
Character skew is estimated as the average slant of elementary segments obtained by sampling the word image contour, without taking into account horizontal and pseudohorizontal segments.
Q10. What is the way to solve the problem of ambiguity between dynamic lexicons?
An elegant method could be a hierarchical data representa-tion which provides more details as the feature sequence lengthgets smaller and the ambiguity between the dynamic lexiconcandidates (to be computed dynamically) gets higher.