Word image matching using dynamic time warping
read more
Citations
Exact indexing of dynamic time warping
The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances
Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques
Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package
A global averaging method for dynamic time warping, with applications to clustering
References
Shape matching and object recognition using shape contexts
Dynamic programming algorithm optimization for spoken word recognition
Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison
Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95).
An algorithm for associating the features of two images
Related Papers (5)
Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems
Frequently Asked Questions (13)
Q2. What have the authors stated for future works in "Word image matching using dynamic time warping" ?
Their future work will focus on improving the accuracy as well as the speed of the techniques used here. Accuracy can be improved by using better pruning techniques as well as using a larger feature set which discriminates words better from each other. Speed can be improved by optimizing their implementation of the dynamic time warping algorithm, as well as looking at related computational techniques to minimize the number of possible matches.
Q3. What is the way to compensate for the variations in the slant and skew?
DTW offers a more flexible way to compensate for these variations than linear scaling: in the matching algorithm that the authors propose, image columns are aligned and compared using DTW.
Q4. How is the identification of ink pixels realized?
The identification of ink pixels is currently realized using a thresholding technique which the authors have found to be sufficient for their purposes.
Q5. What is the effect of pruning on the word image matching algorithm?
For the matching based on DTW and the shape context run (see below), the authors normalized the slant and skew of the word images and cleaned the images to remove noise in the background and parts of other words that reach into the bounding box.
Q6. What is the method for determining the likelihood of a pair of words matching?
Previous research [3] indicates that good matching performance can be achieved by a technique that skews, resizes and aligns two candidate word images with respect to each other and then compares them pixel-by-pixel.
Q7. What is the slant and skew angle of a person's writing?
While the slant and skew angle at which a person writes is usually constant for single words, the inter-character and intra-character spacing is subject to larger variations.
Q8. Why do some image columns not contain ink pixels?
Due to a number of factors, such as pressure on the writing instrument and fading ink, some image columns may not contain ink pixels.
Q9. What is the effect of the pruning method on the smaller set A?
The authors attribute this effect to the pruning method, which works much better on the smaller set A: while the pruning preserves about 91% of the relevant documents for data set A, it only produces 71% recall on data set B.
Q10. What constraint is used to ensure that the path stays close to the diagonal of the matrix?
The authors use the Sakoe-Chiba band constraint [7] to ensure this path stays close to the diagonal of the matrix which contains the D(i, j) (see Figure 3(b)).
Q11. What is the way to index a collection of handwritten documents?
a human can tag the n most interesting clusters for indexing with the appropriate ASCIIequivalent, which could be used to build a partial index for the analyzed collection.
Q12. How accurate is the word alignment algorithm?
The word alignment accuracy of just about 83% (on a single page) shows how challenging the task of word spotting for historical documents is, even in the presence of a perfect (manually generated) transcript.
Q13. What is the difference between the two projection profiles?
(5)Due to the variations in quality (e.g. contrast, faded ink)of the scanned images, different projection profiles do not generally vary in the same range.