Enhancing Word Image Retrieval in Presence of Font Variations
read more
Citations
A survey of document image word spotting techniques
A brief review of document image retrieval methods: Recent advances
Texture Feature-based Document Image Retrieval
References
Separating Style and Content with Bilinear Models
Word image matching using dynamic time warping
Writer adaptation for online handwriting recognition
Personalized handwriting recognition via biased regularization
Matching word images for content-based retrieval from printed document images
Related Papers (5)
Frequently Asked Questions (13)
Q2. What are the future works mentioned in the paper "Enhancing word image retrieval in presence of font variations" ?
Their future work will be to learn the font/style independent features from a large collection of document images.
Q3. What are the parameters used to perform retrieval on the test set?
Optimal value for kernel parameters and the regularization factors β and λ are found by performing retrieval on the validation set and these optimal parameters are then used while performing retrieval on the test set.
Q4. What is the way to make a linear model more robust?
To make linear models more robust, it is a common practice to first map the feature vectors in the original space to a high dimensional space and then learn the linear model over the high dimensional space.
Q5. What is the hypothesis of style transfer?
Their hypothesis is that a style-transformed query would be more closer to the correct matches and would lead to a better performance of the nearest neighbor classifier.
Q6. What is the common approach to addressing font style variations in word image retrieval?
For addressing font style variations in word image retrieval, a common strategy is to use some font independent feature representation.
Q7. What is the easiest method to do style transfer of a query?
A straightforward method to do style transfer of the query is to decompose it into style and content factors using a bilinear model [10].
Q8. How can the authors represent ith word labels using asymmetric bilinear model?
The ith column of Y t corresponding to the mean vector of ith word label can be represented using asymmetric bilinear model as yit =
Q9. How is the retrieval performed on target dataset?
Now the retrieval is performed on target dataset on the basis of distance between the content vector of query word images and content vector of target dataset word images.
Q10. how do you find fonts in documents?
The authors have also suggested a font independent retrieval strategy by representing wordsfrom all the documents using the same set of high dimensional basis vectors.
Q11. How do the authors obtain content vector representation for all of the target dataset word images?
Using this kernel bilinear model, the authors obtain content vector representation for all of the target dataset word images and use them to perform nearest neighbor based retrieval on the basis of their distance with the content vectors corresponding to query labels from the training dataset.
Q12. What is the way to transfer word images?
As and content vectors Bc(each column is a content vector corresponding to a word label) can be obtained by solving the following optimization problemmin As,Bc||Y s −AsBc||2F . (2)If the same number of word images are available for all the word labels, this problem can be solved with the help of SVD of the matrix Y s.Consider the task of rendering word images in a new font using the asymmetric bilinear model.
Q13. What is the way to achieve font independence?
the kernelized version of the bilinear model is able to achieve font independence and improved mAP scores by up to 0.30 for word image retrieval.