A study on word-level multi-script identification from video frames
read more
Citations
Improving handwriting based gender classification using ensemble classifiers
ICDAR2015 Competition on Video Script Identification (CVSI 2015)
Script Identification of Multi-Script Documents: A Survey
Integrating Local CNN and Global CNN for Script Identification in Natural Scene Images
Script identification algorithms: a survey
References
LIBSVM: A library for support vector machines
The Nature of Statistical Learning Theory
Histograms of oriented gradients for human detection
A Tutorial on Support Vector Machines for Pattern Recognition
Multiresolution gray-scale and rotation invariant texture classification with local binary patterns
Related Papers (5)
Frequently Asked Questions (19)
Q2. What future works have the authors mentioned in the paper "A study on word-level multi-script identification from video frames" ?
Future research plans include to study classifier fusion and feature-fusion based techniques on more Indian scripts in order to create a more robust system capable of handling multiple scripts, accurately.
Q3. What are the future plans for the study?
Future research plans include to study classifier fusion and feature-fusion based techniques on more Indian scripts in order to create a more robust system capable of handling multiple scripts, accurately.
Q4. What is the basic idea behind the HoG descriptor?
The basic idea behind the HoG descriptor is that the shape and appearence of the object within an image can be described by the intensity gradient distribution or the edge directions.
Q5. What is the reason behind the better performance using GLAC features?
The reason behind the better performance using GLAC features is that it is uses gradients and curvature of the image surface for feature description.
Q6. What was the use of the text lines from the video frames?
The words were segmentation from the text lines using their word segmentation technique [10] and were used as input for their experiments.
Q7. How many words were used in the dataset?
In total the low resolution and blurred image dataset was formed using 235 word images comprising of 71 English, 92 Bengali and 73 Hindi words.
Q8. What were the two features used in the previous study?
The two texture-based features namely, Zernike moments and the Gabor filter used in their previous study [4] also did not perform well compared to the gradient feature.
Q9. How many words were used in the combined dataset?
the better resolution and sharp image dataset comprised of 1035 words, having 360 English, 318 Bengali and 357 Hindi words.
Q10. What was the error rate for the English script?
For Hindi script, 6.85% error occurred when low resolution and blurred imageswere considered, whereas, the error reduced to 2.52% for high resolution images.
Q11. What was the error rate in the English script?
Considering the low resolution and blurred images in English script, 5.71% error rate was observed, whereas, 2.5% error occurred with high resolution images.
Q12. What is the main idea behind the HoG descriptor?
Histogram of Oriented Gradients (HoG) [16] is a robust feature descriptor commonly used in computer vision and image processing for object detection.
Q13. What were the main reasons for the misclassification of Bengali word images?
The Bengali word images in Figure 6 (c, d) were misclassified as Hindi because of the very low resolution, blur and the fewer number of characters.
Q14. What is the reason for the increase in script accuracy?
confirming the observation from their previous study [4], that due to the presence of more characters in the long words, more script specific information is available, which resulted in the increase of the accuracy.
Q15. How many different labels can be obtained from the centre pixel?
As the neighbourhood of the centre pixel has 8 pixels, 28 = 256 different labels can be obtained and used as a texture descriptor.
Q16. What is the error rate of the complete dataset?
The error obtained on complete dataset, which is a mixture of both low resolution and high resolution images, was also computed to understand how much error does the low resoluton and blurred images contributed to the overall accuracy.
Q17. What were the parameters used in the present study?
The parameter settings considered for each of the feature extraction techniques used in the present study are given below.1) LBP feature: as mentioned earlier, the basic LBP was considered in their study, with 8 neighbours the feature dimensions of the LBP feature vector was 256.
Q18. What were the main reasons for the misclassification of the Hindi word images?
The Hindi word images shown in Figure 6 (e, f) were misclassified as Bengali: low-resolution and blur were the main reasons for the same.
Q19. What is the confusion matrix in Table I?
The confusion matrix in Table The authoralso reveals that highest confusion of about 9.51% and 11.37% was between Bengali and Hindi using SVM and ANN classifiers, respectively.