A Simple and Effective Solution for Script Identification in the Wild
TL;DR: This work presents an approach for automatically identifying the script of the text localized in the scene images using an off-the-shelf classifier, which is efficient and requires very less labeled data.
Abstract: We present an approach for automatically identifying the script of the text localized in the scene images. Our approach is inspired by the advancements in mid-level features. We represent the text images using mid-level features which are pooled from densely computed local features. Once text images are represented using the proposed mid-level feature representation, we use an off-the-shelf classifier to identify the script of the text image. Our approach is efficient and requires very less labeled data. We evaluate the performance of our method on a recently introduced CVSI dataset, demonstrating that the proposed approach can correctly identify script of 96.70% of the text images. In addition, we also introduce and benchmark a more challenging Indian Language Scene Text (ILST) dataset for evaluating the performance of our method.
...read more
Citations
216 citations
Cites background from "A Simple and Effective Solution for..."
...The previous editions of RRC competitions [1], [2] and other works [3], [4], [5], [6], [7], have provided useful datasets to help researchers tackle each of those problems in order to robustly read text in natural scene images....
[...]
...Despite the available datasets related to scene text detection or to script identification [2], [3], [4], [5], [6], [7], our dataset offers interesting novel aspects....
[...]
78 citations
71 citations
25 citations
25 citations
Cites background from "A Simple and Effective Solution for..."
...[8] take use of midlevel features which are pooled from densely computed local features and off-the-shelf classifier to identify the script of the text image....
[...]
References
13,021 citations
"A Simple and Effective Solution for..." refers methods in this paper
...We compare our methods with popular features used for script identifications in document images namely LBP [9], Gabor features [7]....
[...]
...Texture based features such as Gabor filter [7], LBP [9] have been used for script identification....
[...]
...67% which is significantly better than methods used in document image script identification domain such as [7, 9]....
[...]
...We compare our methods with popular features used for script identifications in document images namely LBP [9], Language Success Failure...
[...]
1,114 citations
"A Simple and Effective Solution for..." refers background or methods in this paper
...Mid-level features have achieved noticeable success in image classification and retrieval tasks [11, 12, 10]....
[...]
...Our method is inspired by recent advancements made in mid-level features [10, 11, 12]....
[...]
651 citations
"A Simple and Effective Solution for..." refers background in this paper
...Scene text understanding has gained huge attention in last decade, and several benchmark datasets has been introduced [13, 14]....
[...]
530 citations
474 citations
"A Simple and Effective Solution for..." refers background in this paper
...Languages # scene images # word images Mode of collection Hindi 76 514 Authors, Google Images Malayalam 121 515 Authors, Google Images Kannada 115 534 Char74K [16] Tamil 59 563 Authors Telugu 79 510 Authors English 128 850 Authors total 578 3486 -...
[...]