An MRF Model for Binarization of Natural Scene Text
read more
Citations
Reading Digits in Natural Images with Unsupervised Feature Learning
Scene Text Recognition using Higher Order Language Priors
Text Detection and Recognition in Imagery: A Survey
Strokelets: A Learned Multi-scale Representation for Scene Text Recognition
Scene Text Detection and Recognition: The Deep Learning Era
References
Threshold selection based on a simple image statistic
An Evaluation Technique for Binarization Algorithms
Binarization of low quality text using a Markov random field model
Binarization of Color Characters in Scene Images Using k-means Clustering and Support Vector Machines
Color binarization for complex camera-based images
Related Papers (5)
Frequently Asked Questions (12)
Q2. What is the way to make the binarization process more efficient?
iterative graph cut based binarization is also more suitable for their application as it refines seeds and, binarization output at each iteration and thus produces a clean binarization result even in case of noisy foreground/background distributions.
Q3. What is the energy function in Equation (1)?
Due to the introduction of GMMS the energy function in Equation (1) now becomes:E(x, k, θ, z) = Ei(x, k, θ, z) + Eij(x, z), (4)i.e. the data term depends on its assignment to GMM component.
Q4. How long does it take to produce a binary image?
The proposed method takes 32 seconds on average to produce final binary result for an image on system with 2 GB RAM and Intel R© CoreTM 2 Duo CPU with 2.93 GHz processor system.
Q5. What is the meaning of edginess difference?
(Note that by edginess difference term the authors mean, energy function with gradient magnitude difference in addition to difference in RGB colour space).
Q6. What is the common term used in the literature?
the smoothness term most commonly used in literature is the Potts model:Eij(x, z) = λ ∑(i,j)∈N exp−(zi − zj)2 2β2 [xi = xj ] dist(i, j) ,where λ determines the degree of smoothness, dist(i, j) is the Euclidean distance between neighbouring pixels i and j.
Q7. What is the gradient orientation of the edge pixel?
For every such edge pixel p the authors traverse the edge image in direction of θ until the authors hit an edge pixel q whose gradient orientation is (π−θ)± π36 (i.e. approximately opposite gradient direction).
Q8. how to make the energy function robust to low contrast colour images?
(5)In order to make the energy function robust to low contrast colour images the authors modify the smoothness term of the energy function by adding a new term which measures the “edginess” of the pixels as follows:Eij(x, z) = λ1 ∑(i,j)∈N [xi = xj ]exp(−β||zi − zj||2)+λ2 ∑(i,j)∈N [xi = xj ]exp(−β||wi − wj ||2).
Q9. What is the way to get the pixel colour from a GMMRF?
ITERATIVE GRAPH CUT BASED BINARIZATIONIn GMMRF framework [4], each pixel colour is generated from one of the 2c Gaussian Mixture Models (GMMS) (c GMMS for foreground and background each) with mean μ and covariance Σ i.e. each foreground colour pixel is generated from following distribution:p(zi|xi, θ, ki) = N (z, θ; μ(xi, ki), Σ(xi, ki)), (3) where N denotes a Gaussian distribution, xi ∈ {0, 1} and ki ∈ {1, ..., c}.
Q10. What are the main problems of the previous binarization algorithms?
Although most of these previous algorithms perform satisfactorily for many cases, they suffer from the problems like: (1) Manual tuning of parameters, (2) High sensitivity to the choice of parameters, (3) Handling images with uneven lighting, noisy background, similar foreground-background colours.
Q11. How do the authors determine the background of the graph?
The authors then re-estimate GMMS using an initial binarization result and iterate the graph cut over new data and smoothness term, until convergence.
Q12. What is the main difference between the two methods?
But these methods lack a principled formulation of the binarization problem of complex colour documents, and hence can not be generalized.