An MRF Model for Binarization of Natural Scene Text
read more
Citations
Whole is Greater than Sum of Parts: Recognizing Scene Text Words
Toward Integrated Scene Text Reading
Strokelets: A Learned Multi-Scale Mid-Level Representation for Scene Text Recognition
Scene Text Detection and Segmentation Based on Cascaded Convolution Neural Networks
Image Binarization for End-to-End Text Understanding in Natural Images
References
"GrabCut": interactive foreground extraction using iterated graph cuts
An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision
Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images
An Experimental Comparison of Min-cut/Max-flow Algorithms for Energy Minimization in Vision
Related Papers (5)
Frequently Asked Questions (12)
Q2. What is the way to make the binarization process more efficient?
iterative graph cut based binarization is also more suitable for their application as it refines seeds and, binarization output at each iteration and thus produces a clean binarization result even in case of noisy foreground/background distributions.
Q3. What is the energy function in Equation (1)?
Due to the introduction of GMMS the energy function in Equation (1) now becomes:E(x, k, θ, z) = Ei(x, k, θ, z) + Eij(x, z), (4)i.e. the data term depends on its assignment to GMM component.
Q4. How long does it take to produce a binary image?
The proposed method takes 32 seconds on average to produce final binary result for an image on system with 2 GB RAM and Intel R© CoreTM 2 Duo CPU with 2.93 GHz processor system.
Q5. What is the meaning of edginess difference?
(Note that by edginess difference term the authors mean, energy function with gradient magnitude difference in addition to difference in RGB colour space).
Q6. What is the common term used in the literature?
the smoothness term most commonly used in literature is the Potts model:Eij(x, z) = λ ∑(i,j)∈N exp−(zi − zj)2 2β2 [xi = xj ] dist(i, j) ,where λ determines the degree of smoothness, dist(i, j) is the Euclidean distance between neighbouring pixels i and j.
Q7. What is the gradient orientation of the edge pixel?
For every such edge pixel p the authors traverse the edge image in direction of θ until the authors hit an edge pixel q whose gradient orientation is (π−θ)± π36 (i.e. approximately opposite gradient direction).
Q8. how to make the energy function robust to low contrast colour images?
(5)In order to make the energy function robust to low contrast colour images the authors modify the smoothness term of the energy function by adding a new term which measures the “edginess” of the pixels as follows:Eij(x, z) = λ1 ∑(i,j)∈N [xi = xj ]exp(−β||zi − zj||2)+λ2 ∑(i,j)∈N [xi = xj ]exp(−β||wi − wj ||2).
Q9. What is the way to get the pixel colour from a GMMRF?
ITERATIVE GRAPH CUT BASED BINARIZATIONIn GMMRF framework [4], each pixel colour is generated from one of the 2c Gaussian Mixture Models (GMMS) (c GMMS for foreground and background each) with mean μ and covariance Σ i.e. each foreground colour pixel is generated from following distribution:p(zi|xi, θ, ki) = N (z, θ; μ(xi, ki), Σ(xi, ki)), (3) where N denotes a Gaussian distribution, xi ∈ {0, 1} and ki ∈ {1, ..., c}.
Q10. What are the main problems of the previous binarization algorithms?
Although most of these previous algorithms perform satisfactorily for many cases, they suffer from the problems like: (1) Manual tuning of parameters, (2) High sensitivity to the choice of parameters, (3) Handling images with uneven lighting, noisy background, similar foreground-background colours.
Q11. How do the authors determine the background of the graph?
The authors then re-estimate GMMS using an initial binarization result and iterate the graph cut over new data and smoothness term, until convergence.
Q12. What is the main difference between the two methods?
But these methods lack a principled formulation of the binarization problem of complex colour documents, and hence can not be generalized.