scispace - formally typeset
Journal ArticleDOI

A new efficient binarization method: application to degraded historical document images

TLDR
The efficiency of the proposed binarization method is shown on both recent and historical document images of the Document Image Binarization Contest (DIBCO) datasets that include different types of degradations.
Abstract
Binarization is an important step in reading text documents automatically through optical character recognition. Old document images often suffer from degradations that make their binarization a challenging task. In this paper, a new binarization technique for degraded document images is presented. The proposed technique is based on active contours evolving according to intrinsic geometric measures of the document image. The image contrast that is defined by the local image maximum and minimum is used to automatically generate the initialization map of our active contour model; an average thresholding is also used to produce the final delineation and binarization. The proposed implementation benefits from the level set framework, which allows the simultaneous application of a large variety of forces at the stroke–background interface. Our binarization method involves the combination of those forces in a specific way. The efficiency of the proposed method is shown on both recent and historical document images of the Document Image Binarization Contest (DIBCO) datasets that include different types of degradations. The results are compared to a number of known techniques from the literature.

read more

Citations
More filters
Journal ArticleDOI

Degraded document image binarization using structural symmetry of strokes

TL;DR: The structural symmetric pixels (SSPs) are utilized to calculate the local threshold in neighborhood and the voting result of multiple thresholds will determine whether one pixel belongs to the foreground or not and an adaptive global threshold selection algorithm is proposed.
Journal ArticleDOI

Degraded Historical Document Binarization: A Review on Issues, Challenges, Techniques, and Future Directions.

TL;DR: A comprehensive review is conducted on the issues and challenges faced during the image Binarization process, followed by insights on various methods used for image binarization.
Journal ArticleDOI

Historical Document Image Binarization: A Review

TL;DR: A comprehensive view of the field of historical document image binarization with a focus on the contributions made in the last decade is provided in this paper, where the standard methods for image thresholding, preprocessing, and post-processing are reviewed.
Journal ArticleDOI

Binarization of degraded document images with global-local U-Nets

TL;DR: A local-global combined approach for document binarization, composed of a global branch and a local branch, taking the global patches from downsampled image and cropped local patches from source image as respective inputs.
Journal ArticleDOI

An enhanced binarization framework for degraded historical document images

TL;DR: Li et al. as mentioned in this paper adopted mathematical morphological operations to estimate and compensate the document background, whose radius is computed by the minimum entropy-based stroke width transform (SWT), and performed Laplacian energy-based segmentation on the compensated document images.
References
More filters
Journal ArticleDOI

Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulations

TL;DR: The PSC algorithm as mentioned in this paper approximates the Hamilton-Jacobi equations with parabolic right-hand-sides by using techniques from the hyperbolic conservation laws, which can be used also for more general surface motion problems.
Journal ArticleDOI

Minimum error thresholding

TL;DR: A computationally efficient solution to the problem of minimum error thresholding is derived under the assumption of object and pixel grey level values being normally distributed and is applicable in multithreshold selection.
Journal ArticleDOI

Adaptive document image binarization

TL;DR: A new method is presented for adaptive document image binarization, where the page is considered as a collection of subcomponents such as text, background and picture, which adapts and performs well in each case qualitatively and quantitatively.
Related Papers (5)