scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Adaptive degraded document image binarization

01 Mar 2006-Pattern Recognition (Elsevier Science Inc.)-Vol. 39, Iss: 3, pp 317-327
TL;DR: The proposed method does not require any parameter tuning by the user and can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, large signal-dependent noise, smear and strain.
About: This article is published in Pattern Recognition.The article was published on 2006-03-01. It has received 585 citations till now. The article focuses on the topics: Thresholding & Wiener filter.
Citations
More filters
Journal ArticleDOI
TL;DR: A novel document image binarization technique that addresses issues ofSegmentation of text from badly degraded document images by using adaptive image contrast, a combination of the local image contrast and theLocal image gradient that is tolerant to text and background variation caused by different types of document degradations.
Abstract: Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intra-variation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny's edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively, that are significantly higher than or close to that of the best-performing methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques.

255 citations


Cites methods from "Adaptive degraded document image bi..."

  • ...Some post-processing is further applied to improve the document binarization quality....

    [...]

Proceedings ArticleDOI
09 Jun 2010
TL;DR: A new document image binarization technique that segments the text from badly degraded historical document images by using local thresholds that are estimated from the detected high contrast pixels within a local neighborhood window.
Abstract: This paper presents a new document image binarization technique that segments the text from badly degraded historical document images. The proposed technique makes use of the image contrast that is defined by the local image maximum and minimum. Compared with the image gradient, the image contrast evaluated by the local maximum and minimum has a nice property that it is more tolerant to the uneven illumination and other types of document degradation such as smear. Given a historical document image, the proposed technique first constructs a contrast image and then detects the high contrast image pixels which usually lie around the text stroke boundary. The document text is then segmented by using local thresholds that are estimated from the detected high contrast pixels within a local neighborhood window. The proposed technique has been tested over the dataset that is used in the recent Document Image Binarization Contest (DIBCO) 2009. Experiments show its superior performance.

239 citations


Cites background from "Adaptive degraded document image bi..."

  • ...Other approaches have also been reported to binarize historical document images through background subtraction [13, 8], texture analysis [12], decomposition method [4], and cross section sequence graph analysis [5], and so on....

    [...]

Journal ArticleDOI
TL;DR: The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international research groups.
Abstract: Document images often suffer from different types of degradation that renders the document image binarization a challenging task. This paper presents a document image binarization technique that segments the text from badly degraded document images accurately. The proposed technique is based on the observations that the text documents usually have a document background of the uniform color and texture and the document text within it has a different intensity level compared with the surrounding document background. Given a document image, the proposed technique first estimates a document background surface through an iterative polynomial smoothing procedure. Different types of document degradation are then compensated by using the estimated document background surface. The text stroke edge is further detected from the compensated document image by using L1-norm image gradient. Finally, the document text is segmented by a local threshold that is estimated based on the detected text stroke edges. The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international research groups.

238 citations


Cites background or methods from "Adaptive degraded document image bi..."

  • ...Compared with the document background surface estimated in [13,14], the document background surface estimated through polynomial smoothing is smoother and closer to the real document background surface....

    [...]

  • ...[13] estimate the document background surface based on the binary document image generated by Sauvola’s thresholding method [12]....

    [...]

  • ...time, the document background surface estimated through polynomial smoothing is also much smoother compared with the ones in [13,14] and so more suitable for the document degradation compensation....

    [...]

Proceedings ArticleDOI
19 Jan 2009
TL;DR: A new sliding window based local thresholding technique 'NICK', inspired from the Niblack's binarization method, which exhibits its robustness and effectiveness when evaluated on low quality ancient document images.
Abstract: In this paper, we present a new sliding window based local thresholding technique 'NICK' and give a detailed comparison of some existing sliding-window based thresholding algorithms with our method. The proposed method aims at achieving better binarization results, specifically, for ancient document images. NICK has been inspired from the Niblack's binarization method and exhibits its robustness and effectiveness when evaluated on low quality ancient document images.

217 citations


Cites background from "Adaptive degraded document image bi..."

  • ...In an OCR, one of the main processing stage is binarization of document images, i.e. separation of foreground from background [12,16]....

    [...]

Journal ArticleDOI
TL;DR: This work presents an adaptive and parameterless generalization of Otsu's method, extended using a multiscale framework, and has been applied on various datasets, including the DIBCO'09 dataset, with promising results.

212 citations


Cites methods from "Adaptive degraded document image bi..."

  • ...Gatos’ [15] 7 O(s2n2 + s(2)bn 2) 42467....

    [...]

  • ...One of the state-of-the-art binarization methods is introduced in [15]....

    [...]

  • ...Seven parameters are considered for the Gatos method: three for its Sauvola component, one for background estimation, and three for the threshold formula [15]....

    [...]

References
More filters
Journal ArticleDOI

37,017 citations


"Adaptive degraded document image bi..." refers methods in this paper

  • ...Usually, it distinguishes text areas from background areas, so it is used as a text locating technique....

    [...]

  • ...The proposed method has been extensively tested with a variety of degraded image documents and has demonstrated superior performance against four (4) well-known techniques....

    [...]

  • ...Keywords: Degraded document images; Local adaptive binarization...

    [...]

Book
03 Oct 1988
TL;DR: This chapter discusses two Dimensional Systems and Mathematical Preliminaries and their applications in Image Analysis and Computer Vision, as well as image reconstruction from Projections and image enhancement.
Abstract: Introduction. 1. Two Dimensional Systems and Mathematical Preliminaries. 2. Image Perception. 3. Image Sampling and Quantization. 4. Image Transforms. 5. Image Representation by Stochastic Models. 6. Image Enhancement. 7. Image Filtering and Restoration. 8. Image Analysis and Computer Vision. 9. Image Reconstruction From Projections. 10. Image Data Compression.

8,504 citations

Journal ArticleDOI
TL;DR: 40 selected thresholding methods from various categories are compared in the context of nondestructive testing applications as well as for document images, and the thresholding algorithms that perform uniformly better over nonde- structive testing and document image applications are identified.
Abstract: We conduct an exhaustive survey of image thresholding methods, categorize them, express their formulas under a uniform notation, and finally carry their performance comparison. The thresholding methods are categorized according to the information they are exploiting, such as histogram shape, measurement space clustering, entropy, object attributes, spatial correlation, and local gray-level surface. 40 selected thresholding methods from various categories are compared in the context of nondestructive testing applications as well as for document images. The comparison is based on the combined performance measures. We identify the thresholding algorithms that perform uniformly better over nonde- structive testing and document image applications. © 2004 SPIE and IS&T. (DOI: 10.1117/1.1631316)

4,543 citations


"Adaptive degraded document image bi..." refers background in this paper

  • ...In a global approach, threshold selection leads to a single threshold value for the entire image....

    [...]