Segmentation of images of stained papers based on distance perception

doi:10.1109/ICSMC.2010.5642394

Home
/
Papers
/
Segmentation of images of stained papers based on distance perception

Proceedings Article•DOI•

Segmentation of images of stained papers based on distance perception

Carlos A. B. Mello¹•Institutions (1)

Federal University of Pernambuco¹

22 Nov 2010-pp 1636-1642

TL;DR: The decreasing of the foreground information (the ink) by simulating the information the authors perceive when they go far from the document image is proposed, which is very efficient in documents with several types of degradation although it is not suitable for small noises.

read less

Abstract: A new approach to segment images of historical documents that are in stained paper is presented herein. Due to their characteristics, these images are very difficult to segment, especially in documents with high illumination variance along the image, non-uniform degradation and the presence of smudges or smears. We propose herein the decreasing of the foreground information (the ink) by simulating the information we perceive when we go far from the document image. As we stand back, the text tends to disappear remaining just the main colors from the background. The method is very efficient in documents with several types of degradation although it is not suitable for small noises.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

ICFHR 2012 Competition on Handwritten Document Image Binarization (H-DIBCO 2012)

[...]

Ioannis Pratikakis¹, Konstantinos Zagori, Panagiotis Kaddas, Basilis Gatos•Institutions (1)

Democritus University of Thrace¹

18 Sep 2012

TL;DR: This paper reports on the contest details including the evaluation measures used as well as the performance of the 24 submitted methods along with a short description of each method.

...read moreread less

Abstract: H-DIBCO 2012 is the International Document Image Binarization Competition which is dedicated to handwritten document images organized in conjunction with ICFHR 2012 conference. The objective of the contest is to identify current advances in handwritten document image binarization using meaningful evaluation performance measures. This paper reports on the contest details including the evaluation measures used as well as the performance of the 24 submitted methods along with a short description of each method.

...read moreread less

181 citations

Journal Article•DOI•

A new thresholding algorithm for document images based on the perception of objects by distance

[...]

Rafael G. Mesquita¹, Carlos A. B. Mello¹, L. H. E. V. Almeida¹•Institutions (1)

Federal University of Pernambuco¹

01 Apr 2014-Computer-Aided Engineering

TL;DR: A new method to enhance and binarize document images with several kind of degradation is proposed, based on the idea that by the absolute difference between a document image and its background it is possible to effectively emphasize the text and attenuate degraded regions.

...read moreread less

Abstract: In this work a new method to enhance and binarize document images with several kind of degradation is proposed. The method is based on the idea that by the absolute difference between a document image and its background it is possible to effectively emphasize the text and attenuate degraded regions. To generate the background of a document our work was inspired on the human visual system and on the perception of objects by distance. Snellen's visual acuity notation was used to define how far an image must be from an observer so that the details of the characters are not perceived anymore, remaining just the background. A scheme that combines k-means clustering algorithm and Otsu's thresholding method is also used to perform binarization. The proposed method has been tested on two different datasets of document images DIBCO 2011 and a real historical document image dataset with very satisfactory results.

...read moreread less

29 citations

Journal Article•DOI•

Parameter tuning for document image binarization using a racing algorithm

[...]

Rafael G. Mesquita¹, Ricardo M. A. Silva¹, Carlos A. B. Mello¹, Péricles B. C. de Miranda²•Institutions (2)

Federal University of Pernambuco¹, Universidade Federal Rural de Pernambuco²

01 Apr 2015-Expert Systems With Applications

TL;DR: This work investigated the use of a racing procedure based on a statistical approach, named I/F-Race, to suggest the parameters for two binarization algorithms reasoned on the perception of objects by distance and a Laplacian energy-based technique.

...read moreread less

Abstract: It is investigated the use of I/F-Race to tune document image binarization methods.The method combines visual perception with the minimization of an energy function.Our experiments show that I/F-Race suggests promising parametric configurations.The binarization algorithm configured by I/F-Race outperforms other recent methods. Binarization of images of old documents is considered a challenging task due to the wide diversity of degradation effects that can be found. To deal with this, many algorithms whose performance depends on an appropriate choice of their parameters have been proposed. In this work, it is investigated the application of a racing procedure based on a statistical approach, named I/F-Race, to suggest the parameters for two binarization algorithms reasoned (i) on the perception of objects by distance (POD) and (ii) on the POD combined with a Laplacian energy-based technique. Our experiments show that both algorithms had their performance statistically improved outperforming other recent binarization techniques. The second proposal presented herein ranked first in H-DIBCO (Handwritten Document Image Binarization Contest) 2014.

...read moreread less

28 citations

Cites background or methods from "Segmentation of images of stained p..."

...E-mail addresses: rgm@cin.ufpe.br (R.G. Mesquita), rmas@cin.ufpe.br (R.M.A. Silva), cabm@cin.ufpe.br (C.A.B. Mello), pbcm@cin.ufpe.br (P.B.C. Miranda)....
[...]
...Rafael G. Mesquita a, Ricardo M.A. Silva a, Carlos A.B. Mello a,⇑, Péricles B.C. Miranda a,b a Centro de Informática, Universidade Federal de Pernambuco, Brazil b Departamento de Estatística e Informática, Universidade Federal Rural de Pernambuco, Brazil a r t i c l e i n f o Article history: Available online 31 October 2014 Keywords: Parameter tuning Document image binarization Racing algorithms a b s t r a c t Binarization of images of old documents is considered a challenging task due to the wide diversity of degradation effects that can be found....
[...]
...…especially in the case of old documents, because in this kind of images it is possible to find different issues, like uneven illumination, faded ink, smudges and smears, bleed-through interference and shadows (Mello, 2010b; Mesquita, Mello, & Almeida, 2014; Ntirogiannis, Gatos, & Pratikakis, 2013)....
[...]
...In addition, images from a set of manuscripts authored by the Brazilian politician Joaquim Nabuco from the end of the 19th Century, from ProHist project (Mello, 2010a) are also used....
[...]
...Binarization is an important step in the document image analysis pipeline (that usually includes digitization, binarization Sezgin & Sankur, 2004, skew correction Mascaro, Cavalcanti, & A.B. Mello, 2010, text-line, word Sanchez, Mello, Suarez, & Lopes, 2011 and character segmentation Lacerda & Mello, 2013 followed by character recognition Cheriet, Kharma, Liu, & Suen, 2007; de Mellao et al., 2012) since its result affects further stages of the recognition....
[...]

Journal Article•DOI•

An enhanced binarization framework for degraded historical document images

[...]

Wei Xiong¹, Wei Xiong², Lei Zhou¹, Ling Yue¹, Lirong Li¹, Song Wang² - Show less +2 more•Institutions (2)

Hubei University of Technology¹, University of South Carolina²

01 Dec 2021-Eurasip Journal on Image and Video Processing

TL;DR: Li et al. as mentioned in this paper adopted mathematical morphological operations to estimate and compensate the document background, whose radius is computed by the minimum entropy-based stroke width transform (SWT), and performed Laplacian energy-based segmentation on the compensated document images.

...read moreread less

Abstract: Binarization plays an important role in document analysis and recognition (DAR) systems. In this paper, we present our winning algorithm in ICFHR 2018 competition on handwritten document image binarization (H-DIBCO 2018), which is based on background estimation and energy minimization. First, we adopt mathematical morphological operations to estimate and compensate the document background. It uses a disk-shaped structuring element, whose radius is computed by the minimum entropy-based stroke width transform (SWT). Second, we perform Laplacian energy-based segmentation on the compensated document images. Finally, we implement post-processing to preserve text stroke connectivity and eliminate isolated noise. Experimental results indicate that the proposed method outperforms other state-of-the-art techniques on several public available benchmark datasets.

...read moreread less

24 citations

Proceedings Article•DOI•

Automatic image segmentation of old topographic maps and floor plans

[...]

Carlos A. B. Mello¹, Diogo C. Costa¹, T. J. dos Santos¹•Institutions (1)

Federal University of Pernambuco¹

13 Dec 2012

TL;DR: A new algorithm for image segmentation of ancient maps and floor plans is introduced that aims to remove most part of non textual elements leaving just the text, which allows further automatic identification of the map or plan through automatic character recognition techniques.

...read moreread less

Abstract: There are several kinds of information that can be achieved in ancient documents. In general, image processing research on this subject works with images of letters or documents. Topographic maps and floor plans are also an important source of information about history. In this paper, we introduce a new algorithm for image segmentation of ancient maps and floor plans. It aims to remove most part of non textual elements leaving just the text. This allows further automatic identification of the map or plan through automatic character recognition techniques. The proposed method uses a new edge detection algorithm, thresholding and connected component analysis. The results are analyzed both qualitatively and quantitatively by comparison with other technique.

...read moreread less

16 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

A mathematical theory of communication

[...]

Claude E. Shannon

01 Jul 1948-Bell System Technical Journal

TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.

...read moreread less

Abstract: In this final installment of the paper we consider the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now. To a considerable extent the continuous case can be obtained through a limiting process from the discrete case by dividing the continuum of messages and signals into a large but finite number of small regions and calculating the various parameters involved on a discrete basis. As the size of the regions is decreased these parameters in general approach as limits the proper values for the continuous case. There are, however, a few new effects that appear and also a general change of emphasis in the direction of specialization of the general results to particular cases.

...read moreread less

65,425 citations

Journal Article•DOI•

Possible generalization of Boltzmann-Gibbs statistics

[...]

Constantino Tsallis¹•Institutions (1)

National Council for Scientific and Technological Development¹

01 Jul 1988-Journal of Statistical Physics

TL;DR: In this paper, a generalized form of entropy was proposed for the Boltzmann-Gibbs statistics with the q→1 limit, and the main properties associated with this entropy were established, particularly those corresponding to the microcanonical and canonical ensembles.

...read moreread less

Abstract: With the use of a quantity normally scaled in multifractals, a generalized form is postulated for entropy, namelyS q ≡k [1 – ∑ i=1 W p i q ]/(q-1), whereq∈ℝ characterizes the generalization andp i are the probabilities associated withW (microscopic) configurations (W∈ℕ). The main properties associated with this entropy are established, particularly those corresponding to the microcanonical and canonical ensembles. The Boltzmann-Gibbs statistics is recovered as theq→1 limit.

...read moreread less

8,239 citations

"Segmentation of images of stained p..." refers methods in this paper

...A threshold algorithm based on Tsallis entropy [ 15 ] is presented in [9]....
[...]

Journal Article•DOI•

Survey over image thresholding techniques and quantitative performance evaluation

[...]

Mehmet Sezgin¹, Bulent Sankur•Institutions (1)

TÜBİTAK Marmara Research Center¹

01 Jan 2004-Journal of Electronic Imaging

TL;DR: 40 selected thresholding methods from various categories are compared in the context of nondestructive testing applications as well as for document images, and the thresholding algorithms that perform uniformly better over nonde- structive testing and document image applications are identified.

...read moreread less

Abstract: We conduct an exhaustive survey of image thresholding methods, categorize them, express their formulas under a uniform notation, and finally carry their performance comparison. The thresholding methods are categorized according to the information they are exploiting, such as histogram shape, measurement space clustering, entropy, object attributes, spatial correlation, and local gray-level surface. 40 selected thresholding methods from various categories are compared in the context of nondestructive testing applications as well as for document images. The comparison is based on the combined performance measures. We identify the thresholding algorithms that perform uniformly better over nonde- structive testing and document image applications. © 2004 SPIE and IS&T. (DOI: 10.1117/1.1631316)

...read moreread less

4,543 citations

"Segmentation of images of stained p..." refers methods in this paper

...However, classical thresholding algorithms [ 13 ] do not achieve satisfactory results due to the unique features of this kind of images as can be seen in Fig. 2 where the documents of Fig. 1 where binarized using the classical Otsu algorithm [13]....
[...]
...Most of the thresholding algorithms presented herein and several other can be found in more details in [ 13 ]....
[...]
...adopted in [ 13 ] defmes six classes of algorithms: histogram based, entropy-based, clustering-based , object attributes-based , spatial methods and local or adaptive methods....
[...]
...However, classical thresholding algorithms [13] do not achieve satisfactory results due to the unique features of this kind of images as can be seen in Fig. 2 where the documents of Fig. 1 where binarized using the classical Otsu algorithm [ 13 ]....
[...]

Book•

The image processing handbook

[...]

John C. Russ

01 Jan 2011

TL;DR: In this paper, the acquisition and use of digital images in a wide variety of scientific fields is discussed. But the focus is on high dynamic range imaging in more than two dimensions.

...read moreread less

Abstract: "This guide clearly explains the acquisition and use of digital images in a wide variety of scientific fields. This sixth edition features new sections on selecting a camera with resolution appropriate for use on light microscopes, on the ability of current cameras to capture raw images with high dynamic range, and on imaging in more than two dimensions. It discusses Dmax for X-ray images and combining images with different exposure settings to further extend the dynamic range. This edition also includes a new chapter on shape measurements, a review of new developments in image file searching, and a wide range of new examples and diagrams"

...read moreread less

3,017 citations

Book•

Vision Science: Photons to Phenomenology

[...]

Stephen E. Palmer¹•Institutions (1)

University of California, Berkeley¹

07 May 1999

TL;DR: In this paper, the authors present a comprehensive overview of visual science, from early neural processing of image structure in the retina to high-level visual attention, memory, imagery, and awareness.

...read moreread less

Abstract: This book revolutionizes how vision can be taught to undergraduate and graduate students in cognitive science, psychology, and optometry. It is the first comprehensive textbook on vision to reflect the integrated computational approach of modern research scientists. This new interdisciplinary approach, called "vision science," integrates psychological, computational, and neuroscientific perspectives. The book covers all major topics related to vision, from early neural processing of image structure in the retina to high-level visual attention, memory, imagery, and awareness. The presentation throughout is theoretically sophisticated yet requires minimal knowledge of mathematics. There is also an extensive glossary, as well as appendices on psychophysical methods, connectionist modeling, and color technology. The book will serve not only as a comprehensive textbook on vision, but also as a valuable reference for researchers in cognitive science, psychology, neuroscience, computer science, optometry, and philosophy.

...read moreread less

1,774 citations

"Segmentation of images of stained p..." refers background or methods in this paper

...The choice of a disk as a structural element is because our visual system tends to loose the perception of comers as objects go far from us [ 10 ]....
[...]
...In this paper, we present a new approach for this thresholding problem by filtering the background fIrst using ideas of visual perception theory [5][ 10 ]....
[...]