scispace - formally typeset

Proceedings ArticleDOI

A color-based layout analysis to process censorship cards of film archives

31 Aug 2005-pp 1110-1114

TL;DR: A new method for image segmentation and layout analysis that takes full advantage of color information is proposed, implemented in the DIA system WISDOM++ and tested on a corpus of multi-format documents concerning historic film censorships.

AbstractProcessing censorship cards of the 20/sup th/ century in order to support annotation and retrieval processes, leads to a number of challenges for many DIA systems. Problems due to the low layout quality and standard of such a material can be reduced by exploiting information conveyed by color. In this paper, taking into account lessons learned in the context of the 1ST project Collate, we propose a new method for image segmentation and layout analysis that takes full advantage of color information. The method has been implemented in the DIA system WISDOM++ and tested on a corpus of multi-format documents concerning historic film censorships.

Topics: Color image (53%)

Summary (1 min read)

1. Introduction

  • Many institutions which collect and preserve cultural heritage, as historical documents, have shown a great interest in the digitalization of their resources and in the exploitation of mechanisms to provide online access to digitalized products.
  • This paper presents layout analysis issues and problems addressed in the EU funded project COLLATE, whose main goal is to provide film archivists adequate access to historic film-related documents and their associated metadata [5].
  • Finally, conclusions are drawn in Section 4.

2. The approach

  • A naïve approach to color document image processing would be to separate different colors and to process Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05).
  • Images are segmented again and the spatial merging is applied on intersecting blocks.
  • At each step, the dissimilarity between two clusters of colors (inter-cluster dissimilarity) is evaluated on the basis of two measures: a) the Euclidean distance between two colors taken from distinct clusters (nearest neighbor based dissimilarity); b) the Euclidean distance between the centroids of the two clusters (centroid-based dissimilarity).
  • Authorized licensed use limited to: Donato Malerba.
  • A first step towards the reconstruction of layout structure consists of classifying the blocks according to their content type: text, horizontal line, vertical line, picture (i.e. halftone images) and graphics (e.g. line drawings).

3. Application

  • In this section the authors empirically evaluate the proposed approach in terms of the capability to isolate interesting blocks of different color for subsequent logical labeling.
  • In Fig. 4, a document image of the NFA class, that represents the most complex to analyze because of the overall low quality, is shown.
  • The document contains manual annotations (no_prec_doc, top right-hand corner), blue stamps (register_office and dispatch_officer, bottom page), red stamps (rubber_stamp, top left-hand corner) and revenue stamps (stamp, in the middle of the page).
  • The color-based layout analysis is able to isolate them, while the b/w layout analysis returns a single layout block for the whole central part of the document image and two spurious blocks extracted from the bottom of the image.
  • Indeed, for the FAA class, 205 components have been labeled in the color setting against 140 in the b/w, while 64 against 12 for the NFA class.

4. Conclusions

  • A new color-based layout analysis method has been proposed in order to meet challenges coming from processing censorship cards of European film archives of the 20ties and 30ties of the last century.
  • A comparison of the method with the original b/w version has been provided.
  • Results show that the color-based approach allows to isolate interesting blocks better than the previous version and to provide a more accurate base for understanding.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

A color-based layout analysis to process censorship cards of film archives
Margherita Berardi Oronzo Altamura Michelangelo Ceci Donato Malerba
Dipartimento di Informatica Università degli Studi di Bari
via Orabona 4 - 70126 Bari
{berardi, altamura, ceci, malerba }@di.uniba.it
Abstract
Processing censorship cards of the 20
th
century in order
to support annotation and retrieval processes, leads to a
number of challenges for many DIA systems. Problems
due to the low layout quality and standard of such a
material can be reduced by exploiting information
conveyed by color. In this paper, taking into account
lessons learned in the context of the IST project Collate,
we propose a new method for image segmentation and
layout analysis that takes full advantage of color
information. The method has been implemented in the
DIA system WISDOM++ and tested on a corpus of multi-
format documents concerning historic film censorships.
1. Introduction
Many institutions which collect and preserve cultural
heritage, as historical documents, have shown a great
interest in the digitalization of their resources and in the
exploitation of mechanisms to provide online access to
digitalized products. Indeed, several research projects
have been recently promoted for the purposes of
preservation, storage, indexing, and on-line fruition.
Interesting examples are : the MASTER pro ject, that has
developed a standard for computer-readable descriptions
of medieval manuscripts in European libraries with
retrieval objectives [10]; the MEMORIAL project, whose
goal is the establishment of a digital document workbench
enabling the creation of distributed virtual archives of
typewritten documents related to prisoners in World-War
II concentration camps [3]; the Bovary project, that
concerns the digitalization of 5,000 original manuscripts
handwritten by Gustave Flaubert [12]; the D-SCRIBE
project, that aims to develop an integrated system for
digitization and processing of Old Greek manuscripts [6].
This paper presents layout analysis issues and
problems addressed in the EU funded project COLLATE,
whose main goal is to provide film archivists adequate
access to historic film-related documents and their
associated metadata [5]. Such documents can be
censorship documents, newspaper articles, posters,
advertisement material, registration cards, and photos that
cannot be used for access, indexing and retrieval as it is.
In this framework, we applied the DIA system
WISDOM++ [1] to digitized documents available in three
national film archives, namely Deutsches Filminstitut,
Filmarchiv Austria and Národní Filmový Archiv ( Czech
Republic). WISDOM++ was originally developed to fully
support the transformation of multi-page printed
documents into XML format. Since most of information
on the do cuments is typewritten, the system appeared to
be useful for the conversion of scanned documents into a
format (i.e. XML) suitable for storage and retrieval.
Nevertheless, the low layout quality and standard of such
a material introduces a considerable amount of noise in its
description. T he layout quality is often negatively affected
by both the degradation of the documents and the
presence of frames, stamps, signatures, ink specks, and
manual annotations that overlap to those layout
components involved in the understanding processes.
To effectively process these documents, it is necessary
to exploit information conveyed by color since signatures,
stamps and manual annotations are often characterized by
colors different from those present in the background and
typewritten texts. In this way, it is possible to identify
noise (e.g. strains, tears and irregular accumulation of dirt
due to repeated handling [2]) on the basis o f color
homogeneity, as well as to isolate overlapping blocks of
interest (e.g. as in legal documents, where blue stamps or
revenue stamps often overlap signatures or typewritten
text), but also to isolate interesting blocks from
uninteresting ones (e.g. manual annotations that overlap
layout components involved in annotation processes).
WISDOM++, originally developed to p rocess black-
and-white (binary) images, has been extended to take full
advantage of color information in image segmentation and
layout analysis steps. In particular, in the following
section the description of the new algorithm is provided.
In Section 3, r esults on censorship cards are reported and
discussed. Finally, conclusions are drawn in Section 4.
2. The approach
A naïve approach to color document image processing
would be to separate different colors and to process
Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05)
1520-5263/05 $20.00 © 2005
IEEE
Authorized licensed use limited to: Donato Malerba. Downloaded on July 26,2010 at 21:21:31 UTC from IEEE Xplore. Restrictions apply.

images corresponding to each color separately, as
independent binary images. However, this approach is
based on the simplified assumption that a logical
component can be associated with a single color. In
practice, this assumption is rarely true since paper color
darkens with age, while printed parts either handwritten or
typed tend to fade [11]. Moreover, when the document is
written or typed on both sides, and the backside is visible
from the front side, further noise is introduced. For these
reasons, a more sophisticated approach is necessary.
The proposed color image segmentation algorithm
operates in three steps (see Fig. 1): color reduction and
background removal, colorimetric merging and spatial
merging. In the first phase, a color reduction that performs
colors quantization in order to identify the set of relevant
colors and to allow the user to manually select
background colors is executed. Later on, the algorithm
works on a set (
List) of binary images, where 0
corresponds to pixels of background colors while 1
corresponds to one of the foreground colors. There are as
many binary images as foreground colors.
The segmentation of each binary image is based on an
efficient variant of the Run Length Smoothing Algorithm
(RLSA) [15] and produces a list of rectangular blocks
(
BBSet). Once the set of basic blocks has been extracted,
the colorimetric merging is performed. It aims to cluster
binary images on the basis of the associated colors.
Images belonging to the same cluster are removed from
List and replaced by the merging result. The second
merging step is performed on the updated
List taking
images in pairs. Images are segmented again and the
spatial merging is applied on intersecting blocks. The
result is an updated list of both binary and multicolor
images. Both the merging steps take into account only
pixels contained in the set of basic blocks. Pixels that do
not contribute to the identification of a basic block are
ignored as noise pixels. W e observe that most of color
image segmentation methods only operate in color space
and do not take any spatial information into account.
Thus, relations between color values and pixel positions in
the image plane are not used [13] and the color
homogeneity of spatially contiguous pixels is the only
used criterion. This severe limitation is overcome by some
methods [8, 9] that associate colors to layout components
on the basis of both color and spatial information.
Nevertheless, these approaches are b ased on the
assumption that a layout component is associated to a
single color. Differently, in our domain it is necessary to
provide the system the capability to also identify
multicolor blocks as pictures or revenue stamps.
In the following subsections, the three steps of color
image segmentation as well as the layout analysis
procedure are described in detail.
The Quantization process. The quantization p rocess
follows the method proposed by [7] whose basic idea is to
build a octree containing a maximum of K different leaves
(a leaf corresponds to a color). Image is read twice. The
first time, colors ar e iteratively added to the tree keeping
at most K leaves. For our application domain, we set
K=16 since archivists normally do not “see” more than
nine colors in a censorship card. Quantization is
performed at the second reading of the image and the K-
colors image is transformed in K different binary images,
each of which is related to a color. Among the K colors,
the user manually selects a subset of m background colors.
In our domain, m ranges between two and three due to
document degradation. The exploration of the automatic
selection of background colors is part of future work.
Colorimetric Merging.
The
list of K-m images is the
merging phase input. The first merging process aims to
merge those binary images whose colors can be
considered as light variations. T his is a necessary step,
since the value K fixed a priori in the previous step may
turn out to be too large.
The process is based on a hierarchical clustering
algorithm. At each step, the dissimilarity between two
clusters of colors (inter-cluster dissimilarity) is evaluated
on the basis of two measures: a) the Euclidean distance
between two colors taken from distinct clusters (nearest
neighbor based dissimilarity); b) the Euclidean distance
between the centroids of the two clusters (centroid-based
dissimilarity). Two clusters of color s are merged when
both computed measures are lower than a threshold.
Clusters whose nearest neighbor based dissimilarity is
lowest are considered firstly. The threshold is defined as
the standard deviation value computed by considering all
the distances between each color of one cluster and each
color of another cluster. At the end, for each remaining
cluster a new image representing the average color of
original images is generated. All distances are computed
in the CIELab space. CIELab space is obtained by a
p
rocedure segment(OriginalImg, th_Incl,
th_Int, th_MinOcc, th_MaxOcc)
Output: list of images
Begin
ReducedImage quantization(16,OriginalImg);
List generateBinaryImg(ReducedImg,List);
List removeBackground(List);
Forall ForegroundImg List
BBSets BBSets
RLSASegmentation(ForegroundImg);
List ColorimetricMerging(List, BBSets) ;
Forall Foreground1 List,
Forall Foreground2 List-{Foreground1} {
BBSet1 ApplyRLSASegm(Foreground1);
BBSet2 ApplyRLSASegm(Foreground2);
IntBlocks
ComputeIntersections(BBSet1,BBSet2);
List
SpatialMerging(List,IntBlocks,th_Incl,
th_Int,th_MinOcc,th_MaxOcc);}
return List
End
;
Fig. 1 Top-level pseudocode of the segmentation algorithm.
Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05)
1520-5263/05 $20.00 © 2005
IEEE
Authorized licensed use limited to: Donato Malerba. Downloaded on July 26,2010 at 21:21:31 UTC from IEEE Xplore. Restrictions apply.

nonlinear transformation of the original RGB space. W e
used CIELab since it is considered "visually uniform"
because adjacent color samples represent equal intervals
of visual perception [4].
Spatial Merg ing. By considering spatial information
on the degree of overlapping of the layout extracted from
different color images, it is possible to group together
multicolor blocks and remove some useless low-density
blocks (with few pixels) that capture color shades of the
same layout component. Spatial merging operates on
RLSA results when it is applied to the possibly reduced
set of binary images determined by colorimetric merging.
For each couple of binary images intersecting blocks are
merged following three perceptual criteria.
The first criterion is summarized as follows:
Given BBx BasicBloks(Foreground1),
BBy BasicBloks(Foreground2)
1) if perc_of_intersection(BBx,BBy) > th_Int
&& th_MinOcc<perc_of_occupation(BBx)<th_MaxOcc
&& th_MinOcc<perc_of_occupation(BBy)<th_MaxOcc
then Foreground1removeArea(Foreground1,BBx);
Foreground2removeArea(Foreground2,BBy);
NewForegroundGenerateMulticolor(BBx, BBy);
List addElement(List, NewForeground);
This rule identifies multicolor layout components. For
each couple of intersecting blocks, when the percentage of
intersection exceeds a threshold (th_Int)andthe
percentage of occupa tion (i.e. the ratio between the area
of the block and the entire image area) for both candidate
blocks is in the interval [th_MinOcc, th_MaxOcc]thena
new multicolor image is generated. The new image is built
as the union of pixels of the original images enclosed in
the blocks. Original binary images are also “cleaned” by
removing pixels added to the multicolor image.
The second criterion is summarized as follows:
2) if perc_of_intersection(BBx,BBy) > th_Int
&& perc_Inclusion(BBx,BBy)+
perc_Inclusion(BBy,BBx) th_Incl
&& Multicolor(Foreground1)
then Foreground1
addArea(Foreground1,Foreground2, BByBBx);
removeArea(Foreground2, BByBBx);
This criterion is based on the rationale that if a block
strongly o verlaps a block of a multicolor image, the
intersecting part has to be considered as composing the
multicolor block. The pixels enclosed in the intersection
are removed from the binary image (
Foreground2)and
added to the multicolor image (
Foreground1).
A third criterion aiming at the extension of binary
images is summarized by the following rules:
3) if perc_of_intersection(BBx,BBy) > th_Int
&& perc_Inclusion(BBx,BBy)+
perc_Inclusion(BBy,BBx) th_Incl
&& perc_of_occupation(BBx) < th_MinOcc
then Foreground1
addArea(Foreground1,Foreground2,BBy);
removeArea(Foreground2, BBy);
4) if perc_of_intersection(BBx,BBy) > th_Int
&& perc_Inclusion(BBx,BBy)+
perc_Inclusion(BBy,BBx) th_Incl
&& density(BBx) < density(BBy)
then Foreground2
addArea(Foreground2,Foreground1, BByBBx);
removeArea(Foreground1, BByBBx);
Rule 3 states that if a small block has a high degree of
overlapping with a block of another image, it has to be
considered a spurious block to include in the image
associated to the “predominant” block (
Foreground1).
Rule 4 states that if two blocks have a high degree of
overlappi ng, then the intersecting part of the block with
lower density (no n-predominant block) has to be added to
the image of the predominant one. The density of a block
is defined as the ratio between the number of pixels
contained in a block and the area of the block.
Our algorithm allows the user to set four different
thresholds: th_Int and th_Incl that define the minimal
percentage of intersection and inclusion of merging
blocks, respectively; th_MinOcc and th_MaxOcc that
define the range of occupation for merging blocks. All the
values are dependent on the specific type o f documents.
Although it is possible to find the optimal value of these
parameters on the basis of a training set of documents, this
aspect has not been explored in this work.
At the end of the spatial merging process,
List
contains the final list of binary images. The RLSA
segmentation is applied to each image separately (if not
yet computed) and each RLSA execution returns a set of
rectangular blocks that are joined in a single set of blocks.
Layout Analysis. The segmentation algorithm returns
(possibly) overlapping blocks that may contain either
textual or graphical information and are either single color
or multicolor. A first step towards the reconstruction of
layout structure consists of classifying the blocks
according to their content type: text, horizontal line,
vertical line, picture (i.e. halftone images) and graphics
(e.g. line drawings). This classification is performed by
means of the decision tree learner ITI that builds a
decision tree from a set of training examples (blocks) of
the five classes. The layout structure is built by exploiting
not only the result of the classification of basic blocks and
their geometrical features but also the color information
obtained during the segmentation process.
Strategies for the extraction of layout structure have
been traditionally classified as top-down or bottom-up
[14]. W ISDOM++ decomposes the document page in a
hybrid way, since it combines the image segmentation and
a bottom-up layout analysis method to assemble basic
blocks into larger frames. More precisely, the layout
structure is extracted in two steps:
1. A global analysis of the document image to
determine possible areas containing paragraphs, sections,
columns, figures and tables. This step is based on an
iterative process, in which the vertical and horizontal
histograms of text blocks are alternatively analyzed in
Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05)
1520-5263/05 $20.00 © 2005
IEEE
Authorized licensed use limited to: Donato Malerba. Downloaded on July 26,2010 at 21:21:31 UTC from IEEE Xplore. Restrictions apply.

order to d etect columns and sections/paragraphs,
respectively. The levels of columns and sections are
alternated, which means that a column contains sections,
while a section contains columns.
2. A local analysis to group together blocks which
possibly fall within the same area. Four perceptual criteria
are considered in this step: proximity (e.g. adjacent
components belonging to the same column/area are
equally spaced), continuity (overlapping components),
similarity ( e.g. components of the same type, with an
almost equal height) and color (i.e. components of the
same color). Pairs of layout components that satisfy some
of these criteria are grouped together. It is noteworthy that
grouping affects either pairs of layout components
extracted from the same binary image (i.e. with exactly the
same color) or pairs of layout components labelled as
multicolor. Therefore, the color associated with a new
layout component is univocally determined from its
constituents. Differently, it is possible to group together
layout components with different content type. In this
case, the associated type is set to mixed, otherwise it is set
to the inherited type. The layout structure extracted for
each document page is a hierarchy with five levels: basic
blocks, lines, set of lines, frame1 and frame2.
3. Application
In this section we empirically evaluate the proposed
appro ach in terms of the capability to isolate interesting
blocks of different color for subsequent logical labeling.
To evaluate this aspect, we compared the output of the
new color-based layout analysis with the output of the
black and white (b/w) layout analysis implemented in the
original versio n of WISDOM++.
The corpus used in this study is composed by
document images provided by the three f ilm archives
involved in the proj ect COLLATE. Generally, documents
are multi-page, where each page is a 256-colors image in
TIFF format representing rare historic film censorships
from the 20's and 30's. We applied WISDOM++ with both
the layout analysis methods to 108 document images in all
belonging to 3 distinct classes, one for each archive (see
Table 1). In the case of the color-based setting, the
following threshold values have been used: th_Int = 70%,
th_Incl= 75%, th_MinOcc =1.5%andth_MaxOcc=
4.5%. Once the layout structures have been extracted, to
the same domain-expert user (archivist) is asked to
manually label interesting components. The number of
relevant labels is 12 for FAA, 7 for DIF and 13 for NFA.
In Fig. 2, 3 and 4, examples of layout analysis outputs
are shown. It is noticeable that the color-based layout
analysis is able to isolate interesting blocks better than the
previous version. For example, in Fig. 2 the b/w layout
analysis returns very few blocks. In particular, labels such
as stamp, film_genre, film_length, adhesive_stamp have
not been separated and co-occur in the same frame2
block. On the contrary, color-based layout analysis is able
to isolate them. By closely looking at the image, we can
draw another consideration: the dep_signature (in violet
in the bottom) has not been re presented at all in the b/w
image, which is due to the approximation perfor med by
the embedded binarization algorithm. Of course, this loss
of layout components does not occur in color-based layout
analysis. By looking at Fig. 3, we note that the color-
based layout analysis is able to identify overlapping
blocks, that is, cens_signature and stamp. On the contrary,
the b/w layout analysis identifies two blocks, and the
stamp has been split. In Fig. 4, a document image of the
NFA class, that represents the most complex to analyze
because of the overall low quality, is shown. In this case,
the document contains manual annotations (no_prec_doc,
top right-hand corner), blue stamps (register_office and
dispatch_officer, bottom page), red stamps
(rubber_stamp, top left-hand corner) and revenue stamps
(stamp, in the middle of the page). The color-based layout
analysis is able to isolate them, while the b/w layout
analysis returns a single layout block for the whole central
part of the document image and two spurious blocks
extracted from the bottom o f the image. This poor result is
due to the presence of both vertical and horizontal lines,
which affect the RLSA segmentation, especially when
dep_signature
stam p
adesive_stamp
film_length
film_genre
Fig. 2 First page layouts of a FAA censorship card.
Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05)
1520-5263/05 $20.00 © 2005
IEEE
Authorized licensed use limited to: Donato Malerba. Downloaded on July 26,2010 at 21:21:31 UTC from IEEE Xplore. Restrictions apply.

colors ar e not differentiated.
The number of frame2 layout components that the user
is able to label has been recorded. In the case of DIF
cards, the color setting (i.e. 1 33 lab els) is comparable with
the b/w (i.e. 149 labels). This can be explained by the
minor relevance of color in the case of DIF images. On
the contrary, in the case of both FAA and NFA, several
logical components are characterized by color
information. Indeed, for the FAA class, 205 components
have been labeled in the color setting against 140 in the
b/w, while 64 against 12 for the NFA class.
4. Conclusions
In this paper, a new color-based layout analysis method
has been proposed in order to meet challenges coming
from processing censorship cards of European film
archives of the 20ties and 30ties of the last century. A
comparison of the method with the original b/w version
has been provided. Results show that the color-based
appro ach allows to isolate interesting blocks better than
the previous version and to provide a more accurate base
for understanding. For future works, we plan to evaluate
the p roposed approach in automatic/manual labeling.
Acknowledgements
The work presented in this paper is partial fulfillment of
the research objective set by the ATENEO-2005 project
on “Gestione dell'informazione non strutturata: modelli,
metodi e architetture”.
References
1. Altamura O., Esposito F., & Malerba D.: Transforming
paper documents into XML format with WISDOM++,
IJDAR, 4(1), 2-17, 2001.
2. A. Antonacopoulos, D. Karatzas: The Lifecycle of a Digital
Historical Document: Structure and Content. ACM Symp.
on Document Engineering. 147-154. 2004
3. A. Antonacopoulos, D. Karatzas: Document Image
Analysis for World War II P ersonal Records. Workshop
DIAL’04 , 23-24, 336-341. 2004.
4. H.-D. Cheng, X.H. Jiang, Y. Sun, J.L. Wang: Color image
segmentation: advances and prospects. Pattern
Recognition; 34: 2259-81, 2001 .
5. I. Frommholz, H. Brocks, U. Thiel, E. Neuhold, L.
Iannone, G. Semeraro, M. Berardi, M. Ceci: Document-
centered Collaboration for Scholars in the Humanities -
The COLLATE System. In Proc.ECDL’03 , 2003.
6. B.Gatos,K.Ntzios,I.Pratikakis,S.PetridisandT.
Konidaris, S. J. P erantonis: A Segmentation-Free
Recognition Technique to Assist Old Greek Handwritten
Manuscript OCR. DAS2004, LNCS 3163, 63-74, 2004.
7. M. Gervautz, W. P urgathofer: A simple method for color
quantization: octree quantization. New Trends in Computer
Graphics, Proc. of Computer Graphic Int. 219-231, 1988.
8. H. Hase, M. Yoneda, S. Tokai, J. Kato, C.Y. Suen: Color
segmentation for text extraction, Int. Journal of Document
Analysis and Recognition (IJDAR) 6(4), 271-284, 2004.
9. J. He, A.C. Downton: Configurable Text Stamp
Identification Tool with Application of Fuzzy Logic. In
proc. of DAS2004, LNCS 3163, 201-212, 2004.
10. F. Le Bourgeois, H. Kaileh: Automatic Metadata Retrieval
from Ancient Manuscripts, In proc. of Document Analysis
Systems DAS2004, LNCS 3163, Italy, 75-89, 2004.
11. C.A.B. Mello and R.D.Lins: Image Segmentation of
Historical Documents. Visual 2000: 3rd Int. Conference on
Visual Computing, Mexico City, 2000.
12. S. Nicolas, T. Paquet, L. Heutte: Enriching Historical
Manuscripts: The Bovary Project. In proc. of Document
Analysis Systems DAS2004, LNCS 3163, 135-146, 2004.
13. T. Perroud, K. Sobottka and H. Bunke: Text Extraction
from Color Documents Clustering Approaches in Three
and Four Dimensions, ICDAR’01, 2001.
14. S.N. Srihari, & G.W. Zack: Document Image Analysis.
Proc. o f Int. Conf. on Pattern Recogn ition, 434-436, 1986
15. K.Y.Wong,R.G.Casey,F.M.Wahl:Documentanalysis
system. IBM Journal of Research Development 26(6),1982.
Fig 3 Second page layouts of a DIF censorship card.
cens_signature
cert_signature
cert_signature
stam p
Fig. 4 First page layouts of a NFA censorship card
.
undefined
no_prec_doc
dispatch_office
register_office
rubber_stamp
stam p
Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR’05)
1520-5263/05 $20.00 © 2005
IEEE
Authorized licensed use limited to: Donato Malerba. Downloaded on July 26,2010 at 21:21:31 UTC from IEEE Xplore. Restrictions apply.
Citations
More filters

Proceedings ArticleDOI
18 Sep 2011
TL;DR: A novel framework for segmentation of documents with complex layouts performed by combination of clustering and conditional random fields (CRF) based modeling and has been extensively tested on multi-colored document images with text overlapping graphics/image.
Abstract: In this paper, we propose a novel framework for segmentation of documents with complex layouts. The document segmentation is performed by combination of clustering and conditional random fields (CRF) based modeling. The bottom-up approach for segmentation assigns each pixel to a cluster plane based on color intensity. A CRF based discriminative model is learned to extract the local neighborhood information in different cluster/color planes. The final category assignment is done by a top-level CRF based on the semantic correlation learned across clusters. The proposed framework has been extensively tested on multi-colored document images with text overlapping graphics/image.

12 citations


Cites methods from "A color-based layout analysis to pr..."

  • ...Layout analysis using color information have been proposed in [9]–[11] to handle color document images with complex layouts such as forms, text overlaid on image, posters etc....

    [...]


References
More filters

Journal ArticleDOI
TL;DR: This survey provides a summary of color image segmentation techniques available now based on monochrome segmentation approaches operating in different color spaces and some novel approaches such as fuzzy method and physics-based method are investigated.
Abstract: Image segmentation is very essential and critical to image processing and pattern recognition. This survey provides a summary of color image segmentation techniques available now. Basically, color segmentation approaches are based on monochrome segmentation approaches operating in different color spaces. Therefore, we first discuss the major segmentation approaches for segmenting monochrome images: histogram thresholding, characteristic feature clustering, edge detection, region-based methods, fuzzy techniques, neural networks, etc.; then review some major color representation methods and their advantages/disadvantages; finally summarize the color image segmentation techniques using different color representations. The usage of color models for image segmentation is also discussed. Some novel approaches such as fuzzy method and physics-based method are investigated as well.

1,568 citations


"A color-based layout analysis to pr..." refers methods in this paper

  • ...We used CIELab since it is considered "visually uniform" because adjacent color samples represent equal intervals of visual perception [4]....

    [...]


Journal ArticleDOI
TL;DR: The requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing, are outlined and several critical functions have been investigated and the technical approaches are discussed.
Abstract: This paper outlines the requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing. Several critical functions have been investigated and the technical approaches are discussed. The first is the segmentation and classification of digitized printed documents into regions of text and images. A nonlinear, run-length smoothing algorithm has been used for this purpose. By using the regular features of text lines, a linear adaptive classification scheme discriminates text regions from others. The second technique studied is an adaptive approach to the recognition of the hundreds of font styles and sizes that can occur on printed documents. A preclassifier is constructed during the input process and used to speed up a well-known pattern-matching method for clustering characters from an arbitrary print source into a small sample of prototypes. Experimental results are included.

701 citations


Book
01 Apr 1996

346 citations


Book ChapterDOI
TL;DR: A new method for filling a color table is presented that produces pictures of similar quality as existing methods, but requires less memory and execution time.
Abstract: A new method for filling a color table is presented that produces pictures of similar quality as existing methods, but requires less memory and execution time. All colors of an image are inserted in an octree, and this octree is reduced from the leaves to the root in such a way that every pixel has a well defined maximum error. The algorithm is described in PASCAL notation.

328 citations


"A color-based layout analysis to pr..." refers methods in this paper

  • ...The quantization process follows the method proposed by [7] whose basic idea is to build a octree containing a maximum of K different leaves (a leaf corresponds to a color)....

    [...]


Journal ArticleDOI
TL;DR: The innovative aspects described in the paper are: the preprocessing algorithm, the adaptive page segmentation, the acquisition of block classification rules using techniques from machine learning, the layout analysis based on general layout principles, and a method that uses document layout information for conversion to HTML/XML formats.
Abstract: The transformation of scanned paper documents to a form suitable for an Internet browser is a complex process that requires solutions to several problems. The application of an OCR to some parts of the document image is only one of the problems. In fact, the generation of documents in HTML format is easier when the layout structure of a page has been extracted by means of a document analysis process. The adoption of an XML format is even better, since it can facilitate the retrieval of documents in the Web. Nevertheless, an effective transformation of paper documents into this format requires further processing steps, namely document image classification and understanding. WISDOM++ is a document processing system that operates in five steps: document analysis, document classification, document understanding, text recognition with an OCR, and transformation into HTML/XML format. The innovative aspects described in the paper are: the preprocessing algorithm, the adaptive page segmentation, the acquisition of block classification rules using techniques from machine learning, the layout analysis based on general layout principles, and a method that uses document layout information for conversion to HTML/XML formats. A benchmarking of the system components implementing these innovative aspects is reported.

129 citations


"A color-based layout analysis to pr..." refers methods in this paper

  • ...WISDOM++ was originally developed to fully support the transformation of multi-page printed documents into XML format....

    [...]

  • ...We applied WISDOM++ with both the layout analysis methods to 108 document images in all belonging to 3 distinct classes, one for each archive (see Table 1)....

    [...]

  • ...WISDOM++ decomposes the document page in a hybrid way, since it combines the image segmentation and a bottom-up layout analysis method to assemble basic blocks into larger frames....

    [...]

  • ...WISDOM++, originally developed to process blackand-white (binary) images, has been extended to take full advantage of color information in image segmentation and layout analysis steps....

    [...]

  • ...In this framework, we applied the DIA system WISDOM++ [1] to digitized documents available in three national film archives, namely Deutsches Filminstitut, Filmarchiv Austria and Národní Filmový Archiv (Czech Republic)....

    [...]


Frequently Asked Questions (2)
Q1. What are the contributions in "A color-based layout analysis to process censorship cards of film archives" ?

In this paper, taking into account lessons learned in the context of the IST project Collate, the authors propose a new method for image segmentation and layout analysis that takes full advantage of color information. 

For future works, the authors plan to evaluate the proposed approach in automatic/manual labeling.