Syntactic and Semantic Labeling of Hierarchically Organized Document Image Components of Indian Scripts
TL;DR: A document image analysis system which performs segmentation, content characterization as well as semantic labeling of components, and has obtained promising results for semantic segmentation of over 30 categories of documents in Indian scripts.
Abstract: In this paper we describe our document image analysis system which performs segmentation, content characterization as well as semantic labeling of components. Segmentation is done using white spaces and gives the segmented components arranged in a hierarchy. Semantic labeling is done using domain knowledge which is specified where possible in the form of a document model applicable to a class of documents. The novelty of the system lies in the suite of methods it employs which are capable of handling documents in Indian scripts. We have obtained promising results for semantic segmentation of over 30 categories of documents in Indian scripts.
...read more
Citations
3 citations
Cites methods from "Syntactic and Semantic Labeling of ..."
...The method used for evaluating the performance of our algorithm is based on counting the number of matches between the pixels segmented by the algorithm and the pixels in the ground truth [11]....
[...]
References
701 citations
"Syntactic and Semantic Labeling of ..." refers methods in this paper
...Well known methods are XY-cut [10], the smearing algorithm [15], white space analysis [2], Docstrum [11], the Voronoi-diagram based approach [5], and other variants like [6], [8], [13]....
[...]
628 citations
"Syntactic and Semantic Labeling of ..." refers methods in this paper
...Well known methods are XY-cut [10], the smearing algorithm [15], white space analysis [2], Docstrum [11], the Voronoi-diagram based approach [5], and other variants like [6], [8], [13]....
[...]
624 citations
456 citations
"Syntactic and Semantic Labeling of ..." refers methods in this paper
...Our method differs from XY-cut since the decomposed blocks need not be rectangle but can be general polygons....
[...]
...Recursive XY-cut produces a hierarchical organization of document components....
[...]
...Well known methods are XY-cut [10], the smearing algorithm [15], white space analysis [2], Docstrum [11], the Voronoi-diagram based approach [5], and other variants like [6], [8], [13]....
[...]
...The output of the recursive procedure is a hierarchical arrangement of the segmented blocks can be seen in Fig 2 It is worthwhile to compare our approach with the XY-cut algorithm which also makes use of white spaces....
[...]
275 citations
"Syntactic and Semantic Labeling of ..." refers methods in this paper
...Well known methods are XY-cut [10], the smearing algorithm [15], white space analysis [2], Docstrum [11], the Voronoi-diagram based approach [5], and other variants like [6], [8], [13]....
[...]