A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation

doi:10.1109/ICDAR.2011.245

Proceedings ArticleDOI

A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation

- pp 1215-1219

TLDR

A novel framework for segmentation of documents with complex layouts performed by combination of clustering and conditional random fields (CRF) based modeling and has been extensively tested on multi-colored document images with text overlapping graphics/image.

Abstract:

In this paper, we propose a novel framework for segmentation of documents with complex layouts. The document segmentation is performed by combination of clustering and conditional random fields (CRF) based modeling. The bottom-up approach for segmentation assigns each pixel to a cluster plane based on color intensity. A CRF based discriminative model is learned to extract the local neighborhood information in different cluster/color planes. The final category assignment is done by a top-level CRF based on the semantic correlation learned across clusters. The proposed framework has been extensively tested on multi-colored document images with text overlapping graphics/image.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A comprehensive survey of mostly textual document segmentation algorithms since 2008

Sebastien Eskenazi, +2 more

- 01 Apr 2017 -

Pattern Recognition

TL;DR: This survey highlights the variety of the approaches that have been proposed for document image segmentation since 2008 and provides a clear typology of documents and of document images segmentation algorithms.

...read moreread less

Proceedings ArticleDOI

Research on the Text Detection and Extraction from Complex Images

Jian Zhang, +3 more

TL;DR: This paper tries to find a new way which can utilize existing methods to detect and extract text from born-digital image.

...read moreread less

Book ChapterDOI

Extraction of Doodles and Drawings from Manuscripts

Chandranath Adak, +1 more

TL;DR: An approach to separate the non-texts from texts of a manuscript, mainly in the form of doodles and drawings of some exceptional thinkers and writers, and a computational approach to recover the struck-out texts to reduce human effort.

...read moreread less

Journal ArticleDOI

Consensus-based clustering for document image segmentation

Soumyadeep Dey, +2 more

- 01 Dec 2016 -

International Journal on Document Analys...

TL;DR: A consensus-based clustering approach for document image segmentation that is used iteratively with a classifier to label each primitive block and shows that the dependency of classification performance on the training data is significantly reduced.

...read moreread less

Journal ArticleDOI

An intelligent character recognition method to filter spam images on cloud

Jun Chen, +5 more

TL;DR: STRHOG, an extended version of HOG that is helpful for filtering spam images on cloud and a fair comparison with other methods, nearest neighbor classifier is used for the intelligent character recognition.

...read moreread less

References

PDF

Open Access

More filters

Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +2 more

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +3 more

Proceedings Article

Large Margin DAGs for Multiclass Classification

John Platt, +2 more

TL;DR: An algorithm, DAGSVM, is presented, which operates in a kernel-induced feature space and uses two-class maximal margin hyperplanes at each decision-node of the DDAG, which is substantially faster to train and evaluate than either the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms.

...read moreread less

Proceedings ArticleDOI

A Visual Vocabulary for Flower Classification

M.-E. Nilsback, +1 more

TL;DR: It is demonstrated that by developing a visual vocabulary that explicitly represents the various aspects that distinguish one flower from another, it can overcome the ambiguities that exist between flower categories.

...read moreread less

Journal ArticleDOI

Comparison of texture features based on Gabor filters

S.E. Grigorescu, +2 more

- 01 Oct 2002 -

IEEE Transactions on Image Processing

TL;DR: The grating cell operator is the only one that selectively responds only to texture and does not give false response to nontexture features such as object contours and the texture detection capabilities of the operators are compared.

...read moreread less

A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation

Citations

A comprehensive survey of mostly textual document segmentation algorithms since 2008

Research on the Text Detection and Extraction from Complex Images

Extraction of Doodles and Drawings from Manuscripts

Consensus-based clustering for document image segmentation

An intelligent character recognition method to filter spam images on cloud

References

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Probabilistic Models for Segmenting and Labeling Sequence Data

Large Margin DAGs for Multiclass Classification

A Visual Vocabulary for Flower Classification

Comparison of texture features based on Gabor filters

Related Papers (5)

Document Image Segmentation Using a 2D Conditional Random Field Model

Segmentation of color documents by line oriented clustering using spatial information

Twenty years of document image analysis in PAMI

Document segmentation using Relative Location Features

Page Segmentation for Historical Handwritten Document Images Using Conditional Random Fields