scispace - formally typeset
Proceedings ArticleDOI

A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation

TLDR
A novel framework for segmentation of documents with complex layouts performed by combination of clustering and conditional random fields (CRF) based modeling and has been extensively tested on multi-colored document images with text overlapping graphics/image.
Abstract
In this paper, we propose a novel framework for segmentation of documents with complex layouts. The document segmentation is performed by combination of clustering and conditional random fields (CRF) based modeling. The bottom-up approach for segmentation assigns each pixel to a cluster plane based on color intensity. A CRF based discriminative model is learned to extract the local neighborhood information in different cluster/color planes. The final category assignment is done by a top-level CRF based on the semantic correlation learned across clusters. The proposed framework has been extensively tested on multi-colored document images with text overlapping graphics/image.

read more

Citations
More filters
Journal ArticleDOI

A comprehensive survey of mostly textual document segmentation algorithms since 2008

TL;DR: This survey highlights the variety of the approaches that have been proposed for document image segmentation since 2008 and provides a clear typology of documents and of document images segmentation algorithms.
Proceedings ArticleDOI

Research on the Text Detection and Extraction from Complex Images

TL;DR: This paper tries to find a new way which can utilize existing methods to detect and extract text from born-digital image.
Book ChapterDOI

Extraction of Doodles and Drawings from Manuscripts

TL;DR: An approach to separate the non-texts from texts of a manuscript, mainly in the form of doodles and drawings of some exceptional thinkers and writers, and a computational approach to recover the struck-out texts to reduce human effort.
Journal ArticleDOI

Consensus-based clustering for document image segmentation

TL;DR: A consensus-based clustering approach for document image segmentation that is used iteratively with a classifier to label each primitive block and shows that the dependency of classification performance on the training data is significantly reduced.
Journal ArticleDOI

An intelligent character recognition method to filter spam images on cloud

TL;DR: STRHOG, an extended version of HOG that is helpful for filtering spam images on cloud and a fair comparison with other methods, nearest neighbor classifier is used for the intelligent character recognition.
References
More filters
Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
Proceedings Article

Large Margin DAGs for Multiclass Classification

TL;DR: An algorithm, DAGSVM, is presented, which operates in a kernel-induced feature space and uses two-class maximal margin hyperplanes at each decision-node of the DDAG, which is substantially faster to train and evaluate than either the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms.
Proceedings ArticleDOI

A Visual Vocabulary for Flower Classification

TL;DR: It is demonstrated that by developing a visual vocabulary that explicitly represents the various aspects that distinguish one flower from another, it can overcome the ambiguities that exist between flower categories.
Journal ArticleDOI

Comparison of texture features based on Gabor filters

TL;DR: The grating cell operator is the only one that selectively responds only to texture and does not give false response to nontexture features such as object contours and the texture detection capabilities of the operators are compared.
Related Papers (5)