scispace - formally typeset
Search or ask a question
Author

Sangeet Aggarwal

Bio: Sangeet Aggarwal is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topics: Optimization problem & Graphics. The author has an hindex of 1, co-authored 1 publications receiving 4 citations.

Papers
More filters
Proceedings ArticleDOI
16 Dec 2012
TL;DR: This paper presents a novel framework that learns optimal parameters, depending on the nature of the document image content for binarization and text/graphics segmentation, using EM algorithm.
Abstract: Most of the document pre-processing techniques are parameter dependent. In this paper, we present a novel framework that learns optimal parameters, depending on the nature of the document image content for binarization and text/graphics segmentation. The learning problem has been formulated as an optimization problem using EM algorithm to adaptively learn optimal parameters. Experimental results have established the effectiveness of our approach.

4 citations


Cited by
More filters
Proceedings ArticleDOI
07 Apr 2014
TL;DR: A novel learning based framework to extract articles from newspaper images using a Fixed-Point Model that uses contextual information and features of each block to learn the layout of newspaper images and attains a contraction mapping to assign a unique label to every block.
Abstract: This paper presents a novel learning based framework to extract articles from newspaper images using a Fixed-Point Model. The input to the system comprises blocks of text and graphics, obtained using standard image processing techniques. The fixed point model uses contextual information and features of each block to learn the layout of newspaper images and attains a contraction mapping to assign a unique label to every block. We use a hierarchical model which works in two stages. In the first stage, a semantic label (heading, sub-heading, text-blocks, image and caption) is assigned to each segmented block. The labels are then used as input to the next stage to group the related blocks into news articles. Experimental results show the applicability of our algorithm in newspaper labeling and article extraction.

17 citations

Proceedings ArticleDOI
12 Jun 2015
TL;DR: A novel method to rebuild the broken characters are thinned and the endpoints of the lines are obtained and the line segments are effectively rebuilt so as to preserve the degraded character.
Abstract: Degraded character recognition is one of the most challenging topic in the field of Kannada character recognition. The degraded characters which are broken and deformed will have missing features and will be difficult for any recognition method. Rebuilding the degraded character is very important for better recognition. This paper proposes a novel method to rebuild the broken characters. These characters are thinned and the endpoints of the lines are obtained. The line segments are effectively rebuilt so as to preserve the degraded character. Experimental results on this method are presented to establish its efficiency.

7 citations

Proceedings ArticleDOI
11 Apr 2016
TL;DR: A novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image and compute parameters to maximize the expected recognition accuracy found in E-step.
Abstract: Performance of most of the recognition engines for document images is effected by quality of the image being processed and the selection of parameter values for the pre-processing algorithm. Usually the choice of such parameters is done empirically. In this paper, we propose a novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image. Recognition accuracy can be used as a metric for document quality assessment. We learn filters that capture the script properties and degradation to predict recognition accuracy. An EM based framework has been formulated to iteratively learn optimal parameters for document image pre-processing. In the E-step, we estimate the expected accuracy using the current set of parameters and filters. In the M-step we compute parameters to maximize the expected recognition accuracy found in E-step. The experiments validate the efficacy of the proposed methodology for document image pre-processing applications.

6 citations

Proceedings ArticleDOI
24 Aug 2013
TL;DR: A novel framework for learning optimal parameters for text graphic separation in the presence of complex layouts of Indian newspaper is proposed.
Abstract: Digitization of newspaper article is important for registering historical events. Layout analysis of Indian newspaper is a challenging task due to the presence of different font size, font styles and random placement of text and non-text regions. In this paper we propose a novel framework for learning optimal parameters for text graphic separation in the presence of complex layouts. The learning problem has been formulated as an optimization problem using EM algorithm to learn optimal parameters depending on the nature of the document content.

3 citations