scispace - formally typeset

Author

Ritu Garg

Other affiliations: Indian Institutes of Technology
Bio: Ritu Garg is an academic researcher from Indian Institute of Technology Delhi. The author has contributed to research in topic(s): Graphics & Document layout analysis. The author has an hindex of 5, co-authored 14 publication(s) receiving 61 citation(s). Previous affiliations of Ritu Garg include Indian Institutes of Technology.
Papers
More filters

Proceedings ArticleDOI
18 Sep 2011
TL;DR: A novel framework for segmentation of documents with complex layouts performed by combination of clustering and conditional random fields (CRF) based modeling and has been extensively tested on multi-colored document images with text overlapping graphics/image.
Abstract: In this paper, we propose a novel framework for segmentation of documents with complex layouts. The document segmentation is performed by combination of clustering and conditional random fields (CRF) based modeling. The bottom-up approach for segmentation assigns each pixel to a cluster plane based on color intensity. A CRF based discriminative model is learned to extract the local neighborhood information in different cluster/color planes. The final category assignment is done by a top-level CRF based on the semantic correlation learned across clusters. The proposed framework has been extensively tested on multi-colored document images with text overlapping graphics/image.

12 citations


Proceedings ArticleDOI
17 Sep 2011
TL;DR: The proposed framework presents a top-down approach by performing page, block/paragraph and word level script identification in multiple stages by utilizing texture and shape based information embedded in the documents at different levels for feature extraction.
Abstract: Script identification in a multi-lingual document environment has numerous applications in the field of document image analysis, such as indexing and retrieval or as an initial step towards optical character recognition. In this paper, we propose a novel hierarchical framework for script identification in bi-lingual documents. The framework presents a top-down approach by performing page, block/paragraph and word level script identification in multiple stages. We utilize texture and shape based information embedded in the documents at different levels for feature extraction. The prediction task at different levels of hierarchy is performed by Support Vector Machine (SVM) and Rejection based classifier defined using AdaBoost. Experimental evaluation of the proposed concept on document collections of Hindi/English and Bangla/English scripts have shown promising results.

11 citations


Proceedings ArticleDOI
Gaurav Harit1, Ritu Garg1, Santanu Chaudhury1Institutions (1)
05 Mar 2007
TL;DR: An integrated scheme for document image compression is presented which preserves the layout structure, and still allows the display of textual portions to adapt to the user preferences and screen area, and derives an SVG representation of the complete document image.
Abstract: We present an integrated scheme for document image compression which preserves the layout structure, and still allows the display of textual portions to adapt to the user preferences and screen area. We encode the layout structure of the document images in an XML representation. The textual components and picture components are compressed separately into different representations. We derive an SVG (scalable vector graphics) representation of the complete document image. Compression is achieved since the word-images are encoded using specifications for geometric primitives that compose a word. A document rendered from its SVG representation can be adapted for display and interactive access through common browsers on desktop as well as mobile devices. We demonstrate the effectiveness of the proposed scheme for document access

5 citations


Proceedings ArticleDOI
Ritu Garg1, Santanu Chaudhury1Institutions (1)
11 Apr 2016
TL;DR: A novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image and compute parameters to maximize the expected recognition accuracy found in E-step.
Abstract: Performance of most of the recognition engines for document images is effected by quality of the image being processed and the selection of parameter values for the pre-processing algorithm. Usually the choice of such parameters is done empirically. In this paper, we propose a novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image. Recognition accuracy can be used as a metric for document quality assessment. We learn filters that capture the script properties and degradation to predict recognition accuracy. An EM based framework has been formulated to iteratively learn optimal parameters for document image pre-processing. In the E-step, we estimate the expected accuracy using the current set of parameters and filters. In the M-step we compute parameters to maximize the expected recognition accuracy found in E-step. The experiments validate the efficacy of the proposed methodology for document image pre-processing applications.

5 citations


Proceedings ArticleDOI
Ritu Garg, S. Indu, Santanu Chaudhury1Institutions (1)
15 Dec 2011
TL;DR: Multi-Objective Genetic Algorithm is used to maximize the camera coverage with optimum illumination of the sensing space and this paper outlines the camera and light source location optimization problem with multiple objective functions.
Abstract: Optimal placement of visual sensors along with good lighting conditions is indispensable for the successful execution of surveillance applications. Limited field-of-view, depth-of-field, occlusion due to presence of different objects in the scene form the major constraints for visual sensor placement. While over/under exposed objects, shadowing and light rays directly incident on the camera lens are some of the constraints for light source placement. Because of the nature of the constraints and complexity of the problem, the placement problem is considered to be a multi-objective global optimization problem. The paper outlines the camera and light source location optimization problem with multiple objective functions. Multi-Objective Genetic Algorithm is used to maximize the camera coverage with optimum illumination of the sensing space.

5 citations


Cited by
More filters

Journal ArticleDOI
TL;DR: This work proposes a new active learning method for classification, which handles label noise without relying on multiple oracles (i.e., crowdsourcing), and proposes a strategy that selects (for labeling) instances with a high influence on the learned model.
Abstract: We propose a new active learning method for classification, which handles label noise without relying on multiple oracles (i.e., crowdsourcing). We propose a strategy that selects (for labeling) instances with a high influence on the learned model. An instance x is said to have a high influence on the model h, if training h on x (with label $$y = h(x)$$ ) would result in a model that greatly disagrees with h on labeling other instances. Then, we propose another strategy that selects (for labeling) instances that are highly influenced by changes in the learned model. An instance x is said to be highly influenced, if training h with a set of instances would result in a committee of models that agree on a common label for x but disagree with h(x). We compare the two strategies and we show, on different publicly available datasets, that selecting instances according to the first strategy while eliminating noisy labels according to the second strategy, greatly improves the accuracy compared to several benchmarking methods, even when a significant amount of instances are mislabeled.

78 citations


Journal ArticleDOI
TL;DR: This survey highlights the variety of the approaches that have been proposed for document image segmentation since 2008 and provides a clear typology of documents and of document images segmentation algorithms.
Abstract: In document image analysis, segmentation is the task that identifies the regions of a document. The increasing number of applications of document analysis requires a good knowledge of the available technologies. This survey highlights the variety of the approaches that have been proposed for document image segmentation since 2008. It provides a clear typology of documents and of document image segmentation algorithms. We also discuss the technical limitations of these algorithms, the way they are evaluated and the general trends of the community.

71 citations


Journal ArticleDOI
TL;DR: Various feature extraction and classification techniques associated with the OSI of the Indic scripts are discussed in this survey and it is hoped that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India.
Abstract: Offline Script Identification (OSI) facilitates many important applications such as automatic archiving of multilingual documents, searching online/offline archives of document images and for the selection of script specific Optical Character Recognition (OCR) in a multilingual environment. In a multilingual country like India, a document containing text words in more than one language is a common scenario. A state-of-the-art survey about the techniques available in the area of OSI for Indic scripts would be of a great aid to the researchers. Hence, a sincere attempt is made in this article to discuss the advancements reported in the literature during the last few decades. Various feature extraction and classification techniques associated with the OSI of the Indic scripts are discussed in this survey. We hope that this survey will serve as a compendium not only for researchers in India, but also for policymakers and practitioners in India. It will also help to accomplish a target of bringing the researchers working on different Indic scripts together. Taking the recent developments in OSI of Indian regional scripts into consideration, this article will provide a better platform for future research activities.

42 citations


Journal ArticleDOI
TL;DR: A survey of the past researches on character based as keyword based approaches used for retrieving information from document images to provide insights into the strengths and weaknesses of current techniques and the guidance in choosing the area that future work on document image retrieval could address.
Abstract: This paper attempts to provide a survey of the past researches on character based as keyword based approaches used for retrieving information from document images. This survey also provides insights into the strengths and weaknesses of current techniques, relevancy lies between each technique and also the guidance in choosing the area that future work on document image retrieval could address.

39 citations


Proceedings ArticleDOI
Deepak Arya1, C. V. Jawahar2, Chakravorty Bhagvati3, Tushar Patnaik1  +4 moreInstitutions (7)
17 Sep 2011
TL;DR: The project is an attempt to implement an integrated platform for OCR of different Indian languages and currently is being enhanced for handling the space and time constraints, achieving higher recognition accuracies and adding new functionalities.
Abstract: This paper presents integration and testing scheme for managing a large Multilingual OCR Project. The project is an attempt to implement an integrated platform for OCR of different Indian languages. Software engineering, workflow management and testing processes have been discussed in this paper. The OCR has now been experimentally deployed for some specific applications and currently is being enhanced for handling the space and time constraints, achieving higher recognition accuracies and adding new functionalities.

25 citations


Network Information
Related Authors (3)
Ehtesham Hassan

45 papers, 363 citations

92% related
Santanu Chaudhury

380 papers, 3.6K citations

77% related
Gaurav Harit

73 papers, 523 citations

55% related
Performance
Metrics

Author's H-index: 5

No. of papers from the Author in previous years
YearPapers
20162
20151
20132
20121
20113
20094