scispace - formally typeset
Search or ask a question
Author

Santanu Chaudhury

Bio: Santanu Chaudhury is an academic researcher from Indian Institute of Technology, Jodhpur. The author has contributed to research in topics: Ontology (information science) & Image segmentation. The author has an hindex of 28, co-authored 380 publications receiving 3691 citations. Previous affiliations of Santanu Chaudhury include Central Electronics Engineering Research Institute & Indian Institute of Technology Delhi.


Papers
More filters
Proceedings ArticleDOI
11 Apr 2016
TL;DR: A novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image and compute parameters to maximize the expected recognition accuracy found in E-step.
Abstract: Performance of most of the recognition engines for document images is effected by quality of the image being processed and the selection of parameter values for the pre-processing algorithm. Usually the choice of such parameters is done empirically. In this paper, we propose a novel framework for automatic selection of optimal parameters for pre-processing algorithm by estimating the quality of the document image. Recognition accuracy can be used as a metric for document quality assessment. We learn filters that capture the script properties and degradation to predict recognition accuracy. An EM based framework has been formulated to iteratively learn optimal parameters for document image pre-processing. In the E-step, we estimate the expected accuracy using the current set of parameters and filters. In the M-step we compute parameters to maximize the expected recognition accuracy found in E-step. The experiments validate the efficacy of the proposed methodology for document image pre-processing applications.

6 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper proposes a noise-resilient SR framework for text images and recognizes the text using a deep BLSTM network trained on high resolution images and tests the OCR performance of the noise- Resilient super-resolved images is at par with the original HR images.
Abstract: Recognizing text from noisy low-resolution (LR) images is extremely challenging and is an open problem for the computer vision community. Super-resolving a noisy LR text image results in noisy High Resolution (HR) text image, as super-resolution (SR) leads to spatial correlation in the noise, and further cannot be de-noised successfully. Traditional noise-resilient text image super-resolution methods utilize a denoising algorithm prior to text SR but denoising process leads to loss of some high frequency details, and the output HR image has missing information (texture details and edges). This paper proposes a noise-resilient SR framework for text images and recognizes the text using a deep BLSTM network trained on high resolution images. The proposed end-to-end deep learning based framework for noise-resilient text image SR simultaneously perform image denoising and super-resolution as well as preserves missing details. Stacked sparse denoising auto-encoder (SSDA) is learned for LR text image denoising, and our proposed coupled deep convolutional auto-encoder (CDCA) is learned for text image super-resolution. The pretrained weights for both these networks serve as initial weights to the end-to-end framework during finetuning, and the network is jointly optimized for both the tasks. We tested on several Indian Language datasets and the OCR performance of the noise-resilient super-resolved images is at par with the original HR images.

6 citations

Journal ArticleDOI
TL;DR: The prevalence of mental disability was found higher among males than among females and among individuals with low socioeconomic status, and there is scope of community-based rehabilitation of the mentally disabled.
Abstract: Background: In the present era, mental disability is a major public health problem in the society. Many of the mental disabilities are correctable if detected early. Objectives: To assess the prevalence and pattern of mental disability. Materials and Methods: Community-based cross-sectional study. Patients of all age groups in the age range of 0-60 years were randomly selected from 10 blocks of 2 districts, viz., Ranchi and Hazaribagh. Thirty villages from each block were taken for the study. The study was conducted by making house-to-house visits, interviewing and examining all the individuals in the families selected using pre-tested questionnaire. Statistical Analysis: It was done by the proportions. Results and Conclusion: The prevalence of mental disability was found higher among males (67.9%) than among females (32.1%). The prevalence rate was higher among the productive groups and among individuals with low socioeconomic status. There is scope of community-based rehabilitation of the mentally disabled.

6 citations

Proceedings ArticleDOI
05 Nov 2015
TL;DR: Assessment of assumptions of Multivariate Autoregressive (MAR) framework which is employed for evaluating directionality among fMRI time-series recorded during a Sensory-Motor (SM) task indicates inadequacy of MAR models to find directional interactions among different task-activated regions of brain.
Abstract: Directionality analysis of time-series, recorded from task-activated regions-of-interest (ROIs) during functional Magnetic Resonance Imaging (fMRI), has helped in gaining insights of complex human behavior and human brain functioning. The most widely used standard method of Granger Causality for evaluating directionality employ linear regression modeling of temporal processes. Such a parameter-driven approach rests on various underlying assumptions about the data. The short-comings can arise when misleading conclusions are reached after exploration of data for which the assumptions are getting violated. In this study, we assess assumptions of Multivariate Autoregressive (MAR) framework which is employed for evaluating directionality among fMRI time-series recorded during a Sensory-Motor (SM) task. The fMRI time-series here is an averaged time-series from a user-defined ROI of multiple voxels. The “aim” is to establish a step-by-step procedure using statistical methods in conjunction with graphical methods to seek the validity of MAR models, specifically in the context of directionality analysis of fMRI data which has not been done previously to the best of our knowledge. Here, in our case of SM task (block design paradigm) there is violation of assumptions, indicating the inadequacy of MAR models to find directional interactions among different task-activated regions of brain.

6 citations

Proceedings ArticleDOI
23 Aug 2015
TL;DR: A unified framework of language model and multiple preprocessing hypotheses for word recognition from bilingual document images and uses a language model to verify each alternative and choose the best recognized sequence is presented.
Abstract: Script based features are highly discriminative for text segmentation and recognition. Thus they are widely used in Optical Character Recognition(OCR) problems. But usage of script dependent features restricts the adaptation of such architectures directly for another script. With script independent systems, this problem can be solved to a certain extent for monolingual documents. But the problem aggravates in case of multilingual documents as it is very difficult for a single classifier to learn many scripts. Generally a script identification module identifies text segments and accordingly the script-dependent classifier is selected. This paper presents a unified framework of language model and multiple preprocessing hypotheses for word recognition from bilingual document images. Prior to text recognition, preprocessing steps such as binarization and segmentation are required for ease of recognition. But these steps induce huge combinatorial error propagating to final recognition accuracy. In this paper we use multiple preprocessing routines as alternate hypotheses and use a language model to verify each alternative and choose the best recognized sequence. We test this architecture for word recognition of Kannada-English and Telugu-English bilingual documents and achieved better recognition rates than single methods using same classifier.

6 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
Abstract: Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.

14,054 citations

01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

Journal ArticleDOI
TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.
Abstract: Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

2,653 citations

Reference EntryDOI
15 Oct 2004

2,118 citations