Home
/
Authors
/
Santanu Chaudhury

Author

Santanu Chaudhury

Other affiliations: Central Electronics Engineering Research Institute, Indian Institute of Technology Delhi, Indian Statistical Institute ...read more

Bio: Santanu Chaudhury is an academic researcher from Indian Institute of Technology, Jodhpur. The author has contributed to research in topics: Ontology (information science) & Image segmentation. The author has an hindex of 28, co-authored 380 publications receiving 3691 citations. Previous affiliations of Santanu Chaudhury include Central Electronics Engineering Research Institute & Indian Institute of Technology Delhi.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1988

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Multi-Task Real-Time Heterogeneous Traffic Capacity Analysis in Traffic Videos Using Faster Rcnn and Mld- Sort

[...]

Annu Mor, Mukesh Kumar, Santanu Chaudhury

01 Jan 2022-Social Science Research Network

Proceedings Article•DOI•

Guided Compositional Generative Adversarial Networks

[...]

Anurag Tripathi¹, Siddharth Srivastava¹, Brejesh Lall¹, Santanu Chaudhury²•Institutions (2)

Indian Institute of Technology Delhi¹, Indian Institute of Technology, Jodhpur²

01 Oct 2019

TL;DR: This paper shows that by training a Generative Adversarial Network with raw image pixels as input, it can generate scenes which constitute the objects as well as generate the surrounding environment suitable for the combination of the input objects.

...read moreread less

Abstract: In this paper, we propose to synthesize natural images from a set of input objects. The proposed technique generates a scene which has high correlation with the provided set of input objects while also maintaining the natural placement of objects within the scene. The technique constitutes of a generative adversarial network trained on a large corpus of objects and natural scenes. This is in contrast with earlier works where the objective was to generate a natural scene from a noise vector or conditioning the network over a variable. However, such methods have limitations in their ability to control the objects within the generated images. On the contrary, we show that by training a Generative Adversarial Network with raw image pixels as input, we can generate scenes which constitute the objects as well as generate the surrounding environment suitable for the combination of the input objects. We provide qualitative and quantitative results on challenging MS-COCO dataset to show the effectiveness of the proposed technique.

...read moreread less

Proceedings Article•DOI•

Code-Borrowedness of English words in Hindi Language

[...]

Ram Mohan, Muhammad Arif, Jobin Wilson, Santanu Chaudhury¹, Brejesh Lall¹ - Show less +1 more•Institutions (1)

Indian Institute of Technology Delhi¹

09 Mar 2017

TL;DR: The ground truth metric, based on user preference towards usage of a Hindi word in its Hindi form as opposed to its English form in a Hindi sentence, is determined through a survey.

...read moreread less

Abstract: 1.1 Ground Truth e user preference towards usage of a Hindi word in its Hindi form as opposed to its English form in a Hindi sentence, is determined through a survey. From the survey responses, dierence between the total number of instances wherein the word is preferred in its Hindi form and the instances wherein it is preferred in its English form is calculated to form the ground truth metric. e survey responses for 12 words are available from 58 participants, to measure the eectiveness of our proposed metric.

...read moreread less

Posted Content•

Understanding Character Recognition using Visual Explanations Derived from the Human Visual System and Deep Networks

[...]

Chetan Ralekar, Shubham Choudhary, Tapan K. Gandhi, Santanu Chaudhury

10 Aug 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the congruence of information gathering strategies between humans and deep neural networks has been examined in a character recognition task, where the authors use the visual fixation maps obtained from the eye-tracking experiment as a supervisory input to align the model's focus on relevant character regions.

...read moreread less

Abstract: Human observers engage in selective information uptake when classifying visual patterns. The same is true of deep neural networks, which currently constitute the best performing artificial vision systems. Our goal is to examine the congruence, or lack thereof, in the information-gathering strategies of the two systems. We have operationalized our investigation as a character recognition task. We have used eye-tracking to assay the spatial distribution of information hotspots for humans via fixation maps and an activation mapping technique for obtaining analogous distributions for deep networks through visualization maps. Qualitative comparison between visualization maps and fixation maps reveals an interesting correlate of congruence. The deep learning model considered similar regions in character, which humans have fixated in the case of correctly classified characters. On the other hand, when the focused regions are different for humans and deep nets, the characters are typically misclassified by the latter. Hence, we propose to use the visual fixation maps obtained from the eye-tracking experiment as a supervisory input to align the model's focus on relevant character regions. We find that such supervision improves the model's performance significantly and does not require any additional parameters. This approach has the potential to find applications in diverse domains such as medical analysis and surveillance in which explainability helps to determine system fidelity.

...read moreread less

Proceedings Article•DOI•

Lighter and Faster Two-Pathway CMRNet for Video Saliency Prediction

[...]

Sai Phani Kumar Malladi, Jayanta Mukhopadhyay, Mohamed-Chaker Larabi, Santanu Chaudhury

16 Oct 2022

TL;DR: Wang et al. as discussed by the authors proposed a two-pathway CMRNet (TP-CMRNet) with effective feature integration of spatial and temporal domains at multiple scales for video saliency prediction.

...read moreread less

Abstract: Existing dynamic saliency prediction models face challenges like inefficient spatio-temporal feature integration, ineffective multi-scale feature extraction, and lacking domain adaptation because of huge pre-trained backbone networks. In this paper, we propose a two pathway architecture with effective feature integration of spatial and temporal domains at multiple scales for video saliency prediction. Frame and optical flow pathways extract features from video frame and optical flow maps, respectively using a series of cross-concatenated multi-scale residual (CMR) blocks. We name this network as two-pathway CMRNet (TP-CMRNet). Every CMR block follows a feature fusion and attention module for merging features from two pathways and guiding the network to weigh salient regions, respectively. A bi-directional LSTM module is used for learning the task by looking at previous and next video frames. We build a simple decoder for feature reconstruction into the final attention map. TP-CMRNet is comprehensively evaluated using three benchmark datasets: DHF1K, Hollywood-2, and UCF sports. We observe that our model performs at par with other deep dynamic models. In particular, we outperform all the other models with a lesser number of model parameters and lower inference time.

...read moreread less

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
…
72
73
74
75
76
77
78
…

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Data clustering: a review

[...]

Anil K. Jain¹, M. N. Murty², Patrick J. Flynn³•Institutions (3)

Michigan State University¹, Indian Institute of Science², Ohio State University³

01 Sep 1999-ACM Computing Surveys

TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.

...read moreread less

Abstract: Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.

...read moreread less

14,054 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Journal Article•DOI•

Online and off-line handwriting recognition: a comprehensive survey

[...]

Réjean Plamondon¹, Sargur N. Srihari²•Institutions (2)

École Normale Supérieure¹, University at Buffalo²

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.

...read moreread less

Abstract: Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

...read moreread less

2,653 citations

Reference Entry•DOI•

IEEE Transactions on Pattern Analysis and Machine Intelligence

[...]

King-Sun Fu

15 Oct 2004

2,118 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse