Home
/
Authors
/
Vinay P. Namboodiri

Author

Vinay P. Namboodiri

Other affiliations: Bell Labs, Indian Institute of Technology Bombay, Katholieke Universiteit Leuven ...read more

Bio: Vinay P. Namboodiri is an academic researcher from University of Bath. The author has contributed to research in topics: Computer science & Convolutional neural network. The author has an hindex of 22, co-authored 190 publications receiving 1869 citations. Previous affiliations of Vinay P. Namboodiri include Bell Labs & Indian Institute of Technology Bombay.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2008
2007
2006
2005
2004

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Multi-agent Diverse Generative Adversarial Networks

[...]

Arnab Ghosh¹, Viveka Kulharia¹, Vinay P. Namboodiri², Philip H. S. Torr¹, Puneet K. Dokania¹ - Show less +1 more•Institutions (2)

University of Oxford¹, Indian Institute of Technology Kanpur²

18 Jun 2018

TL;DR: Mad-GAN as discussed by the authors is a multi-agent GAN architecture incorporating multiple generators and one discriminator, which is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample.

...read moreread less

Abstract: We propose MAD-GAN, an intuitive generalization to the Generative Adversarial Networks (GANs) and its conditional variants to address the well known problem of mode collapse. First, MAD-GAN is a multi-agent GAN architecture incorporating multiple generators and one discriminator. Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample. Intuitively, to succeed in this task, the discriminator must learn to push different generators towards different identifiable modes. We perform extensive experiments on synthetic and real datasets and compare MAD-GAN with different variants of GAN. We show high quality diverse sample generations for challenging tasks such as image-to-image translation and face generation. In addition, we also show that MAD-GAN is able to disentangle different modalities when trained using highly challenging diverse-class dataset (e.g. dataset with images of forests, icebergs, and bedrooms). In the end, we show its efficacy on the unsupervised feature representation task.

...read moreread less

290 citations

Proceedings Article•DOI•

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

[...]

K R Prajwal¹, Rudrabha Mukhopadhyay¹, Vinay P. Namboodiri², C. V. Jawahar¹•Institutions (2)

International Institute of Information Technology, Hyderabad¹, University of Bath²

23 Aug 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work investigates the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech segment, and identifies key reasons pertaining to this and hence resolves them by learning from a powerful lip-sync discriminator.

...read moreread less

Abstract: In this work, we investigate the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech segment. Current works excel at producing accurate lip movements on a static image or videos of specific people seen during the training phase. However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio. We identify key reasons pertaining to this and hence resolve them by learning from a powerful lip-sync discriminator. Next, we propose new, rigorous evaluation benchmarks and metrics to accurately measure lip synchronization in unconstrained videos. Extensive quantitative evaluations on our challenging benchmarks show that the lip-sync accuracy of the videos generated by our Wav2Lip model is almost as good as real synced videos. We provide a demo video clearly showing the substantial impact of our Wav2Lip model and evaluation benchmarks on our website: \url{this http URL}. The code and models are released at this GitHub repository: \url{this http URL}. You can also try out the interactive demo at this link: \url{this http URL}.

...read moreread less

251 citations

Proceedings Article•DOI•

Attending to Discriminative Certainty for Domain Adaptation

[...]

Vinod K Kurmi¹, Shanu Kumar¹, Vinay P. Namboodiri¹•Institutions (1)

Indian Institute of Technology Kanpur¹

15 Jun 2019

TL;DR: This paper observes that just by incorporating the probabilistic certainty of the discriminator while training the classifier, the method is able to obtain state of the art results on various datasets as compared against all the recent methods.

...read moreread less

Abstract: In this paper, we aim to solve for unsupervised domain adaptation of classifiers where we have access to label information for the source domain while these are not available for a target domain. While various methods have been proposed for solving these including adversarial discriminator based methods, most approaches have focused on the entire image based domain adaptation. In an image, there would be regions that can be adapted better, for instance, the foreground object may be similar in nature. To obtain such regions, we propose methods that consider the probabilistic certainty estimate of various regions and specific focus on these during classification for adaptation. We observe that just by incorporating the probabilistic certainty of the discriminator while training the classifier, we are able to obtain state of the art results on various datasets as compared against all the recent methods. We provide a thorough empirical analysis of the method by providing ablation analysis, statistical significance test, and visualization of the attention maps and t-SNE embeddings. These evaluations convincingly demonstrate the effectiveness of the proposed approach.

...read moreread less

114 citations

Proceedings Article•

Deep active learning for object detection

[...]

Soumya Roy¹, Asim Unmesh¹, Vinay P. Namboodiri¹•Institutions (1)

Indian Institute of Technology Kanpur¹

01 Jan 2018

TL;DR: A novel active learning method is developed which poses the layered architecture used in object detection as a ‘query by committee’ paradigm to choose the set of images to be queried and these methods outperform classical uncertainty-based active learning algorithms like maximum entropy.

...read moreread less

Abstract: Object detection methods like Single Shot Multibox Detector (SSD) provide highly accurate object detection that run in real-time. However, these approaches require a large number of annotated training images. Evidently, not all of these images are equally useful for training the algorithms. Moreover, obtaining annotations in terms of bounding boxes for each image is costly and tedious. In this paper, we aim to obtain a highly accurate object detector using only a fraction of the training images. We do this by adopting active learning that uses ‘human in the loop’ paradigm to select the set of images that would be useful if annotated. Towards this goal, we make the following contributions: 1. We develop a novel active learning method which poses the layered architecture used in object detection as a ‘query by committee’ paradigm to choose the set of images to be queried. 2. We introduce a framework to use the exploration/exploitation trade-off in our methods. 3. We analyze the results on standard object detection datasets which show that with only a third of the training data, we can obtain more than 95% of the localization accuracy of full supervision. Further our methods outperform classical uncertainty-based active learning algorithms like maximum entropy.

...read moreread less

107 citations

Posted Content•

CovidAID: COVID-19 Detection Using Chest X-Ray

[...]

Arpan Mangal, Surya Kalia, Harish Rajgopal, Krithika Rangarajan, Vinay P. Namboodiri, Subhashis Banerjee, Chetan Arora - Show less +3 more

21 Apr 2020-arXiv: Image and Video Processing

TL;DR: CovidAID: COVID-19 AI Detector, a novel deep neural network based model to triage patients for appropriate testing and significantly improves upon the results of Covid-Net on the same dataset.

...read moreread less

Abstract: The exponential increase in COVID-19 patients is overwhelming healthcare systems across the world. With limited testing kits, it is impossible for every patient with respiratory illness to be tested using conventional techniques (RT-PCR). The tests also have long turn-around time, and limited sensitivity. Detecting possible COVID-19 infections on Chest X-Ray may help quarantine high risk patients while test results are awaited. X-Ray machines are already available in most healthcare systems, and with most modern X-Ray systems already digitized, there is no transportation time involved for the samples either. In this work we propose the use of chest X-Ray to prioritize the selection of patients for further RT-PCR testing. This may be useful in an inpatient setting where the present systems are struggling to decide whether to keep the patient in the ward along with other patients or isolate them in COVID-19 areas. It would also help in identifying patients with high likelihood of COVID with a false negative RT-PCR who would need repeat testing. Further, we propose the use of modern AI techniques to detect the COVID-19 patients using X-Ray images in an automated manner, particularly in settings where radiologists are not available, and help make the proposed testing technology scalable. We present CovidAID: COVID-19 AI Detector, a novel deep neural network based model to triage patients for appropriate testing. On the publicly available covid-chestxray-dataset [2], our model gives 90.5% accuracy with 100% sensitivity (recall) for the COVID-19 infection. We significantly improve upon the results of Covid-Net [10] on the same dataset.

...read moreread less

95 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

[신간의 별자리x] 우리/미술, 그리고 ‘슬픔의 박물관’

[...]

이화영

01 Jan 2015

12,972 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•DOI•

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

[...]

Ting-Chun Wang¹, Ming-Yu Liu¹, Jun-Yan Zhu², Andrew Tao¹, Jan Kautz¹, Bryan Catanzaro¹ - Show less +2 more•Institutions (2)

Nvidia¹, University of California, Berkeley²

18 Jun 2018

TL;DR: In this paper, a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented.

...read moreread less

Abstract: We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048 A— 1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.

...read moreread less

3,457 citations

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse