scispace - formally typeset
Search or ask a question
Author

Vinay P. Namboodiri

Bio: Vinay P. Namboodiri is an academic researcher from University of Bath. The author has contributed to research in topics: Computer science & Convolutional neural network. The author has an hindex of 22, co-authored 190 publications receiving 1869 citations. Previous affiliations of Vinay P. Namboodiri include Bell Labs & Indian Institute of Technology Bombay.


Papers
More filters
Proceedings ArticleDOI
18 Jun 2018
TL;DR: Mad-GAN as discussed by the authors is a multi-agent GAN architecture incorporating multiple generators and one discriminator, which is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample.
Abstract: We propose MAD-GAN, an intuitive generalization to the Generative Adversarial Networks (GANs) and its conditional variants to address the well known problem of mode collapse. First, MAD-GAN is a multi-agent GAN architecture incorporating multiple generators and one discriminator. Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample. Intuitively, to succeed in this task, the discriminator must learn to push different generators towards different identifiable modes. We perform extensive experiments on synthetic and real datasets and compare MAD-GAN with different variants of GAN. We show high quality diverse sample generations for challenging tasks such as image-to-image translation and face generation. In addition, we also show that MAD-GAN is able to disentangle different modalities when trained using highly challenging diverse-class dataset (e.g. dataset with images of forests, icebergs, and bedrooms). In the end, we show its efficacy on the unsupervised feature representation task.

290 citations

Proceedings ArticleDOI
TL;DR: This work investigates the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech segment, and identifies key reasons pertaining to this and hence resolves them by learning from a powerful lip-sync discriminator.
Abstract: In this work, we investigate the problem of lip-syncing a talking face video of an arbitrary identity to match a target speech segment. Current works excel at producing accurate lip movements on a static image or videos of specific people seen during the training phase. However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio. We identify key reasons pertaining to this and hence resolve them by learning from a powerful lip-sync discriminator. Next, we propose new, rigorous evaluation benchmarks and metrics to accurately measure lip synchronization in unconstrained videos. Extensive quantitative evaluations on our challenging benchmarks show that the lip-sync accuracy of the videos generated by our Wav2Lip model is almost as good as real synced videos. We provide a demo video clearly showing the substantial impact of our Wav2Lip model and evaluation benchmarks on our website: \url{this http URL}. The code and models are released at this GitHub repository: \url{this http URL}. You can also try out the interactive demo at this link: \url{this http URL}.

251 citations

Proceedings ArticleDOI
15 Jun 2019
TL;DR: This paper observes that just by incorporating the probabilistic certainty of the discriminator while training the classifier, the method is able to obtain state of the art results on various datasets as compared against all the recent methods.
Abstract: In this paper, we aim to solve for unsupervised domain adaptation of classifiers where we have access to label information for the source domain while these are not available for a target domain. While various methods have been proposed for solving these including adversarial discriminator based methods, most approaches have focused on the entire image based domain adaptation. In an image, there would be regions that can be adapted better, for instance, the foreground object may be similar in nature. To obtain such regions, we propose methods that consider the probabilistic certainty estimate of various regions and specific focus on these during classification for adaptation. We observe that just by incorporating the probabilistic certainty of the discriminator while training the classifier, we are able to obtain state of the art results on various datasets as compared against all the recent methods. We provide a thorough empirical analysis of the method by providing ablation analysis, statistical significance test, and visualization of the attention maps and t-SNE embeddings. These evaluations convincingly demonstrate the effectiveness of the proposed approach.

114 citations

Proceedings Article
01 Jan 2018
TL;DR: A novel active learning method is developed which poses the layered architecture used in object detection as a ‘query by committee’ paradigm to choose the set of images to be queried and these methods outperform classical uncertainty-based active learning algorithms like maximum entropy.
Abstract: Object detection methods like Single Shot Multibox Detector (SSD) provide highly accurate object detection that run in real-time. However, these approaches require a large number of annotated training images. Evidently, not all of these images are equally useful for training the algorithms. Moreover, obtaining annotations in terms of bounding boxes for each image is costly and tedious. In this paper, we aim to obtain a highly accurate object detector using only a fraction of the training images. We do this by adopting active learning that uses ‘human in the loop’ paradigm to select the set of images that would be useful if annotated. Towards this goal, we make the following contributions: 1. We develop a novel active learning method which poses the layered architecture used in object detection as a ‘query by committee’ paradigm to choose the set of images to be queried. 2. We introduce a framework to use the exploration/exploitation trade-off in our methods. 3. We analyze the results on standard object detection datasets which show that with only a third of the training data, we can obtain more than 95% of the localization accuracy of full supervision. Further our methods outperform classical uncertainty-based active learning algorithms like maximum entropy.

107 citations

Posted Content
TL;DR: CovidAID: COVID-19 AI Detector, a novel deep neural network based model to triage patients for appropriate testing and significantly improves upon the results of Covid-Net on the same dataset.
Abstract: The exponential increase in COVID-19 patients is overwhelming healthcare systems across the world. With limited testing kits, it is impossible for every patient with respiratory illness to be tested using conventional techniques (RT-PCR). The tests also have long turn-around time, and limited sensitivity. Detecting possible COVID-19 infections on Chest X-Ray may help quarantine high risk patients while test results are awaited. X-Ray machines are already available in most healthcare systems, and with most modern X-Ray systems already digitized, there is no transportation time involved for the samples either. In this work we propose the use of chest X-Ray to prioritize the selection of patients for further RT-PCR testing. This may be useful in an inpatient setting where the present systems are struggling to decide whether to keep the patient in the ward along with other patients or isolate them in COVID-19 areas. It would also help in identifying patients with high likelihood of COVID with a false negative RT-PCR who would need repeat testing. Further, we propose the use of modern AI techniques to detect the COVID-19 patients using X-Ray images in an automated manner, particularly in settings where radiologists are not available, and help make the proposed testing technology scalable. We present CovidAID: COVID-19 AI Detector, a novel deep neural network based model to triage patients for appropriate testing. On the publicly available covid-chestxray-dataset [2], our model gives 90.5% accuracy with 100% sensitivity (recall) for the COVID-19 infection. We significantly improve upon the results of Covid-Net [10] on the same dataset.

95 citations


Cited by
More filters
Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this paper, a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented.
Abstract: We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048 A— 1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.

3,457 citations

01 Jan 2006

3,012 citations