Showing papers by "Dumitru Erhan published in 2017"

PDF

Open Access

Proceedings Article•DOI•

Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks

[...]

Konstantinos Bousmalis¹, Nathan Silberman¹, David Dohan¹, Dumitru Erhan¹, Dilip Krishnan¹ - Show less +1 more•Institutions (1)

Google¹

01 Jul 2017

TL;DR: In this paper, a generative adversarial network (GAN)-based method adapts source-domain images to appear as if drawn from the target domain by learning in an unsupervised manner a transformation in the pixel space from one domain to another.

...read moreread less

Abstract: Collecting well-annotated image datasets to train modern machine learning algorithms is prohibitively expensive for many tasks. One appealing alternative is rendering synthetic data where ground-truth annotations are generated automatically. Unfortunately, models trained purely on rendered images fail to generalize to real images. To address this shortcoming, prior work introduced unsupervised domain adaptation algorithms that have tried to either map representations between the two domains, or learn to extract features that are domain-invariant. In this work, we approach the problem in a new light by learning in an unsupervised manner a transformation in the pixel space from one domain to the other. Our generative adversarial network (GAN)-based method adapts source-domain images to appear as if drawn from the target domain. Our approach not only produces plausible samples, but also outperforms the state-of-the-art on a number of unsupervised domain adaptation scenarios by large margins. Finally, we demonstrate that the adaptation process generalizes to object classes unseen during training.

...read moreread less

1,549 citations

Journal Article•DOI•

Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge

[...]

Oriol Vinyals¹, Alexander Toshev¹, Samy Bengio¹, Dumitru Erhan¹•Institutions (1)

Google¹

01 Apr 2017-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image is presented.

...read moreread less

Abstract: Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. Finally, given the recent surge of interest in this task, a competition was organized in 2015 using the newly released COCO dataset. We describe and analyze the various improvements we applied to our own baseline and show the resulting performance in the competition, which we won ex-aequo with a team from Microsoft Research.

...read moreread less

848 citations

Book Chapter•DOI•

The (Un)reliability of saliency methods

[...]

Pieter-Jan Kindermans¹, Sara Hooker¹, Julius Adebayo¹, Maximilian Alber², Kristof T. Schütt², Sven Dähne², Dumitru Erhan¹, Been Kim¹ - Show less +4 more•Institutions (2)

Google¹, Technical University of Berlin²

02 Nov 2017

TL;DR: This work uses a simple and common pre-processing step ---adding a constant shift to the input data--- to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute.

...read moreread less

Abstract: Saliency methods aim to explain the predictions of deep neural networks. These methods lack reliability when the explanation is sensitive to factors that do not contribute to the model prediction. We use a simple and common pre-processing step which can be compensated for easily—adding a constant shift to the input data—to show that a transformation with no effect on how the model makes the decision can cause numerous methods to attribute incorrectly. In order to guarantee reliability, we believe that the explanation should not change when we can guarantee that two networks process the images in identical manners. We show, through several examples, that saliency methods that do not satisfy this requirement result in misleading attribution. The approach can be seen as a type of unit test; we construct a narrow ground truth to measure one stated desirable property. As such, we hope the community will embrace the development of additional tests.

...read moreread less

490 citations

Posted Content•

Learning how to explain neural networks: PatternNet and PatternAttribution

[...]

Pieter-Jan Kindermans, Kristof T. Schütt¹, Maximilian Alber¹, Klaus-Robert Müller, Dumitru Erhan, Been Kim, Sven Dähne¹ - Show less +3 more•Institutions (1)

Technical University of Berlin¹

16 May 2017-arXiv: Machine Learning

TL;DR: In this article, the authors argue that explanation methods for neural networks should work reliably in the limit of simplicity, the linear models, and propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.

...read moreread less

Abstract: DeConvNet, Guided BackProp, LRP, were invented to better understand deep neural networks. We show that these methods do not produce the theoretically correct explanation for a linear model. Yet they are used on multi-layer networks with millions of parameters. This is a cause for concern since linear models are simple neural networks. We argue that explanation methods for neural nets should work reliably in the limit of simplicity, the linear models. Based on our analysis of linear models we propose a generalization that yields two explanation techniques (PatternNet and PatternAttribution) that are theoretically sound for linear models and produce improved explanations for deep networks.

...read moreread less

205 citations

Posted Content•

Stochastic Variational Video Prediction

[...]

Mohammad Babaeizadeh¹, Chelsea Finn², Dumitru Erhan, Roy H. Campbell¹, Sergey Levine² - Show less +1 more•Institutions (2)

University of Illinois at Urbana–Champaign¹, University of California, Berkeley²

30 Oct 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper develops a stochastic variational video prediction (SV2P) method that predicts a different possible future for each sample of its latent variables, and is the first to provide effective Stochastic multi-frame prediction for real-world video.

...read moreread less

Abstract: Predicting the future in real-world settings, particularly from raw sensory observations such as images, is exceptionally challenging. Real-world events can be stochastic and unpredictable, and the high dimensionality and complexity of natural images requires the predictive model to build an intricate understanding of the natural world. Many existing methods tackle this problem by making simplifying assumptions about the environment. One common assumption is that the outcome is deterministic and there is only one plausible future. This can lead to low-quality predictions in real-world settings with stochastic dynamics. In this paper, we develop a stochastic variational video prediction (SV2P) method that predicts a different possible future for each sample of its latent variables. To the best of our knowledge, our model is the first to provide effective stochastic multi-frame prediction for real-world video. We demonstrate the capability of the proposed method in predicting detailed future frames of videos on multiple real-world datasets, both action-free and action-conditioned. We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods. Our SV2P implementation will be open sourced upon publication.

...read moreread less

189 citations

Posted Content•

The (Un)reliability of saliency methods

[...]

Pieter-Jan Kindermans¹, Sara Hooker¹, Julius Adebayo¹, Maximilian Alber², Kristof T. Schütt², Sven Dähne², Dumitru Erhan¹, Been Kim¹ - Show less +4 more•Institutions (2)

Google¹, Technical University of Berlin²

02 Nov 2017-arXiv: Machine Learning

TL;DR: In this article, a simple and common pre-processing step, adding a constant shift to the input data, is used to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute.

...read moreread less

Abstract: Saliency methods aim to explain the predictions of deep neural networks. These methods lack reliability when the explanation is sensitive to factors that do not contribute to the model prediction. We use a simple and common pre-processing step ---adding a constant shift to the input data--- to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute. In order to guarantee reliability, we posit that methods should fulfill input invariance, the requirement that a saliency method mirror the sensitivity of the model with respect to transformations of the input. We show, through several examples, that saliency methods that do not satisfy input invariance result in misleading attribution.

...read moreread less

150 citations