Top 6 papers published by Dumitru Erhan from Google in 2019

Proceedings Article•

A Benchmark for Interpretability Methods in Deep Neural Networks

[...]

Sara Hooker¹, Dumitru Erhan¹, Pieter-Jan Kindermans², Been Kim¹•Institutions (2)

Google¹, Technical University of Berlin²

01 Jan 2019

TL;DR: In this paper, an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks is proposed, and it is shown that ensemble based approaches outperform a random assignment of importance.

...read moreread less

Abstract: We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks. Our results across several large-scale image classification datasets show that many popular interpretability methods produce estimates of feature importance that are not better than a random designation of feature importance. Only certain ensemble based approaches---VarGrad and SmoothGrad-Squared---outperform such a random assignment of importance. The manner of ensembling remains critical, we show that some approaches do no better then the underlying method but carry a far higher computational burden.

...read moreread less

361 citations

Posted Content•

Model-Based Reinforcement Learning for Atari

[...]

Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski - Show less +10 more

01 Mar 2019-arXiv: Learning

TL;DR: SimPLe as discussed by the authors is a model-based deep RL algorithm based on video prediction models, which can solve Atari games with fewer interactions than model-free methods and outperforms state-of-the-art RL algorithms by over an order of magnitude.

...read moreread less

Abstract: Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.

...read moreread less

315 citations

Proceedings Article•

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

[...]

Ruben Villegas¹, Arkanath Pathak², Harini Kannan², Dumitru Erhan², Quoc V. Le², Honglak Lee³ - Show less +2 more•Institutions (3)

Adobe Systems¹, Google², University of Michigan³

05 Nov 2019

TL;DR: This work proposes a different approach: finding minimal inductive bias for video prediction while maximizing network capacity, and investigates this question by performing the first large-scale empirical study and demonstrates state-of-the-art performance by learning large models on three different datasets.

...read moreread less

Abstract: Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex inductive biases inside network architectures with highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: finding minimal inductive bias for video prediction while maximizing network capacity. We investigate this question by performing the first large-scale empirical study and demonstrate state-of-the-art performance by learning large models on three different datasets: one for modeling object interactions, one for modeling human motion, and one for modeling car driving.

...read moreread less

99 citations

Posted Content•

VideoFlow: A Flow-Based Generative Model for Video

[...]

Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma - Show less +3 more

04 Mar 2019

TL;DR: This work proposes a model for video prediction based on normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions, and describes an approach for modeling the latent space dynamics, and demonstrates that flow-based generative models offer a viable and competitive approach to generative modeling of video.

...read moreread less

Abstract: Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. In particular, learning predictive models of videos offers an especially appealing mechanism to enable a rich understanding of the physical world: videos of real-world interactions are plentiful and readily available, and a model that can predict future video frames can not only capture useful representations of the world, but can be useful in its own right, for problems such as model-based robotic control. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models that can represent uncertain futures, such models are either extremely expensive computationally (as in the case of pixel-level autoregressive models), or do not directly optimize the likelihood of the data. In this work, we propose a model for video prediction based on normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions. To our knowledge, our work is the first to propose multi-frame video prediction with normalizing flows. We describe an approach for modeling the latent space dynamics, and demonstrate that flow-based generative models offer a viable and competitive approach to generative modeling of video.

...read moreread less

93 citations

Posted Content•

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

[...]

Manoj Kumar¹, Mohammad Babaeizadeh¹, Dumitru Erhan¹, Chelsea Finn¹, Sergey Levine¹, Laurent Dinh¹, Durk Kingma¹ - Show less +3 more•Institutions (1)

Google¹

04 Mar 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work is the first to propose multi-frame video prediction with normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions.

...read moreread less

Abstract: Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models that can represent uncertain futures, such models are either extremely expensive computationally as in the case of pixel-level autoregressive models, or do not directly optimize the likelihood of the data. To our knowledge, our work is the first to propose multi-frame video prediction with normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions. We describe an approach for modeling the latent space dynamics, and demonstrate that flow-based generative models offer a viable and competitive approach to generative modelling of video.

...read moreread less

81 citations

Posted Content•

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

[...]

Ruben Villegas¹, Arkanath Pathak², Harini Kannan², Dumitru Erhan², Quoc V. Le², Honglak Lee³ - Show less +2 more•Institutions (3)

Adobe Systems¹, Google², University of Michigan³

05 Nov 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this article, the authors propose to find minimal inductive bias inside network architectures with highly specialized computation, including segmentation masks, optical flow, and foreground and background separation, to predict future video frames.

...read moreread less

Abstract: Predicting future video frames is extremely challenging, as there are many factors of variation that make up the dynamics of how frames change through time. Previously proposed solutions require complex inductive biases inside network architectures with highly specialized computation, including segmentation masks, optical flow, and foreground and background separation. In this work, we question if such handcrafted architectures are necessary and instead propose a different approach: finding minimal inductive bias for video prediction while maximizing network capacity. We investigate this question by performing the first large-scale empirical study and demonstrate state-of-the-art performance by learning large models on three different datasets: one for modeling object interactions, one for modeling human motion, and one for modeling car driving.

...read moreread less

31 citations

Showing papers by "Dumitru Erhan published in 2019"