Top 10 papers published by Jason Yosinski from Uber in 2016

Posted Content•

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

[...]

Anh Nguyen¹, Alexey Dosovitskiy², Jason Yosinski³, Thomas Brox², Jeff Clune⁴ - Show less +1 more•Institutions (4)

Vietnam Academy of Science and Technology¹, University of Freiburg², Cornell University³, University of Wyoming⁴

30 May 2016-arXiv: Neural and Evolutionary Computing

TL;DR: In this paper, a deep generator network (DGN) is proposed to generate synthetic images that look almost real, reveal the features learned by each neuron in an interpretable way, generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned.

...read moreread less

Abstract: Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right - similar to why we study the human brain - and will enable researchers to further improve DNNs. One path to understanding how a neural network functions internally is to study what each of its neurons has learned to detect. One such method is called activation maximization (AM), which synthesizes an input (e.g. an image) that highly activates a neuron. Here we dramatically improve the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network (DGN). The algorithm (1) generates qualitatively state-of-the-art synthetic images that look almost real, (2) reveals the features learned by each neuron in an interpretable way, (3) generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned, and (4) can be considered as a high-quality generative method (in this case, by generating novel, creative, interesting, recognizable images).

...read moreread less

407 citations

Proceedings Article•

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

[...]

Anh Nguyen¹, Alexey Dosovitskiy², Jason Yosinski³, Thomas Brox², Jeff Clune⁴ - Show less +1 more•Institutions (4)

Vietnam Academy of Science and Technology¹, University of Freiburg², Cornell University³, University of Wyoming⁴

30 May 2016

TL;DR: In this article, a deep generator network is used to generate synthetic images that look almost real, reveal the features learned by each neuron in an interpretable way, generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned, and can be considered as a high quality generative method.

...read moreread less

Abstract: Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right---similar to why we study the human brain---and will enable researchers to further improve DNNs. One path to understanding how a neural network functions internally is to study what each of its neurons has learned to detect. One such method is called activation maximization, which synthesizes an input (e.g. an image) that highly activates a neuron. Here we dramatically improve the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network. The algorithm (1) generates qualitatively state-of-the-art synthetic images that look almost real, (2) reveals the features learned by each neuron in an interpretable way, (3) generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned, and (4) can be considered as a high-quality generative method (in this case, by generating novel, creative, interesting, recognizable images).

...read moreread less

329 citations

Posted Content•

Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks

[...]

Anh Nguyen, Jason Yosinski, Jeff Clune

11 Feb 2016-arXiv: Neural and Evolutionary Computing

TL;DR: An algorithm is introduced that explicitly uncovers the multiple facets of each neuron by producing a synthetic visualization of each of the types of images that activate a neuron by separately synthesizing each type of image a neuron fires in response to.

...read moreread less

Abstract: We can better understand deep neural networks by identifying which features each of their neurons have learned to detect. To do so, researchers have created Deep Visualization techniques including activation maximization, which synthetically generates inputs (e.g. images) that maximally activate each neuron. A limitation of current techniques is that they assume each neuron detects only one type of feature, but we know that neurons can be multifaceted, in that they fire in response to many different types of features: for example, a grocery store class neuron must activate either for rows of produce or for a storefront. Previous activation maximization techniques constructed images without regard for the multiple different facets of a neuron, creating inappropriate mixes of colors, parts of objects, scales, orientations, etc. Here, we introduce an algorithm that explicitly uncovers the multiple facets of each neuron by producing a synthetic visualization of each of the types of images that activate a neuron. We also introduce regularization methods that produce state-of-the-art results in terms of the interpretability of images obtained by activation maximization. By separately synthesizing each type of image a neuron fires in response to, the visualizations have more appropriate colors and coherent global structure. Multifaceted feature visualization thus provides a clearer and more comprehensive description of the role of each neuron.

...read moreread less

266 citations

Proceedings Article•

Convergent Learning: Do different neural networks learn the same representations?

[...]

Yixuan Li¹, Jason Yosinski¹, Jeff Clune², Hod Lipson³, John E. Hopcroft¹ - Show less +1 more•Institutions (3)

Cornell University¹, University of Wyoming², Columbia University³

07 Jan 2016

TL;DR: The authors investigate the extent to which neural networks exhibit convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces.

...read moreread less

Abstract: Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of parameters, but valuable because it increases our ability to understand current models and create improved versions of them. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We propose a specific method of probing representations: training multiple networks and then comparing and contrasting their individual, learned representations at the level of neurons or groups of neurons. We begin research into this question using three techniques to approximately align different neural networks on a feature level: a bipartite matching approach that makes one-to-one assignments between neurons, a sparse prediction approach that finds one-to-many mappings, and a spectral clustering approach that finds many-to-many mappings. This initial investigation reveals a few previously unknown properties of neural networks, and we argue that future research into the question of convergent learning will yield many more. The insights described here include (1) that some features are learned reliably in multiple networks, yet other features are not consistently learned; (2) that units learn to span low-dimensional subspaces and, while these subspaces are common to multiple networks, the specific basis vectors learned are not; (3) that the representation codes show evidence of being a mix between a local code and slightly, but not fully, distributed codes across multiple units.

...read moreread less

188 citations

Proceedings Article•

Convergent Learning: Do different neural networks learn the same representations?

[...]

Yixuan Li¹, Jason Yosinski¹, Jeff Clune², Hod Lipson¹, John E. Hopcroft¹ - Show less +1 more•Institutions (2)

Cornell University¹, University of Wyoming²

01 Jan 2016

TL;DR: The authors investigate the extent to which neural networks exhibit convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces.

...read moreread less

Abstract: Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of parameters, but valuable because it increases our ability to understand current models and create improved versions of them. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We propose a specific method of probing representations: training multiple networks and then comparing and contrasting their individual, learned representations at the level of neurons or groups of neurons. We begin research into this question using three techniques to approximately align different neural networks on a feature level: a bipartite matching approach that makes one-to-one assignments between neurons, a sparse prediction approach that finds one-to-many mappings, and a spectral clustering approach that finds many-to-many mappings. This initial investigation reveals a few previously unknown properties of neural networks, and we argue that future research into the question of convergent learning will yield many more. The insights described here include (1) that some features are learned reliably in multiple networks, yet other features are not consistently learned; (2) that units learn to span low-dimensional subspaces and, while these subspaces are common to multiple networks, the specific basis vectors learned are not; (3) that the representation codes show evidence of being a mix between a local code and slightly, but not fully, distributed codes across multiple units.

...read moreread less

110 citations

Proceedings Article•DOI•

Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

[...]

Sina Honari¹, Jason Yosinski², Pascal Vincent¹, Chris Pal³•Institutions (3)

Université de Montréal¹, Cornell University², École Polytechnique³

27 Jun 2016

TL;DR: The Recombinator Network as discussed by the authors proposes to combine upsampled coarse, abstract features with finer features to produce robust pixel-level predictions, which can make use of several layers of computation in deciding how to use coarse features.

...read moreread less

Abstract: Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision. Max-pooling purposefully discards precise spatial information in order to create features that are more robust, and typically organized as lower resolution spatial feature maps. On some tasks, such as whole-image classification, max-pooling derived features are well suited, however, for tasks requiring precise localization, such as pixel level prediction and segmentation, max-pooling destroys exactly the information required to perform well. Precise localization may be preserved by shallow convnets without pooling but at the expense of robustness. Can we have our max-pooled multilayered cake and eat it too? Several papers have proposed summation and concatenation based methods for combining upsampled coarse, abstract features with finer features to produce robust pixel level predictions. Here we introduce another model — dubbed Recombinator Networks — where coarse features inform finer features early in their formation such that finer features can make use of several layers of computation in deciding how to use coarse features. The model is trained once, end-to-end and performs better than summation-based architectures, reducing the error from the previous state of the art on two facial keypoint datasets, AFW and AFLW, by 30% and beating the current state-of-the-art on 300W without using extra data. We improve performance even further by adding a denoising prediction model based on a novel convnet formulation.

...read moreread less

101 citations

Posted Content•

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

[...]

Anh Nguyen¹, Jeff Clune¹, Yoshua Bengio, Alexey Dosovitskiy², Jason Yosinski³ - Show less +1 more•Institutions (3)

University of Wyoming¹, University of Freiburg², Uber ³

30 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Nguyen et al. as mentioned in this paper proposed a Plug and Play Generative Network (PPGN) to generate high-resolution, photo-realistic images with an additional prior on the latent code.

...read moreread less

Abstract: Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. In addition, we provide a unified probabilistic interpretation of related activation maximization methods and call the general class of models "Plug and Play Generative Networks". PPGNs are composed of 1) a generator network G that is capable of drawing a wide range of image types and 2) a replaceable "condition" network C that tells the generator what to draw. We demonstrate the generation of images conditioned on a class (when C is an ImageNet or MIT Places classification network) and also conditioned on a caption (when C is an image captioning network). Our method also improves the state of the art of Multifaceted Feature Visualization, which generates the set of synthetic inputs that activate a neuron in order to better understand how deep neural networks operate. Finally, we show that our model performs reasonably well at the task of image inpainting. While image models are used in this paper, the approach is modality-agnostic and can be applied to many types of data.

...read moreread less

79 citations

Journal Article•DOI•

Understanding innovation engines: Automated creativity and improved stochastic optimization via deep learning

[...]

Anh Nguyen¹, Jason Yosinski², Jeff Clune¹•Institutions (2)

University of Wyoming¹, Cornell University²

01 Sep 2016-Evolutionary Computation

TL;DR: The long-term vision for the Innovation Engine algorithm is described, which involves many technical challenges that remain to be solved and suggests that Innovation Engines could ultimately automate the production of endless streams of interesting solutions in any domain.

...read moreread less

Abstract: The Achilles Heel of stochastic optimization algorithms is getting trapped on local optima. Novelty Search mitigates this problem by encouraging exploration in all interesting directions by replacing the performance objective with a reward for novel behaviors. This reward for novel behaviors has traditionally required a human-crafted, behavioral distance function. While Novelty Search is a major conceptual breakthrough and outperforms traditional stochastic optimization on certain problems, it is not clear how to apply it to challenging, high-dimensional problems where specifying a useful behavioral distance function is difficult. For example, in the space of images, how do you encourage novelty to produce hawks and heroes instead of endless pixel static? Here we propose a new algorithm, the Innovation Engine, that builds on Novelty Search by replacing the human-crafted behavioral distance with a Deep Neural Network DNN that can recognize interesting differences between phenotypes. The key insight is that DNNs can recognize similarities and differences between phenotypes at an abstract level, wherein novelty means interesting novelty. For example, a DNN-based novelty search in the image space does not explore in the low-level pixel space, but instead creates a pressure to create new types of images e.g., churches, mosques, obelisks, etc.. Here, we describe the long-term vision for the Innovation Engine algorithm, which involves many technical challenges that remain to be solved. We then implement a simplified version of the algorithm that enables us to explore some of the algorithm's key motivations. Our initial results, in the domain of images, suggest that Innovation Engines could ultimately automate the production of endless streams of interesting solutions in any domain: for example, producing intelligent software, robot controllers, optimized physical components, and art.

...read moreread less

53 citations

Journal Article•DOI•

GSNs: generative stochastic networks

[...]

Guillaume Alain¹, Yoshua Bengio¹, Li Yao¹, Jason Yosinski², Éric Thibodeau-Laufer¹, Saizheng Zhang¹, Pascal Vincent¹ - Show less +3 more•Institutions (2)

Université de Montréal¹, Cornell University²

01 Jun 2016-Information and Inference: A Journal of the IMA

TL;DR: A novel training principle for generative probabilistic models that is an alternative to maximum likelihood and an interesting justication for dependency networks and generalized pseudolikelihood and dene an appropriate joint distribution and sampling mechanism, even when the conditionals are not consistent.

...read moreread less

Abstract: We introduce a novel training principle for generative probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework generalizes Denoising Auto-Encoders (DAE) and is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. The transition distribution is a conditional distribution that generally involves a small move, so it has fewer dominant modes and is unimodal in the limit of small moves. This simplies the learning problem, making it less like density estimation and more akin to supervised function approximation, with gradients that can be obtained by backprop. The theorems provided here provide a probabilistic interpretation for denoising autoencoders and generalize them; seen in the context of this framework, auto-encoders that learn with injected noise are a special case of GSNs and can be interpreted as generative models. The theorems also provide an interesting justication for dependency networks and generalized pseudolikelihood and dene an appropriate joint distribution and sampling mechanism, even when the conditionals are not consistent. GSNs can be used with missing inputs and can be used to sample subsets of variables given the rest. Experiments validating these theoretical results are conducted on both synthetic datasets and image datasets. The experiments employ a particular architecture that mimics the Deep Boltzmann Machine Gibbs sampler but that allows training to proceed with backprop through a recurrent neural network with noise injected inside and without the need for layerwise pretraining.

...read moreread less

47 citations

Journal Article•DOI•

Webal comes of age: A review of the first 21 years of artificial life on the web

[...]

Tim Taylor¹, Joshua E. Auerbach², Josh C. Bongard³, Jeff Clune⁴, Simon Hickinbotham¹, Charles Ofria⁵, Mizuki Oka⁶, Sebastian Risi⁷, Kenneth O. Stanley⁸, Jason Yosinski⁹ - Show less +6 more•Institutions (9)

University of York¹, École Polytechnique Fédérale de Lausanne², University of Vermont³, University of Wyoming⁴, Michigan State University⁵, University of Tsukuba⁶, IT University of Copenhagen⁷, University of Central Florida⁸, Cornell University⁹

01 Aug 2016-Artificial Life

TL;DR: A survey of the first 21 years of web-based artificial life (WebAL) research and applications, broadly construed to include the many different ways in which artificial life and web technologies might intersect.

...read moreread less

Abstract: We present a survey of the first 21 years of web-based artificial life WebAL research and applications, broadly construed to include the many different ways in which artificial life and web technologies might intersect. Our survey covers the period from 1994-when the first WebAL work appeared-up to the present day, together with a brief discussion of relevant precursors. We examine recent projects, from 2010-2015, in greater detail in order to highlight the current state of the art. We follow the survey with a discussion of common themes and methodologies that can be observed in recent work and identify a number of likely directions for future work in this exciting area.

...read moreread less

23 citations

Showing papers by "Jason Yosinski published in 2016"