scispace - formally typeset

International Conference on Learning Representations

About: International Conference on Learning Representations is an academic conference. The conference publishes majorly in the area(s): Artificial neural network & Reinforcement learning. Over the lifetime, 3367 publication(s) have been published by the conference receiving 458812 citation(s). more


Open accessProceedings Article
Diederik P. Kingma1, Jimmy Ba2Institutions (2)
01 Jan 2015-
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm. more

Topics: Stochastic optimization (63%), Convex optimization (54%), Rate of convergence (52%) more

78,539 Citations

Open accessProceedings Article
Karen Simonyan1, Andrew Zisserman1Institutions (1)
01 Jan 2015-
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision. more

49,857 Citations

Open accessProceedings Article
01 Jan 2015-
Abstract: Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition. more

15,992 Citations

Open accessProceedings Article
Diederik P. Kingma1, Max Welling1Institutions (1)
01 Jan 2014-
Abstract: How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods. Second, we show that for i.i.d. datasets with continuous latent variables per datapoint, posterior inference can be made especially efficient by fitting an approximate inference model (also called a recognition model) to the intractable posterior using the proposed lower bound estimator. Theoretical advantages are reflected in experimental results. more

Topics: Approximate inference (67%), Inference (55%), Estimator (53%) more

14,546 Citations

Open accessProceedings Article
20 Mar 2015-
Abstract: Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset. more

Topics: Adversarial machine learning (65%), Overfitting (54%), MNIST database (52%) more

7,946 Citations

No. of papers from the Conference in previous years

Top Attributes

Show by:

Conference's top 5 most impactful authors

Yoshua Bengio

73 papers, 27.4K citations

Sergey Levine

63 papers, 4.1K citations

Ruslan Salakhutdinov

18 papers, 1.4K citations

Yann LeCun

18 papers, 4.9K citations

Max Welling

16 papers, 18.5K citations

Network Information
Related Conferences (5)
Neural Information Processing Systems

12.9K papers, 1.2M citations

96% related
International Conference on Machine Learning

10.6K papers, 788.5K citations

93% related
European Conference on Computer Vision

6.3K papers, 596.2K citations

88% related
Computer Vision and Pattern Recognition

19.7K papers, 2.2M citations

88% related