Journal•

arXiv: Machine Learning

About: arXiv: Machine Learning is an academic journal. The journal publishes majorly in the area(s): Artificial neural network & Inference. Over the lifetime, 12404 publications have been published receiving 260672 citations.

...read moreread less

Topics: Artificial neural network, Inference, Estimator, Gaussian process, Computer science ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Distilling the Knowledge in a Neural Network

[...]

Geoffrey E. Hinton, Oriol Vinyals, Jeffrey Dean

09 Mar 2015-arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

Abstract: A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.

...read moreread less

12,857 citations

Posted Content•

Towards Deep Learning Models Resistant to Adversarial Attacks

[...]

Aleksander Madry¹, Aleksandar Makelov¹, Ludwig Schmidt¹, Dimitris Tsipras¹, Adrian Vladu¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

19 Jun 2017-arXiv: Machine Learning

TL;DR: This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

...read moreread less

Abstract: Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples---inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at this https URL and this https URL.

...read moreread less

5,789 citations

Posted Content•

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

[...]

Leland McInnes, John Healy

09 Feb 2018-arXiv: Machine Learning

TL;DR: The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance.

...read moreread less

Abstract: UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology The result is a practical scalable algorithm that applies to real world data The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning

...read moreread less

5,390 citations

Posted Content•

Explaining and Harnessing Adversarial Examples

[...]

Ian Goodfellow¹, Jonathon Shlens¹, Christian Szegedy¹•Institutions (1)

Google¹

20 Dec 2014-arXiv: Machine Learning

TL;DR: The authors argue that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, which is supported by new quantitative results while giving the first explanation of the most intriguing fact about adversarial examples: their generalization across architectures and training sets.

...read moreread less

Abstract: Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.

...read moreread less

4,967 citations

Posted Content•

Auto-Encoding Variational Bayes

[...]

Diederik P. Kingma¹, Max Welling¹•Institutions (1)

University of Amsterdam¹

20 Dec 2013-arXiv: Machine Learning

TL;DR: In this paper, a stochastic variational inference and learning algorithm was proposed for directed probabilistic models with intractable posterior distributions and large datasets, which scales to large datasets.

...read moreread less

Abstract: How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods. Second, we show that for i.i.d. datasets with continuous latent variables per datapoint, posterior inference can be made especially efficient by fitting an approximate inference model (also called a recognition model) to the intractable posterior using the proposed lower bound estimator. Theoretical advantages are reflected in experimental results.

...read moreread less

4,883 citations

Collapse

Network Information

Related Journals (5)

arXiv: Learning

45K papers, 837.1K citations

96% related

Journal of Machine Learning Research

3.2K papers, 591K citations

95% related

Annals of Statistics

5.6K papers, 653K citations

50K papers, 1.1M citations

7.6K papers, 1.6M citations

84% related

Performance

Metrics

12,404

Papers

362,022

Citations

No. of papers from the Journal in previous years
Year	Papers
2021	1,522
2020	1,840
2019	1,725
2018	1,916
2017	1,638
2016	1,174