Journal•

arXiv: Learning

About: arXiv: Learning is an academic journal. The journal publishes majorly in the area(s): Artificial neural network & Computer science. Over the lifetime, 45050 publications have been published receiving 837151 citations.

...read moreread less

Topics: Artificial neural network, Computer science, Reinforcement learning, Deep learning, Graph (abstract data type) ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa¹, Gaël Varoquaux¹, Alexandre Gramfort¹, Vincent Michel¹, Bertrand Thirion¹, Olivier Grisel, Mathieu Blondel, Andreas Müller², Joel Nothman, Gilles Louppe², Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Edouard Duchesnay - Show less +15 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Liège²

02 Jan 2012-arXiv: Learning

TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.

...read moreread less

Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from this http URL.

...read moreread less

28,898 citations

Posted Content•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

22 Dec 2014-arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

...read moreread less

23,486 citations

Posted Content•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

11 Feb 2015-arXiv: Learning

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.

...read moreread less

17,184 citations

Posted Content•

Semi-Supervised Classification with Graph Convolutional Networks

[...]

Thomas Kipf¹, Max Welling¹•Institutions (1)

University of Amsterdam¹

09 Sep 2016-arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less

Abstract: We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

...read moreread less

15,696 citations

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

09 Mar 2016-arXiv: Learning

TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

...read moreread less

Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

...read moreread less

13,333 citations

Collapse

Network Information

Related Journals (5)

Journal of Machine Learning Research

3.2K papers, 591K citations

50K papers, 1.1M citations

90% related

arXiv: Computation and Language

24.8K papers, 481.5K citations

7.6K papers, 1.6M citations

86% related

IEEE Transactions on Neural Networks

6.7K papers, 630.8K citations

84% related

Performance

Metrics

45,050

Papers

1,319,128

Citations

No. of papers from the Journal in previous years
Year	Papers
2022	3
2021	12,564
2020	10,847
2019	8,886
2018	4,727
2017	2,309