scispace - formally typeset
Institution

University of Amsterdam

EducationAmsterdam, Noord-Holland, Netherlands
About: University of Amsterdam is a(n) education organization based out in Amsterdam, Noord-Holland, Netherlands. It is known for research contribution in the topic(s): Population & Randomized controlled trial. The organization has 59309 authors who have published 140894 publication(s) receiving 5984137 citation(s). The organization is also known as: UvA & Universiteit van Amsterdam.


Papers
More filters
Proceedings Article
01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

78,539 citations

Posted Content
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

23,369 citations

Proceedings Article
01 Jan 2014
TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
Abstract: How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods. Second, we show that for i.i.d. datasets with continuous latent variables per datapoint, posterior inference can be made especially efficient by fitting an approximate inference model (also called a recognition model) to the intractable posterior using the proposed lower bound estimator. Theoretical advantages are reflected in experimental results.

14,546 citations

Journal ArticleDOI
Georges Aad1, T. Abajyan2, Brad Abbott3, Jalal Abdallah4  +2964 moreInstitutions (200)
TL;DR: In this article, a search for the Standard Model Higgs boson in proton-proton collisions with the ATLAS detector at the LHC is presented, which has a significance of 5.9 standard deviations, corresponding to a background fluctuation probability of 1.7×10−9.
Abstract: A search for the Standard Model Higgs boson in proton–proton collisions with the ATLAS detector at the LHC is presented. The datasets used correspond to integrated luminosities of approximately 4.8 fb−1 collected at View the MathML source in 2011 and 5.8 fb−1 at View the MathML source in 2012. Individual searches in the channels H→ZZ(⁎)→4l, H→γγ and H→WW(⁎)→eνμν in the 8 TeV data are combined with previously published results of searches for H→ZZ(⁎), WW(⁎), View the MathML source and τ+τ− in the 7 TeV data and results from improved analyses of the H→ZZ(⁎)→4l and H→γγ channels in the 7 TeV data. Clear evidence for the production of a neutral boson with a measured mass of View the MathML source is presented. This observation, which has a significance of 5.9 standard deviations, corresponding to a background fluctuation probability of 1.7×10−9, is compatible with the production and decay of the Standard Model Higgs boson.

8,774 citations

Posted Content
TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.
Abstract: We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

8,285 citations


Authors

Showing all 59309 results

NameH-indexPapersCitations
Richard A. Flavell2311328205119
Scott M. Grundy187841231821
Stuart H. Orkin186715112182
Kenneth C. Anderson1781138126072
David A. Weitz1781038114182
Dorret I. Boomsma1761507136353
Brenda W.J.H. Penninx1701139119082
Michael Kramer1671713127224
Nicholas J. White1611352104539
Lex M. Bouter158767103034
Wolfgang Wagner1562342123391
Jerome I. Rotter1561071116296
David Cella1561258106402
David Eisenberg156697112460
Naveed Sattar1551326116368
Network Information
Related Institutions (5)
University College London
210.6K papers, 9.8M citations

94% related

University of Edinburgh
151.6K papers, 6.6M citations

94% related

University of Pennsylvania
257.6K papers, 14.1M citations

94% related

Columbia University
224K papers, 12.8M citations

94% related

University of Pittsburgh
201K papers, 9.6M citations

94% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
202291
20219,647
20208,534
20197,823
20186,407
20176,387