Categorical Foundations of Gradient-Based Learning

Open AccessPosted Content

Categorical Foundations of Gradient-Based Learning

G. S. H. Cruttwell, +4 more

- 02 Mar 2021 -

arXiv: Learning

Chats0

TLDR

In this article, a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories is proposed, which encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, shedding new light on their similarities and differences.

Abstract:

We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Diagrammatic Differentiation for Quantum Machine Learning

Alexis Toumi, +2 more

- 14 Mar 2021 -

arXiv: Quantum Physics

TL;DR: In this article, the authors introduce diagrammatic differentiation for tensor calculus by generalising the dual number construction from rigs to monoidal categories, and apply this to ZX diagrams, showing how to calculate diagrammatically the gradient of a linear map with respect to a phase parameter.

...read moreread less

Posted Content

Quantum Information Effects

Chris Heunen, +1 more

- 26 Jul 2021 -

arXiv: Quantum Physics

TL;DR: This work studies the two dual quantum information effects to manipulate the amount of information in quantum computation: hiding and allocation, and provides universal categorical constructions that semantically interpret this arrow metalanguage with choice.

...read moreread less

Posted Content

Categorical composable cryptography.

Anne Broadbent, +1 more

- 12 May 2021 -

arXiv: Cryptography and Security

TL;DR: In this article, the authors formalize the simulation paradigm of cryptography in terms of category theory and show that protocols secure against abstract attacks form a symmetric monoidal category, thus giving an abstract model of composable security definitions in cryptography.

...read moreread less

Posted Content

Category Theory in Machine Learning.

Dan Shiebler, +2 more

- 13 Jun 2021 -

arXiv: Learning

TL;DR: In this paper, the authors document the motivations, goals and common themes across these applications and touch on gradient-based learning, probability, and equivariant learning, as well as applying category theory to machine learning.

...read moreread less

Book ChapterDOI

Categorical Artificial Intelligence: The Integration of Symbolic and Statistical AI for Verifiable, Ethical, and Trustworthy AI

Yoshihiro Maruyama

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Journal ArticleDOI

Learning representations by back-propagating errors

David E. Rumelhart, +2 more

- 01 Jan 1988 -

Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

Journal Article

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

John C. Duchi, +2 more

- 01 Feb 2011 -

Journal of Machine Learning Research

TL;DR: This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.

...read moreread less