Open AccessProceedings Article
Meta-learning Symmetries by Reparameterization
TLDR
In this paper, a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data is presented, which can provably represent equivariance-inducing parameter sharing for any finite group of symmetry transformations.Abstract:
Many successful deep learning architectures are equivariant to certain transformations in order to conserve parameters and improve generalization: most famously, convolution layers are equivariant to shifts of the input. This approach only works when practitioners know the symmetries of the task and can manually construct an architecture with the corresponding equivariances. Our goal is an approach for learning equivariances from data, without needing to design custom task-specific architectures. We present a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data. Our method can provably represent equivariance-inducing parameter sharing for any finite group of symmetry transformations. Our experiments suggest that it can automatically learn to encode equivariances to common transformations used in image processing tasks.read more
Citations
More filters
Posted Content
Known Operator Learning and Hybrid Machine Learning in Medical Imaging - A Review of the Past, the Present, and the Future.
TL;DR: A review of the state-of-the-art of hybrid machine learning in medical imaging can be found in this article, with a particular focus on known operator learning and how hybrid approaches gain more and more momentum across essentially all applications in medical image and medical image analysis.
Book ChapterDOI
Meta-learning of Pooling Layers for Character Recognition
TL;DR: In this paper, a meta-learning framework for pooling layers is proposed, in which the kernel shape and pooling operation are trainable using two parameters, thereby allowing flexible pooling of the input data.
Posted Content
Neural Networks for Learning Counterfactual G-Invariances from Single Environments
S Chandra Mouli,Bruno Ribeiro +1 more
TL;DR: The authors proposed a learning framework counterfactually guided by the learning hypothesis that any group invariance to known transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data.
Journal ArticleDOI
A Nested Bi-level Optimization Framework for Robust Few Shot Learning
TL;DR: NESTEDMAML as discussed by the authors proposes a novel robust meta-learning algorithm, which learns to assign weights to training tasks or instances by con-sider weights as hyper-parameters and iteratively optimize them using a small set of validation tasks set in a nested bi-level optimization approach.
Posted Content
A Nested Bi-level Optimization Framework for Robust Few Shot Learning.
TL;DR: NestedMAML as mentioned in this paper considers weights as hyper-parameters and iteratively optimizes them using a small set of validation tasks set in a nested bi-level optimization approach.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Journal ArticleDOI
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Posted Content
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy,Lucas Beyer,Alexander Kolesnikov,Dirk Weissenborn,Xiaohua Zhai,Thomas Unterthiner,Mostafa Dehghani,Matthias Minderer,Georg Heigold,Sylvain Gelly,Jakob Uszkoreit,Neil Houlsby +11 more
TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.