Open AccessProceedings Article
Mutual Information Neural Estimation.
Mohamed Ishmael Belghazi,Aristide Baratin,Sai Rajeshwar,Sherjil Ozair,Yoshua Bengio,Aaron Courville,Devon Hjelm +6 more
- pp 531-540
TLDR
A Mutual Information Neural Estimator (MINE) is presented that is linearly scalable in dimensionality as well as in sample size, trainable through back-prop, and strongly consistent, and applied to improve adversarially trained generative models.Abstract:
We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks. We present a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size, trainable through back-prop, and strongly consistent. We present a handful of applications on which MINE can be used to minimize or maximize mutual information. We apply MINE to improve adversarially trained generative models. We also use MINE to implement the Information Bottleneck, applying it to supervised classification; our results demonstrate substantial improvement in flexibility and performance in these settings.read more
Citations
More filters
Journal Article
“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告
Proceedings Article
On Variational Bounds of Mutual Information
TL;DR: In this article, a continuum of variational lower bounds for estimating and optimizing mutual information (MI) in high dimensions is presented. But the tradeoffs between these lower bounds remain unclear.
Journal ArticleDOI
Deep Learning Enabled Semantic Communication Systems
TL;DR: In this paper, a deep learning based semantic communication system, named DeepSC, for text transmission based on the Transformer, aims at maximizing the system capacity and minimizing the semantic errors by recovering the meaning of sentences, rather than bit- or symbol-errors in traditional communications.
Proceedings ArticleDOI
Graph Representation Learning via Graphical Mutual Information Maximization
TL;DR: An unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder is developed, which outperforms state-of-the-art unsuper supervised counterparts, and even sometimes exceeds the performance of supervised ones.
Proceedings ArticleDOI
Multi-Task Self-Supervised Learning for Robust Speech Recognition
Mirco Ravanelli,Jianyuan Zhong,Santiago Pascual,Pawel Swietojanski,Joao Monteiro,Jan Trmal,Yoshua Bengio +6 more
TL;DR: PASE+ is proposed, an improved version of PASE that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks and learns transferable representations suitable for highly mismatched acoustic conditions.
References
More filters
Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI
Image quality assessment: from error visibility to structural similarity
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Journal ArticleDOI
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Journal ArticleDOI
Multilayer feedforward networks are universal approximators
TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.