scispace - formally typeset
Open Access

Minimizing Description Length in an Unsupervised Neural Network

TLDR
The recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution and this approximation corresponds to using a suboptimal encoding scheme and therefore gives an upper bound on the minimal description length.
Abstract
An autoencoder network uses a set of recognition weights to convert an input vector into a representation vector. It then uses a set of generative weights to convert the representation vector into an approximate reconstruction of the input vector. We derive an objective function for training autoencoders based on the Minimum Description Length (MDL) principle. The aim is to minimize the information required to describe both the representation vector and the reconstruction error. This information is minimized by choosing representation vectors stochastically according to a Boltzmann distribution. Unfortunately, if the representation vectors use distributed representations, it is exponentially expensive to compute this Boltzmann distribution because it involves all possible representation vectors. We show that the recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution. This approximation corresponds to using a suboptimal encoding scheme and therefore gives an upper bound on the minimal description length. Even when this bound is poor, it can be used as a Lyapunov function for learning both the generative and the recognition weights. We demonstrate that this approach can be used to learn distributed representations in which many di erent hidden causes combine to produce each observed data vector. Such representations can be exponentially more e cient in their use of hardware than standard vector quantization or mixture models.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Autoencoder-based feature learning for cyber security applications

TL;DR: It is shown how well the AE is capable of automatically learning a reasonable notion of semantic similarity among input features, and how the scheme can reduce the dimensionality of the features thereby signicantly minimising the memory requirements.
Journal ArticleDOI

Application of deep learning to cybersecurity: A survey

TL;DR: This survey focuses on recent DL approaches that have been proposed in the area of cybersecurity, namely intrusion detection, malware detection, phishing/spam detection, and website defacement detection.
Journal ArticleDOI

Text summarization using unsupervised deep learning

TL;DR: Experiments show that the AE using local vocabularies clearly provide a more discriminative feature space and improves the recall on average 11.2%, and the ENAE can make further improvements, particularly in selecting informative sentences.
Journal ArticleDOI

Nonlinear Information Bottleneck

TL;DR: In this paper, a non-parametric upper bound for mutual information is proposed to find the optimal bottleneck variable for arbitrary-distributed discrete and/or continuous random variables X and Y with a Gaussian joint distribution.
Posted Content

Nonlinear Information Bottleneck.

TL;DR: This work proposes a method for performing IB on arbitrarily-distributed discrete and/or continuous X and Y, while allowing for nonlinear encoding and decoding maps, that achieves better performance than the recently-proposed “variational IB” method on several real-world datasets.
References
More filters

A minimum description length framework for unsupervised learning

TL;DR: This thesis presents a general framework for describing unsupervised learning procedures based on the Minimum Description Length (MDL) principle, and describes three new learning algorithms derived in this manner from the MDL framework.
Related Papers (5)