Open Access
Minimizing Description Length in an Unsupervised Neural Network
TLDR
The recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution and this approximation corresponds to using a suboptimal encoding scheme and therefore gives an upper bound on the minimal description length.Abstract:
An autoencoder network uses a set of recognition weights to convert an input vector into a representation vector. It then uses a set of generative weights to convert the representation vector into an approximate reconstruction of the input vector. We derive an objective function for training autoencoders based on the Minimum Description Length (MDL) principle. The aim is to minimize the information required to describe both the representation vector and the reconstruction error. This information is minimized by choosing representation vectors stochastically according to a Boltzmann distribution. Unfortunately, if the representation vectors use distributed representations, it is exponentially expensive to compute this Boltzmann distribution because it involves all possible representation vectors. We show that the recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution. This approximation corresponds to using a suboptimal encoding scheme and therefore gives an upper bound on the minimal description length. Even when this bound is poor, it can be used as a Lyapunov function for learning both the generative and the recognition weights. We demonstrate that this approach can be used to learn distributed representations in which many di erent hidden causes combine to produce each observed data vector. Such representations can be exponentially more e cient in their use of hardware than standard vector quantization or mixture models.read more
Citations
More filters
Proceedings ArticleDOI
Autoencoder-based feature learning for cyber security applications
TL;DR: It is shown how well the AE is capable of automatically learning a reasonable notion of semantic similarity among input features, and how the scheme can reduce the dimensionality of the features thereby signicantly minimising the memory requirements.
Journal ArticleDOI
Application of deep learning to cybersecurity: A survey
TL;DR: This survey focuses on recent DL approaches that have been proposed in the area of cybersecurity, namely intrusion detection, malware detection, phishing/spam detection, and website defacement detection.
Journal ArticleDOI
Text summarization using unsupervised deep learning
Mahmood Yousefi-Azar,Len Hamey +1 more
TL;DR: Experiments show that the AE using local vocabularies clearly provide a more discriminative feature space and improves the recall on average 11.2%, and the ENAE can make further improvements, particularly in selecting informative sentences.
Journal ArticleDOI
Nonlinear Information Bottleneck
TL;DR: In this paper, a non-parametric upper bound for mutual information is proposed to find the optimal bottleneck variable for arbitrary-distributed discrete and/or continuous random variables X and Y with a Gaussian joint distribution.
Posted Content
Nonlinear Information Bottleneck.
TL;DR: This work proposes a method for performing IB on arbitrarily-distributed discrete and/or continuous X and Y, while allowing for nonlinear encoding and decoding maps, that achieves better performance than the recently-proposed “variational IB” method on several real-world datasets.
References
More filters
Journal Article
A new view of the EM algorithm that justifies incremental and other variants
A minimum description length framework for unsupervised learning
TL;DR: This thesis presents a general framework for describing unsupervised learning procedures based on the Minimum Description Length (MDL) principle, and describes three new learning algorithms derived in this manner from the MDL framework.
Related Papers (5)
Evolutionary Learning Algorithm for Projection Neural Networks
Min Woong Hwang,Jin-Young Choi +1 more