scispace - formally typeset
Search or ask a question

Showing papers on "Autoencoder published in 2000"


Book ChapterDOI
TL;DR: This paper looks at a recognition-based approach whose accuracy in such environments is superior to that obtained via more conventional mechanisms and suggests a simple and more robust alternative to commonly used classification mechanisms.
Abstract: Though impressive classification accuracy is often obtained via discrimination-based learning techniques such as Multi-Layer Perceptrons (DMLP), these techniques often assume that the underlying training sets are optimally balanced (in terms of the number of positive and negative examples). Unfortunately, this is not always the case. In this paper, we look at a recognition-based approach whose accuracy in such environments is superior to that obtained via more conventional mechanisms. At the heart of the new technique is a modified autoencoder that allows for the incorporation of a recognition component into the conventional MLP mechanism. In short, rather than being associated with an output value of "1", positive examples are fully reconstructed at the network output layer while negative examples, rather than being associated with an output value of "0", have their inverse derived at the output layer. The result is an auto-encoder able to recognize positive examples while discriminating against negative ones by virtue of the fact that negative cases generate larger reconstruction errors. A simple technique is employed to exaggerate the impact of training with these negative examples so that reconstruction errors can be more reliably established. Preliminary testing on both seismic and sonar data sets has demonstrated that the new method produces lower error rates than standard connectionist systems in imbalanced settings. Our approach thus suggests a simple and more robust alternative to commonly used classification mechanisms.

12 citations


01 Jan 2000
TL;DR: The recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution and this approximation corresponds to using a suboptimal encoding scheme and therefore gives an upper bound on the minimal description length.
Abstract: An autoencoder network uses a set of recognition weights to convert an input vector into a representation vector. It then uses a set of generative weights to convert the representation vector into an approximate reconstruction of the input vector. We derive an objective function for training autoencoders based on the Minimum Description Length (MDL) principle. The aim is to minimize the information required to describe both the representation vector and the reconstruction error. This information is minimized by choosing representation vectors stochastically according to a Boltzmann distribution. Unfortunately, if the representation vectors use distributed representations, it is exponentially expensive to compute this Boltzmann distribution because it involves all possible representation vectors. We show that the recognition weights of an autoencoder can be used to compute an approximation to the Boltzmann distribution. This approximation corresponds to using a suboptimal encoding scheme and therefore gives an upper bound on the minimal description length. Even when this bound is poor, it can be used as a Lyapunov function for learning both the generative and the recognition weights. We demonstrate that this approach can be used to learn distributed representations in which many di erent hidden causes combine to produce each observed data vector. Such representations can be exponentially more e cient in their use of hardware than standard vector quantization or mixture models.

10 citations


Proceedings Article
01 Jan 2000
TL;DR: A viewpoint invariant face recognition method in which several viewpoint dependent classifiers are combined by a gating network that can be self-organized such that one of the classifiers is selected depending on the viewpoint of a given input face image.
Abstract: This paper proposes a viewpoint invariant face recognition method in which several viewpoint dependent classifiers are combined by a gating network. The gating network is designed as autoencoder with competitive hidden units. The viewpoint dependent representations of faces can be obtained by this autoencoder from many faces with different views. Multinomial logit model is used for the viewpoint dependent classifiers. By combining the classifiers with the gating network, the network can be self-organized such that one of the classifiers is selected depending on the viewpoint of a given input face image. Experimental results of view invariant face recognition are shown using the face images cap tured from different viewpoints.

7 citations