scispace - formally typeset
Search or ask a question

Showing papers on "Deep belief network published in 2002"


Journal ArticleDOI
TL;DR: A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule.
Abstract: It is possible to combine multiple latent-variable models of the same data by multiplying their probability distributions together and then renormalizing. This way of combining individual "expert" models makes it hard to generate samples from the combined model but easy to infer the values of the latent variables of each expert, because the combination rule ensures that the latent variables of different experts are conditionally independent when given the data. A product of experts (PoE) is therefore an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary. Training a PoE by maximizing the likelihood of the data is difficult because it is hard even to approximate the derivatives of the renormalization term in the combination rule. Fortunately, a PoE can be trained using a different objective function called "contrastive divergence" whose derivatives with regard to the parameters can be approximated accurately and efficiently. Examples are presented of contrastive divergence learning using several types of expert on several types of data.

5,150 citations


Patent
15 Nov 2002
TL;DR: A plausible neural network (PLANN) as discussed by the authors is an artificial neural network with weight connection given by mutual information, which has the capability and learning, and yet retains many characteristics of a biological neural network.
Abstract: A plausible neural network (PLANN) is an artificial neural network with weight connection given by mutual information, which has the capability and learning, and yet retains many characteristics of a biological neural network. The learning algorithm (300, 301, 302, 304, 306, 308) is based on statistical estimation, which is faster than the gradient decent approach currently used. The network after training becomes a fuzzy/belief network; the inference and weight are exchangeable, and as a result, knowledge extraction becomes simple. PLANN performs associative memory, supervised, semi-supervised, unsupervised learning and function/relation approximation in a single network architecture. This network architecture can easily be implemented by analog VLSI circuit design.

32 citations