Probability, random variables and stochastic processes
Citations
17,693 citations
9,157 citations
Cites background or methods from "Probability, random variables and s..."
...15-93): H(y1; y2) = H(y1) +H(y2) I(y1; y2): (35) Maximising this joint entropy consists of maximising the individual entropies while minimising the mutual information, I(y1; y2), shared between the two....
[...]
...The algorithm presented in section 2 is a stochastic gradient ascent algorithmwhich maximises the joint entropy in (35)....
[...]
8,738 citations
Cites background from "Probability, random variables and s..."
...Although formal and quantitative explanations of this weight fall-off can be given [11], the intuitionis that images typically vary slowly over space, so near pixels are likely to have similar values, and it is therefore appropriate to average them together....
[...]
8,231 citations
Cites background or methods from "Probability, random variables and s..."
...The term log |detW| in the likelihood comes from the classic rule for (linearly) transforming random variables and their densities (Papoulis, 1991): In general, for any random vector x with density px and for any matrixW, the density of y=Wx is given by px(Wx)|detW|....
[...]
...A typical example is the uniform distibution in eq....
[...]
...The differential entropy H of a random vector y with density f (y) is defined as (Cover and Thomas, 1991; Papoulis, 1991): H(y) = − ∫ f (y) log f (y)dy....
[...]
...For a proof, see e.g. (Cover and Thomas, 1991; Papoulis, 1991)....
[...]
...To avoid the problems encountered with the preceding approximations of negentropy, new approximations were developed in (Hyvärinen, 1998b)....
[...]
3,963 citations