On the rationale of maximum-entropy methods

doi:10.1109/PROC.1982.12425

Home
/
Papers
/
On the rationale of maximum-entropy methods

Journal Article•DOI•

On the rationale of maximum-entropy methods

E. T. Jaynes¹•Institutions (1)

Washington University in St. Louis¹

01 Sep 1982-Vol. 70, Iss: 9, pp 939-952

TL;DR: The relations between maximum-entropy (MAXENT) and other methods of spectral analysis such as the Schuster, Blackman-Tukey, maximum-likelihood, Bayesian, and Autoregressive models are discussed, emphasizing that they are not in conflict, but rather are appropriate in different problems.

read less

Abstract: We discuss the relations between maximum-entropy (MAXENT) and other methods of spectral analysis such as the Schuster, Blackman-Tukey, maximum-likelihood, Bayesian, and Autoregressive (AR, ARMA, or ARIMA) models, emphasizing that they are not in conflict, but rather are appropriate in different problems. We conclude that: 1) "Orthodox" sampling theory methods are useful in problems where we have a known model (sampling distribution) for the properties of the noise, but no appreciable prior information about the quantities being estimated. 2) MAXENT is optimal in problems where we have prior information about multiplicities, but no noise. 3) The full Bayesian solution includes both of these as special cases and is needed in problems where we have both prior information and noise. 4) AR models are in one sense a special case of MAXENT, but in another sense they are ubiquitous in all spectral analysis problems with discrete time series. 5) Empirical methods such as Blackman-Tukey, which do not invoke even a likelihood function, are useful in the preliminary, exploratory phase of a problem where our knowledge is sufficient to permit intuitive judgments about how to organize a calculation (smoothing, decimation, windows, prewhitening, padding with zeroes, etc.) but insufficient to set up a quantitative model which would do the proper things for us automatically and optimally.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Novel methods improve prediction of species' distributions from occurrence data

[...]

Jane Elith¹, Catherine H. Graham², Robert P. Anderson³, Miroslav Dudík⁴, Simon Ferrier, Antoine Guisan⁵, Robert J. Hijmans⁶, Falk Huettmann⁷, John R. Leathwick⁸, Anthony Lehmann, Jin Li⁹, Lúcia G. Lohmann¹⁰, Bette A. Loiselle¹¹, Glenn Manion, Craig Moritz⁶, Miguel Nakamura¹², Yoshinori Nakazawa¹³, Jacob C. M. Mc Overton¹⁴, A. Townsend Peterson¹³, Steven J. Phillips¹⁵, Karen Richardson¹⁶, Ricardo Scachetti-Pereira, Robert E. Schapire, Jorge Soberón¹³, Stephen E. Williams¹⁷, Mary S. Wisz, Niklaus E. Zimmermann¹⁸ - Show less +23 more•Institutions (18)

University of Melbourne¹, Stony Brook University², City University of New York³, Princeton University⁴, University of Lausanne⁵, University of California, Berkeley⁶, University of Alaska Fairbanks⁷, National Institute of Water and Atmospheric Research⁸, Commonwealth Scientific and Industrial Research Organisation⁹, University of São Paulo¹⁰, University of Missouri¹¹, Consejo Nacional de Ciencia y Tecnología¹², University of Kansas¹³, Landcare Research¹⁴, AT&T¹⁵, McGill University¹⁶, James Cook University¹⁷, Swiss Federal Institute for Forest, Snow and Landscape Research¹⁸

01 Apr 2006-Ecography

TL;DR: This work compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date and found that presence-only data were effective for modelling species' distributions for many species and regions.

...read moreread less

Abstract: Prediction of species' distributions is central to diverse applications in ecology, evolution and conservation science. There is increasing electronic access to vast sets of occurrence records in museums and herbaria, yet little effective guidance on how best to use this information in the context of numerous approaches for modelling distributions. To meet this need, we compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date. We used presence-only data to fit models, and independent presence-absence data to evaluate the predictions. Along with well-established modelling methods such as generalised additive models and GARP and BIOCLIM, we explored methods that either have been developed recently or have rarely been applied to modelling species' distributions. These include machine-learning methods and community models, both of which have features that may make them particularly well suited to noisy or sparse information, as is typical of species' occurrence data. Presence-only data were effective for modelling species' distributions for many species and regions. The novel methods consistently outperformed more established methods. The results of our analysis are promising for the use of data from museums and herbaria, especially as methods suited to the noise inherent in such data improve.

...read moreread less

7,589 citations

Cites methods from "On the rationale of maximum-entropy..."

...Similarly, maximum entropy methods are well known in other fields (Jaynes 1982) but only recently developed for questions of species’ distributions (Phillips et al. 2006)....
[...]

An Introduction to Kolmogorov Complexity and Its Applications

[...]

Ming Li¹, Paul M. B. Vitányi•Institutions (1)

University of Waterloo¹

01 Jan 2019

TL;DR: The book presents a thorough treatment of the central ideas and their applications of Kolmogorov complexity with a wide range of illustrative applications, and will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics.

...read moreread less

Abstract: The book is outstanding and admirable in many respects. ... is necessary reading for all kinds of readers from undergraduate students to top authorities in the field. Journal of Symbolic Logic Written by two experts in the field, this is the only comprehensive and unified treatment of the central ideas and their applications of Kolmogorov complexity. The book presents a thorough treatment of the subject with a wide range of illustrative applications. Such applications include the randomness of finite objects or infinite sequences, Martin-Loef tests for randomness, information theory, computational learning theory, the complexity of algorithms, and the thermodynamics of computing. It will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics. The book is self-contained in that it contains the basic requirements from mathematics and computer science. Included are also numerous problem sets, comments, source references, and hints to solutions of problems. New topics in this edition include Omega numbers, KolmogorovLoveland randomness, universal learning, communication complexity, Kolmogorov's random graphs, time-limited universal distribution, Shannon information and others.

...read moreread less

3,361 citations

Journal Article•DOI•

A universal prior for integers and estimation by minimum description length

[...]

Jorma Rissanen

01 Jun 1983-Annals of Statistics

TL;DR: In this article, the minimum description length (MDL) criterion is used to estimate the total number of binary digits required to rewrite the observed data, when each observation is given with some precision.

...read moreread less

Abstract: of the number of bits required to write down the observed data, has been reformulated to extend the classical maximum likelihood principle. The principle permits estimation of the number of the parameters in statistical models in addition to their values and even of the way the parameters appear in the models; i.e., of the model structures. The principle rests on a new way to interpret and construct a universal prior distribution for the integers, which makes sense even when the parameter is an individual object. Truncated realvalued parameters are converted to integers by dividing them by their precision, and their prior is determined from the universal prior for the integers by optimizing the precision. 1. Introduction. In this paper we study estimation based upon the principle of minimizing the total number of binary digits required to rewrite the observed data, when each observation is given with some precision. Instead of attempting at an absolutely shortest description, which would be futile, we look for the optimum relative to a class of parametrically given distributions. This Minimum Description Length (MDL) principle, which we introduced in a less comprehensive form in [25], turns out to degenerate to the more familiar Maximum Likelihood (ML) principle in case the number of parameters in the models is fixed, so that the description length of the parameters themselves can be ignored. In another extreme case, where the parameters determine the data, it similarly degenerates to Jaynes's principle of maximum entropy, [14]. But the main power of the new criterion is that it permits estimates of the entire model, its parameters, their number, and even the way the parameters appear in the model; i.e., the model structure. Hence, there will be no need to supplement the estimated parameters with a separate hypothesis test to decide whether a model is adequately parameterized or, perhaps, over parameterized.

...read moreread less

1,762 citations

Cites background from "On the rationale of maximum-entropy..."

...In the final Section 5, we show that Jaynes's principle of maximum entropy [14, 15, 16, 17] may be viewed as a particular instance of ours....
[...]
...This is of some significance because there are a number of important applications where the ML principle fails but where the maximum entropy formalism has been highly successful [16, 17]....
[...]

Book•

Markov Random Field Modeling in Image Analysis

[...]

Stan Z. Li¹•Institutions (1)

Microsoft¹

01 Jan 2001

TL;DR: This detailed and thoroughly enhanced third edition presents a comprehensive study / reference to theories, methodologies and recent developments in solving computer vision problems based on MRFs, statistics and optimisation.

...read moreread less

Abstract: Markov random field (MRF) theory provides a basis for modeling contextual constraints in visual processing and interpretation. It enables systematic development of optimal vision algorithms when used with optimization principles. This detailed and thoroughly enhanced third edition presents a comprehensive study / reference to theories, methodologies and recent developments in solving computer vision problems based on MRFs, statistics and optimisation. It treats various problems in low- and high-level computational vision in a systematic and unified way within the MAP-MRF framework. Among the main issues covered are: how to use MRFs to encode contextual constraints that are indispensable to image understanding; how to derive the objective function for the optimal solution to a problem; and how to design computational algorithms for finding an optimal solution. Easy-to-follow and coherent, the revised edition is accessible, includes the most recent advances, and has new and expanded sections on such topics as: Discriminative Random Fields (DRF) Strong Random Fields (SRF) Spatial-Temporal Models Total Variation Models Learning MRF for Classification (motivation + DRF) Relation to Graphic Models Graph Cuts Belief Propagation Features: Focuses on the application of Markov random fields to computer vision problems, such as image restoration and edge detection in the low-level domain, and object matching and recognition in the high-level domain Presents various vision models in a unified framework, including image restoration and reconstruction, edge and region segmentation, texture, stereo and motion, object matching and recognition, and pose estimation Uses a variety of examples to illustrate how to convert a specific vision problem involving uncertainties and constraints into essentially an optimization problem under the MRF setting Introduces readers to the basic concepts, important models and various special classes of MRFs on the regular image lattice and MRFs on relational graphs derived from images Examines the problems of parameter estimation and function optimization Includes an extensive list of references This broad-ranging and comprehensive volume is an excellent reference for researchers working in computer vision, image processing, statistical pattern recognition and applications of MRFs. It has been class-tested and is suitable as a textbook for advanced courses relating to these areas.

...read moreread less

1,694 citations

Cites background from "On the rationale of maximum-entropy..."

...The maximum entropy criterion is simply taking this fact into account: Configurations with higher entropy are more likely because nature can generate them in more ways (Jaynes 1982)....
[...]

Book•

Markov Random Field Modeling in Computer Vision

[...]

Stan Z. Li¹•Institutions (1)

Nanyang Technological University¹

01 Aug 1995

TL;DR: This book presents a comprehensive study on the use of MRFs for solving computer vision problems, and covers the following parts essential to the subject: introduction to fundamental theories, formulations of MRF vision models, MRF parameter estimation, and optimization algorithms.

...read moreread less

Abstract: From the Publisher: Markov random field (MRF) theory provides a basis for modeling contextual constraints in visual processing and interpretation. It enables us to develop optimal vision algorithms systematically when used with optimization principles. This book presents a comprehensive study on the use of MRFs for solving computer vision problems. The book covers the following parts essential to the subject: introduction to fundamental theories, formulations of MRF vision models, MRF parameter estimation, and optimization algorithms. Various vision models are presented in a unified framework, including image restoration and reconstruction, edge and region segmentation, texture, stereo and motion, object matching and recognition, and pose estimation. This book is an excellent reference for researchers working in computer vision, image processing, statistical pattern recognition, and applications of MRFs. It is also suitable as a text for advanced courses in these areas.

...read moreread less

1,333 citations

Cites background from "On the rationale of maximum-entropy..."

...can generate them in more ways and the maximum entropy criterion is simply taking this fact into account (Jaynes 1982)....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Information Theory and Statistical Mechanics. II

[...]

E. T. Jaynes¹•Institutions (1)

Stanford University¹

15 Oct 1957-Physical Review

TL;DR: In this article, the authors consider statistical mechanics as a form of statistical inference rather than as a physical theory, and show that the usual computational rules, starting with the determination of the partition function, are an immediate consequence of the maximum-entropy principle.

...read moreread less

Abstract: Information theory provides a constructive criterion for setting up probability distributions on the basis of partial knowledge, and leads to a type of statistical inference which is called the maximum-entropy estimate. It is the least biased estimate possible on the given information; i.e., it is maximally noncommittal with regard to missing information. If one considers statistical mechanics as a form of statistical inference rather than as a physical theory, it is found that the usual computational rules, starting with the determination of the partition function, are an immediate consequence of the maximum-entropy principle. In the resulting "subjective statistical mechanics," the usual rules are thus justified independently of any physical argument, and in particular independently of experimental verification; whether or not the results agree with experiment, they still represent the best estimates that could have been made on the basis of the information available.It is concluded that statistical mechanics need not be regarded as a physical theory dependent for its validity on the truth of additional assumptions not contained in the laws of mechanics (such as ergodicity, metric transitivity, equal a priori probabilities, etc.). Furthermore, it is possible to maintain a sharp distinction between its physical and statistical aspects. The former consists only of the correct enumeration of the states of a system and their properties; the latter is a straightforward example of statistical inference.

...read moreread less

12,099 citations

Maximum entropy spectral analysis

[...]

J. P. Burg

01 Jan 1967

2,053 citations

Journal Article•DOI•

Image reconstruction from incomplete and noisy data

[...]

Stephen F. Gull¹, G.J. Daniell¹•Institutions (1)

University of Cambridge¹

20 Apr 1978-Nature

TL;DR: In this article, a technique for image reconstruction by a maximum entropy method is presented, which is sufficiently fast to be useful for large and complicated images and is applicable in spectroscopy, electron microscopy, X-ray crystallography, geophysics and virtually any type of optical image processing.

...read moreread less

Abstract: Results are presented of a powerful technique for image reconstruction by a maximum entropy method, which is sufficiently fast to be useful for large and complicated images. Although our examples are taken from the fields of radio and X-ray astronomy, the technique is immediately applicable in spectroscopy, electron microscopy, X-ray crystallography, geophysics and virtually any type of optical image processing. Applied to radioastronomical data, the algorithm reveals details not seen by conventional analysis, but which are known to exist.

...read moreread less

969 citations

Journal Article•DOI•

The Minimum Entropy Production Principle

[...]

E. T. Jaynes

01 Oct 1980-Annual Review of Physical Chemistry

TL;DR: In this paper, the authors summarize progress to date toward generalizing Gibbs' variational principle to nonequilibrium conditions, and conclude that the outlook is good in that the basic principles are believed known; but they do not yet �now whether they can be reduced to simple rules immediately useful in practice, in the way that the Gibbs phase rule is useful.

...read moreread less

Abstract: It seems intuitively reasonable that Gibbs' variational principle de termining the conditions of heterogeneous equilibrium can be gener alized to nonequilibrium conditions. That is, a nonequilibriurn steady state should be the one that makes some kind of generalized-entropy production stationary; and even in the presence of irreversible fluxes, the condition for migrational equilibrium should still be the equality of some generalized chemical potentials. We summarize progress to date toward this goal, reviewing (a) the early history, (b) work of Onsager and first attempts at generalization, (c) the new direction the field took after 1967 with the work of Tykodi and Mitchell, and (d) the present situation and prospects. Our conclu sion will be, briefly, that the outlook is good in that the basic principles are believed known; but we do not yet �now whether they can be reduced to simple rules immediately useful in practice, in the way that the Gibbs phase rule is useful. For this, we need more experience in the technique of applying them to particular cases, and more data to test some conjectures.

...read moreread less

400 citations

Journal Article•DOI•

Alternative interpretation of maximum entropy spectral analysis (Corresp.)

[...]

A. van den Bos

01 Jul 1971-IEEE Transactions on Information Theory

TL;DR: Maximum entropy spectral analysis is a method for the estimation of power spectra with a higher resolution than can be obtained with conventional techniques by extrapolation of the autocorrelation function in such a way that the entropy of the corresponding probability density function is maximized in each step of the extrapolation.

...read moreread less

Abstract: Maximum entropy spectral analysis is a method for the estimation of power spectra with a higher resolution than can be obtained with conventional techniques. This is achieved by extrapolation of the autocorrelation function in such a way that the entropy of the corresponding probability density function is maximized in each step of the extrapolation. This correspondence also gives a simple interpretation of the method without entropy considerations.

...read moreread less

268 citations