On Information and Sufficiency

Home
/
Papers
/
On Information and Sufficiency

Posted Content•

On Information and Sufficiency

01 Feb 1997-Research Papers in Economics (Santa Fe Institute)-

TL;DR: The information deviation between any two finite measures cannot be increased by any statistical operations (Markov morphisms) and is invarient if and only if the morphism is sufficient for these two measures as mentioned in this paper.

read less

Abstract: The information deviation between any two finite measures cannot be increased by any statistical operations (Markov morphisms). It is invarient if and only if the morphism is sufficient for these two measures

...read moreread less

Citations

PDF

Open Access

More filters

Quantum Computation and Quantum Information

[...]

Michael A. Nielsen, Isaac L. Chuang

01 Dec 2010

TL;DR: This chapter discusses quantum information theory, public-key cryptography and the RSA cryptosystem, and the proof of Lieb's theorem.

...read moreread less

Abstract: Part I. Fundamental Concepts: 1. Introduction and overview 2. Introduction to quantum mechanics 3. Introduction to computer science Part II. Quantum Computation: 4. Quantum circuits 5. The quantum Fourier transform and its application 6. Quantum search algorithms 7. Quantum computers: physical realization Part III. Quantum Information: 8. Quantum noise and quantum operations 9. Distance measures for quantum information 10. Quantum error-correction 11. Entropy and information 12. Quantum information theory Appendices References Index.

...read moreread less

14,825 citations

Journal Article•DOI•

Deep learning in neural networks

[...]

Jürgen Schmidhuber¹•Institutions (1)

University of Lugano¹

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

Cites methods from "On Information and Sufficiency"

...Many UL methods are designed to maximize entropy-related, information-theoretic (Boltzmann, 1909; Kullback & Leibler, 1951; Shannon, 1948) objectives (e.g., Amari, Cichocki, & Yang, 1996; Barlowet al., 1989;Dayan&Zemel, 1995;Deco&Parra, 1997; Field, 1994; Hinton, Dayan, Frey, & Neal, 1995; Linsker,…...
[...]
...Many UL methods are designed to maximize entropy-related, information-theoretic (Boltzmann, 1909; Shannon, 1948; Kullback and Leibler, 1951) objectives (e....
[...]

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Community detection in graphs

[...]

Santo Fortunato¹•Institutions (1)

Institute for Scientific Interchange¹

03 Jun 2009-arXiv: Physics and Society

TL;DR: A thorough exposition of community structure, or clustering, is attempted, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists.

...read moreread less

Abstract: The modern science of networks has brought significant advances to our understanding of complex systems. One of the most relevant features of graphs representing real systems is community structure, or clustering, i. e. the organization of vertices in clusters, with many edges joining vertices of the same cluster and comparatively few edges joining vertices of different clusters. Such clusters, or communities, can be considered as fairly independent compartments of a graph, playing a similar role like, e. g., the tissues or the organs in the human body. Detecting communities is of great importance in sociology, biology and computer science, disciplines where systems are often represented as graphs. This problem is very hard and not yet satisfactorily solved, despite the huge effort of a large interdisciplinary community of scientists working on it over the past few years. We will attempt a thorough exposition of the topic, from the definition of the main elements of the problem, to the presentation of most methods developed, with a special focus on techniques designed by statistical physicists, from the discussion of crucial issues like the significance of clustering and how methods should be tested and compared against each other, to the description of applications to real networks.

...read moreread less

9,057 citations

Cites methods from "On Information and Sufficiency"

...Here the snapshot cost28 is the Kullback-Leibler (KL) divergence [389] between the adjacency/similarity matrix at time t and the matrix describing the community structure of the graph at time t; the historical cost is the KL divergence between the matrices describing the community structure of the graph at times t − 1 and t ....
[...]

Journal Article•DOI•

Multimodel Inference Understanding AIC and BIC in Model Selection

[...]

Kenneth P. Burnham¹, David E. Anderson¹•Institutions (1)

Colorado State University¹

01 Nov 2004-Sociological Methods & Research

TL;DR: Various facets of such multimodel inference are presented here, particularly methods of model averaging, which can be derived as a non-Bayesian result.

...read moreread less

Abstract: The model selection literature has been generally poor at reflecting the deep foundations of the Akaike information criterion (AIC) and at making appropriate comparisons to the Bayesian information...

...read moreread less

8,933 citations

Cites background from "On Information and Sufficiency"

...In 1951 S. Kullback and R. A. Leibler published a now-famous paper (Kullback and Leibler 1951) that quantified the meaning of “information" as related to R. A. Fisher's concept of sufficient statistics....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

On the generalized distance in statistics

[...]

P. C. Mahalanobis

01 Jan 1936

6,325 citations

"On Information and Sufficiency" refers background in this paper

...A special case of this divergence is Mahalanobis' generalized distance [13]....
[...]

Journal Article•DOI•

On the Mathematical Foundations of Theoretical Statistics

[...]

R. A. Fisher¹•Institutions (1)

University of Cambridge¹

01 Jan 1922-Philosophical transactions - Royal Society. Mathematical, physical and engineering sciences

TL;DR: In this paper, the authors define the center of location as the abscissa of a frequency curve for which the sampling errors of optimum location are uncorrelated with those of optimum scaling.

...read moreread less

Abstract: Centre of Location. That abscissa of a frequency curve for which the sampling errors of optimum location are uncorrelated with those of optimum scaling. (9.)

...read moreread less

3,392 citations

Journal Article•DOI•

Theory of Statistical Estimation

[...]

R. A. Fisher¹•Institutions (1)

University of Cambridge¹

01 Jul 1925

TL;DR: It has been pointed out to me that some of the statistical ideas employed in the following investigation have never received a strictly logical definition and analysis, and it is desirable to set out for criticism the manner in which the logical foundations of these ideas may be established.

...read moreread less

Abstract: It has been pointed out to me that some of the statistical ideas employed in the following investigation have never received a strictly logical definition and analysis The idea of a frequency curve, for example, evidently implies an infinite hypothetical population distributed in a definite manner; but equally evidently the idea of an infinite hypothetical population requires a more precise logical specification than is contained in that phrase The same may be said of the intimately connected idea of random sampling These ideas have grown up in the minds of practical statisticians and lie at the basis especially of recent work; there can be no question of their pragmatic value It was no part of my original intention to deal with the logical bases of these ideas, but some comments which Dr Burnside has kindly made have convinced me that it may be desirable to set out for criticism the manner in which I believe the logical foundations of these ideas may be established

...read moreread less

2,464 citations

Journal Article•DOI•

An invariant form for the prior probability in estimation problems.

[...]

Harold Jeffreys

24 Sep 1946-Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences

TL;DR: It is shown that a certain differential form depending on the values of the parameters in a law of chance is invariant for all transformations of the parameter when the law is differentiable with regard to all parameters.

...read moreread less

Abstract: It is shown that a certain differential form depending on the values of the parameters in a law of chance is invariant for all transformations of the parameters when the law is differentiable with regard to all parameters. For laws containing a location and a scale parameter a form with a somewhat restricted type of invariance is found even when the law is not everywhere differentiable with regard to the parameters. This form has the properties required to give a general rule for stating the prior probability in a large class of estimation problems.

...read moreread less

2,292 citations

"On Information and Sufficiency" refers methods in this paper

...Jeffreys (par....
[...]
...The particular measure of divergence we use has been considered by Jeffreys ([10), [111) in another connection....
[...]

On a measure of divergence between two statistical populations defined by their probability distributions

[...]

A. Bhattacharyya

01 Jan 1943

2,183 citations

"On Information and Sufficiency" refers background in this paper

...We are also concerned with the statistical problem of discrimination ([31, [17]), by considering a measure of the "distance" or "divergence" between statistical populations ([1), [21, [131) in terms of our measure of information....
[...]