Journal ArticleDOI

# A mathematical theory of communication

01 Jul 1948-Bell System Technical Journal (Wiley-Blackwell)-Vol. 27, Iss: 3, pp 379-423

TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Abstract: In this final installment of the paper we consider the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now. To a considerable extent the continuous case can be obtained through a limiting process from the discrete case by dividing the continuum of messages and signals into a large but finite number of small regions and calculating the various parameters involved on a discrete basis. As the size of the regions is decreased these parameters in general approach as limits the proper values for the continuous case. There are, however, a few new effects that appear and also a general change of emphasis in the direction of specialization of the general results to particular cases.
##### Citations
More filters

Journal ArticleDOI
Lee J. Cronbach1Institutions (1)
Abstract: A general formula (α) of which a special case is the Kuder-Richardson coefficient of equivalence is shown to be the mean of all split-half coefficients resulting from different splittings of a test. α is therefore an estimate of the correlation between two random samples of items from a universe of items like those in the test. α is found to be an appropriate index of equivalence and, except for very short tests, of the first-factor concentration in the test. Tests divisible into distinct subtests should be so divided before using the formula. The index $$\bar r_{ij}$$ , derived from α, is shown to be an index of inter-item homogeneity. Comparison is made to the Guttman and Loevinger approaches. Parallel split coefficients are shown to be unnecessary for tests of common types. In designing tests, maximum interpretability of scores is obtained by increasing the first-factor concentration in any separately-scored subtest and avoiding substantial group-factor clusters within a subtest. Scalability is not a requisite.

34,054 citations

Proceedings Article
01 Jan 1973
TL;DR: The classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion to provide answers to many practical problems of statistical model fitting.
Abstract: In this paper it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion. This observation shows an extension of the principle to provide answers to many practical problems of statistical model fitting.

17,414 citations

Book ChapterDOI
01 Jan 1973
Abstract: In this paper it is shown that the classical maximum likelihood principle can be considered to be a method of asymptotic realization of an optimum estimate with respect to a very general information theoretic criterion. This observation shows an extension of the principle to provide answers to many practical problems of statistical model fitting.

15,032 citations

Book
01 Jan 1996
TL;DR: A valuable reference for the novice as well as for the expert who needs a wider scope of coverage within the area of cryptography, this book provides easy and rapid access of information and includes more than 200 algorithms and protocols.
Abstract: From the Publisher: A valuable reference for the novice as well as for the expert who needs a wider scope of coverage within the area of cryptography, this book provides easy and rapid access of information and includes more than 200 algorithms and protocols; more than 200 tables and figures; more than 1,000 numbered definitions, facts, examples, notes, and remarks; and over 1,250 significant references, including brief comments on each paper.

13,370 citations

### Cites background from "A mathematical theory of communicat..."

• ...The concept of unconditional security was introduced in the seminal paper by Shannon [1120]....

[...]

• ...2 The concept of entropy was introduced in the seminal paper of Shannon [1120]....

[...]

Journal ArticleDOI
Jürgen Schmidhuber1Institutions (1)
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Abstract: In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. Shallow and Deep Learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

11,176 citations

### Cites background from "A mathematical theory of communicat..."

• ...Many UL methods are designed to maximize entropy-related, information-theoretic (Boltzmann, 1909; Kullback & Leibler, 1951; Shannon, 1948) objectives (e.g., Amari, Cichocki, & Yang, 1996; Barlowet al., 1989;Dayan&Zemel, 1995;Deco&Parra, 1997; Field, 1994; Hinton, Dayan, Frey, & Neal, 1995; Linsker,…...

[...]

• ...…another RNN to the stack improves a bound on the data’s description length – equivalent to the negative logarithm of its probability (Huffman, 1952; Shannon, 1948) – as long as there is remaining local learnable predictability in the data representation on the corresponding level of the…...

[...]

##### References
More filters

Journal ArticleDOI
Abstract: A quantitative measure of “information” is developed which is based on physical as contrasted with psychological considerations. How the rate of transmission of this information over a system is limited by the distortion resulting from storage of energy is discussed from the transient viewpoint. The relation between the transient and steady state viewpoints is reviewed. It is shown that when the storage of energy is used to restrict the steady state transmission to a limited range of frequencies the amount of information that can be transmitted is proportional to the product of the width of the frequency-range by the time it is available. Several illustrations of the application of this principle to practical systems are included. In the case of picture transmission and television the spacial variation of intensity is analyzed by a steady state method analogous to that commonly used for variations with time.

1,614 citations

Journal ArticleDOI
Harry Nyquist1Institutions (1)
Abstract: This paper considers two fundamental factors entering into the maximum speed of transmission of intelligence by telegraph. These factors are signal shaping and choice of codes. The first is concerned with the best wave shape to be impressed on the transmitting medium so as to permit of greater speed without undue interference either in the circuit under consideration or in those adjacent, while the latter deals with the choice of codes which will permit of transmitting a maximum amount of intelligence with a given number of signal elements. It is shown that the wave shape depends somewhat on the type of circuit over which intelligence is to be transmitted and that for most cases the optimum wave is neither rectangular nor a half cycle sine wave as is frequently used but a wave of special form produced by sending a simple rectangular wave through a suitable network. The impedances usually associated with telegraph circuits are such as to produce a fair degree of signal shaping when a rectangular voltage wave is impressed. Consideration of the choice of codes show that while it is desirable to use those involving more than two current values, there are limitations which prevent a large number of current values being used. A table of comparisons shows the relative speed efficiencies of various codes proposed. It is shown that no advantages result from the use of a sine wave for telegraph transmission as proposed by Squier and others2 and that their arguments are based on erroneous assumptions.

436 citations

##### Network Information
###### Related Papers (5)
01 Jan 1991

Thomas M. Cover, Joy A. Thomas

Claude E. Shannon, Warren Weaver

E. T. Jaynes

Claude E. Shannon

Solomon Kullback, R. A. Leibler

##### Performance
###### Metrics
No. of citations received by the Paper in previous years
YearCitations
2022103
20214,334
20204,042
20193,479
20183,042
20172,953