scispace - formally typeset


Elements of information theory

01 Jan 1991-

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.

AbstractPreface to the Second Edition. Preface to the First Edition. Acknowledgments for the Second Edition. Acknowledgments for the First Edition. 1. Introduction and Preview. 1.1 Preview of the Book. 2. Entropy, Relative Entropy, and Mutual Information. 2.1 Entropy. 2.2 Joint Entropy and Conditional Entropy. 2.3 Relative Entropy and Mutual Information. 2.4 Relationship Between Entropy and Mutual Information. 2.5 Chain Rules for Entropy, Relative Entropy, and Mutual Information. 2.6 Jensen's Inequality and Its Consequences. 2.7 Log Sum Inequality and Its Applications. 2.8 Data-Processing Inequality. 2.9 Sufficient Statistics. 2.10 Fano's Inequality. Summary. Problems. Historical Notes. 3. Asymptotic Equipartition Property. 3.1 Asymptotic Equipartition Property Theorem. 3.2 Consequences of the AEP: Data Compression. 3.3 High-Probability Sets and the Typical Set. Summary. Problems. Historical Notes. 4. Entropy Rates of a Stochastic Process. 4.1 Markov Chains. 4.2 Entropy Rate. 4.3 Example: Entropy Rate of a Random Walk on a Weighted Graph. 4.4 Second Law of Thermodynamics. 4.5 Functions of Markov Chains. Summary. Problems. Historical Notes. 5. Data Compression. 5.1 Examples of Codes. 5.2 Kraft Inequality. 5.3 Optimal Codes. 5.4 Bounds on the Optimal Code Length. 5.5 Kraft Inequality for Uniquely Decodable Codes. 5.6 Huffman Codes. 5.7 Some Comments on Huffman Codes. 5.8 Optimality of Huffman Codes. 5.9 Shannon-Fano-Elias Coding. 5.10 Competitive Optimality of the Shannon Code. 5.11 Generation of Discrete Distributions from Fair Coins. Summary. Problems. Historical Notes. 6. Gambling and Data Compression. 6.1 The Horse Race. 6.2 Gambling and Side Information. 6.3 Dependent Horse Races and Entropy Rate. 6.4 The Entropy of English. 6.5 Data Compression and Gambling. 6.6 Gambling Estimate of the Entropy of English. Summary. Problems. Historical Notes. 7. Channel Capacity. 7.1 Examples of Channel Capacity. 7.2 Symmetric Channels. 7.3 Properties of Channel Capacity. 7.4 Preview of the Channel Coding Theorem. 7.5 Definitions. 7.6 Jointly Typical Sequences. 7.7 Channel Coding Theorem. 7.8 Zero-Error Codes. 7.9 Fano's Inequality and the Converse to the Coding Theorem. 7.10 Equality in the Converse to the Channel Coding Theorem. 7.11 Hamming Codes. 7.12 Feedback Capacity. 7.13 Source-Channel Separation Theorem. Summary. Problems. Historical Notes. 8. Differential Entropy. 8.1 Definitions. 8.2 AEP for Continuous Random Variables. 8.3 Relation of Differential Entropy to Discrete Entropy. 8.4 Joint and Conditional Differential Entropy. 8.5 Relative Entropy and Mutual Information. 8.6 Properties of Differential Entropy, Relative Entropy, and Mutual Information. Summary. Problems. Historical Notes. 9. Gaussian Channel. 9.1 Gaussian Channel: Definitions. 9.2 Converse to the Coding Theorem for Gaussian Channels. 9.3 Bandlimited Channels. 9.4 Parallel Gaussian Channels. 9.5 Channels with Colored Gaussian Noise. 9.6 Gaussian Channels with Feedback. Summary. Problems. Historical Notes. 10. Rate Distortion Theory. 10.1 Quantization. 10.2 Definitions. 10.3 Calculation of the Rate Distortion Function. 10.4 Converse to the Rate Distortion Theorem. 10.5 Achievability of the Rate Distortion Function. 10.6 Strongly Typical Sequences and Rate Distortion. 10.7 Characterization of the Rate Distortion Function. 10.8 Computation of Channel Capacity and the Rate Distortion Function. Summary. Problems. Historical Notes. 11. Information Theory and Statistics. 11.1 Method of Types. 11.2 Law of Large Numbers. 11.3 Universal Source Coding. 11.4 Large Deviation Theory. 11.5 Examples of Sanov's Theorem. 11.6 Conditional Limit Theorem. 11.7 Hypothesis Testing. 11.8 Chernoff-Stein Lemma. 11.9 Chernoff Information. 11.10 Fisher Information and the Cram-er-Rao Inequality. Summary. Problems. Historical Notes. 12. Maximum Entropy. 12.1 Maximum Entropy Distributions. 12.2 Examples. 12.3 Anomalous Maximum Entropy Problem. 12.4 Spectrum Estimation. 12.5 Entropy Rates of a Gaussian Process. 12.6 Burg's Maximum Entropy Theorem. Summary. Problems. Historical Notes. 13. Universal Source Coding. 13.1 Universal Codes and Channel Capacity. 13.2 Universal Coding for Binary Sequences. 13.3 Arithmetic Coding. 13.4 Lempel-Ziv Coding. 13.5 Optimality of Lempel-Ziv Algorithms. Compression. Summary. Problems. Historical Notes. 14. Kolmogorov Complexity. 14.1 Models of Computation. 14.2 Kolmogorov Complexity: Definitions and Examples. 14.3 Kolmogorov Complexity and Entropy. 14.4 Kolmogorov Complexity of Integers. 14.5 Algorithmically Random and Incompressible Sequences. 14.6 Universal Probability. 14.7 Kolmogorov complexity. 14.9 Universal Gambling. 14.10 Occam's Razor. 14.11 Kolmogorov Complexity and Universal Probability. 14.12 Kolmogorov Sufficient Statistic. 14.13 Minimum Description Length Principle. Summary. Problems. Historical Notes. 15. Network Information Theory. 15.1 Gaussian Multiple-User Channels. 15.2 Jointly Typical Sequences. 15.3 Multiple-Access Channel. 15.4 Encoding of Correlated Sources. 15.5 Duality Between Slepian-Wolf Encoding and Multiple-Access Channels. 15.6 Broadcast Channel. 15.7 Relay Channel. 15.8 Source Coding with Side Information. 15.9 Rate Distortion with Side Information. 15.10 General Multiterminal Networks. Summary. Problems. Historical Notes. 16. Information Theory and Portfolio Theory. 16.1 The Stock Market: Some Definitions. 16.2 Kuhn-Tucker Characterization of the Log-Optimal Portfolio. 16.3 Asymptotic Optimality of the Log-Optimal Portfolio. 16.4 Side Information and the Growth Rate. 16.5 Investment in Stationary Markets. 16.6 Competitive Optimality of the Log-Optimal Portfolio. 16.7 Universal Portfolios. 16.8 Shannon-McMillan-Breiman Theorem (General AEP). Summary. Problems. Historical Notes. 17. Inequalities in Information Theory. 17.1 Basic Inequalities of Information Theory. 17.2 Differential Entropy. 17.3 Bounds on Entropy and Relative Entropy. 17.4 Inequalities for Types. 17.5 Combinatorial Bounds on Entropy. 17.6 Entropy Rates of Subsets. 17.7 Entropy and Fisher Information. 17.8 Entropy Power Inequality and Brunn-Minkowski Inequality. 17.9 Inequalities for Determinants. 17.10 Inequalities for Ratios of Determinants. Summary. Problems. Historical Notes. Bibliography. List of Symbols. Index. more

Content maybe subject to copyright    Report

More filters
18 Nov 2016
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

26,972 citations

01 Dec 2010
TL;DR: This chapter discusses quantum information theory, public-key cryptography and the RSA cryptosystem, and the proof of Lieb's theorem.
Abstract: Part I. Fundamental Concepts: 1. Introduction and overview 2. Introduction to quantum mechanics 3. Introduction to computer science Part II. Quantum Computation: 4. Quantum circuits 5. The quantum Fourier transform and its application 6. Quantum search algorithms 7. Quantum computers: physical realization Part III. Quantum Information: 8. Quantum noise and quantum operations 9. Distance measures for quantum information 10. Quantum error-correction 11. Entropy and information 12. Quantum information theory Appendices References Index.

14,183 citations

Journal ArticleDOI
TL;DR: Using distributed antennas, this work develops and analyzes low-complexity cooperative diversity protocols that combat fading induced by multipath propagation in wireless networks and develops performance characterizations in terms of outage events and associated outage probabilities, which measure robustness of the transmissions to fading.
Abstract: We develop and analyze low-complexity cooperative diversity protocols that combat fading induced by multipath propagation in wireless networks. The underlying techniques exploit space diversity available through cooperating terminals' relaying signals for one another. We outline several strategies employed by the cooperating radios, including fixed relaying schemes such as amplify-and-forward and decode-and-forward, selection relaying schemes that adapt based upon channel measurements between the cooperating terminals, and incremental relaying schemes that adapt based upon limited feedback from the destination terminal. We develop performance characterizations in terms of outage events and associated outage probabilities, which measure robustness of the transmissions to fading, focusing on the high signal-to-noise ratio (SNR) regime. Except for fixed decode-and-forward, all of our cooperative diversity protocols are efficient in the sense that they achieve full diversity (i.e., second-order diversity in the case of two terminals), and, moreover, are close to optimum (within 1.5 dB) in certain regimes. Thus, using distributed antennas, we can provide the powerful benefits of space diversity without need for physical arrays, though at a loss of spectral efficiency due to half-duplex operation and possibly at the cost of additional receive hardware. Applicable to any wireless setting, including cellular or ad hoc networks-wherever space constraints preclude the use of physical arrays-the performance characterizations reveal that large power or energy savings result from the use of these protocols.

12,519 citations

Cites background from "Elements of information theory"

  • ..., achievable rates, via three structurally different random coding sche mes: • facilitation [5, Theorem 2], in which the relay does not actively help the source, but rather, facilitates the sourc e transmission by inducing as little interference as possibl e; • cooperation [5, Theorem 1], in which the relay fully decodes the source message and retransmits, jointly with the source, a bin index (in the sense of Slepian-Wolf coding [14], [15]) of the previous source message; • observation2 [5, Theorem 6], in which the relay encodes a...


  • ...quantized version of its received signal, using ideas from source coding with side information [14], [16], [17]....


Journal ArticleDOI
Simon Haykin1
TL;DR: Following the discussion of interference temperature as a new metric for the quantification and management of interference, the paper addresses three fundamental cognitive tasks: radio-scene analysis, channel-state estimation and predictive modeling, and the emergent behavior of cognitive radio.
Abstract: Cognitive radio is viewed as a novel approach for improving the utilization of a precious natural resource: the radio electromagnetic spectrum. The cognitive radio, built on a software-defined radio, is defined as an intelligent wireless communication system that is aware of its environment and uses the methodology of understanding-by-building to learn from the environment and adapt to statistical variations in the input stimuli, with two primary objectives in mind: /spl middot/ highly reliable communication whenever and wherever needed; /spl middot/ efficient utilization of the radio spectrum. Following the discussion of interference temperature as a new metric for the quantification and management of interference, the paper addresses three fundamental cognitive tasks. 1) Radio-scene analysis. 2) Channel-state estimation and predictive modeling. 3) Transmit-power control and dynamic spectrum management. This work also discusses the emergent behavior of cognitive radio.

11,825 citations

Christopher M. Bishop1
01 Jan 2006
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

More filters
Journal ArticleDOI

2,252 citations

Richard E. Blahut1
01 Jan 1987

1,036 citations

01 Nov 1987

93 citations

Journal ArticleDOI
G. Longo1
01 Oct 1979
TL;DR: The stochastic integral with respect to processes with values in a reflexive Banach space, Theor.
Abstract: 14. M. Metivier, The stochastic integral with respect to processes with values in a reflexive Banach space, Theor. Probability 19 (1974), 758-787. 15. M. Metivier and G. Pistone, Une formule d'isométrie pour r intégrale stochastique Hilbertienne et équations d'évolution linéaires stochastiques, Z. Wahrscheinlicnkeitstheorie und Verw. Gebiete 33 (1975), 1-18. 16. P. A. Meyer, A decomposition theorem for supermartingales, Illinois J. Math. 6 (1962), 193-205. 17. , Intégrales stochastiques. IV, Lecture Notes in Math., vol. 39, Springer-Verlag, Berlin and New York, 1967, pp. 142-162. 18. , Un cours sur les intégrales stochastiques, Lecture Notes in Math., vol. 511, Springer-Verlag, Berlin and New York, 1976, pp. 245-400. 19. , Intégrales Hilbertiennes, Lecture Notes in Math., vol. 581, Springer-Verlag, Berlin and New York, 1977, pp. 446-461. 20. R. E. A. C. Paley, N. Wiener and A. Zygmund, Notes on random functions, Math. Z. 37 (1933), 647-668. 21. J. Pellaumail, Sur lintégrale stochastique et la décomposition de Doob-Meyer, Asterique 9 (1973), 1-125. 22. P. E. Protter, Markov solutions of stochastic differential equations, TL Wahrscheinlichkeitstheorie und Verw. Gebiete 41 (1977), 39-58. 23. , A comparison of stochastic integrals, Ann. Probability (to appear). 24. R. L. Stratonovich, A new representation for stochastic integrals and equations, SIAM. J. Control 4 (1966), 362-371. 25. N. Wiener, Differential-space, J. Math, and Physics 2 (1923), 131-174.

37 citations