scispace - formally typeset
Search or ask a question

An Introduction to Kolmogorov Complexity and Its Applications

01 Jan 2019-
TL;DR: The book presents a thorough treatment of the central ideas and their applications of Kolmogorov complexity with a wide range of illustrative applications, and will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics.
Abstract: The book is outstanding and admirable in many respects. ... is necessary reading for all kinds of readers from undergraduate students to top authorities in the field. Journal of Symbolic Logic Written by two experts in the field, this is the only comprehensive and unified treatment of the central ideas and their applications of Kolmogorov complexity. The book presents a thorough treatment of the subject with a wide range of illustrative applications. Such applications include the randomness of finite objects or infinite sequences, Martin-Loef tests for randomness, information theory, computational learning theory, the complexity of algorithms, and the thermodynamics of computing. It will be ideal for advanced undergraduate students, graduate students, and researchers in computer science, mathematics, cognitive sciences, philosophy, artificial intelligence, statistics, and physics. The book is self-contained in that it contains the basic requirements from mathematics and computer science. Included are also numerous problem sets, comments, source references, and hints to solutions of problems. New topics in this edition include Omega numbers, KolmogorovLoveland randomness, universal learning, communication complexity, Kolmogorov's random graphs, time-limited universal distribution, Shannon information and others.
Citations
More filters
Journal ArticleDOI
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

14,635 citations


Cites background from "An Introduction to Kolmogorov Compl..."

  • ...Most FNN applications focused on FNNs with few hidden layers....

    [...]

  • ...In Supervised Learning (SL), certain NN output events xt may be associated with teacher-given, real-valued labels or targets dt yielding errors et , e.g., et = 1/2(xt − dt)2....

    [...]

  • ...…of a solution candidate by the length of the shortest program that computes it (e.g., Blumer, Ehrenfeucht, Haussler, & Warmuth, 1987; Chaitin, 1966; Grünwald,Myung, & Pitt, 2005; Kolmogorov, 1965b; Levin, 1973a; Li & Vitányi, 1997; Rissanen, 1986; Solomonoff, 1964, 1978;Wallace & Boulton, 1968)....

    [...]

Journal ArticleDOI
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

10,696 citations


Cites background from "An Introduction to Kolmogorov Compl..."

  • ...…is important to keep in mind that there exist several fundamentally different approaches such as Minimum Description Length (cf. e.g. Rissanen 1978, Li and Vitányi 1993) which is based on the idea that the simplicity of an estimate, and therefore also its plausibility is based on the information…...

    [...]

Journal ArticleDOI
TL;DR: This survey tries to provide a structured and comprehensive overview of the research on anomaly detection by grouping existing techniques into different categories based on the underlying approach adopted by each technique.
Abstract: Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and more succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.

9,627 citations

Book
01 Jan 2009
TL;DR: The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.
Abstract: Can machine learning deliver AI? Theoretical results, inspiration from the brain and cognition, as well as machine learning experiments suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one would need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers, graphical models with many levels of latent variables, or in complicated propositional formulae re-using many sub-formulae. Each level of the architecture represents features at a different level of abstraction, defined as a composition of lower-level features. Searching the parameter space of deep architectures is a difficult task, but new algorithms have been discovered and a new sub-area has emerged in the machine learning community since 2006, following these discoveries. Learning algorithms such as those for Deep Belief Networks and other related unsupervised learning algorithms have recently been proposed to train deep architectures, yielding exciting results and beating the state-of-the-art in certain areas. Learning Deep Architectures for AI discusses the motivations for and principles of learning algorithms for deep architectures. By analyzing and comparing recent results with different learning algorithms for deep architectures, explanations for their success are proposed and discussed, highlighting challenges and suggesting avenues for future explorations in this area.

7,767 citations


Cites background from "An Introduction to Kolmogorov Compl..."

  • ...According to learning theory (Vapnik, 1995; Li & Vitanyi, 1997), to obtain good generalization it is enough that the total number of bits needed to encode thewhole training setbe small, compared to the size of the training set....

    [...]

MonographDOI
01 Jan 2006
TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.
Abstract: Planning algorithms are impacting technical disciplines and industries around the world, including robotics, computer-aided design, manufacturing, computer graphics, aerospace applications, drug design, and protein folding. This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms. The treatment is centered on robot motion planning but integrates material on planning in discrete spaces. A major part of the book is devoted to planning under uncertainty, including decision theory, Markov decision processes, and information spaces, which are the “configuration spaces” of all sensor-based planning problems. The last part of the book delves into planning under differential constraints that arise when automating the motions of virtually any mechanical system. Developed from courses taught by the author, the book is intended for students, engineers, and researchers in robotics, artificial intelligence, and control theory as well as computer graphics, algorithms, and computational biology.

6,340 citations


Cites background or methods from "An Introduction to Kolmogorov Compl..."

  • ...As long as the instance encoding is within a polynomial factor of the optimal encoding (this can be made precise using Kolmogorov complexity [633]), then this bizarre behavior is avoided....

    [...]

  • ...This brings it closer to the Kolmogorov complexity [250, 633] of the state transition graph, which is the shortest bit size to which it can possibly be compressed and then fully recovered by a Turing machine....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This chapter discusses the application of the diagonal process of the universal computing machine, which automates the calculation of circle and circle-free numbers.
Abstract: 1. Computing machines. 2. Definitions. Automatic machines. Computing machines. Circle and circle-free numbers. Computable sequences and numbers. 3. Examples of computing machines. 4. Abbreviated tables Further examples. 5. Enumeration of computable sequences. 6. The universal computing machine. 7. Detailed description of the universal machine. 8. Application of the diagonal process. Pagina 1 di 38 On computable numbers, with an application to the Entscheidungsproblem A. M. ...

7,642 citations

Journal ArticleDOI
TL;DR: In this paper, the authors describe the possibility of simulating physics in the classical approximation, a thing which is usually described by local differential equations, and the possibility that there is to be an exact simulation, that the computer will do exactly the same as nature.
Abstract: This chapter describes the possibility of simulating physics in the classical approximation, a thing which is usually described by local differential equations. But the physical world is quantum mechanical, and therefore the proper problem is the simulation of quantum physics. A computer which will give the same probabilities as the quantum system does. The present theory of physics allows space to go down into infinitesimal distances, wavelengths to get infinitely great, terms to be summed in infinite order, and so forth; and therefore, if this proposition is right, physical law is wrong. Quantum theory and quantizing is a very specific type of theory. The chapter talks about the possibility that there is to be an exact simulation, that the computer will do exactly the same as nature. There are interesting philosophical questions about reasoning, and relationship, observation, and measurement and so on, which computers have stimulated people to think about anew, with new types of thinking.

7,202 citations

Proceedings ArticleDOI
03 May 1971
TL;DR: It is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing machine can be “reduced” to the problem of determining whether a given propositional formula is a tautology.
Abstract: It is shown that any recognition problem solved by a polynomial time-bounded nondeterministic Turing machine can be “reduced” to the problem of determining whether a given propositional formula is a tautology. Here “reduced” means, roughly speaking, that the first problem can be solved deterministically in polynomial time provided an oracle is available for solving the second. From this notion of reducible, polynomial degrees of difficulty are defined, and it is shown that the problem of determining tautologyhood has the same polynomial degree as the problem of determining whether the first of two given graphs is isomorphic to a subgraph of the second. Other examples are discussed. A method of measuring the complexity of proof procedures for the predicate calculus is introduced and discussed.

6,675 citations

Book ChapterDOI
TL;DR: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady.
Abstract: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady. The paper was first published in Russian as Вапник В. Н. and Червоненкис А. Я. О равномерноЙ сходимости частот появления событиЙ к их вероятностям. Теория вероятностеЙ и ее применения 16(2), 264–279 (1971).

3,939 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that the likelihood ratio test for fixed sample size can be reduced to this form, and that for large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample with the second test.
Abstract: In many cases an optimum or computationally convenient test of a simple hypothesis $H_0$ against a simple alternative $H_1$ may be given in the following form. Reject $H_0$ if $S_n = \sum^n_{j=1} X_j \leqq k,$ where $X_1, X_2, \cdots, X_n$ are $n$ independent observations of a chance variable $X$ whose distribution depends on the true hypothesis and where $k$ is some appropriate number. In particular the likelihood ratio test for fixed sample size can be reduced to this form. It is shown that with each test of the above form there is associated an index $\rho$. If $\rho_1$ and $\rho_2$ are the indices corresponding to two alternative tests $e = \log \rho_1/\log \rho_2$ measures the relative efficiency of these tests in the following sense. For large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample of size $en$ with the second test. To obtain the above result, use is made of the fact that $P(S_n \leqq na)$ behaves roughly like $m^n$ where $m$ is the minimum value assumed by the moment generating function of $X - a$. It is shown that if $H_0$ and $H_1$ specify probability distributions of $X$ which are very close to each other, one may approximate $\rho$ by assuming that $X$ is normally distributed.

3,760 citations