Posted Content•

# The Generalized Universal Law of Generalization

...read more

##### Citations

More filters

••

[...]

TL;DR: It is argued that the top-down approach to modeling cognition yields greater flexibility for exploring the representations and inductive biases that underlie human cognition.

Abstract: Cognitive science aims to reverse-engineer the mind, and many of the engineering challenges the mind faces involve induction. The probabilistic approach to modeling cognition begins by identifying ideal solutions to these inductive problems. Mental processes are then modeled using algorithms for approximating these solutions, and neural processes are viewed as mechanisms for implementing these algorithms, with the result being a top-down analysis of cognition starting with the function of cognitive processes. Typical connectionist models, by contrast, follow a bottom-up approach, beginning with a characterization of neural mechanisms and exploring what macro-level functional phenomena might emerge. We argue that the top-down approach yields greater flexibility for exploring the representations and inductive biases that underlie human cognition.

399 citations

••

[...]

TL;DR: A novel, principled and unified technique for pattern analysis and generation that ensures computational efficiency and enables a straightforward incorporation of domain knowledge will be presented and has the potential to reduce computational time significantly.

Abstract: The advent of multiple-point geostatistics (MPS) gave rise to the integration of complex subsurface geological structures and features into the model by the concept of training images Initial algorithms generate geologically realistic realizations by using these training images to obtain conditional probabilities needed in a stochastic simulation framework More recent pattern-based geostatistical algorithms attempt to improve the accuracy of the training image pattern reproduction In these approaches, the training image is used to construct a pattern database Consequently, sequential simulation will be carried out by selecting a pattern from the database and pasting it onto the simulation grid One of the shortcomings of the present algorithms is the lack of a unifying framework for classifying and modeling the patterns from the training image In this paper, an entirely different approach will be taken toward geostatistical modeling A novel, principled and unified technique for pattern analysis and generation that ensures computational efficiency and enables a straightforward incorporation of domain knowledge will be presented In the developed methodology, patterns scanned from the training image are represented as points in a Cartesian space using multidimensional scaling The idea behind this mapping is to use distance functions as a tool for analyzing variability between all the patterns in a training image These distance functions can be tailored to the application at hand Next, by significantly reducing the dimensionality of the problem and using kernel space mapping, an improved pattern classification algorithm is obtained This paper discusses the various implementation details to accomplish these ideas Several examples are presented and a qualitative comparison is made with previous methods An improved pattern continuity and data-conditioning capability is observed in the generated realizations for both continuous and categorical variables We show how the proposed methodology is much less sensitive to the user-provided parameters, and at the same time has the potential to reduce computational time significantly

256 citations

••

[...]

TL;DR: This paper presents a series of analyses of phonological cues and distributional cues and their potential for distinguishing grammatical categories of words in corpus analyses and indicates that phonological and Distributional cues contribute differentially towards grammatical categorisation.

Abstract: Recognising the grammatical categories of words is a necessary skill for the acquisition of syntax and for on-line sentence processing. The syntactic and semantic context of the word contribute as cues for grammatical category assignment, but phonological cues, too, have been implicated as important sources of information. The value of phonological and distributional cues has not, with very few exceptions, been empirically assessed. This paper presents a series of analyses of phonological cues and distributional cues and their potential for distinguishing grammatical categories of words in corpus analyses. The corpus analyses indicated that phonological cues were more reliable for less frequent words, whereas distributional information was most valuable for high frequency words. We tested this prediction in an artificial language learning experiment, where the distributional and phonological cues of categories of nonsense words were varied. The results corroborated the corpus analyses. For high-frequency nonwords, distributional information was more useful, whereas for low-frequency words there was more reliance on phonological cues. The results indicate that phonological and distributional cues contribute differentially towards grammatical categorisation.

199 citations

••

[...]

TL;DR: How and why the P-Cognition thesis may be overly restrictive is explained, risking the exclusion of veridical computational-level theories from scientific investigation, and an argument is made to replace the Tractable Cognition thesis by the FPT-Cognitive thesis as an alternative formalization.

Abstract: The recognition that human minds/brains are finite systems with limited resources for computation has led some researchers to advance the Tractable Cognition thesis: Human cognitive capacities are constrained by computational tractability. This thesis, if true, serves cognitive psychology by constraining the space of computational-level theories of cognition. To utilize this constraint, a precise and workable definition of "computational tractability" is needed. Following computer science tradition, many cognitive scientists and psychologists define computational tractability as polynomial-time computability, leading to the P-Cognition thesis. This article explains how and why the P-Cognition thesis may be overly restrictive, risking the exclusion of veridical computational-level theories from scientific investigation. An argument is made to replace the P-Cognition thesis by the FPT-Cognition thesis as an alternative formalization of the Tractable Cognition thesis (here, FPT stands for fixed-parameter tractable). Possible objections to the Tractable Cognition thesis, and its proposed formalization, are discussed, and existing misconceptions are clarified.

183 citations

••

[...]

TL;DR: It is suggested that, for the vast majority of classic findings in cognitive science, embodied cognition offers no scientifically valuable insight and is also unable to adequately address the basic experiences of cognitive life.

Abstract: In recent years, there has been rapidly growing interest in embodied cognition, a multifaceted theoretical proposition that (1) cognitive processes are influenced by the body, (2) cognition exists in the service of action, (3) cognition is situated in the environment, and (4) cognition may occur without internal representations. Many proponents view embodied cognition as the next great paradigm shift for cognitive science. In this article, we critically examine the core ideas from embodied cognition, taking a "thought exercise" approach. We first note that the basic principles from embodiment theory are either unacceptably vague (e.g., the premise that perception is influenced by the body) or they offer nothing new (e.g., cognition evolved to optimize survival, emotions affect cognition, perception-action couplings are important). We next suggest that, for the vast majority of classic findings in cognitive science, embodied cognition offers no scientifically valuable insight. In most cases, the theory has no logical connections to the phenomena, other than some trivially true ideas. Beyond classic laboratory findings, embodiment theory is also unable to adequately address the basic experiences of cognitive life.

100 citations

##### References

More filters

•

[...]

01 Jan 1991

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.

Abstract: Preface to the Second Edition. Preface to the First Edition. Acknowledgments for the Second Edition. Acknowledgments for the First Edition. 1. Introduction and Preview. 1.1 Preview of the Book. 2. Entropy, Relative Entropy, and Mutual Information. 2.1 Entropy. 2.2 Joint Entropy and Conditional Entropy. 2.3 Relative Entropy and Mutual Information. 2.4 Relationship Between Entropy and Mutual Information. 2.5 Chain Rules for Entropy, Relative Entropy, and Mutual Information. 2.6 Jensen's Inequality and Its Consequences. 2.7 Log Sum Inequality and Its Applications. 2.8 Data-Processing Inequality. 2.9 Sufficient Statistics. 2.10 Fano's Inequality. Summary. Problems. Historical Notes. 3. Asymptotic Equipartition Property. 3.1 Asymptotic Equipartition Property Theorem. 3.2 Consequences of the AEP: Data Compression. 3.3 High-Probability Sets and the Typical Set. Summary. Problems. Historical Notes. 4. Entropy Rates of a Stochastic Process. 4.1 Markov Chains. 4.2 Entropy Rate. 4.3 Example: Entropy Rate of a Random Walk on a Weighted Graph. 4.4 Second Law of Thermodynamics. 4.5 Functions of Markov Chains. Summary. Problems. Historical Notes. 5. Data Compression. 5.1 Examples of Codes. 5.2 Kraft Inequality. 5.3 Optimal Codes. 5.4 Bounds on the Optimal Code Length. 5.5 Kraft Inequality for Uniquely Decodable Codes. 5.6 Huffman Codes. 5.7 Some Comments on Huffman Codes. 5.8 Optimality of Huffman Codes. 5.9 Shannon-Fano-Elias Coding. 5.10 Competitive Optimality of the Shannon Code. 5.11 Generation of Discrete Distributions from Fair Coins. Summary. Problems. Historical Notes. 6. Gambling and Data Compression. 6.1 The Horse Race. 6.2 Gambling and Side Information. 6.3 Dependent Horse Races and Entropy Rate. 6.4 The Entropy of English. 6.5 Data Compression and Gambling. 6.6 Gambling Estimate of the Entropy of English. Summary. Problems. Historical Notes. 7. Channel Capacity. 7.1 Examples of Channel Capacity. 7.2 Symmetric Channels. 7.3 Properties of Channel Capacity. 7.4 Preview of the Channel Coding Theorem. 7.5 Definitions. 7.6 Jointly Typical Sequences. 7.7 Channel Coding Theorem. 7.8 Zero-Error Codes. 7.9 Fano's Inequality and the Converse to the Coding Theorem. 7.10 Equality in the Converse to the Channel Coding Theorem. 7.11 Hamming Codes. 7.12 Feedback Capacity. 7.13 Source-Channel Separation Theorem. Summary. Problems. Historical Notes. 8. Differential Entropy. 8.1 Definitions. 8.2 AEP for Continuous Random Variables. 8.3 Relation of Differential Entropy to Discrete Entropy. 8.4 Joint and Conditional Differential Entropy. 8.5 Relative Entropy and Mutual Information. 8.6 Properties of Differential Entropy, Relative Entropy, and Mutual Information. Summary. Problems. Historical Notes. 9. Gaussian Channel. 9.1 Gaussian Channel: Definitions. 9.2 Converse to the Coding Theorem for Gaussian Channels. 9.3 Bandlimited Channels. 9.4 Parallel Gaussian Channels. 9.5 Channels with Colored Gaussian Noise. 9.6 Gaussian Channels with Feedback. Summary. Problems. Historical Notes. 10. Rate Distortion Theory. 10.1 Quantization. 10.2 Definitions. 10.3 Calculation of the Rate Distortion Function. 10.4 Converse to the Rate Distortion Theorem. 10.5 Achievability of the Rate Distortion Function. 10.6 Strongly Typical Sequences and Rate Distortion. 10.7 Characterization of the Rate Distortion Function. 10.8 Computation of Channel Capacity and the Rate Distortion Function. Summary. Problems. Historical Notes. 11. Information Theory and Statistics. 11.1 Method of Types. 11.2 Law of Large Numbers. 11.3 Universal Source Coding. 11.4 Large Deviation Theory. 11.5 Examples of Sanov's Theorem. 11.6 Conditional Limit Theorem. 11.7 Hypothesis Testing. 11.8 Chernoff-Stein Lemma. 11.9 Chernoff Information. 11.10 Fisher Information and the Cram-er-Rao Inequality. Summary. Problems. Historical Notes. 12. Maximum Entropy. 12.1 Maximum Entropy Distributions. 12.2 Examples. 12.3 Anomalous Maximum Entropy Problem. 12.4 Spectrum Estimation. 12.5 Entropy Rates of a Gaussian Process. 12.6 Burg's Maximum Entropy Theorem. Summary. Problems. Historical Notes. 13. Universal Source Coding. 13.1 Universal Codes and Channel Capacity. 13.2 Universal Coding for Binary Sequences. 13.3 Arithmetic Coding. 13.4 Lempel-Ziv Coding. 13.5 Optimality of Lempel-Ziv Algorithms. Compression. Summary. Problems. Historical Notes. 14. Kolmogorov Complexity. 14.1 Models of Computation. 14.2 Kolmogorov Complexity: Definitions and Examples. 14.3 Kolmogorov Complexity and Entropy. 14.4 Kolmogorov Complexity of Integers. 14.5 Algorithmically Random and Incompressible Sequences. 14.6 Universal Probability. 14.7 Kolmogorov complexity. 14.9 Universal Gambling. 14.10 Occam's Razor. 14.11 Kolmogorov Complexity and Universal Probability. 14.12 Kolmogorov Sufficient Statistic. 14.13 Minimum Description Length Principle. Summary. Problems. Historical Notes. 15. Network Information Theory. 15.1 Gaussian Multiple-User Channels. 15.2 Jointly Typical Sequences. 15.3 Multiple-Access Channel. 15.4 Encoding of Correlated Sources. 15.5 Duality Between Slepian-Wolf Encoding and Multiple-Access Channels. 15.6 Broadcast Channel. 15.7 Relay Channel. 15.8 Source Coding with Side Information. 15.9 Rate Distortion with Side Information. 15.10 General Multiterminal Networks. Summary. Problems. Historical Notes. 16. Information Theory and Portfolio Theory. 16.1 The Stock Market: Some Definitions. 16.2 Kuhn-Tucker Characterization of the Log-Optimal Portfolio. 16.3 Asymptotic Optimality of the Log-Optimal Portfolio. 16.4 Side Information and the Growth Rate. 16.5 Investment in Stationary Markets. 16.6 Competitive Optimality of the Log-Optimal Portfolio. 16.7 Universal Portfolios. 16.8 Shannon-McMillan-Breiman Theorem (General AEP). Summary. Problems. Historical Notes. 17. Inequalities in Information Theory. 17.1 Basic Inequalities of Information Theory. 17.2 Differential Entropy. 17.3 Bounds on Entropy and Relative Entropy. 17.4 Inequalities for Types. 17.5 Combinatorial Bounds on Entropy. 17.6 Entropy Rates of Subsets. 17.7 Entropy and Fisher Information. 17.8 Entropy Power Inequality and Brunn-Minkowski Inequality. 17.9 Inequalities for Determinants. 17.10 Inequalities for Ratios of Determinants. Summary. Problems. Historical Notes. Bibliography. List of Symbols. Index.

42,928 citations

[...]

01 Jan 1979

42,654 citations

[...]

01 Jan 1979

12,322 citations

•

[...]

01 Jan 1948

TL;DR: The Mathematical Theory of Communication (MTOC) as discussed by the authors was originally published as a paper on communication theory more than fifty years ago and has since gone through four hardcover and sixteen paperback printings.

Abstract: Scientific knowledge grows at a phenomenal pace--but few books have had as lasting an impact or played as important a role in our modern world as The Mathematical Theory of Communication, published originally as a paper on communication theory more than fifty years ago. Republished in book form shortly thereafter, it has since gone through four hardcover and sixteen paperback printings. It is a revolutionary work, astounding in its foresight and contemporaneity. The University of Illinois Press is pleased and honored to issue this commemorative reprinting of a classic.

10,210 citations

••

[...]

TL;DR: This chapter discusses the application of the diagonal process of the universal computing machine, which automates the calculation of circle and circle-free numbers.

Abstract: 1. Computing machines. 2. Definitions. Automatic machines. Computing machines. Circle and circle-free numbers. Computable sequences and numbers. 3. Examples of computing machines. 4. Abbreviated tables Further examples. 5. Enumeration of computable sequences. 6. The universal computing machine. 7. Detailed description of the universal machine. 8. Application of the diagonal process. Pagina 1 di 38 On computable numbers, with an application to the Entscheidungsproblem A. M. ...

7,330 citations

##### Related Papers (5)

[...]

[...]

[...]

[...]