scispace - formally typeset
Open AccessBook

The Informational Complexity of Learning: Perspectives on Neural Networks and Generative Grammar

Partha Niyogi
Reads0
Chats0
TLDR
The Informational Complexity of Learning: Perspectives on Neural Networks and Generative Grammar brings together two important but very different learning problems within the same analytical framework to analyze both kinds of learning problems.
Abstract
From the Publisher: Among other topics, The Informational Complexity of Learning: Perspectives on Neural Networks and Generative Grammar brings together two important but very different learning problems within the same analytical framework. The first concerns the problem of learning functional mappings using neural networks, followed by learning natural language grammars in the principles and parameters tradition of Chomsky. These two learning problems are seemingly very different. Neural networks are real-valued, infinite-dimensional, continuous mappings. On the other hand, grammars are boolean-valued, finite-dimensional, discrete (symbolic) mappings. Furthermore the research communities that work in the two areas almost never overlap. The book's objective is to bridge this gap. It uses the formal techniques developed in statistical learning theory and theoretical computer science over the last decade to analyze both kinds of learning problems. By asking the same question - how much information does it take to learn - of both problems, it highlights their similarities and differences. Specific results include model selection in neural networks, active learning, language learning and evolutionary models of language change.

read more

Citations
More filters
Journal ArticleDOI

The covering number in learning theory

TL;DR: This work gives estimates for the covering number of a ball of a reproducing kernel Hilbert space as a subset of the continuous function space by means of the regularity of the Mercer kernel K, and provides an example of a Mercer kernels to show that LK½ may not be generated by a Mercer kernel.
Journal ArticleDOI

SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming

TL;DR: This article shows that the convergence behavior of the linear programming SVM is almost the same as that of the quadratic programming S VM, and proposes an upper bound for the misclassification error for general probability distributions.
Proceedings Article

Almost-everywhere algorithmic stability and generalization error

TL;DR: In this paper, the authors introduce the notion of training stability of a learning algorithm and show that, in a general setting, it is sufficient for good bounds on generalization error, in the PAC setting, training stability is both necessary and sufficient for learnability.
Journal ArticleDOI

Modeling Language Evolution

TL;DR: This work describes a model for the evolution of the languages used by the agents of a society and proves convergence of these languages to a common one under certain conditions.
ReportDOI

A Dynamical Systems Model for Language Change

TL;DR: Formalizing linguists'' intuitions of language change as a dynamical system, the computer model is applied to the historical loss of Verb Second from Old French to modern French, showing that otherwise adequate grammatical theories can fail the authors' new evolutionary criterion.