Journal ArticleDOI
A survey of grammatical inference methods for natural language learning
Reads0
Chats0
TLDR
A survey of the methodologies for inferring context-free grammars from examples, developed by researchers in the last decade, to provide a reader with introduction to major concepts and current approaches in Natural Language Learning research.Abstract:
The high complexity of natural language and the huge amount of human and temporal resources necessary for producing the grammars lead several researchers in the area of Natural Language Processing to investigate various solutions for automating grammar generation and updating processes. Many algorithms for Context-Free Grammar inference have been developed in the literature. This paper provides a survey of the methodologies for inferring context-free grammars from examples, developed by researchers in the last decade. After introducing some preliminary definitions and notations concerning learning and inductive inference, some of the most relevant existing grammatical inference methods for Natural Language are described and classified according to the kind of presentation (if text or informant) and the type of information (if supervised, unsupervised, or semi-supervised). Moreover, the state of the art of the strategies for evaluation and comparison of different grammar inference methods is presented. The goal of the paper is to provide a reader with introduction to major concepts and current approaches in Natural Language Learning research.read more
Citations
More filters
Journal ArticleDOI
Mixture of experts: a literature survey
Saeed Masoudnia,Reza Ebrahimpour +1 more
TL;DR: A categorisation of the ME literature based on the implicit problem space partitioning using a tacit competitive process between the experts is presented, and the first group is called the mixture of implicitly localised experts (MILE), and the second is called mixture of explicitly localised Experts (MELE), as it uses pre-specified clusters.
Journal ArticleDOI
Cognitive science in the era of artificial intelligence: A roadmap for reverse-engineering the infant language-learner.
TL;DR: It is argued that, on the computational side, it is important to move from toy problems to the full complexity of the learning situation, and take as input as faithful reconstructions of the sensory signals available to infants as possible.
Dissertation
Unsupervised learning for text-to-speech synthesis
TL;DR: The distributional analysis proposed here places the textual objects analysed in a continuous-valued space, rather than specifying a hard categorisation of those objects, so that the models generalise over objects’ surface forms in a way that is acoustically relevant.
Journal ArticleDOI
Learning Grammars for Architecture-Specific Facade Parsing
TL;DR: Experimental validation and comparison with the state-of-the-art grammar-based methods on four different datasets show that the learned grammar helps in much faster convergence while producing equal or more accurate parsing results compared to handcrafted grammarmars as well as grammars learned by other methods.
Journal ArticleDOI
Automatic Learning of Linguistic Resources for Stopword Removal and Stemming from Text
TL;DR: This paper proposes a methodology to automatically learn linguistic resources for a natural language starting from texts written in that language, and experimental results show that its application may effectively provide useful linguistic resources in a fully automatic manner.
References
More filters
Book
Introduction to Automata Theory, Languages, and Computation
TL;DR: This book is a rigorous exposition of formal languages and models of computation, with an introduction to computational complexity, appropriate for upper-level computer science undergraduates who are comfortable with mathematical arguments.
ReportDOI
Building a large annotated corpus of English: the penn treebank
TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Proceedings ArticleDOI
Combining labeled and unlabeled data with co-training
Avrim Blum,Tom M. Mitchell +1 more
TL;DR: A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.
Journal ArticleDOI
The CHILDES Project: Tools for Analyzing Talk
Clifton Pye,Brian MacWhinney +1 more
TL;DR: This book describes three basic tools for language analysis of transcript data by computer that have been developed in the context of the "Child Language Data Exchange System (CHILDES)" project, and focuses on their use in the child language field, believing that researchers from other areas can make the necessary analogies to their own topics.