scispace - formally typeset
Open Access

Contributions to the theory of finite-state based grammars

TLDR
It is argued that the findings of this dissertation help to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing.
Abstract
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIGs) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchy and the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, leftand right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the ChomskySchutzenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance motivated approximations are linear-time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as the generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Multitiered nonlinear morphology using multitape finite automata : A case study on syriac and arabic

TL;DR: The authors presented a computational model for nonlinear morphology with illustrations from Syriac and Arabic, which allows for multiple lexical representations corresponding to the multiple tiers of autosegmental phonology.
Book Chapter

Applications of Diamonded Double Negation

TL;DR: The paper demonstrates that the GR operation has an interesting potential in expressing regular languages, various kinds of grammars, bimorphisms and relations, and motivates a further study of optimized implementation of the operator.
Proceedings Article

Constraint Grammar Parsing with Left and Right Sequential Finite Transducers

TL;DR: It is shown that the method can improve on the worst-case asymptotic bound of Constraint Grammar parsing from cubic to quadratic in the length of input sentences.
Journal Article

Framework and resources for natural language parser evaluation

TL;DR: A framework (called FEPa) that can be used to carry out practical parser evaluations and comparisons and a set of new evaluation resources: FiEval is a Finnish treebank under construction, and MGTS and RobSet are parser evaluation resources in English.
References
More filters
Journal ArticleDOI

Graph-Based Algorithms for Boolean Function Manipulation

TL;DR: In this paper, the authors present a data structure for representing Boolean functions and an associated set of manipulation algorithms, which have time complexity proportional to the sizes of the graphs being operated on, and hence are quite efficient as long as the graphs do not grow too large.
Book

The Logic of Scientific Discovery

Karl Popper
TL;DR: The Open Society and Its Enemies as discussed by the authors is regarded as one of Popper's most enduring books and contains insights and arguments that demand to be read to this day, as well as many of the ideas in the book.
Book

Principles of database and knowledge-base systems

TL;DR: This book goes into the details of database conception and use, it tells you everything on relational databases from theory to the actual used algorithms.