Showing papers on "Tree-adjoining grammar published in 2006"

PDF

Open Access

Journal Article•DOI•

Recursive syntactic pattern learning by songbirds.

[...]

Timothy Q. Gentner¹, Kimberly M. Fenn¹, Daniel Margoliash¹, Howard C. Nusbaum¹•Institutions (1)

27 Apr 2006-Nature

TL;DR: It is shown that European starlings (Sturnus vulgaris) accurately recognize acoustic patterns defined by a recursive, self-embedding, context-free grammar, and this finding opens a new range of complex syntactic processing mechanisms to physiological investigation.

...read moreread less

Abstract: Noam Chomsky's work on ‘generative grammar’ led to the concept of a set of rules that can generate a natural language with a hierarchical grammar, and the idea that this represents a uniquely human ability. In a series of experiments with European starlings, in which several types of ‘warble’ and ‘rattle’ took the place of words in a human language, the birds learnt to classify phrase structure grammars in a way that met the same criteria. Their performance can be said to be almost human on this yardstick. So if there are language processing capabilities that are uniquely human, they may be more context-free or at a higher level in the Chomsky hierarchy. Or perhaps there is no single property or processing capacity that differentiates human language from non-human communication systems. Humans regularly produce new utterances that are understood by other members of the same language community1. Linguistic theories account for this ability through the use of syntactic rules (or generative grammars) that describe the acceptable structure of utterances2. The recursive, hierarchical embedding of language units (for example, words or phrases within shorter sentences) that is part of the ability to construct new utterances minimally requires a ‘context-free’ grammar2,3 that is more complex than the ‘finite-state’ grammars thought sufficient to specify the structure of all non-human communication signals. Recent hypotheses make the central claim that the capacity for syntactic recursion forms the computational core of a uniquely human language faculty4,5. Here we show that European starlings (Sturnus vulgaris) accurately recognize acoustic patterns defined by a recursive, self-embedding, context-free grammar. They are also able to classify new patterns defined by the grammar and reliably exclude agrammatical patterns. Thus, the capacity to classify sequences from recursive, centre-embedded grammars is not uniquely human. This finding opens a new range of complex syntactic processing mechanisms to physiological investigation.

...read moreread less

510 citations

Proceedings Article•DOI•

Quasi-Synchronous Grammars: Alignment by Soft Projection of Syntactic Dependencies

[...]

David A. Smith¹, Jason Eisner¹•Institutions (1)

Johns Hopkins University¹

08 Jun 2006

TL;DR: This work presents a new model of the translation process: quasi-synchronous grammar (QG), and evaluates the cross-entropy of QGs on unseen text and shows that a better fit to bilingual data is achieved by allowing greater syntactic divergence.

...read moreread less

Abstract: Many syntactic models in machine translation are channels that transform one tree into another, or synchronous grammars that generate trees in parallel. We present a new model of the translation process: quasi-synchronous grammar (QG). Given a source-language parse tree T1, a QG defines a monolingual grammar that generates translations of T1. The trees T2 allowed by this monolingual grammar are inspired by pieces of substructure in T1 and aligned to T1 at those points. We describe experiments learning quasi-synchronous context-free grammars from bitext. As with other monolingual language models, we evaluate the cross-entropy of QGs on unseen text and show that a better fit to bilingual data is achieved by allowing greater syntactic divergence. When evaluated on a word alignment task, QG matches standard baselines.

...read moreread less

112 citations

Book Chapter•DOI•

The theory of grammar constraints

[...]

Meinolf Sellmann¹•Institutions (1)

Brown University¹

25 Sep 2006

TL;DR: An arc-consistency algorithm for context-free grammars, an investigation of when logic combinations of grammar constraints are tractable, and when the boundaries run between regular, context- free, and context-sensitive grammar filtering are studied.

...read moreread less

Abstract: By introducing the Regular Membership Constraint, Gilles Pesant pioneered the idea of basing constraints on formal languages. The paper presented here is highly motivated by this work, taking the obvious next step, namely to investigate constraints based on grammars higher up in the Chomsky hierarchy. We devise an arc-consistency algorithm for context-free grammars, investigate when logic combinations of grammar constraints are tractable, show how to exploit non-constant size grammars and reorderings of languages, and study where the boundaries run between regular, context-free, and context-sensitive grammar filtering.

...read moreread less

51 citations

Proceedings Article•

Unifying Synchronous Tree-Adjoining Grammars and Tree Transducers via Bimorphisms

[...]

Stuart M. Shieber¹•Institutions (1)

Harvard University¹

01 Apr 2006

TL;DR: The tree relations Definable by synchronous tree-substitution grammars (STSG) were shown to be just those definable by linear complete bimorphisms, thereby providing for the first time a clear relationship between synchronous Grammars and tree transducers.

...read moreread less

Abstract: We place synchronous tree-adjoining grammars and tree transducers in the single overarching framework of bimorphisms, continuing the unification of synchronous grammars and tree transducers initiated by Shieber (2004). Along the way, we present a new definition of the tree-adjoining grammar derivation relation based on a novel direct inter-reduction of TAG and monadic macro tree transducers. Tree transformation systems such as tree transducers and synchronous grammars have seen renewed interest, based on a perceived relevance to new applications, such as importing syntactic structure into statistical machine translation models or founding a formalism for speech command and control. The exact relationship among a variety of formalisms has been unclear, with a large number of seemingly unrelated formalisms being independently proposed or characterized. An initial step toward unifying the formalisms was taken (Shieber, 2004) in making use of the formallanguage-theoretic device of bimorphisms, previously used to characterize the tree relations definable by tree transducers. In particular, the tree relations definable by synchronous tree-substitution grammars (STSG) were shown to be just those definable by linear complete bimorphisms, thereby providing for the first time a clear relationship between synchronous grammars and tree transducers.

...read moreread less

46 citations

Book Chapter•DOI•

XML validation for context-free grammars

[...]

Yasuhiko Minamide¹, Akihiko Tozawa²•Institutions (2)

University of Tsukuba¹, IBM²

08 Nov 2006

TL;DR: The validation of a context-free grammar obtained by the analysis against XML schemas is considered and two algorithms for deciding inclusion L(G1)⊆L(G2) are developed, which are efficient in practice although they have exponential complexity.

...read moreread less

Abstract: String expression analysis conservatively approximates the possible string values generated by a program We consider the validation of a context-free grammar obtained by the analysis against XML schemas and develop two algorithms for deciding inclusion L(G1)⊆L(G2) where G1 is a context-free grammar and G2 is either an XML-grammar or a regular hedge grammar The algorithms for XML-grammars and regular hedge grammars have exponential and doubly exponential time complexity, respectively We have incorporated the algorithms into the PHP string analyzer and validated several publicly available PHP programs against the XHTML DTD The experiments show that both of the algorithms are efficient in practice although they have exponential complexity

...read moreread less

35 citations

Journal Article•DOI•

Phase theory and Tree Adjoining Grammar

[...]

Robert Frank¹•Institutions (1)

Johns Hopkins University¹

01 Feb 2006-Lingua

TL;DR: The degree to which the explanations offered by these different approaches generalize across A- and A ′ -movement, across different structural contexts, and across the phenomena of displacement and agreement is explored, and whether such generalization is empirically warranted in each case.

...read moreread less

34 citations

Journal Article•DOI•

Grammatical Representations of Macromolecular Structure

[...]

David Chiang¹, Aravind K. Joshi, David B. Searls•Institutions (1)

University of Southern California¹

23 Jun 2006-Journal of Computational Biology

TL;DR: It is shown how nearly all of these methods to model RNA and protein structure are based on the same core principles and can be converted into equivalent approaches in the framework of tree-adjoining grammars and related formalisms.

...read moreread less

Abstract: Since the first application of context-free grammars to RNA secondary structures in 1988, many researchers have used both ad hoc and formal methods from computational linguistics to model RNA and protein structure. We show how nearly all of these methods are based on the same core principles and can be converted into equivalent approaches in the framework of tree-adjoining grammars and related formalisms. We also propose some new approaches that extend these core principles in novel ways.

...read moreread less

33 citations

Journal Article•

Adaptive Star Grammars

[...]

Frank Drewes, Berthold Hoffmann, Dirk Janssens, Mark Minas, Niels Van Eetvelde - Show less +1 more

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: Adaptive star grammars as mentioned in this paper are an extension of node and hyperedge replacement grammar, and they have been shown to be capable of generating every type-0 string language.

...read moreread less

Abstract: We propose an extension of node and hyperedge replacement grammars, called adaptive star grammars, and study their basic properties. A rule in an adaptive star grammar is actually a rule schema which, via the so-called cloning operation, yields a potentially infinite number of concrete rules. Adaptive star grammars are motivated by application areas such as modeling and refactoring object-oriented programs. We prove that cloning can be applied lazily. Unrestricted adaptive star grammars are shown to be capable of generating every type-0 string language. However, we identify a reasonably large subclass for which the membership problem is decidable.

...read moreread less

31 citations

Book Chapter•DOI•

Adaptive star grammars

[...]

Frank Drewes¹, Berthold Hoffmann², Dirk Janssens³, Mark Minas⁴, Niels Van Eetvelde³ - Show less +1 more•Institutions (4)

Umeå University¹, University of Bremen², University of Antwerp³, Bundeswehr University Munich⁴

17 Sep 2006

TL;DR: It is proved that cloning can be applied lazily, and a reasonably large subclass for which the membership problem is decidable is identified.

...read moreread less

31 citations

Journal Article•DOI•

Generalized lr parsing algorithm for boolean grammars

[...]

Alexander Okhotin¹•Institutions (1)

University of Turku¹

01 Jun 2006-International Journal of Foundations of Computer Science

TL;DR: The generalized LR parsing algorithm for context-free grammars is extended for the case of Boolean Grammars, which are a generalization of the context- free grammARS with logical connectives added to the formalism of rules.

...read moreread less

Abstract: The generalized LR parsing algorithm for context-free grammars is extended for the case of Boolean grammars, which are a generalization of the context-free grammars with logical connectives added to the formalism of rules. In addition to the standard LR operations, Shift and Reduce, the new algorithm uses a third operation called Invalidate, which reverses a previously made reduction. This operation makes the mathematical justification of the algorithm significantly different from its prototype. On the other hand, the changes in the implementation are not very substantial, and the algorithm still works in time O(n4).

...read moreread less

27 citations

Proceedings Article•DOI•

Graph Grammar Induction on Structural Data for Visual Programming

[...]

K. Ates, Jacek P. Kukluk¹, Lawrence B. Holder¹, Diane J. Cook¹, Kang Zhang² - Show less +1 more•Institutions (2)

University of Texas at Arlington¹, University of Texas at Dallas²

13 Nov 2006

TL;DR: An induction method is given to infer node replacement graph grammars from various structural representations and the correctness of an inferred grammar is verified by parsing graphs not present in the training set.

...read moreread less

Abstract: Computer programs that can be expressed in two or more dimensions are typically called visual programs. The underlying theories of visual programming languages involve graph grammars. As graph grammars are usually constructed manually, construction can be a time-consuming process that demands technical knowledge. Therefore, a technique for automatically constructing graph grammars - at least in part - is desirable. An induction method is given to infer node replacement graph grammars. The method operates on labeled graphs of broad applicability. It is evaluated by its performance on inferring graph grammars from various structural representations. The correctness of an inferred grammar is verified by parsing graphs not present in the training set

...read moreread less

Book Chapter•DOI•

Inferring grammars for mildly context sensitive languages in polynomial-time

[...]

Tim Oates¹, Tom Armstrong¹, Leonor Becerra Bonache², Mike Atamas¹•Institutions (2)

University of Maryland, Baltimore County¹, Rovira i Virgili University²

20 Sep 2006

TL;DR: This work presents the first polynomial-time algorithm for inferring Simple External Context Grammars, a class of mildly context-sensitive grammars from positive examples.

...read moreread less

Abstract: Natural languages contain regular, context-free, and context-sensitive syntactic constructions, yet none of these classes of formal languages can be identified in the limit from positive examples Mildly context-sensitive languages are able to represent some context-sensitive constructions, those most common in natural languages, such as multiple agreement, crossed agreement, and duplication These languages are attractive for natural language applications due to their expressiveness, and the fact that they are not fully context-sensitive should lead to computational advantages as well We realize one such computational advantage by presenting the first polynomial-time algorithm for inferring Simple External Context Grammars, a class of mildly context-sensitive grammars, from positive examples

...read moreread less

Proceedings Article•DOI•

Polarized Unification Grammars

[...]

Sylvain Kahane¹•Institutions (1)

University of Paris¹

17 Jul 2006

TL;DR: This paper proposes a generic mathematical formalism for the combination of various structures: strings, trees, dags, graphs and products of them that is both elementary and powerful enough to strongly simulate many grammar formalisms.

...read moreread less

Abstract: This paper proposes a generic mathematical formalism for the combination of various structures: strings, trees, dags, graphs and products of them. The polarization of the objects of the elementary structures controls the saturation of the final structure. This formalism is both elementary and powerful enough to strongly simulate many grammar formalisms, such as rewriting systems, dependency grammars, TAG, HPSG and LFG.

...read moreread less

Journal Article•DOI•

Picture languages: tiling systems versus tile rewriting grammars

[...]

Alessandra Cherubini¹, Stefano Crespi Reghizzi¹, Matteo Pradella, Pierluigi San Pietro¹•Institutions (1)

Polytechnic University of Milan¹

05 May 2006-Theoretical Computer Science

TL;DR: Two results extending classical language properties into 2D are proved: non-recursive tile writing grammars (TRG) coincide with tiling systems (TS) and non-self-embedding TRG are suitably defined as corner Grammars, showing that they generate TS languages.

...read moreread less

Proceedings Article•

On the analysis of fuzzy string patterns with the help of extended and stochastic GDPLL( k ) grammars

[...]

Mariusz Flasiński¹, Janusz Jurek¹•Institutions (1)

Jagiellonian University¹

01 Jul 2006

TL;DR: Two methods of the analysis of distorted (fuzzy) string patterns are presented: a minimum distance measure is used for error-correcting parsing and stochastic approach.

...read moreread less

Abstract: Two methods of the analysis of distorted (fuzzy) string patterns are presented. The methods are based on the use of GDPLL(k) grammars generating a large subclass of context sensitive languages. The first one utilizes error-correcting approach: a minimum distance measure is used for error-correcting parsing. The second one utilizes stochastic approach: the decision about the production to be applied in a derivation step is given according to the probability measure.

...read moreread less

Book Chapter•DOI•

Well-Founded semantics for boolean grammars

[...]

Vassilis Kountouriotis¹, Christos Nomikos², Panos Rondogiannis¹•Institutions (2)

National and Kapodistrian University of Athens¹, University of Ioannina²

26 Jun 2006

TL;DR: In this paper, Okhotin et al. proposed a new semantics for boolean grammars, which applies to all such grammar models, independently of their syntax, based on the well-founded approach to negation.

...read moreread less

Abstract: Boolean grammars [A. Okhotin, Information and Computation 194 (2004) 19-48] are a promising extension of context-free grammars that supports conjunction and negation. In this paper we give a novel semantics for boolean grammars which applies to all such grammars, independently of their syntax. The key idea of our proposal comes from the area of negation in logic programming, and in particular from the so-called well-founded semantics which is widely accepted in this area to be the “correct” approach to negation. We show that for every boolean grammar there exists a distinguished (three-valued) language which is a model of the grammar and at the same time the least fixed point of an operator associated with the grammar. Every boolean grammar can be transformed into an equivalent (under the new semantics) grammar in normal form. Based on this normal form, we propose an ${\mathcal{O}(n^3)}$ algorithm for parsing that applies to any such normalized boolean grammar. In summary, the main contribution of this paper is to provide a semantics which applies to all boolean grammars while at the same time retaining the complexity of parsing associated with this type of grammars.

...read moreread less

Book Chapter•DOI•

Grammars of Space: A grammar of space in Japanese

[...]

Sotaro Kita

01 Jan 2006

Journal Article•DOI•

Regular grammars with truth values in lattice-ordered monoid and their languages

[...]

Li Sheng¹, Yongming Li¹•Institutions (1)

Shaanxi Normal University¹

15 Jan 2006

TL;DR: It is shown that for a given LRG, there exists an LA such that they accept the same languages, and vice versa, and the equivalence between deterministic lattice-valued regular grammars and deterministic associative finite automata is shown.

...read moreread less

Abstract: In this study, we introduce the concept of lattice-valued regular grammars. Such grammars have become a necessary tool for the analysis of fuzzy finite automata. The relationship between lattice-valued finite automata (LA) and lattice-valued regular grammars (LRG) are discussed and we get the following results, for a given LRG, there exists an LA such that they accept the same languages, and vice versa. We also show the equivalence between deterministic lattice-valued regular grammars and deterministic lattice-valued finite automata.

...read moreread less

Journal Article•DOI•

Prime normal form and equivalence of simple grammars

[...]

Cédric Bastien¹, Jurek Czyzowicz¹, Wojciech Fraczak¹, Wojciech Rytter²•Institutions (2)

Université du Québec en Outaouais¹, University of Warsaw²

28 Oct 2006-Theoretical Computer Science

TL;DR: The algorithm computes a canonical representation of a simple language, converting its arbitrary simple grammar into prime normal form (PNF); a simple grammar is in PNF if all its nonterminals define primes.

...read moreread less

Journal Article•DOI•

Pied-Piping in Relative Clauses: Syntax and Compositional Semantics Based on Synchronous Tree Adjoining Grammar

[...]

Chung-hye Han¹•Institutions (1)

Simon Fraser University¹

15 Jul 2006-Research on Language and Computation

TL;DR: It will be shown that (i) the elementary tree representing the logical form of a wh relative pronoun provides a generalized quantifier, and (ii) the semantic composition of the pied-piped material and the wh-word is achieved through adjoining the elementary Tree Adjoining Grammar.

...read moreread less

Abstract: In relative clauses, the wh relative pronoun can be embedded in a larger phrase, as in a boy [whose brother] Mary hit. In such examples, we say that the larger phrase has pied-piped along with the wh-word. In this paper, using a similar syntactic analysis for wh pied-piping as in Han (2002) and further developed in Kallmeyer and Scheffler (2004), I propose a compositional semantics for relative clauses based on Synchronous Tree Adjoining Grammar. It will be shown that (i) the elementary tree representing the logical form of a wh-word provides a generalized quantifier, and (ii) the semantic composition of the pied-piped material and the wh-word is achieved through adjoining in the semantics of the former onto the latter.

...read moreread less

Book Chapter•DOI•

Grammars of Space: Prolegomenon to a Warrwa grammar of space

[...]

William B. McGregor

01 Jan 2006

Proceedings Article•DOI•

GF Parallel Resource Grammars and Russian

[...]

Janna Khegai¹•Institutions (1)

Chalmers University of Technology¹

17 Jul 2006

TL;DR: This work reflects on the experience with the Russian resource grammar trying to answer the questions: how well Russian fits into the common interface and where the line between language-independent and language-specific should be drawn.

...read moreread less

Abstract: A resource grammar is a standard library for the GF grammar formalism. It raises the abstraction level of writing domain-specific grammars by taking care of the general grammatical rules of a language. GF resource grammars have been built in parallel for eleven languages and share a common interface, which simplifies multilingual applications. We reflect on our experience with the Russian resource grammar trying to answer the questions: how well Russian fits into the common interface and where the line between language-independent and language-specific should be drawn.

...read moreread less

What are Treebank Grammars

[...]

D.H.J.K. Prescher, Remko Scha, Khalil Sima'an, A. Zollman

01 Jan 2006

TL;DR: It is argued that embracing the unboundedness assumption of Treebank Grammars also brings the justification of smoothing techniques within the scope of Estimation Theory.

...read moreread less

Abstract: State-of-the-art syntactic disambiguators for natural language employ ”Treebank Grammars”: probabilistic grammars directly projected from annotated corpora (treebanks). Treebank Grammars mark a paradigm shift from the manually constructed, a priori fixed linguistic grammars. In this paper we show that for describing these systems in the framework of Statistical Estimation Theory one must assume an unbounded number of parameters. The unboundedness assumption of Treebank Grammars expresses persistent uncertainty over the formal grammar of natural language. We argue that embracing the unboundedness assumption also brings the justification of smoothing techniques within the scope of Estimation Theory.

...read moreread less

Book Chapter•DOI•

Bag context tree grammars

[...]

Frank Drewes¹, Christine du Toit², Sigrid Ewert³, Brink van der Merwe², Andries P. J. van der Walt² - Show less +1 more•Institutions (3)

Umeå University¹, Stellenbosch University², University of the Witwatersrand³

26 Jun 2006

TL;DR: In this paper, Bag Context (BC) is introduced as a device for regulated rewriting in tree grammars, and it is shown that the class of bc tree languages is the closure of the random context tree languages under linear top-down tree transductions.

...read moreread less

Abstract: We introduce bag context, a device for regulated rewriting in tree grammars. Rather than being part of the developing tree, bag context (bc) evolves on its own during a derivation. We show that the class of bc tree languages is the closure of the class of random context tree languages under linear top-down tree transductions. Further, an interchange theorem for subtrees of dense trees in bc tree languages is established. This result implies that the class of bc tree languages is incomparable with the class of branching synchronization tree languages.

...read moreread less

Journal Article•DOI•

Learning Recursive Automata from Positive Examples

[...]

Isabelle Tellier¹•Institutions (1)

Laboratoire d'Informatique Fondamentale de Lille¹

01 Dec 2006

TL;DR: This theoretical paper studies how to translate finite state automata into categorial grammars and back, and shows that the generalization operators employed in both domains can be compared and that their result can always be represented by generalized automata, called "recursive automata ".

...read moreread less

Abstract: In this theoretical paper, we compare the "classical" learning techniques used to infer regular grammars from positive examples with the ones used to infer categorial grammars. To this aim, we first study how to translate finite state automata into categorial grammars and back. We then show that the generalization operators employed in both domains can be compared, and that their result can always be represented by generalized automata, called "recursive automata ". The relation between these generalized automata and categorial grammars is studied in detail. Finally, new learnable subclasses of categorial grammars are defined, for which learning from strings is nearly not more expensive than from structures.

...read moreread less

Extended cross-serial dependencies in Tree Adjoining Grammars

[...]

Marco Kuhlmann¹, Mathias Möhl¹•Institutions (1)

Saarland University¹

01 Jul 2006

TL;DR: It is shown that multi-component TAG does not necessarily retain the well-nestedness constraint, while this constraint is inherent to Coupled Context-Free Grammar (Hotz and Pitsch, 1996).

...read moreread less

Abstract: The ability to represent cross-serial dependencies is one of the central features of Tree Adjoining Grammar (TAG). The class of dependency structures representable by lexicalized TAG derivations can be captured by two graph-theoretic properties: a bound on the gap degree of the structures, and a constraint called well-nestedness. In this paper, we compare formalisms from two strands of extensions to TAG in the context of the question, how they behave with respect to these constraints. In particular, we show that multi-component TAG does not necessarily retain the well-nestedness constraint, while this constraint is inherent to Coupled Context-Free Grammar (Hotz and Pitsch, 1996).

...read moreread less

Journal Article•

An incremental parser to recognize diagram symbols and gestures represented by adjacency grammars

[...]

Joan Mas, Gemma Sánchez, Josep Lladós

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: A parsing methodology to recognize a set of symbols represented by an adjacency grammar, a grammar that describes a symbol in terms of the primitives that form it and the relations among these primitives.

...read moreread less

Abstract: Syntactic approaches on structural symbol recognition are characterized by defining symbols using a grammar. Following the grammar productions a parser is constructed to recognize symbols: given an input, the parser detects whether it belongs to the language generated by the grammar, recognizing the symbol, or not. In this paper, we describe a parsing methodology to recognize a set of symbols represented by an adjacency grammar. An adjacency grammar is a grammar that describes a symbol in terms of the primitives that form it and the relations among these primitives. These relations are called constraints, which are validated using a defined cost function. The cost function approximates the distortion degree associated to the constraint. When a symbol has been recognized the cost associated to the symbol is like a similarity value. The evaluation of the method has been realized from a qualitative point of view, asking some users to draw some sketches. From a quantitative point of view a benchmarking database of sketched symbols has been used.

...read moreread less

Proceedings Article•

2D context-free grammars: Mathematical formulae recognition.

[...]

Daniel Prusa¹, Václav Hlaváč¹•Institutions (1)

Czech Technical University in Prague¹

01 Jan 2006

TL;DR: It is advocated that two-dimensional context-free grammars can be successfully used in the analysis of images containing objects that exhibit structural relations and demonstrated on a pilot study concerning recognition of off-line hand written mathematical formulae that they have the potential to deal with real-life noisy images.

...read moreread less

Abstract: This contribution advocates that two-dimensional context-free grammars can be successfully used in the analysis of images containing objects that exhibit structural relations. The idea of structural construction is further developed. The approach can be made computationally efficient, practical and be able to cope with noise. We have developed and tested the method on a pilot study aiming at recognition of offline mathematical formulae. The other novelty is not treating symbol segmentation in the image and structural analysis as two separate processes. This allows the system to recover from errors made in initial symbol segmentation. 1 Motivation and Taxonomy of Approaches The paper serves two main purposes. First, it intends to point the reader’s attention to the theory of two-dimensional (2D) languages. It focuses on context-free grammars having the potential to cope with structural relations in images. Second, the paper demonstrates on a pilot study concerning recognition of off-line hand written mathematical formulae that the 2D context-free grammars have the potential to deal with real-life noisy images. The enthusiasm for grammar-based methods in pattern recognition from the 1970’s [6] has gradually faded down due to inability to cope with errors and noise. Even mathematical linguistics, in which the formal grammar approach was pioneered [4], has tended to statistical methods since the 1990s. M.I. Schlesinger from the Ukrainian Academy of Sciences in Kiev has been developing the 2D grammar-based pattern recognition theory in the context of engineering drawings analysis since the late 1970s. His theory was explicated in the 10th chapter of the monograph [17] in English for the first time. The first author of this paper studied independently the theoretical limits of 2D grammars [14] and proved them to be rather restrictive. The main motivation of the authors of the reported work is to discover to what extent the 2D grammars are applicable to practical image analysis. This paper provides insight into an ongoing work on a pilot study aiming at offline recognition of mathematical formulae. We have chosen this application domain because there is a clear structure in formulae and works of others exist which can be used for comparison. Let us categorize the approaches to mathematical formulae recognition along two directions: – on-line recognition (the timing of the pen strokes is available) versus off-line recognition (only an image is available). Proceedings of the Prague Stringology Conference ’06 – printed versus hand-written formulae. We deal with off-line recognition of hand-written formulae in this contribution. Of course, the approach can be also applied to printed formulae.

...read moreread less

Journal Article•DOI•

The power of programmed grammars with graphs from various classes

[...]

Aidan Delaney¹, Aidan Delaney², Madalina Barbaiani¹, Cristina Bibire¹, Jürgen Dassow³, Szilárd Zsolt Fazekas⁴, Szilárd Zsolt Fazekas¹, Mihai Ionescu⁴, Guangwu Liu¹, Atif Lodhi⁵, Benedek Nagy⁴ - Show less +7 more•Institutions (5)

Rovira i Virgili University¹, Maynooth University², Otto-von-Guericke University Magdeburg³, University of Debrecen⁴, University of Barcelona⁵

01 Sep 2006-Journal of Applied Mathematics and Computing

TL;DR: It is obtained that Eulerian, Hamiltonian, planar and bipartite graphs and regular graphs of degree at least three are pr-universal in that sense that any language which can be generated by programmed grammars can be obtained where the underlying graph belongs to the given special class of graphs.

...read moreread less

Abstract: Programmed grammars, one of the most important and well investigated classes of grammars with context-free rules and a mechanism controlling the application of the rules, can be described by graphs. We investigate whether or not the restriction to special classes of graphs restricts the generative power of programmed grammars with erasing rules and without appearance checking, too. We obtain that Eulerian, Hamiltonian, planar and bipartite graphs and regular graphs of degree at least three are pr-universal in that sense that any language which can be generated by programmed grammars (with erasing rules and without appearance checking) can be obtained by programmed grammars where the underlying graph belongs to the given special class of graphs, whereas complete graphs, regular graphs of degree 2 and backbone graphs lead to proper subfamilies of the family of programmed languages.

...read moreread less

Proceedings Article•DOI•

A Tree Adjoining Grammar Analysis of the Syntax and Semantics of it It-Clefts

[...]

Chung-hye Han¹, Nancy Hedberg¹•Institutions (1)

Simon Fraser University¹

15 Jul 2006

TL;DR: It is argued that in it-clefts as in It was Ohno who won, the cleft pronoun and the clefts clause form a discontinuous syntactic constituent, and a semantic unit as a definite description, presenting arguments from Percus and Hedberg.

...read moreread less

Abstract: In this paper, we argue that in it-clefts as in It was Ohno who won, the cleft pronoun (it) and the cleft clause (who won) form a discontinuous syntactic constituent, and a semantic unit as a definite description, presenting arguments from Percus (1997) and Hedberg (2000). We propose a syntax of it-clefts using Tree-Local Multi-Component Tree Adjoining Grammar and a compositional semantics on the proposed syntax using Synchronous Tree Adjoining Grammar.

...read moreread less