scispace - formally typeset
Search or ask a question

Showing papers on "Tree-adjoining grammar published in 2005"


Patent
21 Oct 2005
TL;DR: In this article, a method of building a mixed-initiative grammar can include receiving one or more conjoin phrases, wherein each conjoin phrase is associated with a selected one of the plurality of directed dialog grammars, and receiving a user input specifying a selected grammar generation technique.
Abstract: A method of building a mixed-initiative grammar can include receiving one or more conjoin phrases, wherein each conjoin phrase is associated with a selected one of the plurality of directed dialog grammars, and receiving a user input specifying a selected grammar generation technique. The mixed-initiative grammar can be automatically generated, in accordance with the selected grammar generation technique, such that the mixed-initiative grammar specifies an allowable ordering of sets when interpreting a user spoken utterance and whether duplicative phrases are allowable within the user spoken utterance.

122 citations


01 Jan 2005
TL;DR: This paper explains the model integration approach tackling a common case study and implements triple graph grammars which are declarative and brings them together with OMG’s MOF standard.
Abstract: Model Driven Application Development (MDA) is OMG’s vision of model-based software system development. MDA is based on the idea of automatically transforming abstract models into more specific models. In this paper we explain our model integration approach tackling a common case study. Our approach implements triple graph grammars which are declarative and brings them together with OMG’s MOF standard. From a set of declarative triple graph grammar rules we (semi−)automatically derive operational graph grammar rules that can be used for consistency checking, consistency recovery, and model transformation using rule application mechanisms. Thus, triple graph grammar are suitable for model integration in general and model transformation in particular.

79 citations


Journal ArticleDOI
TL;DR: It is shown that determining circularity of remote attribute grammars is undecidable and a family of conservative tests of noncircularity are described and shown how they can be used to “schedule” a remote attribute grammar using standard techniques.
Abstract: Describing the static semantics of programming languages with attribute grammars is eased when the formalism allows direct dependencies to be induced between rules for nodes arbitrarily far away in the tree. Such direct non-local dependencies cannot be analyzed using classical methods, which enable efficient evaluation.This article defines an attribute grammar extension (“remote attribute grammars”) to permit references to objects with fields to be passed through the attribute system. Fields may be read and written through these references. The extension has a declarative semantics in the spirit of classical attribute grammars. It is shown that determining circularity of remote attribute grammars is undecidable.The article then describes a family of conservative tests of noncircularity and shows how they can be used to “schedule” a remote attribute grammar using standard techniques. The article discusses practical batch and incremental evaluation of remote attribute grammars.

71 citations


Proceedings ArticleDOI
06 Oct 2005
TL;DR: This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation, and reports two hardness results for the class NP, along with an exponential time lower-bound for certain classes of algorithms that are currently used in the Literature.
Abstract: This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic context-free grammars working in a 'synchronous' way. Two hardness results for the class NP are reported, along with an exponential time lower-bound for certain classes of algorithms that are currently used in the literature.

66 citations


PatentDOI
William D. Ramsey1
TL;DR: In this paper, a method and apparatus for automatically forming a grammar is provided for automatically constructing a rule in the grammar, which is then generated automatically based in part on the n-grams.
Abstract: A method and apparatus are provided for automatically forming a grammar. Example text strings are received and N-grams are formed based on the text strings. A rule in the grammar is then generated automatically based in part on the n-grams.

61 citations


Journal ArticleDOI
TL;DR: An efficient hypothesis representation method which consists of a table-like data structure similar to the parse table used in efficient parsing algorithms for context-free grammars such as Cocke-Younger-Kasami algorithm is proposed.

59 citations


01 Jan 2005
TL;DR: This article investigated drawings as models of syntactic structure and showed that well-nested drawings allow for efficient processing by defining a simple constraint language for them and presenting an algo-rithm that decides in polynomial time whether a formula in that con-straint language is satisfiable on a well nested drawing.
Abstract: This paper investigates drawings (totally ordered forests) as models of syntactic structure. It oers a new model-based perspective on lexicalised Tree Adjoining Grammar by characterising a class of drawings structurally equivalent to tag derivations. The drawings in this class are distinguished by a restricted form of non-projectivity (gap degree at most one) and the absence of interleaving substructures (well-nestedness). We demonstrate that well-nested drawings allow for ecient processing by defining a simple constraint language for them and presenting an algo- rithm that decides in polynomial time whether a formula in that con- straint language is satisfiable on a well-nested drawing.

52 citations


Book ChapterDOI
15 Jun 2005
TL;DR: The paper shows empirically that AGE is as good as GE for a classical problem, and proves that including semantics in the grammar can improve GE performance, and concludes that adding too much semantics can make the search difficult.
Abstract: This paper describes Attribute Grammar Evolution (AGE), a new Automatic Evolutionary Programming algorithm that extends standard Grammar Evolution (GE) by replacing context-free grammars by attribute grammars. GE only takes into account syntactic restrictions to generate valid individuals. AGE adds semantics to ensure that both semantically and syntactically valid individuals are generated. Attribute grammars make it possible to semantically describe the solution. The paper shows empirically that AGE is as good as GE for a classical problem, and proves that including semantics in the grammar can improve GE performance. An important conclusion is that adding too much semantics can make the search difficult.

47 citations


Book ChapterDOI
07 Nov 2005
TL;DR: This paper proposes to use triple graph grammars as declarative specification formalism to enable a graphical specification of model transformation rules, which can be specified within the FUJABA tool and it is argued that these rules can be more easily specified and they become more understandable and maintainable.
Abstract: Models and model transformations are the core concepts of OMG’s MDATM approach. Within this approach, most models are derived from the MOF and have a graph-based nature. In contrast, most of the current model transformations are specified textually. To enable a graphical specification of model transformation rules, this paper proposes to use triple graph grammars as declarative specification formalism. These triple graph grammars can be specified within the FUJABA tool and we argue that these rules can be more easily specified and they become more understandable and maintainable. To show the practicability of our approach, we present how to generate Tefkat rules from triple graph grammar rules, which helps to integrate triple graph grammars with a state of a art model transformation tool and shows the expressiveness of the concept.

43 citations


Journal ArticleDOI
TL;DR: It is proved this model has greater generative capacity than the tiling systems of Giammarresi and Restivo and the grammars of Matz, another generalization of context-free string Grammars to 2D.

41 citations


Journal ArticleDOI
TL;DR: This paper describes approaches for machine learning of context free grammars (CFGs) from positive and negative sample strings, which are implemented in Synapse system, and mechanisms for incremental learning, and search.

Journal ArticleDOI
02 Feb 2005
TL;DR: It is proved that scattered context grammars having two context sensing productions and five nonterminals are sufficient to generate all recursively enumerable languages and it is shown that the same power can be reached by simple semi-conditional grammar having 10 conditional productions with conditions of the length two.
Abstract: We improve the upper bounds of certain descriptional complexity measures of two types of rewriting mechanisms regulated by context conditions. We prove that scattered context grammars having two context sensing productions and five nonterminals are sufficient to generate all recursively enumerable languages and we also show that the same power can be reached by simple semi-conditional grammars having 10 conditional productions with conditions of the length two or eight conditional productions with conditions of length three. The results are based on the common idea of using the so called Geffert normal forms for phrase structure grammars.

Journal ArticleDOI
TL;DR: The methodology and tools applied in the Parallel Grammar project (ParGram) to support consistency and parallelism of linguistic representations across multilingual Lexical Functional Grammar (lfg) grammars are discussed.
Abstract: This paper discusses the methodology and tools applied in the Parallel Grammar project (ParGram) to support consistency and parallelism of linguistic representations across multilingual Lexical Functional Grammar (lfg) grammars. A particular issue is that the grammars in the ParGram project are developed at different international sites. The approach that was established over several years relies on (i) a technical tool for checking adherence to the best-practice feature declaration for linguistic representations, (ii) the coordinated, systematic use of templates for expressing generalizations across lexicon entries and grammar rules, and (iii) a grammar code reviewing committee in which extensions to the existing representations are critically discussed.

Journal ArticleDOI
TL;DR: This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples that employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible Grammars.

Book ChapterDOI
28 Apr 2005
TL;DR: It is shown that the universal membership problem for the class of lexicalized ACGs is NP-complete and the languages generated by lexicalization ACGs form a subclass of NP which includes some NP- complete languages.
Abstract: Previous studies have shown that some well-known classes of grammars can be simulated by Abstract Categorial Grammars (de Groote 2001) in straightforward ways. These classes of grammars all generate subclasses of the PTIME languages. While the exact generative capacity of the class of ACGs and the complexity of its universal membership problem are both unknown, we show that the universal membership problem for the class of lexicalized ACGs is NP-complete and the languages generated by lexicalized ACGs form a subclass of NP which includes some NP-complete languages.

Journal ArticleDOI
TL;DR: It is shown that syntax of a small domain-specific language can be inferred from positive and negative programs provided by domain experts, using the genetic programming approach in grammatical inference.

Patent
13 Jun 2005
TL;DR: In this paper, a static analysis of speech grammars prior to their deployment in a speech system is presented. But the static analysis is limited to the use of speech-to-speech systems.
Abstract: The present invention provides static analysis of speech grammars prior to the speech grammars being deployed in a speech system.

01 Jan 2005
TL;DR: It is described how multimodal grammars for dialogue systems can be written using the Grammatical Framework (GF) formalism, and a proof-of-concept dialogue system constructed using these techniques is presented.
Abstract: We describe how multimodal grammars for dialogue systems can be written using the Grammatical Framework (GF) formalism. A proof-of-concept dialogue system constructed using these techniques is also presented. The software engineering problem of keeping grammars for different languages, modalities and systems (such as speech recognizers and parsers) in sync is reduced by the formal relationship between the abstract and concrete syntaxes, and by generating equivalent grammars from GF grammars.

01 Jan 2005
TL;DR: The mechanism of an attribute grammar is proposed to maintain GE as a pluggable component to any search-algorithm whereby it serves to facilitate the generation of viable solutions to the problem at hand.
Abstract: Research extending the capabilities of the well-known evolutionary-algorithm (EA) of Grammatical Evolution (GE) is presented. GE essentially describes a software component for (potentially) any search algorithm (more prominently an EA) whereby it serves to facilitate the generation of viable solutions to the problem at hand. In this way, GE can be thought of as a generallyapplicable, robust and pluggable component to any search-algorithm. Facilitating this plug-ability is the ability to hand-describe the structure of solutions to a particular problem; this, under the guise of the concise and effective notation of a grammar definition. This grammar may be thought of, as the rules for the generation of solutions to a problem. Recent research has shown, that for static-problems (problems who’s optimum-solution resides within a finitely-describable set, for the set of allpossible solutions), the ability to focus the search (for the optimum) on the more promising regions of this set, has provided the best-performing approaches to-date. As such, it is suggested that search be biased toward more promising areas of the set of all possible solutions. In it’s use of a grammar, GE provides such a bias (as a language-bias), yet remains unable, to effectively bias the search for problems of constrainedoptimisation. As such, and as detailed in this thesis the mechanism of an attribute grammar is proposed to maintain GE as a pluggable component

Journal ArticleDOI
TL;DR: This work shows that non-circularity remains decidable in EXPTIME and establishes the complexity of the non-emptiness and equivalence problem of extended AGs to be complete for EXPTime, and shows that the Region Algebra expressions can be efficiently translated into extendedAGs.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: This work uses the genetic programming approach for grammatical inference and proposes the use of frequent sequences, syntax graphs and incremental construction of grammars in order to infer a more comprehensive set of context-free Grammars.
Abstract: We propose a new application area for grammar inference which intends to make domain-specific language development easier and finds a second application in renovation tools for legacy systems. We use the genetic programming approach for grammatical inference and propose the use of frequent sequences, syntax graphs and incremental construction of grammars in order to be able to infer a more comprehensive set of context-free grammars.

Journal ArticleDOI
TL;DR: The paper formulates the Hays and Gaifman dependency grammar (HGDG) in terms of constraints on a string based encoding of dependency trees and develops an approach to obtain a regular approximation for these grammars.
Abstract: The paper formulates the Hays and Gaifman dependency grammar (HGDG) in terms of constraints on a string based encoding of dependency trees and develops an approach to obtain a regular approximation for these grammars. Our encoding of dependency trees uses brackets in a novel fashion: pairs of brackets indicate dependencies between pairs of positions rather than boundaries of phrases. This leads to several advantages: (i) HGDG rules over the balanced bracketing can be expressed using regular languages. (ii) A new homomorphic representation for context-free languages is obtained. (iii) A star-free regular approximation for the original projective dependency grammar is obtained by limiting the number of stacked dependencies. (iv) By relaxing certain constraints, the encoding can be extended to non-projective dependency trees and graphs, (v) strong generative power of HGDGs can now be characterized through sets of bracketed strings.

Journal ArticleDOI
TL;DR: State-alternating context-free grammars are introduced, and the language classes obtained from them are compared to the classes of the Chomsky hierarchy as well as to some well-known complexity classes.

Book ChapterDOI
28 Apr 2005
TL;DR: Dependency Structure Grammars (DSG), which are rewriting rule grammars generating sentences together with their dependency structures, are more expressive than CF-grammars and non-equivalent to mildly context-sensitive grammARS.
Abstract: In this paper, we define Dependency Structure Grammars (DSG), which are rewriting rule grammars generating sentences together with their dependency structures, are more expressive than CF-grammars and non-equivalent to mildly context-sensitive grammars We show that DSG are weakly equivalent to Categorial Dependency Grammars (CDG) recently introduced in [6,3] In particular, these dependency grammars naturally express long distance dependencies and enjoy good mathematical properties

01 Jan 2005
TL;DR: The CRITTER translation system makes use of a single grammar to perform analysis and synthesis tasks, which is a variant of DCG (Definite Clause Grammars), in which annotations have been added to allow for dual compilations of the grammar intoAnalysis and synthesis Prolog programs sharing the same declarative content.
Abstract: The CRITTER translation system makes use of a single grammar to perform analysis and synthesis tasks. The formalism used is a variant of DCG (Definite Clause Grammars), in which annotations have been added to allow for dual compilations of the grammar into analysis and synthesis Prolog programs sharing the same declarative content. These annotations are of two types: 1) annotations separating the declarative content of rules (logic) from goal-processing order (control), and 2) annotations which act as directives for the compiler(s) to perform "optimization" transformations on groups of rules making the target Prolog procedures better adapted to the analysis or synthesis task at hand. 1. THE TRANSLATION MODEL.

01 Jun 2005
TL;DR: It is argued that the findings of this dissertation help to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing.
Abstract: This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIGs) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchy and the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, leftand right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the ChomskySchutzenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance motivated approximations are linear-time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as the generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing.

Journal ArticleDOI
TL;DR: This paper investigates various definitions of OLP and discusses their interrelations, proving that some of the OLP variants are indeed undecidable, and presents a novel, decidable OLP constraint which is more liberal than the existing decidable ones.
Abstract: Unification grammars are known to be Turing-equivalent; given a grammar G and a word w, it is undecidable whether w ? L(G). In order to ensure decidability, several constraints on grammars, commonly known as off-line parsability (OLP), were suggested, such that the recognition problem is decidable for grammars which satisfy OLP. An open question is whether it is decidable if a given grammar satisfies OLP. In this paper we investigate various definitions of OLP and discuss their interrelations, proving that some of the OLP variants are indeed undecidable. We then present a novel, decidable OLP constraint which is more liberal than the existing decidable ones.

Journal Article
TL;DR: A positive answer to the question whether or not insertion grammars with weight at least 7 can characterize recursively enumerable languages can be improved is come up with by decreasing the weight of the insertion grammar used to 5.
Abstract: Insertion grammars have been introduced in [1] and their computational power has been studied in several places. In [7] it is proved that insertion grammars with weight at least 7 can characterize recursively enumerable languages (modulo a weak coding and an inverse morphism), and the question was formulated whether or not this result can be improved. In this paper, we come up with a positive answer to this question, by decreasing the weight of the insertion grammar used to 5. We also give a characterization of recursively enumerable languages in terms of right quotients of insertion languages.

Journal ArticleDOI
TL;DR: It is shown that the class of languages generated by esl-tag (ESL-TAL) properly includes theclass of Languages generated by sl- tag (SL- TAL) and the class by cfg (mcfg), and that SL-Tal is a full trio and ESL-T AL is a substitution closed full AFL.
Abstract: Several grammars have been proposed for representing RNA secondary structure including pseudoknots such as simple linear tree adjoining grammar (sl-tag), extended sl-tag (esl-tag) and RNA pseudoknot grammar (rpg). The main purpose of this paper is to compare the generative power of these grammars by identifying them as subclasses of multiple context-free grammars (mcfg). Specifically, it is shown that the class of languages generated by esl-tag (ESL-TAL) properly includes the class of languages generated by sl-tag (SL-TAL) and the class of languages generated by cfg. Also, we show that the class of languages generated by rpg coincides with the class of languages generated by mcfg with dimension one or two and rank one or two. Furthermore, it is shown that SL-TAL is a full trio and ESL-TAL is a substitution closed full AFL. key words: RNA secondary structure, pseudoknot, multiple context-free grammar, tree adjoining grammar

Journal ArticleDOI
TL;DR: Subclasses of monadic context-free tree grammars (CFTGs) are compared and it is examined whether the restrictions of linearity and nondeletion on monadic CFTGs are necessary to generate the same class of languages.