scispace - formally typeset
Search or ask a question

Showing papers on "Context-sensitive grammar published in 2005"


01 Jan 2005
TL;DR: This paper explains the model integration approach tackling a common case study and implements triple graph grammars which are declarative and brings them together with OMG’s MOF standard.
Abstract: Model Driven Application Development (MDA) is OMG’s vision of model-based software system development. MDA is based on the idea of automatically transforming abstract models into more specific models. In this paper we explain our model integration approach tackling a common case study. Our approach implements triple graph grammars which are declarative and brings them together with OMG’s MOF standard. From a set of declarative triple graph grammar rules we (semi−)automatically derive operational graph grammar rules that can be used for consistency checking, consistency recovery, and model transformation using rule application mechanisms. Thus, triple graph grammar are suitable for model integration in general and model transformation in particular.

79 citations


Journal ArticleDOI
TL;DR: It is shown that determining circularity of remote attribute grammars is undecidable and a family of conservative tests of noncircularity are described and shown how they can be used to “schedule” a remote attribute grammar using standard techniques.
Abstract: Describing the static semantics of programming languages with attribute grammars is eased when the formalism allows direct dependencies to be induced between rules for nodes arbitrarily far away in the tree. Such direct non-local dependencies cannot be analyzed using classical methods, which enable efficient evaluation.This article defines an attribute grammar extension (“remote attribute grammars”) to permit references to objects with fields to be passed through the attribute system. Fields may be read and written through these references. The extension has a declarative semantics in the spirit of classical attribute grammars. It is shown that determining circularity of remote attribute grammars is undecidable.The article then describes a family of conservative tests of noncircularity and shows how they can be used to “schedule” a remote attribute grammar using standard techniques. The article discusses practical batch and incremental evaluation of remote attribute grammars.

71 citations


Proceedings ArticleDOI
06 Oct 2005
TL;DR: This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation, and reports two hardness results for the class NP, along with an exponential time lower-bound for certain classes of algorithms that are currently used in the Literature.
Abstract: This paper investigates some computational problems associated with probabilistic translation models that have recently been adopted in the literature on machine translation. These models can be viewed as pairs of probabilistic context-free grammars working in a 'synchronous' way. Two hardness results for the class NP are reported, along with an exponential time lower-bound for certain classes of algorithms that are currently used in the literature.

66 citations


Book ChapterDOI
15 Jun 2005
TL;DR: The paper shows empirically that AGE is as good as GE for a classical problem, and proves that including semantics in the grammar can improve GE performance, and concludes that adding too much semantics can make the search difficult.
Abstract: This paper describes Attribute Grammar Evolution (AGE), a new Automatic Evolutionary Programming algorithm that extends standard Grammar Evolution (GE) by replacing context-free grammars by attribute grammars. GE only takes into account syntactic restrictions to generate valid individuals. AGE adds semantics to ensure that both semantically and syntactically valid individuals are generated. Attribute grammars make it possible to semantically describe the solution. The paper shows empirically that AGE is as good as GE for a classical problem, and proves that including semantics in the grammar can improve GE performance. An important conclusion is that adding too much semantics can make the search difficult.

47 citations


Proceedings ArticleDOI
10 Jan 2005
TL;DR: This work specifies a polymorphic type system for an applied lambda calculus that refines the string type with a subtype hierarchy derived from language containment and presents two algorithms that solve language inclusion constraints with respect to a fixed context-free reference grammar.
Abstract: We specify a polymorphic type system for an applied lambda calculus that refines the string type with a subtype hierarchy derived from language containment. It enables us to find a language for each string-type expression such that the value of the expression is a member of that language. Type inference for this system infers language inclusion constraints that can be viewed as a context-free grammar with a nonterminal for each string-valued expression.Then we present two algorithms that solve language inclusion constraints with respect to a fixed context-free reference grammar. The solutions are sound but incomplete because the general problem of context-free language inclusion is undecidable. Both algorithms are derived from Earley's parsing algorithm for context-free languages.Taking the two parts together enables us to answer questions like: Is the value of a string-type expression derivable from a given nonterminal in the reference grammar?

47 citations


Journal ArticleDOI
TL;DR: It is proved this model has greater generative capacity than the tiling systems of Giammarresi and Restivo and the grammars of Matz, another generalization of context-free string Grammars to 2D.

41 citations


Journal ArticleDOI
TL;DR: This paper describes approaches for machine learning of context free grammars (CFGs) from positive and negative sample strings, which are implemented in Synapse system, and mechanisms for incremental learning, and search.

41 citations


Journal ArticleDOI
TL;DR: The methodology and tools applied in the Parallel Grammar project (ParGram) to support consistency and parallelism of linguistic representations across multilingual Lexical Functional Grammar (lfg) grammars are discussed.
Abstract: This paper discusses the methodology and tools applied in the Parallel Grammar project (ParGram) to support consistency and parallelism of linguistic representations across multilingual Lexical Functional Grammar (lfg) grammars. A particular issue is that the grammars in the ParGram project are developed at different international sites. The approach that was established over several years relies on (i) a technical tool for checking adherence to the best-practice feature declaration for linguistic representations, (ii) the coordinated, systematic use of templates for expressing generalizations across lexicon entries and grammar rules, and (iii) a grammar code reviewing committee in which extensions to the existing representations are critically discussed.

34 citations


Journal ArticleDOI
TL;DR: This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples that employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible Grammars.

28 citations


Book ChapterDOI
28 Apr 2005
TL;DR: It is shown that the universal membership problem for the class of lexicalized ACGs is NP-complete and the languages generated by lexicalization ACGs form a subclass of NP which includes some NP- complete languages.
Abstract: Previous studies have shown that some well-known classes of grammars can be simulated by Abstract Categorial Grammars (de Groote 2001) in straightforward ways. These classes of grammars all generate subclasses of the PTIME languages. While the exact generative capacity of the class of ACGs and the complexity of its universal membership problem are both unknown, we show that the universal membership problem for the class of lexicalized ACGs is NP-complete and the languages generated by lexicalized ACGs form a subclass of NP which includes some NP-complete languages.

23 citations


Journal ArticleDOI
TL;DR: It is shown that syntax of a small domain-specific language can be inferred from positive and negative programs provided by domain experts, using the genetic programming approach in grammatical inference.

Book ChapterDOI
28 Apr 2005
TL;DR: It is demonstrated that removing the SMC from the revised MG-version increases the generative power in such a way that the resulting formalism is not mildly context-sensitive anymore, suggesting that intuitions to the contrary notwithstanding, imposing an LC as such, here the SPIC, does not necessarily reduce formal complexity.
Abstract: Locality Conditions (LCs) on (unbounded) dependencies have played a major role in the development of generative syntax ever since the seminal work by Ross [22]. Descriptively, they fall into two groups. On the one hand there are intervention-based LCs (ILCs) often formulated as “minimality constraints” (“minimal link condition,” “minimize chain links”, “shortest move”, “attract closest,” etc.). On the other hand there are containment-based LCs (CLCs) typically de.ned in terms of (generalized) grammatical functions (“adjunct island”, “subject island”, “specifier island”, etc.). Research on LCs has been dominated by two very general trends. First, attempts have been made at unifying ILCs and CLCs on the basis of notions such as “government” and “barrier” (e.g. [4]). Secondly, research has often been guided by the intuition that, beyond empirical coverage, LCs somehow contribute to restricting the formal capacity of grammars (cf. [3–p. 125], [6–p. 14f]). Both these issues, we are going to argue, can be fruitfully studied within the framework of minimalist grammars (MGs) as defined by Stabler [25]. In particular, we are going to demonstrate that there is a specic asymmetry between the in.uence of ILCs and CLCs on complexity. Thus, MGs, including an ILC, namely, the shortest move condition (SMC) have been shown to belong to the mildly context-sensitive grammar formalisms by Michaelis [14]. The same has been shown in [16, 18] for a revised version of MGs introduced in [26], which includes the SMC and an additional CLC, namely, the specifier island condition (SPIC). In particular [14] and [16, 18] show that, in terms of derivable string languages, both the original MG-type and the revised MG-type constitute a subclass of the class of linear context-free rewriting systems (LCFRSs) in the sense of [28, 29], and thus, a series of other formalism classes all generating the same class of string languages as LCFRSs. Here we will demonstrate that removing the SMC from the revised MG-version increases the generative power in such a way that the resulting formalism is not mildly context-sensitive anymore. This suggests that intuitions to the contrary notwithstanding, imposing an LC as such, here the SPIC, does not necessarily reduce formal complexity.

01 Jan 2005
TL;DR: It is described how multimodal grammars for dialogue systems can be written using the Grammatical Framework (GF) formalism, and a proof-of-concept dialogue system constructed using these techniques is presented.
Abstract: We describe how multimodal grammars for dialogue systems can be written using the Grammatical Framework (GF) formalism. A proof-of-concept dialogue system constructed using these techniques is also presented. The software engineering problem of keeping grammars for different languages, modalities and systems (such as speech recognizers and parsers) in sync is reduced by the formal relationship between the abstract and concrete syntaxes, and by generating equivalent grammars from GF grammars.

Journal ArticleDOI
TL;DR: This work shows that non-circularity remains decidable in EXPTIME and establishes the complexity of the non-emptiness and equivalence problem of extended AGs to be complete for EXPTime, and shows that the Region Algebra expressions can be efficiently translated into extendedAGs.

Proceedings ArticleDOI
16 Oct 2005
TL;DR: This work uses the genetic programming approach for grammatical inference and proposes the use of frequent sequences, syntax graphs and incremental construction of grammars in order to infer a more comprehensive set of context-free Grammars.
Abstract: We propose a new application area for grammar inference which intends to make domain-specific language development easier and finds a second application in renovation tools for legacy systems. We use the genetic programming approach for grammatical inference and propose the use of frequent sequences, syntax graphs and incremental construction of grammars in order to be able to infer a more comprehensive set of context-free grammars.

Journal ArticleDOI
TL;DR: State-alternating context-free grammars are introduced, and the language classes obtained from them are compared to the classes of the Chomsky hierarchy as well as to some well-known complexity classes.

Book ChapterDOI
28 Apr 2005
TL;DR: Dependency Structure Grammars (DSG), which are rewriting rule grammars generating sentences together with their dependency structures, are more expressive than CF-grammars and non-equivalent to mildly context-sensitive grammARS.
Abstract: In this paper, we define Dependency Structure Grammars (DSG), which are rewriting rule grammars generating sentences together with their dependency structures, are more expressive than CF-grammars and non-equivalent to mildly context-sensitive grammars We show that DSG are weakly equivalent to Categorial Dependency Grammars (CDG) recently introduced in [6,3] In particular, these dependency grammars naturally express long distance dependencies and enjoy good mathematical properties

Book ChapterDOI
11 Oct 2005
TL;DR: It turns out that such grammar is more expressive to model the translational equivalences of parallel texts for machine translation, and in this paper, the use of CSG as a basis for building a machine translation (MT) system for Portuguese to Chinese translation is proposed.
Abstract: This paper proposes a variation of synchronous grammar based on the formalism of context-free grammar by generalizing the first component of productions that models the source text, named Constraint-based Synchronous Grammar (CSG). Unlike other synchronous grammars, CSG allows multiple target productions to be associated to a single source production rule, which can be used to guide a parser to infer different possible translational equivalences for a recognized input string according to the feature constraints of symbols in the pattern. Furthermore, CSG is augmented with independent rewriting that allows expressing discontinuous constituents in the inference rules. It turns out that such grammar is more expressive to model the translational equivalences of parallel texts for machine translation, and in this paper, we propose the use of CSG as a basis for building a machine translation (MT) system for Portuguese to Chinese translation.

Journal Article
TL;DR: A positive answer to the question whether or not insertion grammars with weight at least 7 can characterize recursively enumerable languages can be improved is come up with by decreasing the weight of the insertion grammar used to 5.
Abstract: Insertion grammars have been introduced in [1] and their computational power has been studied in several places. In [7] it is proved that insertion grammars with weight at least 7 can characterize recursively enumerable languages (modulo a weak coding and an inverse morphism), and the question was formulated whether or not this result can be improved. In this paper, we come up with a positive answer to this question, by decreasing the weight of the insertion grammar used to 5. We also give a characterization of recursively enumerable languages in terms of right quotients of insertion languages.

Book ChapterDOI
02 Dec 2005
TL;DR: It is proved that the membership problem for monotone AC-tree automata is PSPACE-complete, and the family of AC-regular tree languages is strictly subsumed in that ofAC-monotone tree languages.
Abstract: We consider several questions about monotone AC-tree automata, a class of equational tree automata whose transition rules correspond to rules in Kuroda normal form of context-sensitive grammars. Whereas it has been proved that this class has a decision procedure to determine if, given a monotone AC-tree automaton, it accepts no terms, other important decidability or complexity results have not been well-investigated yet. In the paper, we prove that the membership problem for monotone AC-tree automata is PSPACE-complete. We then study the expressiveness of monotone AC-tree automata: precisely, we prove that the family of AC-regular tree languages is strictly subsumed in that of AC-monotone tree languages. The proof technique used in obtaining the above result yields the answers to two different questions, specifically that the family of monotone AC-tree languages is not closed under complementation, and that the inclusion problem for monotone AC-tree automata is undecidable.

Journal ArticleDOI
TL;DR: Metalinear CD grammars as discussed by the authors are context-free CD grammar systems where each component consists of metalinear productions and the maximal number of nonterminals in all starting productions is referred to.
Abstract: Metalinear CD grammar systems are context-free CD grammar systems where each component consists of metalinear productions. The maximal number of nonterminals in all starting productions is referred...


Journal Article
TL;DR: The main result proved shows that the natural embedding of any recursively enumerable one-dimensional array language in the two-dimensional space can be characterized by the projection of a two- dimensional array language generated by a contextual array grammar working in the t-mode and with norm one.
Abstract: The main result proved in this paper shows that the natural embedding of any recursively enumerable one-dimensional array language in the two-dimensional space can be characterized by the projection of a two-dimensional array language generated by a contextual array grammar working in the t-mode and with norm one Moreover, we show that any recursively enumerable one - dimensional array language can even be characterized by the projection of a two-dimensional array language generated by a contextual array grammar working in the t-mode where in the selectors of the contextual array productions only the ability to distinguish between blank and non-blank positions is necessary; in that case, the norm of the two-dimensional contextual array grammar working in the -mode cannot be bounded

Journal Article
TL;DR: In this paper, the authors introduce stratified semantics for Boolean grammars and show how to check if a Boolean grammar generates a language according to this semantics, which covers a class of important and natural languages.
Abstract: We study Boolean grammars. We introduce stratified semantics for Boolean grammars. We show, how to check, if a Boolean grammar generates a language according to this semantics. We show, that stratified semantics covers a class of important and natural languages. We introduce a recognition algorithm for Boolean grammars compliant to this semantics.

01 Sep 2005
TL;DR: These lecture notes present categorial grammars as deductive systems, and include detailed proofs of their main properties, and define the mapping from a syntactic analysis to a higher-order logical formula, which describes the semantics of the parsed sentence.
Abstract: These lecture notes present categorial grammars as deductive systems, and include detailed proofs of their main properties The first chapter deals with Ajdukiewicz and Bar-Hillel categorial grammars (AB grammars), their relation to context-free grammars and their learning algorithms The second chapter is devoted to the Lambek calculus as a deductive system; the weak equivalence with context free grammars is proved; we also define the mapping from a syntactic analysis to a higher-order logical formula, which describes the semantics of the parsed sentence The third and last chapter is about proof-nets as parse structures for Lambek grammars; we show the linguistic relevance of these graphs in particular through the study of a performance question Although definitions, theorems and proofs have been reformulated for pedagogical reasons, these notes contain no personnal result but in the proofnet chapter

Book ChapterDOI
28 Apr 2005
TL;DR: By adding the single operation of intersection, borrowed from conjunctive grammar, PMCFG becomes equivalent to sLMG and RCG and is therefore able to describe exactly the class of languages recognizable in polynomial time.
Abstract: It is already known that parallel multiple context-free grammar (PMCFG) [1] is an instance of the equivalent formalisms simple literal movement grammar (sLMG) [2, 3] and range concatenation grammar (RCG) [4, 5]. In this paper we show that by adding the single operation of intersection, borrowed from conjunctive grammar [6], PMCFG becomes equivalent to sLMG and RCG. As a corollary we get that PMCFG with intersection describe exactly the class of languages recognizable in polynomial time.

Book ChapterDOI
29 Aug 2005
TL;DR: It is shown, how to check, if a Boolean grammar generates a language according to this semantics, and that stratified semantics covers a class of important and natural languages.
Abstract: We study Boolean grammars. We introduce stratified semantics for Boolean grammars. We show, how to check, if a Boolean grammar generates a language according to this semantics. We show, that stratified semantics covers a class of important and natural languages. We introduce a recognition algorithm for Boolean grammars compliant to this semantics.

Journal ArticleDOI
TL;DR: The yield languages of synchronized tree automata, called the synchronized context-free (SCF) languages, are considered and it is shown that their language family coincides with the family of ETOL languages using both studied types of synchronization.

01 Jan 2005
TL;DR: A few families of context-free grammars Gn (n ≥ 1) in Chomsky normal form such that Gn generates Cn, and a family of minimal unambiguousgrammars for which ν and π are linear.
Abstract: Let Ln be the finite language of all n! strings that are permutations of n different symbols (n ? 1). We consider context-free grammars Gn in Chomsky normal form that generate Ln. In particular we study a few families {Gn}n ? 1, satisfying L(Gn) = Ln for n ? 1, with respect to their descriptional complexity, i.e. we determine the number of nonterminal symbols and the number of production rules of Gn as functions of n.

Book ChapterDOI
17 Aug 2005
TL;DR: The authors showed that the set of languages defined by general leftist grammars is not included in CFL, answering in negative a question from [9], where the accessibility problem for some general protection system was related to the membership problem of these gramms.
Abstract: Leftist grammars can be characterized in terms of rules of the form a→ ba and cd→ d, without distinction between terminals and nonterminals. They were introduced by Motwani et. al. [9] , where the accessibility problem for some general protection system was related to the membership problem of these grammars. This protection system was originally proposed in [3,10] in the context of Java virtual worlds. We show that the set of languages defined by general leftist grammars is not included in CFL, answering in negative a question from [9]. Moreover, we relate some restricted but naturally defined variants of leftist grammars to the language classes of the Chomsky hierarchy.