scispace - formally typeset
Search or ask a question

Showing papers on "Context-free grammar published in 2010"


Book
25 Aug 2010
TL;DR: This book provides an extensive overview of the formal language landscape between CFG and PTIME, moving from Tree Adjoining Grammars to Multiple Context-Free grammars and then to Range Concatenation Grammar while explaining available parsing techniques for these formalisms.
Abstract: Given that context-free grammars (CFG) cannot adequately describe natural languages, grammar formalisms beyond CFG that are still computationally tractable are of central interest for computational linguists. This book provides an extensive overview of the formal language landscape between CFG and PTIME, moving from Tree Adjoining Grammars to Multiple Context-Free Grammars and then to Range Concatenation Grammars while explaining available parsing techniques for these formalisms. Although familiarity with the basic notions of parsing and formal languages is helpful when reading this book, it is not a strict requirement. The presentation is supported with many illustrations and examples relating to the different formalisms and algorithms, and chapter summaries, problems and solutions. The book will be useful for students and researchers in computational linguistics and in formal language theory.

134 citations


Journal Article
TL;DR: This work proposes a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures and demonstrates the model's efficacy on supervised phrase-structure parsing and unsupervised dependency grammar induction.
Abstract: Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model's complexity we employ a Bayesian non-parametric prior which biases the model towards a sparse grammar with shallow productions. We demonstrate the model's efficacy on supervised phrase-structure parsing, where we induce a latent segmentation of the training treebank, and on unsupervised dependency grammar induction. In both cases the model uncovers interesting latent linguistic structures while producing competitive results.

98 citations


Proceedings Article
11 Jul 2010
TL;DR: A novel approach for authorship attribution, the task of identifying the author of a document, using probabilistic context-free grammars, and uses this grammar as a language model for classification.
Abstract: In this paper, we present a novel approach for authorship attribution, the task of identifying the author of a document, using probabilistic context-free grammars. Our approach involves building a probabilistic context-free grammar for each author and using this grammar as a language model for classification. We evaluate the performance of our method on a wide range of datasets to demonstrate its efficacy.

88 citations


Proceedings ArticleDOI
17 Jan 2010
TL;DR: The design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications is presented and its use on examples ranging from difficult programming language grammars to web server logs to binary data specification is illustrated.
Abstract: We present the design and theory of a new parsing engine, YAKKER, capable of satisfying the many needs of modern programmers and modern data processing applications. In particular, our new parsing engine handles (1) full scannerless context-free grammars with (2) regular expressions as right-hand sides for defining nonterminals. YAKKER also includes (3) facilities for binding variables to intermediate parse results and (4) using such bindings within arbitrary constraints to control parsing. These facilities allow the kind of data-dependent parsing commonly needed in systems applications, particularly those that operate over binary data. In addition, (5) nonterminals may be parameterized by arbitrary values, which gives the system good modularity and abstraction properties in the presence of data-dependent parsing. Finally, (6) legacy parsing libraries,such as sophisticated libraries for dates and times, may be directly incorporated into parser specifications. We illustrate the importance and utility of this rich collection of features by presenting its use on examples ranging from difficult programming language grammars to web server logs to binary data specification. We also show that our grammars have important compositionality properties and explain why such properties areimportant in modern applications such as automatic grammar induction.In terms of technical contributions, we provide a traditional high-level semantics for our new grammar formalization and show how to compile grammars into non deterministic automata. These automata are stack-based, somewhat like conventional push-down automata,but are also equipped with environments to track data-dependent parsing state. We prove the correctness of our translation of data-dependent grammars into these new automata and then show how to implement the automata efficiently using a variation of Earley's parsing algorithm.

61 citations


DOI
01 Jan 2010
TL;DR: PetitParser combines ideas from scannerless parsing, parser combinators, parsing expression grammars and packrat parsers to model grammar and parsers as objects that can be reconfigured dynamically.
Abstract: Grammars for programming languages are traditionally specified statically. They are hard to compose and reuse due to ambiguities that inevitably arise. PetitParser combines ideas from scannerless parsing, parser combinators, parsing expression grammars and packrat parsers to model grammars and parsers as objects that can be reconfigured dynamically. Through examples and benchmarks we demonstrate that dynamic grammars are not only flexible but highly practical.

61 citations


Book ChapterDOI
13 Sep 2010
TL;DR: It is shown that there is a natural class of context free languages, that includes the class of regular Languages, that can be polynomially learned from a MAT, using an algorithm that is an extension of Angluin's LSTAR algorithm.
Abstract: Angluin showed that the class of regular languages could be learned from a Minimally Adequate Teacher (mat) providing membership and equivalence queries. Clark and Eyraud (2007) showed that some context free grammars can be identified in the limit from positive data alone by identifying the congruence classes of the language. In this paper we consider learnability of context free languages using a MAT. We show that there is a natural class of context free languages, that includes the class of regular languages, that can be polynomially learned from a MAT, using an algorithm that is an extension of Angluin's LSTAR algorithm.

54 citations


Journal ArticleDOI
TL;DR: A language for specifying strategies for solving exercises is introduced, which makes it easier to automatically calculate feedback, for example when a user makes an erroneous step in a calculation.
Abstract: Strategies specify how a wide range of exercises can be solved incrementally, such as bringing a logic proposition to disjunctive normal form, reducing a matrix, or calculating with fractions. In this paper we introduce a language for specifying strategies for solving exercises. This language makes it easier to automatically calculate feedback, for example when a user makes an erroneous step in a calculation. We can automatically generate worked-out examples, track the progress of a student by inspecting submitted intermediate answers, and report back suggestions in case the student deviates from the strategy. Thus it becomes less labor-intensive and less ad-hoc to specify new exercise domains and exercises within that domain. A strategy describes valid sequences of rewrite rules, which turns tracking intermediate steps into a parsing problem. This is a promising view at interactive exercises because it allows us to take advantage of many years of experience in parsing sentences of context-free languages, and transfer this knowledge and technology to the domain of stepwise solving exercises. In this paper we work out the similarities between parsing and solving exercises incrementally, we discuss generating feedback on strategies, and the implementation of a strategy recognizer.

46 citations


Proceedings ArticleDOI
06 May 2010
TL;DR: Methods to automatically insert cut operators into some practical grammars without changing the accepted languages are proposed, which can handle some practical Grammars including the Java grammar in mostly constant space without requiring any extra annotations.
Abstract: Packrat parsing is a powerful parsing algorithm presented by Ford in 2002. Packrat parsers can handle complicated grammars and recursive structures in lexical elements more easily than the traditional LL(k) or LR(1) parsing algorithms. However, packrat parsers require O(n) space for memoization, where n is the length of the input. This space inefficiency makes packrat parsers impractical in some applications. In our earlier work, we had proposed a packrat parser generator that accepts grammars extended with cut operators, which enable the generated parsers to reduce the amount of storage required.Experiments showed that parsers generated from cut-inserted grammars can parse Java programs and subset XML files in bounded space.In this study, we propose methods to automatically insert cut operators into some practical grammars without changing the accepted languages. Our experimental evaluations indicated that using our methods, packrat parsers can handle some practical grammars including the Java grammar in mostly constant space without requiring any extra annotations.

37 citations


Book ChapterDOI
01 Jan 2010
TL;DR: This chapter considers faster algorithms that can be used for some classes of context-free grammars, called recursive-descent parsing and LL-parsing, both of which are not fast enough to be practical.
Abstract: In chapter 5 we use finite automata for text parsing. As noted, there are rather simple structures (e.g., nested comments) that cannot be parsed with finite automata. There is a more powerful formalism called context-free grammars that is often used when finite automata are not enough. In section 15.1 we define context-free grammars and consider a general polynomial parsing algorithm. However, this algorithm is not fast enough to be practical, and in the next two sections we consider faster (linear time) algorithms that can be used for some classes of context-free grammars, called recursive-descent parsing (section 15.2) and LL-parsing (section 15.3).

36 citations


Journal ArticleDOI
TL;DR: It is observed that there is a simple linguistic characterization of the grammar ambiguity problem, and it is shown how to exploit this by presenting an ambiguity analysis framework based on conservative language approximations.

36 citations


Journal ArticleDOI
TL;DR: This paper gives a concise description of PGF, covering syntax, semantics, and parser generation, and discusses the technique of embedded Grammatical Framework, where language processing tasks defined by PGF grammars are integrated in larger systems.
Abstract: Portable Grammar Format (PGF) is a core language for type-theoretical grammars. It is the target language to which grammars written in the high-level formalism Grammatical Framework (GF) are compiled. Low-level and simple, PGF is easy to reason about, so that its language-theoretic properties can be established. It is also easy to write interpreters that perform parsing and generation with PGF grammars, and compilers converting PGF to other formats. This paper gives a concise description of PGF, covering syntax, semantics, and parser generation. It also discusses the technique of embedded grammars, where language processing tasks defined by PGF grammars are integrated in larger systems.

Book ChapterDOI
13 Sep 2010
TL;DR: This work presents a learning algorithm for context free grammars which uses positive data and membership queries, and proves its correctness under the identification in the limit paradigm.
Abstract: The Syntactic Concept Lattice is a residuated lattice based on the distributional structure of a language; the natural representation based on this is a context sensitive formalism. Here we examine the possibility of basing a context free grammar (CFG) on the structure of this lattice; in particular by choosing non-terminals to correspond to concepts in this lattice. We present a learning algorithm for context free grammars which uses positive data and membership queries, and prove its correctness under the identification in the limit paradigm. Since the lattice itself may be infinite, we consider only a polynomially bounded subset of the set of concepts, in order to get an efficient algorithm. We compare this on the one hand to learning algorithms for context free grammars, where the non-terminals correspond to congruence classes, and on the other hand to the use of context sensitive techniques such as Binary Feature Grammars and Distributional Lattice Grammars. The class of CFGs that can be learned in this way includes inherently ambiguous and thus non-deterministic languages; this approach therefore breaks through an important barrier in CFG inference.

Journal ArticleDOI
TL;DR: The aim of this paper is to experiment, on several grammars of domain specific languages and of general-purpose languages, existing grammar metrics together with the new metrics that are based on grammar LR automaton and on the language recognized.
Abstract: Grammar metrics have been introduced to measure the quality and the complexity of the formal grammars. The aim of this paper is to explore the meaning of these notions and to experiment, on several grammars of domain specific languages and of general-purpose languages, existing grammar metrics together with the new metrics that are based on grammar LR automaton and on the language recognized. We discuss the results of this experiment and focus on the comparison between grammars of domain specific languages as well as of general-purpose languages and on the evolution of the metrics between several versions of the same language.

Proceedings Article
02 Jun 2010
TL;DR: A variational inference algorithm for adaptor grammars is described, providing an alternative to Markov chain Monte Carlo methods, and a significant speed-up is shown when parallelizing the algorithm.
Abstract: Adaptor grammars extend probabilistic context-free grammars to define prior distributions over trees with "rich get richer" dynamics. Inference for adaptor grammars seeks to find parse trees for raw text. This paper describes a variational inference algorithm for adaptor grammars, providing an alternative to Markov chain Monte Carlo methods. To derive this method, we develop a stick-breaking representation of adaptor grammars, a representation that enables us to define adaptor grammars with recursion. We report experimental results on a word segmentation task, showing that variational inference performs comparably to MCMC. Further, we show a significant speed-up when parallelizing the algorithm. Finally, we report promising results for a new application for adaptor grammars, dependency grammar induction.

01 Jan 2010
TL;DR: The grammar in the grammar-based Genetic Programming (GP) approach of Grammatical Evolution (GE) is explored and a meta-grammar GE is studied, which allows a larger grammar with different bias, by adopting a divide-and-conquer strategy.
Abstract: The grammar in the grammar-based Genetic Programming (GP) approach of Grammatical Evolution (GE) is explored. The GE algorithm solves problems by using a grammar representation and an automated and parallel trial-and-error approach, Evolutionary Computation (EC). The search for solutions in EC is driven by evaluating each solution, selecting the fittest and replacing these into a population of solutions which are modified to further guide the search. Representations have a strong impact on the efficiency of search and by using a generative grammar domain knowledge is encoded into the population of solutions. The grammar in GE biases the search for solutions, and in combination with a linear representation this is what distinguishes GE from other GP-systems. After a review of grammars in EC and a description of GE, several different constructions of grammars and operators for manipulating the grammars and the evolutionary algorithm are studied. The thesis goes on to study a meta-grammar GE, which allows a larger grammar with different bias. By adopting a divide-and-conquer strategy the goal is to investigate how a modular GE approach solves problems of increasing size and in dynamically changing environments. The results show some benefit from using meta-grammars in GE, for the meta-grammar Genetic Algorithm (mGGA) and they re-emphasize the grammar’s impact on GE’s performance. In addition, GE and meta-grammars are more formally described. The bias, both declarative and search, arising from the use of a Context-Free Grammar representation and the constraints of GE and the mGGA are analyzed and their implications are examined. This is done by studying the effects of the mapping and operations on the input, single and multiple changes in input, as well as the preservation of output after a change. Furthermore, a matrix view of a grammar and different suggestions for measurements of grammars are investigated, in order to allow the practitioner to get an alternative view of the mapping process and of how operations work.

Proceedings ArticleDOI
01 Dec 2010
TL;DR: This paper describes a context free grammar for Bangla language and hence a Bangla parser based on the grammar is developed, based on Top down parsing method and to avoid the left recursion the idea of left factoring is adopted.
Abstract: Parsing is a process of transforming natural language into an internal system representation, which can be trees, dependency graphs, frames or some other structural representations. If a natural language be successfully parsed then grammar checking from this language becomes easy. In this paper we describe a context free grammar for Bangla language and hence we develop a Bangla parser based on the grammar. Our approach is very much general to apply in Bangla Sentences and the method is well accepted for parsing a language of a grammar. The scheme is based on Top down parsing method and to avoid the left recursion the idea of left factoring is adopted.

Journal ArticleDOI
TL;DR: A simple, direct proof of the fact that second-order ACGs are simulated by hyperedge replacement grammars is given, which implies that the string and tree generating power of the former is included in that of the latter.
Abstract: Second-order abstract categorial grammars (de Groote in Association for computational linguistics, 39th annual meeting and 10th conference of the European chapter, proceedings of the conference, pp. 148---155, 2001) and hyperedge replacement grammars (Bauderon and Courcelle in Math Syst Theory 20:83---127, 1987; Habel and Kreowski in STACS 87: 4th Annual symposium on theoretical aspects of computer science. Lecture notes in computer science, vol 247, Springer, Berlin, pp 207---219, 1987) are two natural ways of generalizing "context-free" grammar formalisms for string and tree languages. It is known that the string generating power of both formalisms is equivalent to (non-erasing) multiple context-free grammars (Seki et al. in Theor Comput Sci 88:191---229, 1991) or linear context-free rewriting systems (Weir in Characterizing mildly context-sensitive grammar formalisms, University of Pennsylvania, 1988). In this paper, we give a simple, direct proof of the fact that second-order ACGs are simulated by hyperedge replacement grammars, which implies that the string and tree generating power of the former is included in that of the latter. The normal form for tree-generating hyperedge replacement grammars given by Engelfriet and Maneth (Graph transformation. Lecture notes in computer science, vol 1764. Springer, Berlin, pp 15---29, 2000) can then be used to show that the tree generating power of second-order ACGs is exactly the same as that of hyperedge replacement grammars.

Book ChapterDOI
07 Apr 2010
TL;DR: A novel evolutionary engine for the evolution of context free grammars that relies on specially designed graph-based crossover and mutation operators and is able to create diverse and interesting families of shapes even when the initial population is composed of minimal Grammars.
Abstract: We present a novel evolutionary engine for the evolution of context free grammars. The system relies on specially designed graph-based crossover and mutation operators. While in most evolutionary art systems each individual corresponds to a single artwork, in our approach each individual is a context free grammar that specifies a family of shapes following the same production rules. To assess the adequacy and completeness of the system we perform experiments using automated fitness assignment and user-guided evolution. The experimental results show that the system is able to create diverse and interesting families of shapes even when the initial population is composed of minimal grammars.

Journal ArticleDOI
TL;DR: In adaptive star grammars, rules are actually schemata which, via the cloning of so-called multiple nodes, may adapt to potentially infinitely many contexts when they are applied, and they turn out to be restricted enough to share some of the basic characteristics of context-free devices.

Proceedings ArticleDOI
27 Sep 2010
TL;DR: An analytic comparison of the performance of both setups, i.e., grammatical evolution and tree-adjunctgrammatical evolution, across a number of classic genetic programming benchmarking problems indicate that tree- adjunct grammars has a better overall performance (measured in terms of finding the global optima).
Abstract: In this paper we investigate the application of tree-adjunct grammars to grammatical evolution. The standard type of grammar used by grammatical evolution, context-free grammars, produce a subset of the languages that tree-adjunct grammars can produce, making tree-adjunct grammars, expressively, more powerful. In this study we shed some light on the effects of tree-adjunct grammars on grammatical evolution, or tree-adjunct grammatical evolution. We perform an analytic comparison of the performance of both setups, i.e., grammatical evolution and tree-adjunct grammatical evolution, across a number of classic genetic programming benchmarking problems. The results firmly indicate that tree-adjunct grammatical evolution has a better overall performance (measured in terms of finding the global optima).

Journal ArticleDOI
TL;DR: If it is furthermore required that each rule of the general form A->w has a nonempty w, then a substantial subfamily of conjunctive languages can be generated, yet it remains unknown whether such grammars are as powerful as conj unctive grammARS of thegeneral form.

Book ChapterDOI
06 Jul 2010
TL;DR: Three open questions in the theory of regulated rewriting are addressed, including whether every permitting random context grammar has a non-erasing equivalent and whether permitting random Context Grammars have the same generative capacity as matrix grammars without appearance checking.
Abstract: Three open questions in the theory of regulated rewriting are addressed. The first is whether every permitting random context grammar has a non-erasing equivalent. The second asks whether the same is true for matrix grammars without appearance checking. The third concerns whether permitting random context grammars have the same generative capacity as matrix grammars without appearance checking. The main result is a positive answer to the first question. For the other two, conjectures are presented. It is then deduced from the main result that at least one of the two holds.

Book ChapterDOI
01 Sep 2010
TL;DR: In this paper, the approximative noncanonical unambiguity test by Schmitz can be extended to conservatively identify production rules that do not contribute to the ambiguity of a grammar.
Abstract: Context-free grammars are widely used but still hindered by ambiguity. This stresses the need for detailed detection methods that point out the sources of ambiguity in a grammar. In this paper we show how the approximative Noncanonical Unambiguity Test by Schmitz can be extended to conservatively identify production rules that do not contribute to the ambiguity of a grammar. We prove the correctness of our approach and consider its practical applicability.

Proceedings ArticleDOI
09 Jun 2010
TL;DR: A grammar-based mutation testing framework is proposed, together with effective mutation operators, coverage concepts and algorithms for test sequence generation, which enables complementary or alternative use of regular grammars, depending on the preferences of the test engineer.
Abstract: Model-based approaches, especially based on directed graphs (DG), are becoming popular for mutation testing as they enable definition of simple, nevertheless powerful, mutation operators and effective coverage criteria. However, these models easily become intractable if the system under consideration is too complex or large. Moreover, existing DG-based algorithms for test generation and optimization are rare and rather in an initial stage. Finally, DG models fail to represent languages beyond type-3 (regular). This paper proposes a grammar-based mutation testing framework, together with effective mutation operators, coverage concepts and algorithms for test sequence generation. The objective is to establish a formal framework for model-based mutation testing which enables complementary or alternative use of regular grammars, depending on the preferences of the test engineer. A case study validates the approach and analyzes its characteristic issues.

01 Oct 2010
TL;DR: This paper shows how the approach proposed for direct left-recursive Packrat parsing by Warth et al. can be adapted for ‘pure’ PEGs, and outlines a restrictive subset of left- Recursion Grammars which can safely work with this algorithm.
Abstract: Parsing Expression Grammars (PEGs) are specifications of unambiguous recursive-descent style parsers. PEGs incorporate both lexing and parsing phases and have valuable properties, such as being closed under composition. In common with most recursive-descent systems, raw PEGs cannot handle left-recursion; traditional approaches to left-recursion elimination lead to incorrect parses. In this paper, I show how the approach proposed for direct left-recursive Packrat parsing by Warth et al. can be adapted for ‘pure’ PEGs. I then demonstrate that this approach results in incorrect parses for some PEGs, before outlining a restrictive subset of left-recursive PEGs which can safely work with this algorithm. Finally I suggest an alteration to Warth et al.’s algorithm that can correctly parse a less restrictive subset of directly recursive PEGs.

Proceedings Article
11 Jul 2010
TL;DR: A novel training method for the model using a blocked Metropolis-Hastings sampler in place of the previous method's local Gibbs sampler, which enables efficient blocked inference for training and also improves the parsing algorithm.
Abstract: Learning a tree substitution grammar is very challenging due to derivational ambiguity. Our recent approach used a Bayesian non-parametric model to induce good derivations from treebanked input (Cohn et al., 2009), biasing towards small grammars composed of small generalisable productions. In this paper we present a novel training method for the model using a blocked Metropolis-Hastings sampler in place of the previous method's local Gibbs sampler. The blocked sampler makes considerably larger moves than the local sampler and consequently converges in less time. A core component of the algorithm is a grammar transformation which represents an infinite tree substitution grammar in a finite context free grammar. This enables efficient blocked inference for training and also improves the parsing algorithm. Both algorithms are shown to improve parsing accuracy.

Journal ArticleDOI
TL;DR: This paper presents a solution for rapid development of VisualLISA editor using DEViL, a new visual language for attribute grammars (AGs), and on the development of the associated programming environment.
Abstract: The focus of this paper is on crafting a new visual language for attribute grammars (AGs), and on the development of the associated programming environment. We present a solution for rapid development of VisualLISA editor using DEViL. DEViL uses traditional attribute grammars, to specify the language's syntax and semantics, extended by visual representations to be associated with grammar symbols. From these specifications a visual programming environment is automatically generated. In our case, the environment allows us to edit a visual description of an AG that is automatically translated into textual notations, including an XML-based representation for attribute grammars (XAGra), and is intended to be helpful for beginners and rapid development of small AGs. XAGra allows us to use VisualLISA with other compiler-compiler tools.

Proceedings ArticleDOI
13 Sep 2010
TL;DR: In this article, the authors present a toolkit for context-free grammars, which mainly consists of several algorithms for sentence generation or enumeration and for coverage analysis for context free grammar.
Abstract: Producing sentences from a grammar, according to various criteria, is required in many applications. It is also a basic building block for grammar engineering. This paper presents a toolkit for context-free grammars, which mainly consists of several algorithms for sentence generation or enumeration and for coverage analysis for context-free grammars. The toolkit deals with general context-free grammars. Besides providing implementations of algorithms, the toolkit also provides a simple graphical user interface, through which the user can use the toolkit directly. The toolkit is implemented in Java and is available at http://lcs.ios.ac.cn/ hiwu/toolkit.php. In the paper, the overview of the toolkit and the description of the GUI are presented, and experimental results and preliminary applications of the toolkit are also contained.

Proceedings ArticleDOI
23 Mar 2010
TL;DR: It is shown that query results have rational probabilities with a polynomial-size bit representation and, more importantly, an efficient query-evaluation algorithm is presented.
Abstract: Stochastic context-free grammars (SCFGs) have long been recognized as useful for a large variety of tasks including natural language processing, morphological parsing, speech recognition, information extraction, Web-page wrapping and even analysis of RNA. A string and an SCFG jointly represent a probabilistic interpretation of the meaning of the string, in the form of a (possibly infinite) probability space of parse trees. The problem of evaluating a query over this probability space is considered under the conventional semantics of querying a probabilistic database. For general SCFGs, extremely simple queries may have results that include irrational probabilities. But, for a large subclass of SCFGs (that includes all the standard studied subclasses of SCFGs) and the language of tree-pattern queries with projection (and child/descendant edges), it is shown that query results have rational probabilities with a polynomial-size bit representation and, more importantly, an efficient query-evaluation algorithm is presented.

Journal ArticleDOI
TL;DR: In this paper, contextual star grammars are proposed as a graph grammar approach that allows for simple parsing and that is powerful enough for specifying non-trivial software models, such as program graphs, a language-independent model of object-oriented programs.
Abstract: The precise specification of software models is a major concern in model-driven design of object-oriented software. Metamodelling and graph grammars are apparent choices for such specifications. Metamodelling has several advantages: it is easy to use, and provides procedures that check automatically whether a model is valid or not. However, it is less suited for proving properties of models, or for generating large sets of example models. Graph grammars, in contrast, offer a natural procedure - the derivation process - for generating example models, and they support proofs because they define a graph language inductively. However, not all graph grammars that allow to specify practically relevant models are easily parseable. In this paper, we propose contextual star grammars as a graph grammar approach that allows for simple parsing and that is powerful enough for specifying non-trivial software models. This is demonstrated by defining program graphs, a language-independent model of object-oriented programs, with a focus on shape (static structure) rather than behavior.