Showing papers on "Context-sensitive grammar published in 2015"

PDF

Open Access

Book Chapter•DOI•

Probabilistic Grammars and their Applications

[...]

Stuart Geman¹, Mark Johnson¹•Institutions (1)

Brown University¹

26 Mar 2015

TL;DR: This article reviews the main classes of probabilistic grammars and points to some active areas of research.

...read moreread less

Abstract: Formal grammars are widely used in speech recognition, language translation, and language understanding systems. Grammars rich enough to accommodate natural language generate multiple interpretations of typical sentences. These ambiguities are a fundamental challenge to practical application. Grammars can be equipped with probability distributions, and the various parameters of these distributions can be estimated from data (e.g., acoustic representations of spoken words or a corpus of hand-parsed sentences). The resulting probabilistic grammars help to interpret spoken or written language unambiguously. This article reviews the main classes of probabilistic grammars and points to some active areas of research.

...read moreread less

38 citations

Proceedings Article•DOI•

Language Edit Distance and Maximum Likelihood Parsing of Stochastic Grammars: Faster Algorithms and Connection to Fundamental Graph Problems

[...]

Barna Saha¹•Institutions (1)

University of Massachusetts Amherst¹

17 Oct 2015

TL;DR: The first truly sub-cubic algorithm that computes language edit distance almost optimally was given in this article. But this algorithm requires a large number of substrings of a given string to be aligned, and it is not known whether it is possible to estimate the edit distance in the same time with high probability.

...read moreread less

Abstract: Given a context free language G over alphabet a#x03A3; and a string s, the language edit distance problem seeks the minimum number of edits (insertions, deletions and substitutions) required to convert s into a valid member of the language L(G). The well-known dynamic programming algorithm solves this problem in cubic time in string length [Aho, Peterson 1972, Myers 1985]. Despite its numerous applications, to date there exists no algorithm that computes the exact or approximate language edit distance problem in true sub cubic time. In this paper we give the first such truly sub-cubic algorithm that computes language edit distance almost optimally. We further solve the local alignment problem, for all substrings of s, we can estimate their language edit distance near-optimally in same time with high probability. Next, we design the very first sub cubic algorithm that given an arbitrary stochastic context free grammar, and a string returns a nearly-optimal maximum likelihood parsing of that string. Stochastic context free grammars significantly generalize hidden Markov models, they lie at the foundation of statistical natural language processing, and have found widespread applications in many other fields. To complement our upper bound result, we show that exact computation of maximum likelihood parsing of stochastic grammars or language edit distance in true sub cubic time will imply a truly sub cubic algorithm for all-pairs shortest paths, a long-standing open question. This will result in a breakthrough for a large range of problems in graphs and matrices due to sub cubic equivalence. By a known lower bound result [Lee 2002], and a recent development [Abboud et al. 2015] even the much simpler problem of parsing a context free grammar requires fast matrix multiplication time. Therefore any nontrivial multiplicative approximation algorithms for either of the two problems in time less than matrix-multiplication are unlikely to exist.

...read moreread less

27 citations

Journal Article•DOI•

Categorial dependency grammars

[...]

Michael I. Dekhtyar¹, Alexander Dikovsky², Boris Karlov¹•Institutions (2)

Tver State University¹, University of Nantes²

10 May 2015-Theoretical Computer Science

TL;DR: CDGs represent a class of completely lexicalized dependency grammars that express both projective and non-projective dependencies and generate non-context-free languages.

...read moreread less

26 citations

Semantic construction with graph grammars

[...]

Alexander Koller¹•Institutions (1)

University of Potsdam¹

01 Apr 2015

TL;DR: S-graph grammars are introduced, a new grammar formalism for computing graph-based semantic representations that uses graphs as semantic representations in a way that is consistent with more classical views on semantic construction.

...read moreread less

Abstract: We introduce s-graph grammars, a new grammar formalism for computing graph-based semantic representations. Semantically annotated corpora which use graphs as semantic representations have recently become available, and there have been a number of data-driven systems for semantic parsing that can be trained on these corpora. However, it is hard to map the linguistic assumptions of these systems onto more classical insights on semantic construction. S-graph grammars use graphs as semantic representations, in a way that is consistent with more classical views on semantic construction. We illustrate this with a number of hand-written toy grammars, sketch the use of s-graph grammars for data-driven semantic parsing, and discuss formal aspects.

...read moreread less

25 citations

Posted Content•

What's Decidable about Syntax-Guided Synthesis?

[...]

Benjamin Caulfield, Markus N. Rabe, Sanjit A. Seshia, Stavros Tripakis

28 Oct 2015-arXiv: Logic in Computer Science

TL;DR: It is proved that the SyGuS problem is undecidable for the theory of equality with uninterpreted functions (EUF) and for a very simple bit-vector theory with concatenation, both for context-free grammars and for tree Grammars.

...read moreread less

Abstract: Syntax-guided synthesis (SyGuS) is a recently proposed framework for program synthesis problems. The SyGuS problem is to find an expression or program generated by a given grammar that meets a correctness specification. Correctness specifications are given as formulas in suitable logical theories, typically amongst those studied in satisfiability modulo theories (SMT). In this work, we analyze the decidability of the SyGuS problem for different classes of grammars and correctness specifications. We prove that the SyGuS problem is undecidable for the theory of equality with uninterpreted functions (EUF).We identify a fragment of EUF, which we call regular-EUF, for which the SyGuS problem is decidable. We prove that this restricted problem is EXPTIME-complete and that the sets of solution expressions are precisely the regular tree languages. For theories that admit a unique, finite domain, we give a general algorithm to solve the SyGuS problem on tree grammars. Finite-domain theories include the bit-vector theory without concatenation. We prove SyGuS undecidable for a very simple bit-vector theory with concatenation, both for context-free grammars and for tree grammars. Finally, we give some additional results for linear arithmetic and bit-vector arithmetic along with a discussion of the implication of these results.

...read moreread less

18 citations

Journal Article•DOI•

Product grammars for alignment and folding

[...]

Christian Höner zu Siederdissen¹, Ivo L. Hofacker², Peter F. Stadler³•Institutions (3)

University of Vienna¹, University of Copenhagen², Leipzig University³

01 May 2015-IEEE/ACM Transactions on Computational Biology and Bioinformatics

TL;DR: This work provides a fully worked frameshift-aware, semiglobal DNA-protein alignment algorithm whose grammar is composed of products of small, atomic grammars and an embedding in Haskell as a domain-specific language makes the theory directly accessible to writing and using grammar products without the detour of an external compiler.

...read moreread less

Abstract: We develop a theory of algebraic operations over linear and context-free grammars that makes it possible to combine simple “atomic” grammars operating on single sequences into complex, multi-dimensional grammars. We demonstrate the utility of this framework by constructing the search spaces of complex alignment problems on multiple input sequences explicitly as algebraic expressions of very simple one-dimensional grammars. In particular, we provide a fully worked frameshift-aware, semiglobal DNA-protein alignment algorithm whose grammar is composed of products of small, atomic grammars. The compiler accompanying our theory makes it easy to experiment with the combination of multiple grammars and different operations. Composite grammars can be written out in $ {\rm L}^AT_{E}X$ for documentation and as a guide to implementation of dynamic programming algorithms. An embedding in Haskell as a domain-specific language makes the theory directly accessible to writing and using grammar products without the detour of an external compiler. Software and supplemental files available here: http://www.bioinf.uni-leipzig.de/Software/gramprod/

...read moreread less

18 citations

Proceedings Article•DOI•

Automating grammar comparison

[...]

Ravichandhran Madhavan¹, Mikaël Mayer¹, Sumit Gulwani², Viktor Kuncak¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Microsoft²

23 Oct 2015

TL;DR: This work considers from a practical perspective the problem of checking equivalence of context-free grammars, and proposes an algorithm for proving equivalence that is complete for LL grammARS, yet can be invoked on any context- free grammar, including ambiguous Grammars.

...read moreread less

Abstract: We consider from a practical perspective the problem of checking equivalence of context-free grammars. We present techniques for proving equivalence, as well as techniques for finding counter-examples that establish non-equivalence. Among the key building blocks of our approach is a novel algorithm for efficiently enumerating and sampling words and parse trees from arbitrary context-free grammars; the algorithm supports polynomial time random access to words belonging to the grammar. Furthermore, we propose an algorithm for proving equivalence of context-free grammars that is complete for LL grammars, yet can be invoked on any context-free grammar, including ambiguous grammars. Our techniques successfully find discrepancies between different syntax specifications of several real-world languages, and are capable of detecting fine-grained incremental modifications performed on grammars. Our evaluation shows that our tool improves significantly on the existing available state of the art tools. In addition, we used these algorithms to develop an online tutoring system for grammars that we then used in an undergraduate course on computer language processing. On questions involving grammar constructions, our system was able to automatically evaluate the correctness of 95% of the solutions submitted by students: it disproved 74% of cases and proved 21% of them.

...read moreread less

18 citations

Journal Article•DOI•

Converting metamodels to graph grammars: doing without advanced graph grammar features

[...]

Luka Fürst¹, Marjan Mernik², Viljan Mahnič¹•Institutions (2)

University of Ljubljana¹, University of Maribor²

01 Jul 2015-Software and Systems Modeling

TL;DR: This paper proposes a graph-grammar-based approach to the semantic analysis of model graphs and uses Rekers and Schürr’s Layered Graph Grammars, which may be regarded as a pure generalization of standard context-sensitive string grammars.

...read moreread less

Abstract: In this paper, we present a method to convert a metamodel in the form of a UML class diagram into a context-sensitive graph grammar whose language comprises precisely the set of model graphs (UML object diagrams) that conform to the input metamodel. Compared to other approaches that deal with the same problem, we use a graph grammar formalism that does not employ any advanced graph grammar features, such as application conditions, precedence rules, and production schemes. Specifically, we use Rekers and Schurr's Layered Graph Grammars, which may be regarded as a pure generalization of standard context-sensitive string grammars. We show that elementary grammatical features, i.e., grammar labels and context-sensitive graph rewrite rules, suffice to represent metamodels with arbitrary multiplicities and inheritance. Inspired by attribute string grammars, we also propose a graph-grammar-based approach to the semantic analysis of model graphs.

...read moreread less

17 citations

Journal Article•DOI•

Contextual hyperedge replacement

[...]

Frank Drewes¹, Berthold Hoffmann²•Institutions (2)

Umeå University¹, University of Bremen²

01 Sep 2015-Acta Informatica

TL;DR: Even though these grammars are not context-free, one can show that they inherit several of the nice properties of hyperedge replacement Grammars, and their membership problem is in NP.

...read moreread less

Abstract: Contextual hyperedge-replacement grammars (contextual grammars, for short) are an extension of hyperedge replacement grammars. They have recently been proposed as a grammatical method for capturing the structure of object-oriented programs, thus serving as an alternative to the use of meta-models like uml class diagrams in model-driven software design. In this paper, we study the properties of contextual grammars. Even though these grammars are not context-free, one can show that they inherit several of the nice properties of hyperedge replacement grammars. In particular, they possess useful normal forms and their membership problem is in NP.

...read moreread less

16 citations

Journal Article•DOI•

An on-the-fly grammar modification mechanism for composing and defining extensible languages

[...]

Leonardo V. S. Reis¹, Vladimir Oliveira Di Iorio², Roberto da Silva Bigonha³•Institutions (3)

Universidade Federal de Ouro Preto¹, Universidade Federal de Viçosa², Universidade Federal de Minas Gerais³

01 Jul 2015-Computer Languages, Systems & Structures

TL;DR: It is shown that the mechanism for on-the-fly modification of syntax rules can be useful for defining grammars in a modular way, implementing almost all types of language composition in the context of specification of extensible languages.

...read moreread less

14 citations

Journal Article•DOI•

Two-sided context specifications in formal grammars

[...]

Mikhail Barash¹, Alexander Okhotin²•Institutions (2)

Turku Centre for Computer Science¹, University of Turku²

02 Aug 2015-Theoretical Computer Science

TL;DR: This paper proposes a more general model, in which context specifications may be two-sided, that is, both the left and the right contexts can be specified by the corresponding operators.

...read moreread less

Journal Article•DOI•

Grammar-based model transformations

[...]

Galina Besova¹, Dominik Steenken¹, Heike Wehrheim¹•Institutions (1)

University of Paderborn¹

01 Oct 2015-Computer Languages, Systems & Structures

TL;DR: A new approach to model transformation development is proposed which allows to simplify the developed transformations and improve their quality via the exploitation of the languages' structures and it is shown that such transformations have important properties: they terminate and are sound, complete, and deterministic.

...read moreread less

Posted Content•

Guided Grammar Convergence.

[...]

Vadim Zaytsev

29 Mar 2015-arXiv: Software Engineering

TL;DR: This paper investigates several milestones between those two extremes between language equivalence and grammar identity, and proposes a methodology for inconsistency management in grammar engineering.

...read moreread less

Abstract: Relating formal grammars is a hard problem that balances between language equivalence (which is known to be undecidable) and grammar identity (which is trivial). In this paper, we investigate several milestones between those two extremes and propose a methodology for inconsistency management in grammar engineering. While conventional grammar convergence is a practical approach relying on human experts to encode differences as transformation steps, guided grammar convergence is a more narrowly applicable technique that infers such transformation steps automatically by normalising the grammars and establishing a structural equivalence relation between them. This allows us to perform a case study with automatically inferring bidirectional transformations between 11 grammars (in a broad sense) of the same artificial functional language: parser specifications with different combinator libraries, definite clause grammars, concrete syntax definitions, algebraic data types, metamodels, XML schemata, object models.

...read moreread less

Proceedings Article•DOI•

Certified Normalization of Context-Free Grammars

[...]

Denis Firsov¹, Tarmo Uustalu¹•Institutions (1)

Tallinn University of Technology¹

13 Jan 2015

TL;DR: This work on formalization of language theory proves formally in the Agda dependently typed programming language that each of these transformations is correct in the sense of making progress toward normality and preserving the language of the given grammar.

...read moreread less

Abstract: Every context-free grammar can be transformed into an equivalent one in the Chomsky normal form by a sequence of four transformations. In this work on formalization of language theory, we prove formally in the Agda dependently typed programming language that each of these transformations is correct in the sense of making progress toward normality and preserving the language of the given grammar. Also, we show that the right sequence of these transformations leads to a grammar in the Chomsky normal form (since each next transformation preserves the normality properties established by the previous ones) that accepts the same language as the given grammar. As we work in a constructive setting, soundness and completeness proofs are functions converting between parse trees in the normalized and original grammars.

...read moreread less

Journal Article•DOI•

Linear grammars with one-sided contexts and their automaton representation

[...]

Mikhail Barash¹, Alexander Okhotin²•Institutions (2)

Turku Centre for Computer Science¹, University of Turku²

01 Apr 2015-Theoretical Informatics and Applications

TL;DR: A family of formal grammars with an operator for referring to the left context of a substring being defined, as well as with a conjunction operation, are considered, which are proved to be computationally equivalent to an extension of one-way real-time cellular automata with an extra data channel.

...read moreread less

Abstract: The paper considers a family of formal grammars that extends linear context-free grammars with an operator for referring to the left context of a substring being defined, as well as with a conjunction operation (as in linear conjunctive grammars). These grammars are proved to be computationally equivalent to an extension of one-way real-time cellular automata with an extra data channel. The main result is the undecidability of the emptiness problem for grammars restricted to a one-symbol alphabet, which is proved by simulating a Turing machine by a cellular automaton with feedback. The same construction proves the Σ0 2 -completeness of the finiteness problem for these grammars and automata.

...read moreread less

Proceedings Article•DOI•

Closure properties of Watson-Crick grammars

[...]

Nurul Liyana Mohamad Zulkufli, Sherzod Turaev, Mohd Izzuddin Mohd Tamrin, Messikh Azeddine

14 Dec 2015

TL;DR: It is established that the Watson-Crick regular grammars are closed under almost all of the main closure operations, while the differences between other Watson-crick grammARS with their corresponding Chomsky Grammars depend on the computational power of the Watson -Crick gramMars which still need to be studied.

...read moreread less

Abstract: In this paper, we define Watson-Crick context-free grammars, as an extension of Watson-Crick regular grammars and Watson-Crick linear grammars with context-free grammar rules. We show the relation of Watson-Crick (regular and linear) grammars to the sticker systems, and study some of the important closure properties of the Watson-Crick grammars. We establish that the Watson-Crick regular grammars are closed under almost all of the main closure operations, while the differences between other Watson-Crick grammars with their corresponding Chomsky grammars depend on the computational power of the Watson-Crick grammars which still need to be studied.

...read moreread less

Proceedings Article•DOI•

Linearly Ordered Attribute Grammars: with Automatic Augmenting Dependency Selection

[...]

L. Thomas van Binsbergen¹, Jeroen Bransen², Atze Dijkstra²•Institutions (2)

Royal Holloway, University of London¹, Utrecht University²

13 Jan 2015

TL;DR: This paper examines the class of Linearly Ordered Attribute Grammars (LOAGs), for which strict, bounded size evaluators can be generated and applies an augmenting dependency selection algorithm, allowing it to determine membership for the class LOAG.

...read moreread less

Abstract: Attribute Grammars (AGs) extend Context-Free Grammars with attributes: information gathered on the syntax tree that adds semantics to the syntax AGs are very well suited for describing static analyses, code-generation and other phases incorporated in a compiler AGs are divided into classes based on the nature of the dependencies between the attributes In this paper we examine the class of Linearly Ordered Attribute Grammars (LOAGs), for which strict, bounded size evaluators can be generated Deciding whether an Attribute Grammar is linearly ordered is an NP-hard problem The Ordered Attribute Grammars form a subclass of LOAG for which membership is tested in polynomial time by Kastens' algorithm (1980) On top of this algorithm we apply an augmenting dependency selection algorithm, allowing it to determine membership for the class LOAG Although the worst-case complexity of our algorithm is exponential, the algorithm turns out to be efficient for practical full-sized AGs As a result, we can compile the main AG of the Utrecht Haskell Compiler without the manual addition of augmenting dependencies The reader is provided with insight in the difficulty of deciding whether an AG is linearly ordered, what optimistic choice is made by Kastens' algorithm and how augmenting dependencies can resolve these difficulties

...read moreread less

Journal Article•DOI•

Improved normal form for grammars with one-sided contexts

[...]

Alexander Okhotin¹•Institutions (1)

University of Turku¹

11 Jul 2015-Theoretical Computer Science

TL;DR: A normal form is established for formal grammars equipped with operators for specifying the form of the context of a substring, in which extended left contexts are never used, whereas left contexts may be applied only to individual symbols, so that all rules are of the form A ?

...read moreread less

Journal Article•DOI•

Complexity of Problems of Commutative Grammars

[...]

Eryk Kopczynski¹•Institutions (1)

University of Warsaw¹

17 Jan 2015-arXiv: Formal Languages and Automata Theory

TL;DR: Using linear algebra and a branching analog of the classic Euler theorem, it is shown that, under an assumption that the terminal alphabet is fixed, the membership problem for regular grammars is P, and that the equivalence problem for context free grammARS is in $\mathrm{\Pi_2^P}$.

...read moreread less

Abstract: We consider commutative regular and context-free grammars, or, in other words, Parikh images of regular and context-free languages. By using linear algebra and a branching analog of the classic Euler theorem, we show that, under an assumption that the terminal alphabet is fixed, the membership problem for regular grammars (given v in binary and a regular commutative grammar G, does G generate v?) is P, and that the equivalence problem for context free grammars (do G_1 and G_2 generate the same language?) is in $\mathrm{\Pi_2^P}$.

...read moreread less

Book Chapter•DOI•

A Tool for Intersecting Context-Free Grammars and Its Applications

[...]

Graeme Gange¹, Jorge A. Navas², Peter Schachte¹, Harald Søndergaard¹, Peter J. Stuckey¹ - Show less +1 more•Institutions (2)

University of Melbourne¹, Ames Research Center²

30 Jun 2015

TL;DR: In this paper, the authors describe a tool for intersecting context-free grammars for safety verification of recursive multi-threaded programs, using a refinement-based approach.

...read moreread less

Abstract: This paper describes a tool for intersecting context-free grammars. Since this problem is undecidable the tool follows a refinement-based approach and implements a novel refinement which is complete for regularly separable grammars. We show its effectiveness for safety verification of recursive multi-threaded programs.

...read moreread less

Journal Article•DOI•

Parse views with Boolean grammars

[...]

Andrew Stevenson¹, James R. Cordy¹•Institutions (1)

Queen's University¹

01 Jan 2015-Science of Computer Programming

TL;DR: It is described how Boolean grammars can improve programming language expressiveness and be used for agile parsing and its potential for source transformation systems is discussed.

...read moreread less

Book Chapter•DOI•

Learning Conjunctive Grammars and Contextual Binary Feature Grammars

[...]

Ryo Yoshinaka¹•Institutions (1)

Kyoto University¹

02 Mar 2015

TL;DR: It is shown that conjunctive grammars are also learnable by a distributional learning technique, and the learner is stronger than theirs, while theirs does not.

...read moreread less

Abstract: Approaches based on the idea generically called distributional learning have been making great success in the algorithmic learning of context-free languages and their extensions. We in this paper show that conjunctive grammars are also learnable by a distributional learning technique. Conjunctive grammars are context-free grammars enhanced with conjunctive rules to extract the intersection of two languages. We also compare our result with the closely related work by Clark et al. (JMLR 2010) on contextual binary feature grammars (cbfgs). Our learner is stronger than theirs. In particular our learner learns every exact cbfg, while theirs does not. Clark et al. emphasized the importance of exact cbfgs but they only conjectured there should be a learning algorithm for exact cbfgs. This paper shows that their conjecture is true.

...read moreread less

Proceedings Article•DOI•

Applying formal picture languages to procedural content generation

[...]

David Maung¹, Roger Crawfis¹•Institutions (1)

Ohio State University¹

27 Jul 2015

TL;DR: It is shown that 2D regular expressions can be used for enumeration of all possible tilings that can be generated and reason about the theoretical capability of these constructs and develop some practical use cases for their application in procedural content generation for games.

...read moreread less

Abstract: Procedural content generation for games often uses tile sets. Tilings generated with tile sets are equivalent to pictures generated from a fixed alphabet of characters such as those explored in the area of vision. Formal languages over pictures and their methods of definition such as 2D regular expressions, automata, and array grammars are directly applicable to generation of tilings using finite tile sets. Though grammars such as string grammars, L-systems, and graph grammars have been explored and found useful for the definition of certain content, formal methods have mostly been ignored. We introduce 2D regular expressions and array grammars as generators. We reason about the theoretical capability of these constructs and develop some practical use cases for their application in procedural content generation for games. One area lacking with a search based approach to procedural content generation is an enumeration of all possible tilings that can be generated. We show that 2D regular expressions can be used for enumeration.

...read moreread less

Proceedings Article•DOI•

Computational Properties of Watson-Crick Context-Free Grammars

[...]

Nurul Liyana Mohamad Zulkufli¹, Sherzod Turaev¹, Mohd Izzuddin Mohd Tamrin¹, Azeddine Messikh¹, Imad Fakhri Taha Alshaikhli¹ - Show less +1 more•Institutions (1)

International Islamic University Malaysia¹

01 Dec 2015

TL;DR: It is shown that the family of arbitrary sticker languages, generated from arbitrary sticker systems, is included in theFamily of Watson-Crick linear languages,generated from Watson-crick linear grammars.

...read moreread less

Abstract: Deoxyribonucleic acid, or popularly known as DNA, continues to inspire many theoretical computing models, such as sticker systems and Watson-Crick grammars Sticker systems are the abstraction of ligation processes performed on DNA, while Watson-Crick grammars are models motivated from Watson-Crick finite automata and Chomsky grammars Both of these theoretical models benefit from the Watson-Crick complementarity rule In this paper, we establish the results on the relationship between Watson-Crick linear grammars, which is included in Watson-Crick context-free grammars, and sticker systems We show that the family of arbitrary sticker languages, generated from arbitrary sticker systems, is included in the family of Watson-Crick linear languages, generated from Watson-Crick linear grammars

...read moreread less

Proceedings Article•DOI•

Completing Mixed Language Grammars Through Womb Grammars Plus Ontologies

[...]

Ife Adebara¹, Veronica Dahl¹, Sergio Tessaris•Institutions (1)

Simon Fraser University¹

10 Jan 2015

TL;DR: This position paper proposes to detect unspecified information with appropriate ontologies and exploits the descriptive power of constraints both for defining sentence acceptability and for inferring lexical knowledge from a wordâs sentential context, even when foreign.

...read moreread less

Abstract: Womb Grammars are a recently introduced constraint-based methodology for acquiring linguistic information on a given language from that of another, implemented in CHRG (Constraint Handling Rule Grammars). This is a position paper that discusses their possible adaptation to multilingual text parsing. In particular, we propose to detect unspecified information with appropriate ontologies. Our proposed methodology exploits the descriptive power of constraints both for defining sentence acceptability and for inferring lexical knowledge from a wordâs sentential context, even when foreign.

...read moreread less

Journal Article•DOI•

Equivalent Transformations and Regularization in Context-Free Grammars

[...]

Ludmila Fedorchenko¹, Sergey Baranov•Institutions (1)

Saint Petersburg State University¹

01 Jan 2015-Cybernetics and Information Technologies

TL;DR: A method of grammar regularization with the help of an algorithm of eliminating the left/right-hand side recursion of nonterminals which ultimately converts a context-free grammar into a regular one.

...read moreread less

Abstract: Regularization of translational context-free grammar via equivalent transformations is a mandatory step in developing a reliable processor of a formal language defined by this grammar. In the 1970-ies, the multi-component oriented graphs with basic equivalent transformations were proposed to represent a formal grammar of ALGOL-68 in a compiler for IBM/360 compatibles. This paper describes a method of grammar regularization with the help of an algorithm of eliminating the left/right-hand side recursion of nonterminals which ultimately converts a context-free grammar into a regular one. The algorithm is based on special equivalent transformations of the grammar syntactic graph: elimination of recursions and insertion of iterations. When implemented in the system SynGT, it has demonstrated over 25% reduction of the memory size required to store the respective intermediate control tables, compared to the algorithm used in Flex/Bison parsers.

...read moreread less

Posted Content•

Formalization of simplification for context-free grammars

[...]

Marcus Vinícius Midena Ramos, Ruy J. G. B. de Queiroz

07 Sep 2015-arXiv: Formal Languages and Automata Theory

TL;DR: This paper presents a formalization, using the Coq proof assistant, of the fact that general context-free grammars generate languages that can be also generated by simpler and equivalent context- Free Grammars.

...read moreread less

Abstract: Context-free grammar simplification is a subject of high importance in computer language processing technology as well as in formal language theory. This paper presents a formalization, using the Coq proof assistant, of the fact that general context-free grammars generate languages that can be also generated by simpler and equivalent context-free grammars. Namely, useless symbol elimination, inaccessible symbol elimination, unit rules elimination and empty rules elimination operations were described and proven correct with respect to the preservation of the language generated by the original grammar.

...read moreread less

Proceedings Article•DOI•

Grammatical Inference and Language Frameworks for LANGSEC

[...]

Kerry N. Wood¹, Richard Harang²•Institutions (2)

ICF International¹, United States Army Research Laboratory²

21 May 2015

TL;DR: This paper investigates an alternative approach to inferring grammars via pattern languages and elementary formal system frameworks and summarizes inferability results for subclasses of both frameworks and discusses how they map to the Chomsky hierarchy.

...read moreread less

Abstract: Formal Language Theory for Security (LANGSEC) has proposed that formal language theory and grammars be used to define and secure protocols and parsers. The assumption is that by restricting languages to lower levels of the Chomsky hierarchy, it is easier to control and verify parser code. In this paper, we investigate an alternative approach to inferring grammars via pattern languages and elementary formal system frameworks. We summarize inferability results for subclasses of both frameworks and discuss how they map to the Chomsky hierarchy. Finally, we present initial results of pattern language learning on logged HTTP sessions and suggest future areas of research.

...read moreread less

Posted Content•

Formalization of context-free language theory

[...]

Marcus Vinícius Midena Ramos, Ruy Jose G. B. de Queiroz, Nelma Moreira, José B. Almeida

30 Oct 2015-arXiv: Formal Languages and Automata Theory

TL;DR: This paper presents a formalization, using the Coq proof assistant, of fundamental results related to context-free grammars and languages, including closure properties, grammar simplification, and the existence of a Chomsky Normal Form.

...read moreread less

Abstract: Context-free language theory is a subject of high importance in computer language processing technology as well as in formal language theory. This paper presents a formalization, using the Coq proof assistant, of fundamental results related to context-free grammars and languages. These include closure properties (union, concatenation and Kleene star), grammar simplification (elimination of useless symbols inaccessible symbols, empty rules and unit rules) and the existence of a Chomsky Normal Form for context-free grammars.

...read moreread less

Book Chapter•DOI•

Picture Array Generation Using Pure 2D Context-Free Grammar Rules

[...]

K. G. Subramanian¹, M. Geethalakshmi², N. Gnanamalar David³, Atulya K. Nagar¹•Institutions (3)

Liverpool Hope University¹, Queen Mary's College², Madras Christian College³

24 Nov 2015

TL;DR: This paper introduces another variant of P2DCFG that corresponds to "rightmost" rewriting in string context-free grammars and examines the effect of regulating the rewriting in a ri¾?/i½?dP2 DCFG by suitably adapting two well-known control mechanisms in string Grammars, namely, control words and matrix control.

...read moreread less

Abstract: Pure two-dimensional context-free grammar P2DCFG is a simple but effective non-isometric 2D grammar model to generate picture arrays. This 2D grammar uses only one kind of symbol as in a pure string grammar and rewrites in parallel all the symbols in a column or row by a set of context-free type rules. P2DCFG and a variant called li¾?/i¾?uP2DCFG, which was recently introduced motivated by the "leftmost" rewriting mode in string context-free grammars, have been investigated for different properties. In this paper we introduce another variant of P2DCFG that corresponds to "rightmost" rewriting in string context-free grammars. The resulting grammar is called ri¾?/i¾?dP2DCFG and rewrites in parallel all the symbols only in the rightmost column or the lowermost row of a picture array by a set of context-free type rules. Unlike the case of string context-free grammars, the picture language families of P2DCFG and the two variants li¾?/i¾?uP2DCFG and ri¾?/i¾?dP2DCFG are mutually incomparable, although they are not disjoint. We also examine the effect of regulating the rewriting in a ri¾?/i¾?dP2DCFG by suitably adapting two well-known control mechanisms in string grammars, namely, control words and matrix control.

...read moreread less