scispace - formally typeset
Search or ask a question

Showing papers on "Context-free grammar published in 1997"


Book ChapterDOI
01 Apr 1997
TL;DR: A tree generating system called tree-adjoining grammar (TAG) is described and a number of formal results have been established for TAGs, which are of interest to researchers in formal languages and automata, including those interested in tree grammars and tree automata.
Abstract: In this paper, we will describe a tree generating system called tree-adjoining grammar (TAG) and state some of the recent results about TAGs. The work on TAGs is motivated by linguistic considerations. However, a number of formal results have been established for TAGs, which we believe, would be of interest to researchers in formal languages and automata, including those interested in tree grammars and tree automata.

787 citations


Proceedings Article
Eugene Charniak1
27 Jul 1997
TL;DR: A parsing system based upon a language model for English that is, in turn, based upon assigning probabilities to possible parses for a sentence that outperforms previous schemes is described.
Abstract: We describe a parsing system based upon a language model for English that is, in turn, based upon assigning probabilities to possible parses for a sentence. This model is used in a parsing system by finding the parse for the sentence with the highest probability. This system outperforms previous schemes. As this is the third in a series of parsers by different authors that are similar enough to invite detailed comparisons but different enough to give rise to different levels of performance, we also report on some experiments designed to identify what aspects of these systems best explain their relative performance.

582 citations


Book
01 Jan 1997
TL;DR: 1. Origin and Motivation, Formal Language Theory Prerequisites, and A Generalization: n-Contextual Grammars.
Abstract: 1. Origin and Motivation. 2. Formal Language Theory Prerequisites. 3. Contexts (Adjoining) Everywhere. 4. Basic Classes of Contextual Grammars. 5. Generative Capacity. 6. Language Theoretic Properties. 7. Linguistically Relevant Properties. 8. Grammars with Restricted Selection. 9. Grammars with Minimal/Maximal Use of Selectors. 10. Variants of Contextual Grammars. 11. Two-Level Contextual Grammars. 12. Regulated Contextual Grammars. 13. A Generalization: n-Contextual Grammars. 14. A Dual Model: Insertion Grammars. 15. Further Topics. 16. Open Problems and Research Topics. Bibliography. Subject Index.

179 citations


Journal ArticleDOI
TL;DR: A structural characterisation of the reachable markings of Petri nets in which every transition has exactly one input place is provided, and the reachability problem for this class is proved to be NP-complete.
Abstract: The paper provides a structural characterisation of the reachable markings of Petri nets in which every transition has exactly one input place. As a corollary, the reachability problem for this class is proved to be NP-complete. Further consequences are: the uniform word problem for commutative context-free grammars is NP-complete; weak-bisimilarity is semidecidable for Basic Parallel Processes.

151 citations


Book ChapterDOI
01 Jan 1997
TL;DR: This chapter discusses certain most characteristic links between proof theory and formal grammars and aims to persuade the reader of the generic unity of proof structures in appropriate deductive systems and syntactic and semantic structures generated by corresponding Grammars.
Abstract: Publisher Summary In the traditional sense of the term, “mathematical linguistics” is a branch of applied algebra mainly concerned with formal languages, formal grammars, and automata—the latter being purely computational devices that generate formal languages. A natural link between proof theory and semantics has been established by the constructive approaches in logic as the so–called “formulas-as-types” interpretation: typed lambda terms can be interpreted as formal proofs in natural deduction systems. This chapter discusses certain most characteristic links between proof theory and formal grammars. It aims to persuade the reader of the generic unity of proof structures in appropriate deductive systems and syntactic and semantic structures generated by corresponding grammars. The chapter discusses some algebra connected with syntactic structures determined by proofs in the deductive part of grammars. The algebraic models of deductive systems underlying grammars are considered in the chapter. The algebraic models of logical systems are a traditional domain of metalogic. Substructural logics relevant to the theory of grammar give rise to special algebraic structures residuated algebras.

102 citations


Posted Content
TL;DR: Tree Adjoining Grammar (TAG) is described as a system that arises naturally in the process of lexicalizing CFGs, and can be compared directly to an Meaning-Text Model (MTM).
Abstract: The central role of the lexicon in Meaning-Text Theory (MTT) and other dependency-based linguistic theories cannot be replicated in linguistic theories based on context-free grammars (CFGs). We describe Tree Adjoining Grammar (TAG) as a system that arises naturally in the process of lexicalizing CFGs. A TAG grammar can therefore be compared directly to an Meaning-Text Model (MTM). We illustrate this point by discussing the computational complexity of certain non-projective constructions, and suggest a way of incorporating locality of word-order definitions into the Surface-Syntactic Component of MTT.

85 citations


Journal ArticleDOI
TL;DR: Algorithms are provided to implement a YACC-like tool embedded in the VLCC system for automatic compiler generation of visual languages described by positional grammars, based on the "positional grammar" model.
Abstract: The Visual Language Compiler-Compiler (VLCC) is a grammar-based graphical system for the automatic generation of visual programming environments. In this paper the theoretical and algorithmic issues of VLCC are discussed in detail. The parsing methodology we present is based on the "positional grammar" model. Positional grammars naturally extend context-free grammars by considering new relations in addition to string concatenation. Thanks to this, most of the results from LR parsing can be extended to the positional grammars inheriting the well known LR technique efficiency. In particular, we provide algorithms to implement a YACC-like tool embedded in the VLCC system for automatic compiler generation of visual languages described by positional grammars.

71 citations


Proceedings ArticleDOI
06 Oct 1997
TL;DR: The authors present an approach for the generation of components for a software renovation factory that is generated from a context-free grammar definition that recognizes the code that has to be renovated.
Abstract: The authors present an approach for the generation of components for a software renovation factory. These components are generated from a context-free grammar definition that recognizes the code that has to be renovated. They generate analysis and transformation components that can be instantiated with a specific transformation or analysis task. They apply their approach to COBOL and discuss the construction of realistic software renovation components using the approach.

64 citations


Proceedings ArticleDOI
07 Jul 1997
TL;DR: In this paper, it was shown that the problem of identifying linguistically adequate dependency grammars is NP-complete for a wide range of phrase structure-based grammar formalisms, while there is an apparent lack of such results for dependency based formalisms.
Abstract: Results of computational complexity exist for a wide range of phrase structure-based grammar formalisms, while there is an apparent lack of such results for dependency-based formalisms. We here adapt a result on the complexity of ID/LP-grammars to the dependency framework. Contrary to previous studies on heavily restricted dependency grammars, we prove that recognition (and thus, parsing) of linguistically adequate dependency grammars is NP-complete.

61 citations


Patent
30 Jun 1997
TL;DR: In this article, a method for automated generation of tests for software comprises the steps of establishing a set of formal generative requirements, establishing of formal constraining requirements, developing information from a test case structure and encoding the structure as a context free grammar.
Abstract: A method for automated generation of tests for software comprises the steps of establishing a set of formal generative requirements; establishing a set of formal constraining requirements; developing information from a test case structure and encoding the structure as a context free grammar; and applying the generative requirements and the constraining requirements to indicate tests to be included in a test suite.

58 citations


Book ChapterDOI
27 Feb 1997
TL;DR: A new concept of regular expression and context-free grammar for picture languages (sets of matrices over a finite alphabet) is introduced and these two formalisms are compared and connected.
Abstract: We introduce a new concept of regular expression and context-free grammar for picture languages (sets of matrices over a finite alphabet) and compare and connect these two formalisms.


Journal ArticleDOI
TL;DR: It is proved that the property of consistency is guaranteed for all SCFGs without restrictions, when the probability distributions are learned from the classical inside-outside and Viterbi algorithms, both of which are based on growth transformations.
Abstract: An important problem related to the probabilistic estimation of stochastic context-free grammars (SCFGs) is guaranteeing the consistency of the estimated model. This problem was considered by Booth-Thompson (1973) and Wetherell (1980) and studied by Maryanski (1974) and Chaudhuri et al. (1983) for unambiguous SCFGs only, when the probability distributions were estimated by the relative frequencies in a training sample. In this work, we extend this result by proving that the property of consistency is guaranteed for all SCFGs without restrictions, when the probability distributions are learned from the classical inside-outside and Viterbi algorithms, both of which are based on growth transformations. Other important probabilistic properties which are related to these results are also proven.

Journal ArticleDOI
TL;DR: The Chomsky Conjecture is proved for both the full Lambek calculus and its product-free fragment, and a construction of context-free grammars involving only product- free types is presented.
Abstract: In this paper we prove the Chomsky Conjecture (all languages recognized by the Lambek calculus are context-free) for both the full Lambek calculus and its product-free fragment.For the latter case we present a construction of context-free grammars involving only product-free types.

Proceedings ArticleDOI
23 Apr 1997
TL;DR: A context sensitive graph grammar called reserved graph grammar is presented which can explicitly, efficiently and completely describe the syntax of a wide range of diagrams using labeled graphs and its parsing algorithm is of polynomial time complexity in most cases.
Abstract: When implementing textual languages, formal grammars are commonly used to facilitate understanding languages and creating parsers. In the implementation of a diagrammatic visual programming language (VPL), this rarely happens, though graph grammars with their well established theoretical background may be used as a natural and powerful syntax definition formalism. Yet all graph grammar parsing algorithms presented up to now are either unable to recognize interesting visual languages or tend to be hopelessly inefficient (with exponential time complexity) when applied to graphs with a large number of nodes and edges. The paper presents a context sensitive graph grammar called reserved graph grammar which can explicitly, efficiently and completely describe the syntax of a wide range of diagrams using labeled graphs. Moreover its parsing algorithm is of polynomial time complexity in most cases.

Proceedings ArticleDOI
01 Mar 1997
TL;DR: A collection of new and enhanced tools for experimenting with concepts in formal languages and automata theory, written in Java, include JFLAP for creating and simulating finite automata, pushdown automata and Turing machines, and PumpLemma for proving specific languages are not regular.
Abstract: We present a collection of new and enhanced tools for experimenting with concepts in formal languages and automata theory. New tools, written in Java, include JFLAP for creating and simulating finite automata, pushdown automata and Turing machines; Pâte for parsing restricted and unrestricted grammars and transforming context-free grammars to Chomsky Normal Form; and PumpLemma for proving specific languages are not regular. Enhancements to previous tools LLparse and LRparse, instructional tools for parsing LL(1) and LR(1) grammars, include parsing LL(2) grammars, displaying parse trees, and parsing any context-free grammar with conflict resolution.

01 Jan 1997
TL;DR: An evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples is described, using a genetic algorithm, with a fitness function derived from a minimum description length principle.
Abstract: This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a genetic algorithm, with a fitness function derived from a minimum description length principle. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. We provide details of our fitness function for grammars and present the results of a number of experiments in learning grammars for a range of formal languages. Keywords: grammatical inference, genetic algorithms, language modelling, formal languages, induction, minimum description length. Introduction Grammatical inference (Gold 1978) is a fundamental problem in many areas of artificial intelligence and cognitive science, including speech and language processing, syntactic pattern recognition and automated programming. Although a wide variety of techniques for automated grammatical inference have been devis..

Book ChapterDOI
01 Apr 1997
TL;DR: The systematic investigation of natural languages by means of algebraic, combinatorial and set-theoretic models begun in the 1950s concomitantly in Europe and the U.S.A. seems to have given the start in the development of analytical mathematical models of languages.
Abstract: The systematic investigation of natural languages by means of algebraic, combinatorial and set-theoretic models begun in the 1950s concomitantly in Europe and the U.S.A. An important year in this respect seems to be 1957, when Chomsky published his pioneering book [6] concerning the new generative approach to syntactic structures and some Russian mathematicians proposed a conceptual framework for the study of general morphological categories [10], of the category of grammatical case (A. N. Kolmogorov; see [74]), and of the category of part of speech [75], giving the start in the development of analytical mathematical models of languages.

01 Jan 1997
TL;DR: Given a context free grammar (CFG) G and an integer n >= 0 the authors present an algorithm for generating strings derivable from the grammar of length n such that all strings oflength n are equally likely.
Abstract: Given a context free grammar (CFG) G and an integer n >= 0 we present an algorithm for generating strings derivable from the grammar of length n such that all strings of length n are equally likely. The algorithm requires a pre-processing stage which calculates the number of strings of length k <= n derivable from each post x where A! is a production from the grammar. This step requiresO(n2) time and O(n2) space. The subsequent string generation step uses these counts to generate a string in O(n) time and O(n) space.

01 Jan 1997
TL;DR: This chapter contains sections titled: Motivation, The Approximation Method, Formal Properties, Implementation and Example, Informal Analysis, Related Work and Conclusions, Appendix—APSG formalism and example.
Abstract: This chapter contains sections titled: Motivation, The Approximation Method, Formal Properties, Implementation and Example, Informal Analysis, Related Work and Conclusions, Appendix—APSG formalism and example, Acknowledgments, References

Posted Content
TL;DR: In this paper, the authors propose quantum versions of finite-state and push-down automata, and regular and context-free grammars, and find analogs of classical theorems, including pumping lemmas, closure properties, rational and algebraic generating functions, and Greibach normal form.
Abstract: To study quantum computation, it might be helpful to generalize structures from language and automata theory to the quantum case. To that end, we propose quantum versions of finite-state and push-down automata, and regular and context-free grammars. We find analogs of several classical theorems, including pumping lemmas, closure properties, rational and algebraic generating functions, and Greibach normal form. We also show that there are quantum context-free languages that are not context-free.

01 Jan 1997
TL;DR: This algorithm is an extension of Angluin's ID procedure to an incremental framework and is guaranteed to converge to a minimum state DFA corresponding to the target regular grammar.
Abstract: We present an e cient incremental algorithm for learning regular grammars from labeled examples and membership queries This algorithm is an extension of Angluin's ID procedure to an incremental framework The learning algorithm is intermittently provided with labeled examples and has access to a knowledgeable teacher capable of answering membership queries Based on the observed examples and the teacher's responses to membership queries, the learner constructs a deterministic nite automaton (DFA) with which all examples observed thus far are consistent When additional examples are observed, the learner modi es this DFA suitably to encompass the information provided by the new examples In the limit this algorithm is guaranteed to converge to a minimum state DFA corresponding to the target regular grammar We prove the convergence of this algorithm in the limit and analyze its time and space complexities

Posted Content
TL;DR: A method is presented here for calculating finite-state approximations from context-free grammars that is essentially different from the algorithm introduced by Pereira and Wright (1991; 1996), is faster in some cases, and has the advantage of being open-ended and adaptable.
Abstract: Although adequate models of human language for syntactic analysis and semantic interpretation are of at least context-free complexity, for applications such as speech processing in which speed is important finite-state models are often preferred. These requirements may be reconciled by using the more complex grammar to automatically derive a finite-state approximation which can then be used as a filter to guide speech recognition or to reject many hypotheses at an early stage of processing. A method is presented here for calculating such finite-state approximations from context-free grammars. It is essentially different from the algorithm introduced by Pereira and Wright (1991; 1996), is faster in some cases, and has the advantage of being open-ended and adaptable.

Journal ArticleDOI
TL;DR: This paper investigates the concept of unconditional transfer within various forms of regulated grammars like programmedgrammars, matrix grammar, grammarts with regular control, gramMars controlled by bicoloured digraphs, periodically time-variant grammARS and variants thereof, especially regarding their descriptive capacity.
Abstract: In this paper, we investigate the concept of unconditional transfer within various forms of regulated grammars like programmed grammars, matrix grammars, grammars with regular control, grammars controlled by bicoloured digraphs, periodically time-variant grammars and variants thereof, especially regarding their descriptive capacity. In this way, we solve some problems from the literature. Furthermore, we correct a construction from the literature. Most of the results of the present paper have been announced in [11].

Journal Article
TL;DR: Non-constituent objects, complete-llnk and complete-sequence are defined as basic units of dependency structure, and the probabilities of them are reestimated in a reestimation and BFP algorithm for probabilistic dependency grummars (PDG).
Abstract: This paper presents a reesthnation algorithm and a best-first parsing (BFP) algorithm for probabilistic dependency grummars (PDG). The proposed reestimation algorithm is a variation of the inside-outside algorithm adapted to probabilistic dependency grammars. The inside-outside algorithm is a probabilistic parameter reestimation algorithm for phrase structure grammars in Chomsky Normal Form (CNF). Dependency grammar represents a sentence structure as a set of dependency links between arbitrary two words in the sentence, and can not be reestimated by the inside-outside algorithm directly. In this paper, non-constituent objects, complete-llnk and complete-sequence are defined as basic units of dependency structure, and the probabilities of them are reestimated. The reestimation and BFP algorithms utilize CYK-style chart and the nonconstituent objects as chart entries. Both algoritbrn~ have O(n s) time complexities. 1 I n t r o d u c t i o n There have been many efforts to induce grammars automatically from corpus by utilizing the vast amount of corpora with various degrees of annotations. Corpus-based, stochastic grammar induction has many profitable advantages such as simple acquisition and extension of linguistic knowledges, easy treatment of ambiguities by virtue of its innate scoring mechanism, and fail-soi~ reaction to ill-formed or extra-grammatical sentences. Most of corpus-based grammar inductions have concentrated on phrase structure gram° mars (Black, Lafferty, and Roukos, 1992, Lari and Young, 1990, Magerman, 1994). The typical works on phrase structure grammar induction are as follows(Lari and Young, 1990, Carroll, 1992b): (1) generating all the possible rules, (2) reestimating the probabilities of rules using the inside-outside algorithm, and (3) finally finding a stable grammar by eliminating the rules which have probability values close to 0. Generating all the rules is done by restricting the number of nonterminals and/or the number of the right hand side symbols in the rules and enumerating all the possible combinations. Chen extracts rules by some heuristics and reestimates the probabilities of rules using the inside-outside algorithm (Chen, 1995). The inside-outside algorithm learns a grammar by iteratively adjusting the rule probabilities to minimize the training corpus entropy. It is extensively used as reestimation algorithm for phrase structure grammars. Most of the works on phrase structure grammar induction, however, have partially succeeded. Estimating phrase structure grammars by minimizing the training corpus on-

Journal ArticleDOI
TL;DR: The family of recursively enumerable languages is characterized by scattered context Grammars with four nonterminals if these grammars start their derivations from a word rather than a symbol.
Abstract: The family of recursively enumerable languages is characterized by scattered context grammars with four nonterminals. Moreover, this family is characterized by scattered context grammars with three nonterminals if these grammars start their derivations from a word rather than a symbol. Three open problem areas are suggested


Proceedings ArticleDOI
07 Jul 1997
TL;DR: A method is presented here for calculating finite-state approximations from context-free grammars that is essentially different from the algorithm introduced by Pereira and Wright (1991; 1996), is faster in some cases, and has the advantage of being open-ended and adaptable.
Abstract: Although adequate models of human language for syntactic analysis and semantic interpretation are of at least context-free complexity, for applications such as speech processing in which speed is important finite-state models are often preferred. These requirements may be reconciled by using the more complex grammar to automatically derive a finite-state approximation which can then be used as a filter to guide speech recognition or to reject many hypotheses at an early stage of processing. A method is presented here for calculating such finite-state approximations from context-free grammars. It is essentially different from the algorithm introduced by Pereira and Wright (1991; 1996), is faster in some cases, and has the advantage of being open-ended and adaptable.

Book
01 Jan 1997
TL;DR: The author’s research has focused on parsing of Context-Free Languages, which combines Grammar Systems, Contextual Grammars and Formal Languages, and String Editing and Longest Common Subsequences.
Abstract: of Volume 2.- 1. Complexity: A Language-Theoretic Point of View.- 2. Parsing of Context-Free Languages.- 3. Grammars with Controlled Derivations.- 4. Grammar Systems.- 5. Contextual Grammars and Natural Languages.- 6. Contextual Grammars and Formal Languages.- 7. Language Theory and Molecular Genetics.- 8. String Editing and Longest Common Subsequences.- 9. Automata for Matching Patterns.- 10. Symbolic Dynamics and Finite Automata.- 11. Cryptology: Language-Theoretic Aspects.

Book ChapterDOI
25 Aug 1997
TL;DR: A Chomsky normal form theorem is proved for multiplicative valence Grammars, thereby solving the open question of the existence of normal forms for unordered vector grammars and giving an alternative proof of the inclusion of context-free un ordered vector languages in LOG(CFL).
Abstract: Valences are a very simple and yet powerful method of regulated rewriting. In this paper we give an overview on different aspects of this subject. We discuss closure properties of valence languages. It is shown that matrix grammars can be simulated by valence grammars over finite monoids. A Chomsky normal form theorem is proved for multiplicative valence grammars, thereby solving the open question of the existence of normal forms for unordered vector grammars. This also gives an alternative proof of the inclusion of context-free unordered vector languages in LOG(CFL). Moreover, we investigate valences in parallel systems, thereby solving part of open problems posted in [5, p. 267].