scispace - formally typeset
Search or ask a question

Showing papers on "Rule-based machine translation published in 1987"


Proceedings ArticleDOI
06 Jul 1987
TL;DR: The present paper describes a simple parsing algorithm for the "combinatory" extension of Categorial Grammars, which offers a uniform treatment for "spurious" syntactic ambiguities and the "genuine" structural ambiguITIES which any processor must cope with.
Abstract: There has recently been a revival of interest in Categorial Grammars (CG) among computational linguists. The various versions noted below which extend pure CG by including operations such as functional composition have been claimed to offer simple and uniform accounts of a wide range of natural language (NL) constructions involving bounded and unbounded "movement" and coordination "reduction" in a number of languages. Such grammars have obvious advantages for computational applications, provided that they can be parsed efficiently. However, many of the proposed extensions engender proliferating semantically equivalent surface syntactic analyses. These "spurious analyses" have been claimed to compromise their efficient parseability.The present paper describes a simple parsing algorithm for our own "combinatory" extension of CG. This algorithm offers a uniform treatment for "spurious" syntactic ambiguities and the "genuine" structural ambiguities which any processor must cope with, by exploiting the associativity of functional composition and the procedural neutrality of the combinatory rules of grammar in a bottom-up, left-to-right parser which delivers all semantically distinct analyses via a novel unification-based extension of chart-parsing.

69 citations


Proceedings Article
23 Aug 1987
TL;DR: The grammatical formalism developed for the task is a metagrammatical notation which is a more expressive and computationally tractable variant of Generalized Phrase Structure Grammar and a software system is implemented which provides a grammarian with an environment which they have found to be essential for rapid but successful production of a substantial grammar.
Abstract: Natural language grammars with large coverage are typically the result of many person-years of effort, working with clumsy formalisms and sub-optimal software support for grammar development. This paper describes our approach to the task of writing a substantial grammar, as part of a collaboration to produce a general purpose morphological and syntactic analyser for English. The grammatical formalism we have developed for the task is a metagrammatical notation which is a more expressive and computationally tractable variant of Generalized Phrase Structure Grammar. We have also implemented a software system which provides a highly integrated and very powerful set of tools for developing and managing a large grammar based on this notation. The system provides a grammarian with an environment which we have found to be essential for rapid but successful production of a substantial grammar.

67 citations



Proceedings ArticleDOI
07 Jan 1987
TL;DR: This work has suggested that a single unified formalism for specifying the syntax and semantics of the language is likely to result in a simpler, more robust implementation.
Abstract: Intuit ively considered, a grammar is bidirectional if it can be used by processes of approximately equal computat ional complexity to parse and generate sentences of a language. Because we, as computat ional linguists, are concerned with the meaning of the sentences we process, a bidirectional grammar must specify a correspondence between sentences and meaning representations, and this correspondence must be represented in a manner tha t allows one to be computed from the other. Most research in computat ional linguistics has focused on one or the other of the two sides of the problem, with the result tha t relatively little a t tent ion has been given to the issues raised by the incorporation of a single grammar into a system for tasks of both comprehension and generation. Clearly, if it were possible to have t ruly bidirectional grammars in which both parsing and generation processes were efficient, there would be some compelling reasons for adopting them. First , Occam's razor suggests tha t , if language behavior can be explained by hypothesizing only one linguistic representation, such an explanation is clearly preferable to two tha t are applicable in complementary circumstances. Also, from the practical s tandpoint of designing systems tha t will carry on sophisticated dialogues with their users, a single unified formalism for specifying the syntax and semantics of the language is likely to result in a simpler, more robust implementation. The problems of mainta ining consistency between comprehension and generation components when one of them changes have been eliminated. The lexicon is also simpler because its entries need be made but once, and there is no problem of mainta ining consistency between different lexical entries for unders tanding and generation. It is obvious tha t not all grammars are bidirectional. The most fundamental requirement of any bidirectional grammar is tha t it be represented declaratively. If any information is represented procedurally, it must of necessity be represented differently for parsing and generation processes, resulting in an asymmetry between the two. Any change in the grammar would have to be made in two places to mainta in the equivalence between the syntactic and semantic analyses given to sentences by each process. A grammar like DIAGRAM [8] is an example of a grammar for which the encoding of linguistic information

54 citations


Proceedings Article
Fred Lakin1
13 Jul 1987
TL;DR: The research described in this paper aims at spatially parsing expressions in formal visual languages to recover their underlying syntactic structure.
Abstract: In modern user interfaces, graphics play an important role in the communication between human and computer. When a person employs text and graphic objects in communication, those objects have meaning under a system of interpretation, or "visual language." Formal visual languages are ones which have been explicitly designed to be syntactically and semantically unambiguous. The research described in this paper aims at spatially parsing expressions in formal visual languages to recover their underlying syntactic structure. Such "spatial parsing" allows a general purpose graphics editor to be used as a visual language interface, giving the user the freedom to first simply create some text and graphics, and later have the system process those objects under a particular system of interpretation. The task of spatial parsing can be simplified for the interface designer/implementer through the use of visual grammars. For each of the four formal visual languages described in this paper, there is a specifiable set of spatial arrangements of elements for well-formed visual expressions in that language. Visual Grammar Notation is a way to describe those sets of spatial arrangements; the context-free grammars expressed in this notation are not only visual, but also machine-readable, and are used directly to guide the parsing.

49 citations


Journal ArticleDOI
01 May 1987-Lingua
TL;DR: This paper argued that contemporary morphological theories are undermined by the concept of the linguistic sign, the morpheme or the word, which forms the basis of these theories, and argued that grammars must contain autonomous lexical and morphological components in order to have sufficient power to explain the independence of the sets of conditions on lexical, syntactic and morphology rules.

47 citations


Book ChapterDOI
01 Jan 1987
TL;DR: Two less self-evident claims to be examined in this chapter are that different types of linguistic information, or different representational vocabularies, are called on at different points in the creation of a sentence’s syntactic structure.
Abstract: Typically, the production of speech involves the conversion of ideas into sounds. The ideas seem to precede the sounds. These truisms form the rudiments of two less self-evident claims to be examined in this chapter. The first is that different types of linguistic information, or different representational vocabularies, are called on at different points in the creation of a sentence’s syntactic structure. This is the levels-of-processing hypothesis. The second claim is that these levels of processing are hierarchically organized, with no interaction between lower and higher levels. This is the non-interaction hypothesis.

42 citations


Proceedings Article
01 Jan 1987
TL;DR: The efficiency and the potential for parallelism of various attribute grammar evaluation methods are studied, and the design of a combined evaluator, which seeks to combine thepotential for concurrency of dynamic evaluators and the (sequential) efficiency of staticevaluators, is outlined.
Abstract: Experiments with parallel compilation of programming languages are reported. In order to take advantage of the potential parallelism, the language translation process is expressed as an attribute grammar evaluation problem. Three primary benefits to using attribute grammars are noted. The efficiency and the potential for parallelism of various attribute grammar evaluation methods are studied, and the design of a combined evaluator, which seeks to combine the potential for concurrency of dynamic evaluators and the (sequential) efficiency of static evaluators, is outlined. The methods were used to generate a parallel compiler for a large Pascal subset.

34 citations


Proceedings Article
01 Jan 1987

34 citations


Journal ArticleDOI
TL;DR: It is argued that Paninian-style generative rules and meta-rules could assist in further advances in NLP.

30 citations


Journal ArticleDOI
TL;DR: Two ‘gap’ theorems are shown for languages formed with words that fail to be prefixes of an infinite word: such languages can never be described by unambiguous context-free grammars.

Proceedings ArticleDOI
06 Jul 1987
TL;DR: It is shown that one benefit of FUG, the ability to state global constraints on choice separately from syntactic rules, is difficult in generation systems based on augmented context free grammars (e.g., Definite Clause Grammars).
Abstract: In this paper, we show that one benefit of FUG, the ability to state global constraints on choice separately from syntactic rules, is difficult in generation systems based on augmented context free grammars (e.g., Definite Clause Grammars). They require that such constraints be expressed locally as part of syntactic rules and therefore, duplicated in the grammar. Finally, we discuss a reimplementation of FUG that achieves the similar levels of efficiency as Rubinoff's adaptation of MUMBLE, a deterministic language generator.

Proceedings Article
23 Aug 1987
TL;DR: The universal parser provides principled run-time integration of syntax and semantics, while preserving the generality of domain-independent syntactic grammars, and language-independent domain knowledge bases; the optimized cross product is generated automatically in the precornpllation phase.
Abstract: Machine translation should be semanticalty-accurate, linguistically-principled, user-interactive, and extensible to multiple languages and domains. This paper presents the universal parser architecture that strives to meet these objectives. In essence, linguistic knowledge bases (syntactic, semantic, lexical, pragmatic), encoded in theoretically-motivated formalisms such as lexical-functional grammars, are unified and precompiled into fast run-time grammars for parsing and generation. Thus, the universal parser provides principled run-time integration of syntax and semantics, while preserving the generality of domain-independent syntactic grammars, and language-independent domain knowledge bases; the optimized cross product is generated automatically in the precornpllation phase. Initial results for bi-directional English-Japanese translation show considerable promise both in terms of demonstrating the theoretical feasibillty of the approach and in terms of subsequent practical utility.

Proceedings Article
23 Aug 1987
TL;DR: Direct Memory Access Translation (DMTRANS) is a theory of translation developed at CMT of CMU in which translation is viewd as an integrated part of cognitive processing.
Abstract: Direct Memory Access Translation (DMTRANS) is a theory of translation developed at CMT of CMU in which translation is viewd as an integrated part of cognitive processing. In this paradigm, understanding in source language is a recognition of input in terms of existing knowledge in memory and integration of the input into the memory. Context of sentences are established as what is left in memory after understanding previous sentences (or a preceding part of a sentence). Decisions made during translation are influenced by what is dynamically modified in memory through preceding recognitions. Since knowledge in memory is directly shared with the rest of cognition, during translation other cognitive processes such as inference can dynamically participate in the translation process.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: A project of machine translation of Czech computer manuals into Russian is described, presenting first a description of the overall system structure and concentrating then mainly on input text preparation and a parsing algorithm based on bottom-up parser programmed in Colmerauer's Q-systems.
Abstract: A project of machine translation of Czech computer manuals into Russian is described, presenting first a description of the overall system structure and concentrating then mainly on input text preparation and a parsing algorithm based on bottom-up parser programmed in Colmerauer's Q-systems.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: A recent extension of the linguistic framework of the Rosetta system is discussed, which enables us to divide a grammar into subgrammars in a linguistically motivated way and to control explicitly the application of rules in a subgrammar.
Abstract: The paper discusses a recent extension of the linguistic framework of the Rosetta system. The original framework is elegant and has proved its value in practice, but it also has a number of deficiencies, of which the most salient is the impossibility to assign an explicit structure to the grammars. This may cause problems, especially in a situation where large grammars have to be written by a group of people. The newly developed framework enables us to divide a grammar into subgrammars in a linguistically motivated way and to control explicitly the application of rules in a subgrammar. On the other hand it enables us to divide the set of grammar rules into rule classes in such a way that we get hold of the more difficult translation relations. The use of both these divisions naturally leads to a highly modular structure of the system, which helps in controlling its complexity. We will show that these divisions also give insight into a class of difficult translation problems in which there is a mismatch of categories.

Proceedings Article
13 Jul 1987
TL;DR: The UNITRAN system relies on principle-based descriptions of grammar rather than rule-oriented descriptions, and is based on linguistically motivated principles and their associated parameters of variation.
Abstract: Machine translation has been a particularly difficult problem in the area of Natural Language Processing for over two decades. Early approaches to translation failed in part because interaction effects of complex phenomena made translation appear to be unmanageable. Later approaches to the problem have succeeded but are based on many language-specific rules. To capture all natural language phenomena, rulebased systems require an overwhelming number of rules; thus, such translation systems either have limited coverage, or poor performance due to formidable grammar size. This paper presents an implementation of an "interlingual" approach to natural language translation. The UNITRAN system relies on principle-based descriptions of grammar rather than rule-oriented descriptions. The model is based on linguistically motivated principles and their associated parameters of variation. Because a few principles cover all languages, the unmanageable grammar size of alternative approaches is no longer a problem.


Journal ArticleDOI
TL;DR: A formal model of the mental representation of task languages that allows the representation of family resemblances between individual task-action mappings and makes predictions about the relative learnability of different task language designs.
Abstract: We present a formal model of the mental representation of task languages. The model is a metalanguage for defining task-action grammars: generative grammars which rewrite simple tasks into action specification. Important features of the model are: (1) Identification of the "simple tasks" that users can perform routinely and which require no control structure; (2) Representation of simple tasks by collections of semantic components reflecting a categorisation of the task world; (3) Marking of tokens in rewrite rules with the semantic features of the task world to supply selection restrictions on the rewriting of simple tasks into action specifications. This device allows the representation of family resemblances between individual task-action mappings. Simple complexity metrics over task-action grammars make predictions about the relative learnability of different task language designs. Some empirical support for these predictions is derived from the existing empirical literature on command language learning and from two unreported experiments. Task-action grammars also provide designers with an analytic tool for exposing the configural properties of task languages.

Patent
25 Jun 1987
TL;DR: A translation machine system capable of displaying the original input sentence and the sentence translated from the original sentence includes a device which optionally splits a sentence into multiple sentences, and a device that connects multiple sentences into a single sentence.
Abstract: A translation machine system capable of displaying the original input sentence and the sentence translated from the original sentence includes a device which optionally splits a sentence into multiple sentences, and a device which connects multiple sentences into a single sentence.

Proceedings Article
L. J. De Haas1
23 Aug 1987
TL;DR: A system is described which can generate programs for machine vision systems which have to measure a number of parameters of an industrial object in a camera image using context-free attribute grammars.
Abstract: A system is described which can generate programs for machine vision systems which have to measure a number of parameters of an industrial object in a camera image. Programs are generated starting from descriptive object models. The object models used are context-free attribute grammars, hence the generated programs are parsers. Errors in generated programs, caused by using inaccurate models, are screened by comparing the measurements produced by generated programs with values of the desired parameters provided for example images.

Journal ArticleDOI
TL;DR: A linguistic model for two system elements, the knowledge base and recognition strategy, being an extension of pattern recognition approaches, is outlined, which consists of a powerful object specification language and a simultaneous syntactic-semantic analysis in this language.

Proceedings ArticleDOI
01 Apr 1987
TL;DR: The linguistic point of view, this work tries to elucidate the complexity of the inflectional system using a lexical model which follows the recent work by Lieber, 1980, Selkirk 1982, Kiparsky 1982, and others.
Abstract: In this paper, we present a morphological processor for Modern Greek.From the linguistic point of view, we try to elucidate the complexity of the inflectional system using a lexical model which follows the recent work by Lieber, 1980, Selkirk 1982, Kiparsky 1982, and others.The implementation is based on the concept of "validation grammars" (Courtin 1977).The morphological processing is controlled by a finite automaton and it combinesa. a dictionary containing the stems for a representative fragment of Modern Greek and all the inflectional affixes withb. a grammar which carries out the transmission of the linguistic information needed for the processing. The words are structured by concatenating a stem with an inflectional part. In certain cases, phonological rules are added to the grammar in order to capture lexical phonological phenomena.

01 May 1987
TL;DR: The semantic interface between a systemic functional grammer for text generation and the environment the grammar operates in is described and the framework is presented as a kind of semantics for systemic grammars and also relates it to other semantic approaches.
Abstract: : This report describes the semantic interface between a systemic functional grammer for text generation and the environment the grammar operates in. The grammar is organized as a network of choice points and the semantic interface provides a method for making the grammatical choices in a purposeful way. Each grammatical choice point is equipped with its own semantic procedure for choosing: one or more questions are addressed to one of the components of the environment, such as the knowledge base, so that the information needed to select the appropriate choice alternative can be obtained. The paper presents the framework as a kind of semantics for systemic grammars and also relates it to other semantic approaches. Keywords: Artificial intelligence, Choice experts, Choosers, Text generation, Inquiry semantics, Natural language, Nigel, Penman, Systemic semantics.

Journal ArticleDOI
01 Sep 1987-Robotics
TL;DR: The concept of programmed grammars is generalized to two-dimensional programmedgrammars and the results may have useful applications in picture description, region filling, visual languages, pattern recognition, robotics, and artificial intelligence.

Journal ArticleDOI
TL;DR: This article found that the combination of experts' judgments and syntactic information about the sentence on which the item was based collectively predicted difficulty better than either judgment or syntactically information alone.
Abstract: An earlier investigation (Bejar, 1983) had argued that experts' judgment of item difficulty could perhaps be usefully supplemented with linguistic information about the sentence from which the item was derived. To investigate that idea, we analyzed items from the earlier study to determine their syntactic structure. Three potential independent variables were studied by themselves and in conjunction with subject-matter ratings. The analysis suggested that the combination of experts' judgments and syntactic information about the sentence on which the item was based collectively predicted difficulty better than either judgment or syntactic information alone. Moreover, the proportions of variance in item difficulty accounted for by the judgments and syntactic information was 31%.

Book ChapterDOI
01 Jan 1987
TL;DR: From the very beginning of the modern study of generative grammar, there has been another “functional” motivation that has been used — though less frequently — to constrain the class of possible grammars, the demand of parsability or (in its dual sense), of generability.
Abstract: A principal goal of modern linguistic theory has been to formulate constraints on grammars so as to explain how it is that linguistic knowledge can be acquired. The aim has been to narrow the class of possible grammars so that it is as small as possible, consistent with observed variation in natural grammars. But from the very beginning of the modern study of generative grammar, there has been another “functional” motivation that has been used — though less frequently — to constrain the class of possible grammars. This is the demand of parsability or (in its dual sense), of generability. For example, we might require that natural grammars be amenable to “easy” recognition or generation, in some sense. This demand has actually been explicit since the earliest days of the field, as the following quote from Chomsky’s Morphophonemics of Modern Hebrew [1951] indicates: The criteria of simplicity governing the ordening of statements is as follows: that the shorter grammar is the simpler, and that among equally short grammars, the simplest is that in which the average lenght of derivation of sentences is least.


Book ChapterDOI
Steven Cushing1
01 Jan 1987
TL;DR: This paper concerns itself with the second of these two questions, the expressive power of language, the semantic contents that can or cannot be conveyed by a sentence, word, or phrase, and two explanatory principles that appear to be providing it with at least the beginnings of an answer.
Abstract: Contemporary linguistic theory has concerned itself, to a large extent, with the question of which possible grammars are capable of serving as the actual grammars of natural languages. Considerably less attention has been given to the cognate question of which possible meanings are capable of serving as meanings actually expressed by natural languages. The first question deals with the so-called generative power of language, the grammatical structures that can or cannot relate the constituents that comprise a sentence, word, or phrase. The second question deals with the expressive power of language, the semantic contents that can or cannot be conveyed by a sentence, word, or phrase. In this paper, we will concern ourselves with the second of these two questions and, in particular, with two explanatory principles that appear to be helpfuI’in providing it with at least the beginnings of an answer.1

Proceedings ArticleDOI
01 Apr 1987
TL;DR: This paper is the first to report on parsing details specifically for synthesis, while using only a small dictionary (of about 300 words), and uses a set of syntactic constraints to estimate which words are likely to form phrases.
Abstract: In automatic synthesis of speech from English text, the quality of the output speech is highly dependent upon realistic intonation patterns. Most synthesizers have difficulty obtaining sufficient linguistic information from an input text to specify intonation properly. The syntactic structure of the text often specifies where a speaker should pause and which words to stress. However, the problem of parsing natural English is as yet unsolved. The problem is further complicated in systems which may wish to limit memory space and access time by minimizing dictionary size. In most other references, the parsing problem is only mentioned in passing, or parsing occurs on a local basis, ignoring important syntactic structures that encompass the entire sentence. The system described here recognizes function words and uses a set of syntactic constraints to estimate which words are likely to form phrases. This paper is the first to report on parsing details specifically for synthesis, while using only a small dictionary (of about 300 words).