scispace - formally typeset
Search or ask a question

Showing papers in "Computational Linguistics in 2012"


Journal ArticleDOI
TL;DR: The REG problem is introduced and early work in this area is described, discussing what basic assumptions lie behind it, and showing how its remit has widened in recent years.
Abstract: This article offers a survey of computational research on referring expression generation (REG). It introduces the REG problem and describes early work in this area, discussing what basic assumptions lie behind it, and showing how its remit has widened in recent years. We discuss computational frameworks underlying REG, and demonstrate a recent trend that seeks to link REG algorithms with well-established Knowledge Representation techniques. Considerable attention is given to recent efforts at evaluating REG algorithms and the lessons that they allow us to learn. The article concludes with a discussion of the way forward in REG, focusing on references in larger and more realistic settings.

352 citations


Journal ArticleDOI
TL;DR: A linguistic-oriented computational model is put forward which has at its core an algorithm articulating the effect of factuality relations across levels of syntactic embedding, implemented in De Facto, a factuality profiler for eventualities mentioned in text and tested against a corpus built specifically for the task.
Abstract: Identifying the veracity, or factuality, of event mentions in text is fundamental for reasoning about eventualities in discourse. Inferences derived from events judged as not having happened, or as being only possible, are different from those derived from events evaluated as factual. Event factuality involves two separate levels of information. On the one hand, it deals with polarity, which distinguishes between positive and negative instantiations of events. On the other, it has to do with degrees of certainty (e.g., possible, probable), an information level generally subsumed under the category of epistemic modality. This article aims at contributing to a better understanding of how event factuality is articulated in natural language. For that purpose, we put forward a linguistic-oriented computational model which has at its core an algorithm articulating the effect of factuality relations across levels of syntactic embedding. As a proof of concept, this model has been implemented in De Facto, a factuality profiler for eventualities mentioned in text, and tested against a corpus built specifically for the task, yielding an F1 of 0.70 (macro-averaging) and 0.80 (micro-averaging). These two measures mutually compensate for an over-emphasis present in the other (either on the lesser or greater populated categories), and can therefore be interpreted as the lower and upper bounds of the De Facto's performance.

150 citations


Journal ArticleDOI
TL;DR: This work extends the FactBank corpus, which contains semantically driven veridicality annotations, with pragmatically informed ones and shows that context and world knowledge play a significant role in shaping verdicality.
Abstract: Natural language understanding depends heavily on assessing veridicality-whether events mentioned in a text are viewed as happening or not-but little consideration is given to this property in current relation and event extraction systems. Furthermore, the work that has been done has generally assumed that veridicality can be captured by lexical semantic properties whereas we show that context and world knowledge play a significant role in shaping veridicality. We extend the FactBank corpus, which contains semantically driven veridicality annotations, with pragmatically informed ones. Our annotations are more complex than the lexical assumption predicts but systematic enough to be included in computational work on textual understanding. They also indicate that veridicality judgments are not always categorical, and should therefore be modeled as distributions. We build a classifier to automatically assign event veridicality distributions based on our new annotations. The classifier relies not only on lexical features like hedges or negations, but also on structural features and approximations of world knowledge, thereby providing a nuanced picture of the diverse factors that shape veridicality. "All I know is what I read in the papers" -Will Rogers

140 citations


Journal ArticleDOI
TL;DR: An overview of how modality and negation have been modeled in computational linguistics is provided.
Abstract: Traditionally, most research in NLP has focused on propositional aspects of meaning. To truly understand language, however, extra-propositional aspects are equally important. Modality and negation typically contribute significantly to these extra-propositional meaning aspects. Although modality and negation have often been neglected by mainstream computational linguistics, interest has grown in recent years, as evidenced by several annotation projects dedicated to these phenomena. Researchers have started to work on modeling factuality, belief and certainty, detecting speculative sentences and hedging, identifying contradictions, and determining the scope of expressions of modality and negation. In this article, we will provide an overview of how modality and negation have been modeled in computational linguistics.

139 citations


Journal ArticleDOI
TL;DR: This article explores a combination of deep and shallow approaches to the problem of resolving the scope of speculation and negation within a sentence, specifically in the domain of biomedical research literature and shows that although both approaches perform well in isolation, even better results can be obtained by combining them.
Abstract: This article explores a combination of deep and shallow approaches to the problem of resolving the scope of speculation and negation within a sentence, specifically in the domain of biomedical research literature. The first part of the article focuses on speculation. After first showing how speculation cues can be accurately identified using a very simple classifier informed only by local lexical context, we go on to explore two different syntactic approaches to resolving the in-sentence scopes of these cues. Whereas one uses manually crafted rules operating over dependency structures, the other automatically learns a discriminative ranking function over nodes in constituent trees. We provide an in-depth error analysis and discussion of various linguistic properties characterizing the problem, and show that although both approaches perform well in isolation, even better results can be obtained by combining them, yielding the best published results to date on the CoNLL-2010 Shared Task data. The last part of the article describes how our speculation system is ported to also resolve the scope of negation. With only modest modifications to the initial design, the system obtains state-of-the-art results on this task also.

92 citations


Journal ArticleDOI
TL;DR: It is found that contextual information and final intonation figure as the most salient cues to automatic disambiguation of affirmative cue words, a family of cue words that speakers use frequently in conversation.
Abstract: We present a series of studies of affirmative cue words-a family of cue words such as "okay" or "alright" that speakers use frequently in conversation. These words pose a challenge for spoken dialogue systems because of their ambiguity: They may be used for agreeing with what the interlocutor has said, indicating continued attention, or for cueing the start of a new topic, among other meanings. We describe differences in the acoustic/prosodic realization of such functions in a corpus of spontaneous, task-oriented dialogues in Standard American English. These results are important both for interpretation and for production in spoken language applications. We also assess the predictive power of computational methods for the automatic disambiguation of these words. We find that contextual information and final intonation figure as the most salient cues to automatic disambiguation.

78 citations


Journal ArticleDOI
TL;DR: A unified subcategorization of semantic uncertainty as different domain applications can apply different uncertainty categories is introduced and the domain adaptation for training the models offer an efficient solution for cross-domain and cross-genre semantic uncertainty recognition.
Abstract: Uncertainty is an important linguistic phenomenon that is relevant in various Natural Language Processing applications, in diverse genres from medical to community generated, newswire or scientific discourse, and domains from science to humanities. The semantic uncertainty of a proposition can be identified in most cases by using a finite dictionary (i.e., lexical cues) and the key steps of uncertainty detection in an application include the steps of locating the (genre-and domain-specific) lexical cues, disambiguating them, and linking them with the units of interest for the particular application (e.g., identified events in information extraction). In this study, we focus on the genre and domain differences of the context-dependent semantic uncertainty cue recognition task. We introduce a unified subcategorization of semantic uncertainty as different domain applications can apply different uncertainty categories. Based on this categorization, we normalized the annotation of three corpora and present results with a state-of-the-art uncertainty cue recognition model for four fine-grained categories of semantic uncertainty. Our results reveal the domain and genre dependence of the problem; nevertheless, we also show that even a distant source domain data set can contribute to the recognition and disambiguation of uncertainty cues, efficiently reducing the annotation costs needed to cover a new domain. Thus, the unified subcategorization and domain adaptation for training the models offer an efficient solution for cross-domain and cross-genre semantic uncertainty recognition.

77 citations


Journal ArticleDOI
TL;DR: Using a corpus of implicit arguments for ten predicates from NomBank, a discriminative model is trained that is able to identify implicit arguments with an F1 score of 50%, significantly outperforming an informed baseline model.
Abstract: Nominal predicates often carry implicit arguments. Recent work on semantic role labeling has focused on identifying arguments within the local context of a predicate; implicit arguments, however, have not been systematically examined. To address this limitation, we have manually annotated a corpus of implicit arguments for ten predicates from NomBank. Through analysis of this corpus, we find that implicit arguments add 71% to the argument structures that are present in NomBank. Using the corpus, we train a discriminative model that is able to identify implicit arguments with an F1 score of 50%, significantly outperforming an informed baseline model. This article describes our investigation, explores a wide variety of features important for the task, and discusses future directions for work on implicit argument identification.

77 citations


Journal ArticleDOI
TL;DR: In this article, a vector lattice ordering is used to represent textual entailment, inspired by a strengthened form of the distributional hypothesis, and a degree of entailment is defined in the form of a conditional probability.
Abstract: Formalizing “meaning as context” mathematically leads to a new, algebraic theory of meaning, in which composition is bilinear and associative These properties are shared by other methods that have been proposed in the literature, including the tensor product, vector addition, point-wise multiplication, and matrix multiplicationEntailment can be represented by a vector lattice ordering, inspired by a strengthened form of the distributional hypothesis, and a degree of entailment is defined in the form of a conditional probability Approaches to the task of recognizing textual entailment, including the use of subsequence matching, lexical entailment probability, and latent Dirichlet allocation, can be described within our framework

61 citations


Journal ArticleDOI
TL;DR: The resulting system significantly outperformed a linguistically naive baseline model, and reached the highest scores yet reported on the NIST 2009 Urdu–English test set, supports the hypothesis that both syntactic and semantic information can improve translation quality.
Abstract: This article describes the resource-and system-building efforts of an 8-week Johns Hopkins University Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically Informed Machine Translation (SIMT). We describe a new modality/negation (MN) annotation scheme, the creation of a (publicly available) MN lexicon, and two automated MN taggers that we built using the annotation scheme and lexicon. Our annotation scheme isolates three components of modality and negation: a trigger (a word that conveys modality or negation), a target (an action associated with modality or negation), and a holder (an experiencer of modality). We describe how our MN lexicon was semi-automatically produced and we demonstrate that a structure-based MN tagger results in precision around 86% (depending on genre) for tagging of a standard LDC data set. We apply our MN annotation scheme to statistical machine translation using a syntactic framework that supports the inclusion of semantic annotations. Syntactic tags enriched with semantic annotations are assigned to parse trees in the target-language training texts through a process of tree grafting. Although the focus of our work is modality and negation, the tree grafting procedure is general and supports other types of semantic information. We exploit this capability by including named entities, produced by a pre-existing tagger, in addition to the MN elements produced by the taggers described here. The resulting system significantly outperformed a linguistically naive baseline model (Hiero), and reached the highest scores yet reported on the NIST 2009 Urdu-English test set. This finding supports the hypothesis that both syntactic and semantic information can improve translation quality.

50 citations


Journal ArticleDOI
TL;DR: This work brings together insights obtained from empirical studies in order to determine what should be contained in the summaries of this form of non-linguistic input data, and how the information required for realizing the selected content can be extracted from the visual image and the textual components of the graphic.
Abstract: Information graphics (such as bar charts and line graphs) play a vital role in many multimodal documents. The majority of information graphics that appear in popular media are intended to convey a message and the graphic designer uses deliberate communicative signals, such as highlighting certain aspects of the graphic, in order to bring that message out. The graphic, whose communicative goal (intended message) is often not captured by the document’s accompanying text, contributes to the overall purpose of the document and cannot be ignored. This article presents our approach to providing the high-level content of a non-scientific information graphic via a brief textual summary which includes the intended message and the salient features of the graphic. This work brings together insights obtained from empirical studies in order to determine what should be contained in the summaries of this form of non-linguistic input data, and how the information required for realizing the selected content can be extracted from the visual image and the textual components of the graphic. This work also presents a novel bottom–up generation approach to simultaneously construct the discourse and sentence structures of textual summaries by leveraging different discourse related considerations such as the syntactic complexity of realized sentences and clause embeddings. The effectiveness of our work was validated by different evaluation studies.

Journal ArticleDOI
TL;DR: This work aims to reduce the annotation effort involved in creating resources for semantic role labeling via semi-supervised learning by formalizing the detection of similar sentences and the projection of role annotations as a graph alignment problem, which it solves exactly using integer linear programming.
Abstract: Large-scale annotated corpora are a prerequisite to developing high-performance semantic role labeling systems. Unfortunately, such corpora are expensive to produce, limited in size, and may not be representative. Our work aims to reduce the annotation effort involved in creating resources for semantic role labeling via semi-supervised learning. The key idea of our approach is to find novel instances for classifier training based on their similarity to manually labeled seed instances. The underlying assumption is that sentences that are similar in their lexical material and syntactic structure are likely to share a frame semantic analysis. We formalize the detection of similar sentences and the projection of role annotations as a graph alignment problem, which we solve exactly using integer linear programming. Experimental results on semantic role labeling show that the automatic annotations produced by our method improve performance over using hand-labeled instances alone.

Journal ArticleDOI
TL;DR: A global inference algorithm is proposed that learns on the fly all entailment rules between predicates that co-occur with this concept, and uses a global transitivity constraint on the graph to learn the optimal set of edges.
Abstract: Identifying entailment relations between predicates is an important part of applied semantic inference. In this article we propose a global inference algorithm that learns such entailment rules. First, we define a graph structure over predicates that represents entailment relations as directed edges. Then, we use a global transitivity constraint on the graph to learn the optimal set of edges, formulating the optimization problem as an Integer Linear Program. The algorithm is applied in a setting where, given a target concept, the algorithm learns on the fly all entailment rules between predicates that co-occur with this concept. Results show that our global algorithm improves performance over baseline algorithms by more than 10%.

Journal ArticleDOI
TL;DR: A study on the automatic acquisition of semantic classes for Catalan adjectives from distributional and morphological information, with particular emphasis on polysemous adjectives, and argues that the second model, which models regular polysemy in terms of simultaneous membership to multiple basic classes, is both theoretically and empirically more adequate.
Abstract: We present a study on the automatic acquisition of semantic classes for Catalan adjectives from distributional and morphological information, with particular emphasis on polysemous adjectives. The aim is to distinguish and characterize broad classes, such as qualitative (gran ‘big’) and relational (pulmonar ‘pulmonary’) adjectives, as well as to identify polysemous adjectives such as economic (‘economic ∣ cheap’). We specifically aim at modeling regular polysemy, that is, types of sense alternations that are shared across lemmata. To date, both semantic classes for adjectives and regular polysemy have only been sparsely addressed in empirical computational linguistics.Two main specific questions are tackled in this article. First, what is an adequate broad semantic classification for adjectives? We provide empirical support for the qualitative and relational classes as defined in theoretical work, and uncover one type of adjective that has not received enough attention, namely, the event-related class. Se...

Journal ArticleDOI
TL;DR: A computational model is described for planning phrases like “more than a quarter” and “25.9 per cent” which describe proportions at different levels of precision which are modeled as a constraint satisfaction problem, with solutions subsequently ranked by preferences.
Abstract: We describe a computational model for planning phrases like "more than a quarter" and "25.9 per cent" which describe proportions at different levels of precision. The model lays out the key choices in planning a numerical description, using formal definitions of mathematical form (e.g., the distinction between fractions and percentages) and roundness adapted from earlier studies. The task is modeled as a constraint satisfaction problem, with solutions subsequently ranked by preferences (e.g., for roundness). Detailed constraints are based on a corpus of numerical expressions collected in the NumGen project,1 and evaluated through empirical studies in which subjects were asked to produce (or complete) numerical expressions in specified contexts.

Journal ArticleDOI
TL;DR: The large scale distributed composite language model gives drastic perplexity reduction over n-grams and achieves significantly better translation quality measured by the Bleu score and “readability” of translations when applied to the task of re-ranking the N-best list from a state-of-the-art parsing-based machine translation system.
Abstract: This paper presents an attempt at building a large scale distributed composite language model that is formed by seamlessly integrating an n-gram model, a structured language model, and probabilistic latent semantic analysis under a directed Markov random field paradigm to simultaneously account for local word lexical information, mid-range sentence syntactic structure, and long-span document semantic content. The composite language model has been trained by performing a convergent N-best list approximate EM algorithm and a follow-up EM algorithm to improve word prediction power on corpora with up to a billion tokens and stored on a supercomputer. The large scale distributed composite language model gives drastic perplexity reduction over n-grams and achieves significantly better translation quality measured by the Bleu score and "readability" of translations when applied to the task of re-ranking the N-best list from a state-of-the-art parsing-based machine translation system.

Journal ArticleDOI
TL;DR: This work presents methods for reducing the worst-case and typical-case complexity of a context-free parsing pipeline via hard constraints derived from finite-state pre-processing and demonstrates that this method generalizes across multiple grammars and is complementary to other pruning techniques by presenting empirical results for both exact and approximate inference.
Abstract: We present methods for reducing the worst-case and typical-case complexity of a context-free parsing pipeline via hard constraints derived from finite-state pre-processing. We perform On predictions to determine if each word in the input sentence may begin or end a multi-word constituent in chart cells spanning two or more words, or allow single-word constituents in chart cells spanning the word itself. These pre-processing constraints prune the search space for any chart-based parsing algorithm and significantly decrease decoding time. In many cases cell population is reduced to zero, which we term chart cell "closing." We present methods for closing a sufficient number of chart cells to ensure provably quadratic or even linear worst-case complexity of context-free inference. In addition, we apply high precision constraints to achieve large typical-case speedups and combine both high precision and worst-case bound constraints to achieve superior performance on both short and long strings. These bounds on processing are achieved without reducing the parsing accuracy, and in some cases accuracy improves. We demonstrate that our method generalizes across multiple grammars and is complementary to other pruning techniques by presenting empirical results for both exact and approximate inference using the exhaustive CKY algorithm, the Charniak parser, and the Berkeley parser. We also report results parsing Chinese, where we achieve the best reported results for an individual model on the commonly reported data set.

Journal ArticleDOI
TL;DR: This work presents a framework for empirical risk minimization of probabilistic grammars using the log-loss, and derives sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting.
Abstract: Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk.

Journal ArticleDOI
TL;DR: A record of my encounters with language and my changing views of what one ought to believe about language and how one might represent its properties are offered in this essay.
Abstract: First of all, I am overwhelmed and humbled by the honor the ACL Executive Committee has shown me, an honor that should be shared by the colleagues and students I’ve been lucky enough to have around me this past decade-and-a-half while I’ve been engaged in the FrameNet Project at the International Computer Science Institute in Berkeley. I’ve been asked to say something about the evolution of the ideas behind the work with which I’ve been associated, so my remarks will be a bit more autobiographical than I might like. I’d like to comment on my changing views of what language is like, and how the facts of language can be represented. As I am sure the ACL Executive Committee knows, I have never been a direct participant in efforts in language engineering, but I have been a witness to, a neighbor of, and an indirect participant in some parts of it, and I have been pleased to learn that some of the resources my colleagues and I are building have been found by some researchers to be useful. I offer a record of my encounters with language and my changing views of what one ought to believe about language and how one might represent its properties. In the course of the narrative I will take note of changes I have observed over the past seven decades or so in both technical and conceptual tools in linguistics and language engineering. One theme in this essay is how these tools, and the representations they support, obscure or reveal the properties of language and therefore affect what one might believe about language. The time frame my life occupies has presented many opportunities to ponder this complex relationship.

Journal ArticleDOI
TL;DR: It has been claimed in the literature that for every tree-adjoining grammar, one can construct a strongly equivalent lexicalized version, but it is shown that such a procedure does not exist: Tree-ad joining grammars are not closed under strong lexicalization.
Abstract: A lexicalized tree-adjoining grammar is a tree-adjoining grammar where each elementary tree contains some overt lexical item. Such grammars are being used to give lexical accounts of syntactic phenomena, where an elementary tree defines the domain of locality of the syntactic and semantic dependencies of its lexical items. It has been claimed in the literature that for every tree-adjoining grammar, one can construct a strongly equivalent lexicalized version. We show that such a procedure does not exist: Tree-adjoining grammars are not closed under strong lexicalization.

Journal ArticleDOI
TL;DR: An algorithm is presented that produces a context-free grammar describing exactly the set of strings that the given LFG grammar associates with that f-structure, and serves as a compact representation of all generation results that the LFG language assigns to the input.
Abstract: This article describes an approach to Lexical-Functional Grammar LFG generation that is based on the fact that the set of strings that an LFG grammar relates to a particular acyclic f-structure is a context-free language. We present an algorithm that produces for an arbitrary LFG grammar and an arbitrary acyclic input f-structure a context-free grammar describing exactly the set of strings that the given LFG grammar associates with that f-structure. The individual sentences are then available through a standard context-free generator operating on that grammar. The context-free grammar is constructed by specializing the context-free backbone of the LFG grammar for the given f-structure and serves as a compact representation of all generation results that the LFG grammar assigns to the input. This approach extends to other grammatical formalisms with explicit context-free backbones, such as PATR, and also to formalisms that permit a context-free skeleton to be extracted from richer specifications. It provides a general mathematical framework for understanding and improving the operation of a family of chart-based generation algorithms.

Journal ArticleDOI
TL;DR: The formal power of Multi Bottom-Up Tree Transducers is examined from the point of view of syntax-based machine translation.
Abstract: Tree transducers are defined as relations between trees, but in syntax-based machine translation, we are ultimately concerned with the relations between the strings at the yields of the input and output trees. We examine the formal power of Multi Bottom-Up Tree Transducers from this point of view.

Journal ArticleDOI
TL;DR: It is discussed how well the Fruit Carts domain meets four desired features: unscripted, context-constrained, controllable difficulty, and separability into semi-independent subdialogues.
Abstract: We describe a novel domain, Fruit Carts, aimed at eliciting human language production for the twin purposes of (a) dialogue system research and development and (b) psycholinguistic research. Fruit Carts contains five tasks: choosing a cart, placing it on a map, painting the cart, rotating the cart, and filling the cart with fruit. Fruit Carts has been used for research in psycholinguistics and in dialogue systems. Based on these experiences, we discuss how well the Fruit Carts domain meets four desired features: unscripted, context-constrained, controllable difficulty, and separability into semi-independent subdialogues. We describe the domain in sufficient detail to allow others to replicate it; researchers interested in using the corpora themselves are encouraged to contact the authors directly.

Reference EntryDOI
TL;DR: The survey introduces the reg problem and describes early work in this area, discusses some computational frameworks underlying reg, and demonstrates a recent trend that seeks to link reg algorithms with well-established Knowledge Representation techniques.
Abstract: This article offers a survey of computational research on referring expression generation (REG). It introduces the REG problem and describes early work in this area, discussing what basic assumptio...

Journal ArticleDOI
TL;DR: The goals of this work are to detect the most relevant features for this denotative distinction between event and result nominalizations, and to build an automatic classification system of deverbal nominalizations according to their denotation.
Abstract: This article deals with deverbal nominalizations in Spanish; concretely, we focus on the denotative distinction between event and result nominalizations. The goals of this work is twofold: first, to detect the most relevant features for this denotative distinction; and, second, to build an automatic classification system of deverbal nominalizations according to their denotation. We have based our study on theoretical hypotheses dealing with this semantic distinction and we have analyzed them empirically by means of Machine Learning techniques which are the basis of the ADN-Classifier. This is the first tool that aims to automatically classify deverbal nominalizations in event, result, or underspecified denotation types in Spanish. The ADN-Classifier has helped us to quantitatively evaluate the validity of our claims regarding deverbal nominalizations. We set up a series of experiments in order to test the ADN-Classifier with different models and in different realistic scenarios depending on the knowledge resources and natural language processors available. The ADN-Classifier achieved good results 87.20% accuracy.

Journal ArticleDOI
TL;DR: Quantitative Syntax Analysis is a recent work on QL by Reinhard Köhler that not only provides a comprehensive introduction to the work of QL on the syntactic level, but also sketches the theoretical grounds, the research paradigm, and the ultimate goals of quantitative linguistics in general.
Abstract: Quantitative linguistics (QL) is a discipline of linguistics, that, using real texts, studies languages with quantitative mathematical approaches, aiming to precisely describe and explain, with a system of mathematical laws, the operation and development of language systems. Later in this review, we will address the relationship between QL and computational linguistics. Quantitative Syntax Analysis is a recent work on QL by Reinhard Köhler that not only provides a comprehensive introduction to the work of QL on the syntactic level, but also sketches the theoretical grounds, the research paradigm, and the ultimate goals of quantitative linguistics in general.


Journal ArticleDOI
TL;DR: Victor Yngve was a major contributor in a number of fields within computational linguistics: as the leading researcher in machine translation at the Massachusetts Institute of Technology (MIT), as editor of its first journal, as designer and developer of the first non-numerical programming language (COMIT), and as an influential contributor to linguistic theory.
Abstract: Victor Yngve (5 July 1920 to 15 January 2012) was a major contributor in a number of fields within computational linguistics: as the leading researcher in machine translation (MT) at the Massachusetts Institute of Technology (MIT), as editor of its first journal, as designer and developer of the first non-numerical programming language (COMIT), and as an influential contributor to linguistic theory. While still completing his Ph.D. on cosmic ray physics at the University of Chicago during 1950–1953, Yngve had an idea for using the newly invented computers to translate languages. He contemplated building a translation machine based on simple dictionary lookup. At this time he knew nothing of the earlier speculations of Warren Weaver and others (Hutchins 1997). Then during a visit to Claude Shannon at Bell Telephone Laboratories in early 1952 he heard about a conference on machine translation to be held at MIT in June of that year. He attended the opening public meeting and participated in conference discussions, and then, after Bar-Hillel’s departure from MIT, he was appointed in July 1953 by Jerome Wiesner at the Research Laboratory for Electronics (RLE) to lead the MT research effort there. (For a retrospective survey of his MT research activities see Yngve [2000].) Yngve, along with many others at the time, deprecated the premature publicity around the Georgetown–IBM system demonstrated in January 1954. Yngve was appalled to see research of such a limited nature reported in newspapers; his background in physics required experiments to be carefully planned, with their assumptions made plain, and properly tested and reviewed by other researchers. He was determined to set the new field of MT on a proper scientific course. The first step was a journal for the field, to be named Mechanical Translation—the field became “machine translation” in later years. He found a collaborator for the journal in William N. Locke of the MIT Modern Languages department. The aim was to provide a forum for information about what research was going on in the form of abstracts, and then for peer-reviewed articles. The first issue appeared in March 1954. Yngve’s first experiments at MIT in October 1953 were an implementation of his earlier ideas on word-for-word translation. The results of translating fromGermanwere published in the collection edited by Locke and Booth (Yngve 1955b). One example of output began: