Showing papers in &quot;Computational Linguistics in 2001&quot;

Review of Knowledge representation: logical, philosophical, and computational foundations by John F. Sowa. Brooks/Cole 2000.

TL;DR: The learning approach to coreference resolution of noun phrases in unrestricted text is presented, indicating that on the general noun phrase coreference task, the learning approach holds promise and achieves accuracy comparable to that of nonlearning approaches.

...read moreread less

Abstract: In this paper, we present a learning approach to coreference resolution of noun phrases in unrestricted text. The approach learns from a small, annotated corpus and the task includes resolving not just a certain type of noun phrase (e.g., pronouns) but rather general noun phrases. It also does not restrict the entity types of the noun phrases; that is, coreference is assigned whether they are of "organization," "person," or other types. We evaluate our approach on common data sets (namely, the MUC-6 and MUC-7 coreference corpora) and obtain encouraging results, in-dicating that on the general noun phrase coreference task, the learning approach holds promise and achieves accuracy comparable to that of nonlearning approaches. Our system is the first learning-based system that offers performance comparable to that of state-of-the-art nonlearning systems on these data sets.

...read moreread less

1,059 citations

Journal Article•DOI•

[...]

Stuart C. Shapiro¹•Institutions (1)

State University of New York System¹

Unsupervised learning of the morphology of a natural language

806 citations

Journal Article•DOI•

[...]

John Goldsmith¹•Institutions (1)

University of Chicago¹

Probabilistic top-down parsing and language modeling

TL;DR: This study reports the results of using minimum description length (MDL) analysis to model unsupervised learning of the morphological segmentation of European languages, using corpora ranging in size from 5,000 Words to 500,000 words.

...read moreread less

Abstract: This study reports the results of using minimum description length (MDL) analysis to model unsupervised learning of the morphological segmentation of European languages, using corpora ranging in size from 5,000 words to 500,000 words. We develop a set of heuristics that rapidly develop a probabilistic morphological grammar, and use MDL as our primary tool to determine whether the modifications proposed by the heuristics will be adopted or not. The resulting grammar matches well the analysis that would be developed by a human morphologist.In the final section, we discuss the relationship of this style of MDL grammatical analysis to the notion of evaluation metric in early generative grammar.

...read moreread less

789 citations

Journal Article•DOI•

[...]

Brian Roark¹•Institutions (1)

Brown University¹

Automatic verb classification based on statistical distributions of argument structure

TL;DR: The authors proposed a probabilistic top-down parser for language modeling for speech recognition, which performs very well in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers.

...read moreread less

Abstract: This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic top-down parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. A new language model that utilizes probabilistic top-down parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models, demonstrating the degree to which the information captured by our parsing model is orthogonal to that captured by a trigram model. A small recognition experiment also demonstrates the utility of the model.

...read moreread less

336 citations

Journal Article•DOI•

[...]

Paola Merlo¹, Suzanne Stevenson²•Institutions (2)

University of Geneva¹, University of Toronto²

Improving accuracy in word class tagging through the combination of machine learning systems

TL;DR: This work reports on supervised learning experiments to automatically classify three major types of English verbs, based on their argument structure-specifically, the thematic roles they assign to participants, using linguistically-motivated statistical indicators extracted from large annotated corpora to train the classifier.

...read moreread less

Abstract: Automatic acquisition of lexical knowledge is critical to a wide range of natural language processing tasks. Especially important is knowledge about verbs, which are the primary source of relational information in a sentence---the predicate-argument structure that relates an action or state to its participants (i.e., who did what to whom). In this work, we report on supervised learning experiments to automatically classify three major types of English verbs, based on their argument structure--specifically, the thematic roles they assign to participants. We use linguistically-motivated statistical indicators extracted from large annotated corpora to train the classifier, achieving 69.8% accuracy for a task whose baseline is 34%, and whose expert-based upper bound we calculate at 86.5%. A detailed analysis of the performance of the algorithm and of its errors confirms that the proposed features capture properties related to the argument structure of the verbs. Our results validate our hypotheses that knowledge about thematic relations is crucial for verb classification, and that it can be gleaned from a corpus by automatic means. We thus demonstrate an effective combination of deeper linguistic knowledge with the robustness and scalability of statistical techniques.

...read moreread less

216 citations

Journal Article•DOI•

[...]

Hans van Halteren¹, Walter Daelemans², Jakub Zavrel²•Institutions (2)

Radboud University Nijmegen¹, University of Antwerp²

Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus

TL;DR: It is examined how differences in language models, learned by different data-driven systems performing the same NLP task, can be exploited to yield a higher accuracy than the best individual system.

...read moreread less

Abstract: We examine how differences in language models, learned by different data-driven systems performing the same NLP task, can be exploited to yield a higher accuracy than the best individual system. We do this by means of experiments involving the task of morphosyntactic word class tagging, on the basis of three different tagged corpora. Four well-known tagger generators (hidden Markov model, memory-based, transformation rules, and maximum entropy) are trained on the same corpus data. After comparison, their outputs are combined using several voting strategies and second-stage classifiers. All combination taggers outperform their best component. The reduction in error rate varies with the material in question, but can be as high as 24.3% with the LOB corpus.

...read moreread less

208 citations

Journal Article•DOI•

[...]

Mikio Yamamoto¹, Kenneth Church²•Institutions (2)

University of Tsukuba¹, AT&T Labs²

The interaction of knowledge sources in word sense disambiguation

TL;DR: The authors used suffix arrays to compute term frequency (tf) and document frequency (dr) for all n-grams in two large corpora, an English corpus of 50 million words of Wall Street Journal and a Japanese corpus of 216 million characters of Mainichi Shimbun.

...read moreread less

Abstract: Bigrams and trigrams are commonly used in statistical natural language processing; this paper will describe techniques for working with much longer n-grams. Suffix arrays (Manber and Myers 1990) were first introduced to compute the frequency and location of a substring (n-gram) in a sequence (corpus) of length N. To compute frequencies over all N(N + 1)/2 substrings in a corpus, the substrings are grouped into a manageable number of equivalence classes. In this way, a prohibitive computation over substrings is reduced to a manageable computation over classes. This paper presents both the algorithms and the code that were used to compute term frequency (tf) and document frequency (dr)for all n-grams in two large corpora, an English corpus of 50 million words of Wall Street Journal and a Japanese corpus of 216 million characters of Mainichi Shimbun.The second half of the paper uses these frequencies to find "interesting" substrings. Lexicographers have been interested in n-grams with high mutual information (MI) where the joint term frequency is higher than what would be expected by chance, assuming that the parts of the n-gram combine independently. Residual inverse document frequency (RIDF) compares document frequency to another model of chance where terms with a particular term frequency are distributed randomly throughout the collection. MI tends to pick out phrases with noncompositional semantics (which often violate the independence assumption) whereas RIDF tends to highlight technical terminology, names, and good keywords for information retrieval (which tend to exhibit nonrandom distributions over documents). The combination of both MI and RIDF is better than either by itself in a Japanese word extraction task.

...read moreread less

207 citations

Journal Article•DOI•

[...]

Mark Stevenson¹, Yorick Wilks¹•Institutions (1)

University of Sheffield¹

Towards constructive text, diagram, and layout generation for information presentation

TL;DR: This work presents a sense tagger which uses several knowledge sources and attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words.

...read moreread less

Abstract: Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in artificial in telligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus.Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems.

...read moreread less

182 citations

Journal Article•DOI•

[...]

John A. Bateman¹, Jörg Kleinz, Thomas Kamps, Klaus Reichenberger•Institutions (1)

University of Bremen¹

A statistical model for word discovery in transcribed speech

TL;DR: It is demonstrated that layout offers a rich resource for achieving presentational coherence, alongside more traditional resources such as text-formatting and the text-internal marking of discourse connections, and an integrated approach to layout, text, and diagram generation is introduced.

...read moreread less

Abstract: Combining elements appropriately within a coherent page layout is a well-recognized and crucial aspect of sophisticated information presentation. The precise function and nature of layout has not, however, been sufficiently addressed within computational approaches; attention is often restricted to relatively local issues of typography and text-formatting, leaving broader issues of layout unaddressed. In this paper we focus on the selection and function of layout in pages that appropriately combine textual and graphical representation styles to yield coherent presentation designs. We demonstrate that layout offers a rich resource for achieving presentational coherence, alongside more traditional resources such as text-formatting and the text-internal marking of discourse connections. We also introduce an integrated approach to layout, text, and diagram generation. Our approach is developed on the basis of a preliminary empirical investigation of professionally produced layouts, followed by implementation within a prototype information system in the area of art history.

...read moreread less

119 citations

Journal Article•DOI•

[...]

Anand Venkataraman¹•Institutions (1)

SRI International¹

Integrating prosodic and lexical cues for automatic topic segmentation

TL;DR: In this article, a statistical model for segmentation and word discovery in continuous speech is presented, and an incremental unsupervised learning algorithm to infer word boundaries based on this model is described.

...read moreread less

Abstract: A statistical model for segmentation and word discovery in continuous speech is presented. An incremental unsupervised learning algorithm to infer word boundaries based on this model is described. Results are also presented of empirical tests showing that the algorithm is competitive with other models that have been used for similar tasks.

...read moreread less

118 citations

Journal Article•DOI•

[...]

Gokhan Tur¹, Andreas Stolcke², Dilek Hakkani-Tur¹, Elizabeth Shriberg²•Institutions (2)

Bilkent University¹, SRI International²

A corpus-based evaluation of centering and pronoun resolution

TL;DR: A probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units is presented and a significant reduction in error is achieved by combining the Prosodic and word-based knowledge sources.

...read moreread less

Abstract: We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We evaluate our approach on the Broadcast News corpus, using the DARPA-TDT evaluation metrics. Results show that the prosodic model alone is competitive with word-based segmentation methods. Furthermore, we achieve a significant reduction in error by combining the prosodic and word-based knowledge sources.

...read moreread less

Journal Article•DOI•

[...]

Joel Tetreault¹•Institutions (1)

University of Rochester¹

Unsupervised named entity recognition using syntactic and semantic contextual evidence

TL;DR: A centering algorithm (Left-Right Centering) is introduced that adheres to the constraints and rules of centering theory and is an alternative to Brennan, Friedman, and Pollard's (1987) algorithm.

...read moreread less

Abstract: In this paper we compare pronoun resolution algorithms and introduce a centering algorithm (Left-Right Centering) that adheres to the constraints and rules of centering theory and is an alternative to Brennan, Friedman, and Pollard's (1987) algorithm. We then use the Left-Right Centering algorithm to see if two psycholinguistic claims on Cf-list ranking will actually improve pronoun resolution accuracy. Our results from this investigation lead to the development of a new syntax-based ranking of the Cf-list and corpus-based evidence that contradicts the psycholinguistic claims.

...read moreread less

Journal Article•DOI•

[...]

Alessandro Cucchiarelli, Paola Velardi¹•Institutions (1)

Sapienza University of Rome¹

An algorithm for anaphora resolution in Spanish texts

TL;DR: The purpose of this paper is to suggest the use of a complementary backup method to increase the robustness of any hand-crafted or machine-learning-based NE tagger, and to explore the effectiveness of using more fine-grained evidencenamely, syntactic and semantic contextual knowledge in classifying NEs.

...read moreread less

Abstract: Proper nouns form an open class, making the incompleteness of manually or automatically learned classification rules an obvious problem. The purpose of this paper is twofold: first, to suggest the use of a complementary "backup" method to increase the robustness of any hand-crafted or machine-learning-based NE tagger; and second, to explore the effectiveness of using more fine-grained evidence--namely, syntactic and semantic contextual knowledge---in classifying NEs.

...read moreread less

Journal Article•DOI•

[...]

Manuel Palomar¹, Lidia Moreno, Jesús Peral¹, Rafael Muñoz¹, Antonio Ferrández¹, Patricio Martínez-Barco¹, Maximiliano Saiz-Noeda¹ - Show less +3 more•Institutions (1)

University of Alicante¹

Bootstrapping morphological analyzers by combining human elicitation and machine learning

TL;DR: An algorithm for identifying noun phrase antecedents of third person personal pronouns, demonstrative pronouns, reflexive pronouns, and omitted pronouns (zero pronouns) in unrestricted Spanish texts is presented.

...read moreread less

Abstract: This paper presents an algorithm for identifying noun phrase antecedents of third person personal pronouns, demonstrative pronouns, reflexive pronouns, and omitted pronouns (zero pronouns) in unrestricted Spanish texts. We define a list of constraints and preferences for different types of pronominal expressions, and we document in detail the importance of each kind of knowledge (lexical, morphological, syntactic, and statistical) in anaphora resolution for Spanish. The paper also provides a definition for syntactic conditions on Spanish NP-pronoun noncoreference using partial parsing. The algorithm has been evaluated on a corpus of 1,677 pronouns and achieved a success rate of 76.8%. We have also implemented four competitive algorithms and tested their performance in a blind evaluation on the same test corpus. This new approach could easily be extended to other languages such as English, Portuguese, Italian, or Japanese.

...read moreread less

Journal Article•DOI•

[...]

Kemal Oflazer¹, Sergei Nirenburg², Marjorie McShane²•Institutions (2)

Sabancı University¹, New Mexico State University²

D-tree substitution grammars

TL;DR: This elicit-build-test technique compiles lexical and inectional information elicited from a human into a finite-state transducer lexicon and combines this with a sequence of morphographemic rewrite rules that is induced using transformation-based learning from the elicited examples.

...read moreread less

Abstract: This paper presents a semiautomatic technique for developing broad-coverage finite-state mor-phological analyzers for use in natural language processing applications. It consists of three components---elicitation of linguistic information from humans, a machine learning bootstrapping scheme, and a testing environment. The three components are applied iteratively until a threshold of output quality is attained. The initial application of this technique is for the morphology of low-density languages in the context of the Expedition project at NMSU Computing Research Laboratory. This elicit-build-test technique compiles lexical and inflectional information elicited from a human into a finite-state transducer lexicon and combines this with a sequence of morphographemic rewrite rules that is induced using transformation-based learning from the elicited examples. The resulting morphological analyzer is then tested against a test set, and any corrections are fed back into the learning procedure, which then builds an improved analyzer.

...read moreread less

Journal Article•DOI•

[...]

Owen Rambow¹, David J. Weir², K. Vijay-Shanker³•Institutions (3)

AT&T Labs¹, University of Sussex², University of Delaware³

A reformulation of Rule 2 of centering theory

TL;DR: It is shown how the DSG formalism, which is designed to inherit many of the characterestics of LTAG, can be used to express a variety of linguistic analyses not available in LTAG.

...read moreread less

Abstract: There is considerable interest among computational linguists in lexicalized grammatical frame-works; lexicalized tree adjoining grammar (LTAG) is one widely studied example. In this paper, we investigate how derivations in LTAG can be viewed not as manipulations of trees but as manipulations of tree descriptions. Changing the way the lexicalized formalism is viewed raises questions as to the desirability of certain aspects of the formalism. We present a new formalism, d-tree substitution grammar (DSG). Derivations in DSG involve the composition of d-trees, special kinds of tree descriptions. Trees are read off from derived d-trees. We show how the DSG formalism, which is designed to inherit many of the characterestics of LTAG, can be used to express a variety of linguistic analyses not available in LTAG.

...read moreread less

Journal Article•DOI•

[...]

Rodger Kibble¹•Institutions (1)

University of London¹

Design and enhanced evaluation of a robust anaphor resolution algorithm

TL;DR: A new formulation of Rule 2 of centering theory is proposed that incorporates these principles as well as a streamlined version of Strube and Hahn's (1999) notion of cheapness, and is argued that this formulation provides a natural way to handle topic switches that appear to violate the canonical preference ordering.

...read moreread less

Abstract: The standard preference ordering on the well-known centering transitions Continue, Retain, Shift is argued to be unmotivated: a partial, context-dependent ordering emerges from the interaction between principles dubbed cohesion (maintaining the same center of attention) and salience (realizing the center of attention as the most prominent NP). A new formulation of Rule 2 of centering theory is proposed that incorporates these principles as well as a streamlined version of Strube and Hahn's (1999) notion of cheapness. It is argued that this formulation provides a natural way to handle "topic switches" that appear to violate the canonical preference ordering.

...read moreread less

Journal Article•DOI•

[...]

Roland Stuckardt

Introduction to the special issue on computational anaphora resolution

TL;DR: The ROSANA approach, which generalizes the verification of coindexing restrictions in order to make it applicable to the deficient syntactic descriptions that are provided by a robust state-of-the-art parser, and proves that the robust implementation of syntactic disjoint reference is nearly optimal.

...read moreread less

Abstract: Syntactic coindexing restrictions are by now known to be of central importance to practical anaphor resolution approaches. Since, in particular due to structural ambiguity, the assumption of the availability of a unique syntactic reading proves to be unrealistic, robust anaphor resolution relies on techniques to overcome this deficiency.This paper describes the ROSANA approach, which generalizes the verification of coindexing restrictions in order to make it applicable to the deficient syntactic descriptions that are provided by a robust state-of-the-art parser. By a formal evaluation on two corpora that differ with respect to text genre and domain, it is shown that ROSANA achieves high-quality robust coreference resolution. Moreover, by an in-depth analysis, it is proven that the robust implementation of syntactic disjoint reference is nearly optimal. The study reveals that, compared with approaches that rely on shallow preprocessing, the largely nonheuristic disjoint reference algorithmization opens up the possibility for a slight improvement. Furthermore, it is shown that more significant gains are to be expected elsewhere, particularly from a text-genre-specific choice of preference strategies.The performance study of the ROSANA system crucially rests on an enhanced evaluation methodology for coreference resolution systems, the development of which constitutes the second major contribution of the paper. As a supplement to the model-theoretic scoring scheme that was developed for the Message Understanding Conference (MUC) evaluations, additional evaluation measures are defined that, on one hand, support the developer of anaphor resolution systems, and, on the other hand, shed light on application aspects of pronoun interpretation.

...read moreread less

Journal Article•DOI•

[...]

Ruslan Mitkov¹, Shalom Lappin², Branimir Boguraev³•Institutions (3)

University of Wolverhampton¹, King's College London², IBM³

The uncommon denominator: a proposal for consistent reporting of pronoun resolution results

TL;DR: The drive toward knowledge-poor and robust approaches was further motivated by the emergence of cheaper and more reliable corpus-based NLP tools such as partof-speech taggers and shallow parsers, alongside the increasing availability of corpora and other NLP resources.

...read moreread less

Abstract: Anaphora accounts for cohesion in texts and is a phenomenon under active study in formal and computational linguistics alike. The correct interpretation of anaphora is vital for natural language processing (NLP). For example, anaphora resolution is a key task in natural language interfaces, machine translation, text summarization, information extraction, question answering, and a number of other NLP applications. After considerable initial research, followed by years of relative silence in the early 1980s, anaphora resolution has attracted the attention of many researchers in the last 10 years and a great deal of successful work on the topic has been carried out. Discourseoriented theories and formalisms such as Discourse Representation Theory and Centering Theory inspired new research on the computational treatment of anaphora. The drive toward corpus-based robust NLP solutions further stimulated interest in alternative and/or data-enriched approaches. Last, but not least, application-driven research in areas such as automatic abstracting and information extraction independently highlighted the importance of anaphora and coreference resolution, boosting research in this area. Much of the earlier work in anaphora resolution heavily exploited domain and linguistic knowledge (Sidner 1979; Carter 1987; Rich and LuperFoy 1988; Carbonell and Brown 1988), which was difficult both to represent and to process, and which required considerable human input. However, the pressing need for the development of robust and inexpensive solutions to meet the demands of practical NLP systems encouraged many researchers to move away from extensive domain and linguistic knowledge and to embark instead upon knowledge-poor anaphora resolution strategies. A number of proposals in the 1990s deliberately limited the extent to which they relied on domain and/or linguistic knowledge and reported promising results in knowledge-poor operational environments (Dagan and Itai 1990, 1991; Lappin and Leass 1994; Nasukawa 1994; Kennedy and Boguraev 1996; Williams, Harvey, and Preston 1996; Baldwin 1997; Mitkov 1996, 1998b). The drive toward knowledge-poor and robust approaches was further motivated by the emergence of cheaper and more reliable corpus-based NLP tools such as partof-speech taggers and shallow parsers, alongside the increasing availability of corpora and other NLP resources (e.g., ontologies). In fact, the availability of corpora, both raw and annotated with coreferential links, provided a strong impetus to anaphora resolu

...read moreread less

Journal Article•DOI•

[...]

Donna Byron¹•Institutions (1)

University of Rochester¹

Review of "Longman grammar of spoken and written English" by Douglas Biber, Stig Johansson, Geoffrey Leech, Susan Conrad, and Edward Finegan. Pearson Education Ltd, 1999.

TL;DR: A new reporting standard is proposed that improves the exposition of individual results and the possibility for readers to compare techniques across studies, and an informative new performance metric, the resolution rate, is proposed for use in addition to precision and recall.

...read moreread less

Abstract: Pronoun resolution studies compute performance inconsistently and describe results incompletely. We propose a new reporting standard that improves the exposition of individual results and the possibility for readers to compare techniques across studies. We also propose an informative new performance metric, the resolution rate, for use in addition to precision and recall.

...read moreread less

Journal Article•DOI•

[...]

HirstGraeme

Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation

Journal Article•DOI•

[...]

Dafydd Gibbon, Inge Mertins, Roger K. Moore

Squibs and Discussions: Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence

Journal Article•

[...]

Alessandro Cucchiarelli, Paola Velardi

01 Jan 2001-Computational Linguistics

TL;DR: The authors suggest the use of a complementary backup method to increase the robustness of any hand-crafted or machine-learning-based NE tagger and explore the effectiveness of using more fine-grained evidencenamely, syntactic and semantic contextual knowledge in classifying NEs.

...read moreread less

Abstract: Proper nouns form an open class, making the incompleteness of manually or automatically learned classification rules an obvious problem. The purpose of this paper is twofold: first, to suggest the use of a complementary backup method to increase the robustness of any hand-crafted or machine-learning-based NE tagger; and second, to explore the effectiveness of using more fine-grained evidencenamely, syntactic and semantic contextual knowledgein classifying NEs.

...read moreread less

Journal Article•DOI•

Review of Pattern grammar: a corpus-driven approach to the lexical grammar of English by Susan Hunston and Gill Francis. John Benjamins 2000.

[...]

Christopher Johnson

The need for accurate alignment in natural language system evaluation

Journal Article•DOI•

[...]

Andrew Kehler¹, John Bear², Douglas E. Appelt²•Institutions (2)

University of California, San Diego¹, Artificial Intelligence Center²

Book Reviews: Construing Experience through Meaning: A Language-based Approach to Cognition

TL;DR: An extensive analysis of the alignment procedure used in the MUC-6 evaluation of information extraction technology reveals effects that interfere with the stated goals of the evaluation, and argues strongly for the use of accurate alignment criteria in natural language evaluations.

...read moreread less

Abstract: As evaluations of computational linguistics technology progress toward higher-level interpretation tasks, the problem of determining alignments between system responses and answer key entries may become less straightforward. We present an extensive analysis of the alignment procedure used in the MUC-6 evaluation of information extraction technology, which reveals effects that interfere with the stated goals of the evaluation. These effects are shown to be pervasive enough that they have the potential to adversely impact the technology development process. These results argue strongly/ or the use of accurate alignment criteria in natural language evaluations, and/ or maintaining the independence of alignment criteria and mechanisms used to calculate scores.

...read moreread less

Journal Article•

[...]

John F. Sowa

01 Jan 2001-Computational Linguistics

TL;DR: Language as a multifunctional system and process and the construal of experience: consciousness in daily life and in cognitive science and the making of meaning.

...read moreread less

Abstract: Part I: Introduction 1. Theoretical preliminaries Part II: The ideation base 2. Overview of the general ideational potential 3. Sequences 4. Figures 5. Elements 6. Grammatical metaphors 7. Comparison with Chinese Part III: The meaning base as a resource in language processing systems 8. Building an ideation base 9. Using the ideation base in text processing Part IV: Theoretical and descriptive alternatives 10. Alternative approaches to meaning. 11. Distortion and transformation 12. Figures and processes Part V: Language and the construal of experience 13. Language as a multifunctional system and process 14. Construing ideational models: consciousness in daily life and in cognitive science 15. Language and the making of meaning.

...read moreread less

Journal Article•DOI•

Review of Longman grammar of spoken and written English by Douglas Biber, Stig Johansson, Geoffrey Leech, Susan Conrad, and Edward Finegan. Pearson Education Ltd, 1999.

[...]

Graeme Hirst¹•Institutions (1)

University of Toronto¹

Review of Toward a cognitive semantics: volume I: concept structuring systems, volume II: typology and process in concept structuring by Leonard Talmy. The M1T Press 2000.

Journal Article•DOI•

[...]

Keith Allan¹•Institutions (1)

Monash University¹

Book Reviews: Parallel Text Processing: Alignment and Use of Translation Corpora

Journal Article•

[...]

Philip Resnik

01 Jan 2001-Computational Linguistics

TL;DR: Only for you today!

...read moreread less

Abstract: Only for you today! Discover your favourite parallel text processing alignment and use of translation corpora book right here by downloading and getting the soft file of the book. This is not your time to traditionally go to the book stores to buy a book. Here, varieties of book collections are available to download. One of them is this parallel text processing alignment and use of translation corpora as your preferred book. Getting this book b on-line in this site can be realized now by visiting the link page to download. It will be easy. Why should be here?

...read moreread less

Journal Article•DOI•

Unsupervised named entity recognition using syntactic and semantic contextual evidence

[...]

CucchiarelliAlessandro, VelardiPaola