scispace - formally typeset
Search or ask a question
JournalISSN: 0378-4169

Lingvisticae Investigationes 

John Benjamins Publishing Company
About: Lingvisticae Investigationes is an academic journal. The journal publishes majorly in the area(s): Verb & Noun. It has an ISSN identifier of 0378-4169. Over the lifetime, 573 publications have been published receiving 5855 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Observations about languages, named entity types, domains and textual genres studied in the literature, along with other critical aspects of NERC such as features and evaluation methods, are reported.
Abstract: This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.

2,537 citations

Book ChapterDOI
TL;DR: This article showed that there is no parametric difference between English and Japanese that results in essentially different deep structure configurations, and that agreement is forced in English, it is not in Japanese.
Abstract: English has visible wh-movement; Japanese doesn’t. Japanese scrambles and word order is free; English doesn’t scramble and has an orderly word order. The topic is prominent in Japanese; it is not in English. Japanese has double or multiple subject structures; English does not. Such are the major typological differences between English and Japanese, and some linguists entertain the idea that parametric differences concerning Deep Structure exist between English and Japanese which are responsible for these differences. It has been proposed that English is configurational while Japanese is nonconfigurational; cf: Hale (1980), Chomsky (1981), among others. Or it has been suggested that Japanese clauses are Max(V), while English ones are Max(I); for example, Chomsky in a lecture at UCSD, 1985. I would like to sketch in this paper a claim to the contrary that there is no parametric difference between English and Japanese that results in essentially different deep structure configurations. Instead, the parametric difference between English and Japanese consists simply of the following: Agreement is forced in English, it is not in Japanese. 1

461 citations

Journal ArticleDOI
TL;DR: A new annotation schema for a deep contextual opinion analysis using discourse relations is proposed and the distribution of categories in three different types of online corpora, movie reviews, Letters to the Editor and news reports is analyzed.
Abstract: We present an analysis of opinion in texts based on a detailed semantic analysis of a wide class of expressions. We propose a new annotation schema for a deep contextual opinion analysis using discourse relations. We analyze the distribution of our categories in three different types of online corpora, movie reviews, Letters to the Editor and news reports, in English and French.

76 citations

Journal ArticleDOI
Paul M. Postal1

47 citations

Journal ArticleDOI
TL;DR: The paper reports about the development of a Named Entity Recognition (NER) system in Bengali using a tagged Bengali news corpus and the subsequent transliteration of the recognized Bengali Named Entities (NEs) into English.
Abstract: The paper reports about the development of a Named Entity Recognition (NER) system in Bengali using a tagged Bengali news corpus and the subsequent transliteration of the recognized Bengali Named Entities (NEs) into English. Three different models of the NER have been developed. A semi-supervised learning method has been adopted to develop the first two models, one without linguistic features (Model A) and the other with linguistic features (Model B). The third one (Model C) is based on statistical Hidden Markov Model. A modified joint-source channel model has been used along with a number of alternatives to generate the English transliterations of Bengali NEs and vice-versa. The transliteration models learn the mappings from the bilingual training sets optionally guided by linguistic knowledge in the form of conjuncts and diphthongs in Bengali and their representations in English. The NER system has demonstrated the highest average Recall, Precision and F-Score values of 89.62%, 78.67% and 83.79% respectively in Model C. Evaluation of the proposed transliteration models demonstrated that the modified joint source-channel model performs best in terms of evaluation metrics for person and location names for both Bengali to English (B2E) transliteration and English to Bengali transliteration (E2B). The use of the linguistic knowledge during training of the transliteration models improves performance.

47 citations

Network Information
Related Journals (5)
Lingua
4.1K papers, 70.2K citations
79% related
Journal of Linguistics
2K papers, 37.2K citations
79% related
Computational Linguistics
1.4K papers, 154.8K citations
78% related
Natural Language Engineering
761 papers, 29.3K citations
78% related
Linguistics
2.3K papers, 52K citations
77% related
Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
20216
202014
201911
201815
201712
201618