Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Chinese chunking with tri-training learning

[...]

Wenliang Chen¹, Yujie Zhang¹, Hitoshi Isahara¹•Institutions (1)

National Institute of Information and Communications Technology¹

17 Dec 2006

TL;DR: A novel selection method for tri-training learning in which newly labeled sentences are selected by comparing the agreements of three classifiers if the other two classifiers agree on the labels while itself disagrees.

...read moreread less

Abstract: This paper presents a practical tri-training method for Chinese chunking using a small amount of labeled training data and a much larger pool of unlabeled data. We propose a novel selection method for tri-training learning in which newly labeled sentences are selected by comparing the agreements of three classifiers. In detail, in each iteration, a new sample is selected for a classifier if the other two classifiers agree on the labels while itself disagrees. We compare the proposed tri-training learning approach with co-training learning approach on Upenn Chinese Treebank V4.0(CTB4). The experimental results show that the proposed approach can improve the performance significantly.

...read moreread less

16 citations

Book Chapter•DOI•

Spejd: A Shallow Processing and Morphological Disambiguation Tool

[...]

Aleksander Buczyński¹, Adam Przepiórkowski¹•Institutions (1)

Polish Academy of Sciences¹

25 Aug 2009

TL;DR: This article presents a formalism and a beta version of a new tool for simultaneous morphosyntactic disambiguation and shallow parsing, which facilitates the task of the shallow parsing of Morphosyntactically ambiguous or erroneouslydisambiguated input.

...read moreread less

Abstract: This article presents a formalism and a beta version of a new tool for simultaneous morphosyntactic disambiguation and shallow parsing. Unlike in the case of other shallow parsing formalisms, the rules of the grammar allow for explicit morphosyntactic disambiguation statements, independently of structure-building statements, which facilitates the task of the shallow parsing of morphosyntactically ambiguous or erroneously disambiguated input.

...read moreread less

15 citations

Book Chapter•DOI•

Is shallow parsing useful for unsupervised learning of semantic clusters

[...]

Marie-Laure Reinberger¹, Walter Daelemans¹•Institutions (1)

University of Antwerp¹

16 Feb 2003

TL;DR: The adequacy of the clustering method when applied to a syntactically tagged corpus, and the relevance of the semantic content of the resulting clusters are evaluated.

...read moreread less

Abstract: The context of this paper is the application of unsupervised Machine Learning techniques to building ontology extraction tools for Natural Language Processing. Our method relies on exploiting large amounts of linguistically annotated text, and on linguistic concepts such as selectional restrictions and co-composition. We work with a corpus of medical texts in English. First we apply a shallow parser to the corpus to get subject-verb-object structures. We then extract verb-noun relations, and apply a clustering algorithm to them to build semantic classes of nouns. We have evaluated the adequacy of the clustering method when applied to a syntactically tagged corpus, and the relevance of the semantic content of the resulting clusters.

...read moreread less

15 citations

Journal Article•DOI•

Information Extraction Strategies for Thai Documents

[...]

Rattasit Sukhahuta, Dan Smith

01 Jun 2001-International Journal of Computer Processing of Languages

TL;DR: The structure of written Thai is highly ambiguous, which requires more sophisticated techniques than are necessary to perform comparable IE tasks in most European languages, and large amounts of domain knowledge to cope with these ambiguities.

...read moreread less

Abstract: The development of an information extraction (IE) system for Thai documents raises a number of issues which are not important for IE in English and other European languages. We describe the characteristics of written Thai and the problem statements, and our approach to the Thai IE system. The structure of written Thai is highly ambiguous, which requires more sophisticated techniques than are necessary to perform comparable IE tasks in most European languages, and large amounts of domain knowledge to cope with these ambiguities. The basic characteristic of this system is to provide different natural language components to assess the surface structure of the documents. These components include word segmentation, specific lexical structure terms identification and part-of-speech tagger. Further analysis is to perform a shallow parsing based on the relevant regions that contain the specific trigger terms or patterns specified in the extraction templates. Finally, the information of interest is extracted from the grammar trees in corresponding to predefined concept definitions and returns the users with a list of answers responding to each concept.

...read moreread less

15 citations

Proceedings Article•DOI•

Improving chunking by means of lexical-contextual information in statistical language models

[...]

Ferran Pla¹, Antonio Molina¹, Natividad Prieto¹•Institutions (1)

Polytechnic University of Valencia¹

13 Sep 2000

TL;DR: This work produces tagging and chunking in a single process using an Integrated Language Model formalized as Markov Models that integrates several knowledge sources: lexical probabilities, a contextual Language Model for every chunk, and a contextual LM for the sentences.

...read moreread less

Abstract: In this work, we present a stochastic approach to shallow parsing. Most of the current approaches to shallow parsing have a common characteristic: they take the sequence of lexical tags proposed by a POS tagger as input for the chunking process. Our system produces tagging and chunking in a single process using an Integrated Language Model (ILM) formalized as Markov Models. This model integrates several knowledge sources: lexical probabilities, a contextual Language Model (LM) for every chunk, and a contextual LM for the sentences. We have extended the ILM by adding lexical information to the contextual LMs. We have applied this approach to the CoNLL-2000 shared task improving the performance of the chunker.

...read moreread less

15 citations

Collapse

Network Information

Performance

Metrics

397

Papers

10,872

Citations

No. of papers in the topic in previous years
Year	Papers
2021	7
2020	12
2019	6
2018	5
2017	11
2016	11

Shallow parsing

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics