scispace - formally typeset
Search or ask a question
Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources are presented.
Abstract: In this paper, we focus on chunking including contiguous multiword expression recognition, namely super-chunking. In particular, we present different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources. We display a substantial gain of 7.6 points in terms of overall accuracy.

2 citations

Proceedings ArticleDOI
24 Aug 2002
TL;DR: A parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases that integrates shallow parsing techniques with knowledge-based text retrieval to allow for robust processing and coordination of input modes.
Abstract: We describe a parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases. Users can speak or type, and they can navigate and follow links using mouse clicks. Spoken or written queries may combine search expressions with browser commands and search space restrictions. In interpreting input queries, the system has to be fault-tolerant to account for spontanous speech phenomena as well as typing or speech recognition errors which often distort the meaning of the utterance and are difficult to detect and correct. Our parser integrates shallow parsing techniques with knowledge-based text retrieval to allow for robust processing and coordination of input modes. Parsing relies on a two-layered approach: typical meta-expressions like those concerning search, newspaper types and dates are identified and excluded from the search string to be sent to the search engine. The search terms which are left after preprocessing are then grouped according to co-occurrence statistics which have been derived from a newspaper corpus. These co-occurrence statistics concern typical noun phrases as they appear in newspaper texts.

2 citations

13 Sep 2005
TL;DR: This work uses dictionary definition sentences to extract ‘representative’ arguments of predicative definition words; e.g. ‘arrest’ is likely to take police as the subject and criminal as its object and an architecture of zero pronoun resolution using these representative arguments is described.
Abstract: We propose a method to alleviate the problem of referential granularity for Japanese zero pronoun resolution. We use dictionary definition sentences to extract ‘representative’ arguments of predicative definition words; e.g. ‘arrest’ is likely to take police as the subject and criminal as its object. These representative arguments are far more informative than ‘person’ that is provided by other valency dictionaries. They are auto-extracted using both Shallow parsing and Deep parsing for greater quality and quantity. Initial results are highly promising, obtaining more specific information about selectional preferences. An architecture of zero pronoun resolution using these representative arguments is described.

2 citations

Proceedings Article
01 Dec 2006
TL;DR: A hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge and a set of statistic-based association measures (AMs) as filters is presented.
Abstract: This paper presents a hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge. The algorithm first extracts all the noun phrase collocations from a shallow parsed corpus by using syntactic knowledge in the form of phrase rules. It then removes pseudo collocations by using a set of statistic-based association measures (AMs) as filters. There are two main purposes for the design of this hybrid algorithm: (1) to maintain a reasonable recall while improving the precision, and (2) to investigate the proposed association measures on Chinese noun phrase collocations. The performance is compared with a pure statistical model and a pure rule-based method on a 60MB PoS tagged corpus. The experiment results show that the proposed hybrid method has a higher precision of 92.65% and recall of 47% based on 29 randomly selected noun headwords compared with the precision of 78.87% and recall of 27.19% of a statistics based extraction system. The F-score improvement is 55.7%.

2 citations

Book ChapterDOI
17 Sep 2014
TL;DR: This work describes an information extraction methodology which uses shallow parsing and uses predefined frame templates and vocabulary stored within a domain ontology with elements related to frame templates.
Abstract: This work describes an information extraction methodology which uses shallow parsing. We present detailed information on the extraction process, data structures used within that process as well as the evaluation of the described method. The extraction is fully automatic. Instead of machine learning it uses predefined frame templates and vocabulary stored within a domain ontology with elements related to frame templates. The architecture of the information extractor is modular and the main extraction module is capable of processing various languages when lexicalization for these languages is provided.

2 citations


Network Information
Related Topics (5)
Machine translation
22.1K papers, 574.4K citations
81% related
Natural language
31.1K papers, 806.8K citations
79% related
Language model
17.5K papers, 545K citations
79% related
Parsing
21.5K papers, 545.4K citations
79% related
Query language
17.2K papers, 496.2K citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20217
202012
20196
20185
201711
201611