Topic
Shallow parsing
About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: Different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources are presented.
Abstract: In this paper, we focus on chunking including contiguous multiword expression recognition, namely super-chunking. In particular, we present different strategies to improve a superchunker based on Conditional Random Fields by combining it with a finite-state symbolic super-chunker driven by lexical and grammatical resources. We display a substantial gain of 7.6 points in terms of overall accuracy.
2 citations
••
24 Aug 2002TL;DR: A parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases that integrates shallow parsing techniques with knowledge-based text retrieval to allow for robust processing and coordination of input modes.
Abstract: We describe a parser for robust and flexible interpretation of user utterances in a multi-modal system for web search in newspaper databases. Users can speak or type, and they can navigate and follow links using mouse clicks. Spoken or written queries may combine search expressions with browser commands and search space restrictions. In interpreting input queries, the system has to be fault-tolerant to account for spontanous speech phenomena as well as typing or speech recognition errors which often distort the meaning of the utterance and are difficult to detect and correct. Our parser integrates shallow parsing techniques with knowledge-based text retrieval to allow for robust processing and coordination of input modes. Parsing relies on a two-layered approach: typical meta-expressions like those concerning search, newspaper types and dates are identified and excluded from the search string to be sent to the search engine. The search terms which are left after preprocessing are then grouped according to co-occurrence statistics which have been derived from a newspaper corpus. These co-occurrence statistics concern typical noun phrases as they appear in newspaper texts.
2 citations
13 Sep 2005
TL;DR: This work uses dictionary definition sentences to extract ‘representative’ arguments of predicative definition words; e.g. ‘arrest’ is likely to take police as the subject and criminal as its object and an architecture of zero pronoun resolution using these representative arguments is described.
Abstract: We propose a method to alleviate the problem of referential granularity for Japanese zero pronoun resolution. We use dictionary definition sentences to extract ‘representative’ arguments of predicative definition words; e.g. ‘arrest’ is likely to take police as the subject and criminal as its object. These representative arguments are far more informative than ‘person’ that is provided by other valency dictionaries. They are auto-extracted using both Shallow parsing and Deep parsing for greater quality and quantity. Initial results are highly promising, obtaining more specific information about selectional preferences. An architecture of zero pronoun resolution using these representative arguments is described.
2 citations
•
01 Dec 2006TL;DR: A hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge and a set of statistic-based association measures (AMs) as filters is presented.
Abstract: This paper presents a hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge. The algorithm first extracts all the noun phrase collocations from a shallow parsed corpus by using syntactic knowledge in the form of phrase rules. It then removes pseudo collocations by using a set of statistic-based association measures (AMs) as filters. There are two main purposes for the design of this hybrid algorithm: (1) to maintain a reasonable recall while improving the precision, and (2) to investigate the proposed association measures on Chinese noun phrase collocations. The performance is compared with a pure statistical model and a pure rule-based method on a 60MB PoS tagged corpus. The experiment results show that the proposed hybrid method has a higher precision of 92.65% and recall of 47% based on 29 randomly selected noun headwords compared with the precision of 78.87% and recall of 27.19% of a statistics based extraction system. The F-score improvement is 55.7%.
2 citations
••
17 Sep 2014TL;DR: This work describes an information extraction methodology which uses shallow parsing and uses predefined frame templates and vocabulary stored within a domain ontology with elements related to frame templates.
Abstract: This work describes an information extraction methodology which uses shallow parsing. We present detailed information on the extraction process, data structures used within that process as well as the evaluation of the described method. The extraction is fully automatic. Instead of machine learning it uses predefined frame templates and vocabulary stored within a domain ontology with elements related to frame templates. The architecture of the information extractor is modular and the main extraction module is capable of processing various languages when lexicalization for these languages is provided.
2 citations