scispace - formally typeset
Search or ask a question
Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.


Papers
More filters
Proceedings ArticleDOI
06 Oct 2005
TL;DR: It is established that combining the output of context-free and finite-state parsers gives much higher results than the previous-best published results, on several common tasks.
Abstract: In this paper, we look at comparing high-accuracy context-free parsers with high-accuracy finite-state (shallow) parsers on several shallow parsing tasks. We show that previously reported comparisons greatly under-estimated the performance of context-free parsers for these tasks. We also demonstrate that context-free parsers can train effectively on relatively little training data, and are more robust to domain shift for shallow parsing tasks than has been previously reported. Finally, we establish that combining the output of context-free and finite-state parsers gives much higher results than the previous-best published results, on several common tasks. While the efficiency benefit of finite-state models is inarguable, the results presented here show that the corresponding cost in accuracy is higher than previously thought.

22 citations

01 Jan 2007
TL;DR: This paper gives the complete account of the contest in terms of how the data for the three languages was released, the performances of the participating systems and an overview of the approaches followed for POS tagging and chunking.
Abstract: As part of the IJCAI workshop on ”Shallow Parsing for South Asian Languages”, a contest was held in which the participants trained and tested their shallow parsing systems for Hindi, Bengali and Telugu. This paper gives the complete account of the contest in terms of how the data for the three languages was released, the performances of the participating systems and an overview of the approaches followed for POS tagging and chunking. We finally give an analysis of the systems which gives insights to directions for future research on shallow parsing for South Asian languages.

22 citations

Journal ArticleDOI
TL;DR: A novel phrase chunking model based on the proposed mask method without employing external knowledge and multiple learners that could automatically derive more training examples from the original training data, which significantly improves system performance.
Abstract: Automatic text chunking aims to recognize grammatical phrase structures in natural language text. Text chunking provides downstream syntactic information for further analysis, which is also an important technology in the area of text mining (TM) and natural language processing (NLP). Existing chunking systems make use of external knowledge, e.g. grammar parsers, or integrate multiple learners to achieve higher performance. However, the external knowledge is almost unavailable in many domains and languages. Besides, employing multiple learners does not only complicate the system architecture, but also increase training and testing time costs. In this paper, we present a novel phrase chunking model based on the proposed mask method without employing external knowledge and multiple learners. The mask method could automatically derive more training examples from the original training data, which significantly improves system performance. We had evaluated our method in different chunking tasks and languages in comparison to previous studies. The experimental results show that our method achieves state of the art performance in chunking tasks. In two English chunking tasks, i.e., shallow parsing and base-chunking, our method achieves 94.22 and 93.23 in F"("@b"="1") rates. When porting to Chinese, the F"("@b"="1") rate is 92.30. Also, our chunker is quite efficient. The complete chunking time of a 50K-words is less than 10s.

21 citations

01 Jan 2007
TL;DR: A system which uses lexical shallow parsing to find adjectival “appraisal groups” in sentences, which convey a positive or negative appraisal of an item, is described.
Abstract: We describe a system which uses lexical shallow parsing to find adjectival “appraisal groups” in sentences, which convey a positive or negative appraisal of an item. We used a simple heuristic to detect opinion holders, determining whether a person was being quoted in a specific sentence or not, and if so, who. We also explored the the use of unsupervised learners and voting to increase our coverage.

20 citations

01 Jan 2008
TL;DR: The aim of this paper is to present the construction of a hybrid, three-stage named entity recognizer for Tamil that performs an in-place tagging task for a given Tamil document in three phases namely shallow parsing, shallow semantic parsing and statistical processing.
Abstract: The aim of this paper is to present the construction of a hybrid, three-stage named entity recognizer for Tamil. Named entity recognition performs an in-place tagging task for a given Tamil document in three phases namely shallow parsing, shallow semantic parsing and statistical processing. The E-M algorithm (HMM) is used in the statistical processing phase, with initial probabilities obtained from the shallow parsing phase, and a modification to the E-M algorithm deals with inputs from the shallow semantic parsing phase. This study is concentrated on entity names (personal names, location names and organization names), temporal expressions (dates and times) and number expressions. Both NER tags and POS tags are used as the hidden variables in the E-M algorithm. The average Fvalues obtained from the system 72.72% for the various entity types.

20 citations


Network Information
Related Topics (5)
Machine translation
22.1K papers, 574.4K citations
81% related
Natural language
31.1K papers, 806.8K citations
79% related
Language model
17.5K papers, 545K citations
79% related
Parsing
21.5K papers, 545.4K citations
79% related
Query language
17.2K papers, 496.2K citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20217
202012
20196
20185
201711
201611