scispace - formally typeset
Search or ask a question
Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.


Papers
More filters
01 Jan 2005
TL;DR: This dissertation introduces a framework, inference with classifiers, to study NLP problems that involve complex structured outputs, shallow parsing and semantic role labeling and shows the significance of incorporating constraints into the inference stage as a way to correct and improve the decisions of the stand alone classifiers.
Abstract: A large number of problems in natural language processing (NLP) involve outputs with complex structure. Conceptually in such problems, the task is to assign values to multiple variables which represent the outputs of several interdependent components. A natural approach to this task is to formulate it as a two-stage process. In the first stage, the variables are assigned initial values using machine learning based programs. In the second, an inference procedure uses the outcomes of the first stage classifiers along with domain specific constraints in order to infer a globally consistent final prediction. This dissertation introduces a framework, inference with classifiers, to study such problems. The framework is applied to two important and fundamental NLP problems that involve complex structured outputs, shallow parsing and semantic role labeling. In shallow parsing, the goal is to identify syntactic phrases in sentences, which has been found useful in a variety of large-scale NLP applications. Semantic role labeling is the task of identifying predicate-argument structure in sentences, a crucial step toward a deeper understanding of natural language. In both tasks, we develop state-of-the-art systems which have been used in practice. In this framework, we have shown the significance of incorporating constraints into the inference stage as a way to correct and improve the decisions of the stand alone classifiers. Although it is clear that incorporating constraints into inference necessarily improves global coherency, there is no guarantee of the improvement in the performance measured in terms of the accuracy of the local predictions---the metric that is of interest for most applications. We develop a better theoretic understanding of this issue. Under a reasonable assumption, we prove a sufficient condition to guarantee that using constraints cannot degrade the performance with respect to Hamming loss. In addition, we provide an experimental study suggesting that constraints can improve performance even when the sufficient conditions are not fully satisfied.
Book ChapterDOI
22 Oct 2012
TL;DR: A new method of reordering in phrase based statistical machine translation using shallow parsing and transformation rules to reorder the source sentence is presented.
Abstract: Reordering is of essential importance for phrase based statistical machine translation. In this paper, we would like to present a new method of reordering in phrase based statistical machine translation. We inspired from [1] using preprocessing reordering approaches. We used shallow parsing and transformation rules to reorder the source sentence. The experiment results from English-Vietnamese pair showed that our approach achieves significant improvements over MOSES which is the state-of-the art phrase based system.
Dissertation
01 Jan 2010
TL;DR: This page needs a pagination widget to browse across all the documents, and since only 10 results are displayed per page, this page needs to have this widget.
Abstract: widget. Since only 10 results are displayed per page, we also need a pagination widget to browse across all the documents.
Proceedings Article
01 Dec 2015
TL;DR: This approach puts together the benefits of heuristic rules, a large unlabelled corpus as well as supervised learning to model complex underlying characteristics of noun phrase occurrences and demonstrates a better performance.
Abstract: Information Extraction from Indian languages requires effective shallow parsing, especially identification of “meaningful” noun phrases. Particularly, for an agglutinative and free word order language like Marathi, this problem is quite challenging. We model this task of extracting noun phrases as a sequence labelling problem. A Distant Supervision framework is used to automatically create a large labelled data for training the sequence labelling model. The framework exploits a set of heuristic rules based on corpus statistics for the automatic labelling. Our approach puts together the benefits of heuristic rules, a large unlabelled corpus as well as supervised learning to model complex underlying characteristics of noun phrase occurrences. In comparison to a simple English-like chunking baseline and a publicly available Marathi Shallow Parser, our method demonstrates a better performance.

Network Information
Related Topics (5)
Machine translation
22.1K papers, 574.4K citations
81% related
Natural language
31.1K papers, 806.8K citations
79% related
Language model
17.5K papers, 545K citations
79% related
Parsing
21.5K papers, 545.4K citations
79% related
Query language
17.2K papers, 496.2K citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20217
202012
20196
20185
201711
201611