Topic

Shallow parsing

About: Shallow parsing is a research topic. Over the lifetime, 397 publications have been published within this topic receiving 10211 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Representing Textual Content in a Generic Extraction Model

[...]

Nancy McCracken¹•Institutions (1)

Syracuse University¹

01 Jan 2002

TL;DR: A text processing system that uses shallow parsing techniques to extract information from sentences in text documents and stores frames of information in a knowledge base that is approaching more complete text understanding in a practical way that does not require expensive processing such as full parsing of the documents.

...read moreread less

Abstract: The system described in this paper automatically extracts and stores information from documents. We have implemented a text processing system that uses shallow parsing techniques to extract information from sentences in text documents and stores frames of information in a knowledge base. We intend to use this system in two main application areas: open domain Question & Answering (Q&A) and specific domain information extraction. Extraction from Documents The system described in this paper uses a Natural Language Processing system developed at the Center for Natural Language Processing to extract information from documents and store it in a knowledge base. In the past, applications were aimed at MUC-style information extraction that filled in templates of specific types of information. Our current goal is to produce a system that can extract generic frames of information about all entities and events in the sentences of the text and represent relationships between them. This type of system is approaching more complete text understanding in a practical way that does not require expensive processing such as full parsing of the documents. The heart of the generic extraction system is a set of rules written for a finite-state system that recognizes the patterns of text. These rules are applied in several phases including part-of-speech tagging, bracketing of noun phrases, and categorization of proper noun phrases. Later phases recognize the surface structure of phrases in each sentence and map the phrases to the case frame of the verbs, recognizing the phrases taking the roles of agent, object, point-in-time, etc., and creating a frame representing an “event”. The case roles are similar to those in case grammars (Fillmore 1968). Consider the example sentence: In addition to these most recent incidents, the Abu Sayyaf have bought Russian uranium on Basilan Island.

...read moreread less

4 citations

Proceedings Article•

Validating text mining results on protein-protein interactions using gene expression profiles

[...]

Deyu Zhou¹, Yulan He¹, Chee Keong Kwoh¹•Institutions (1)

Nanyang Technological University¹

01 Jan 2006

TL;DR: A probability model is proposed to score the confidence of protein-protein interactions based on both text mining results and gene expression profiles, and experimental results are presented to show the feasibility of this framework.

...read moreread less

Abstract: Protein-protein interactions referring to the associations of protein molecules are crucial for many biological functions. Since most knowledge about them still hides in biological publications, there is an increasing focus on mining information from the vast amount of biological literature such as MedLine. Many approaches, such as pattern matching, shallow parsing and deep parsing, have been proposed to automatically extract protein-protein interaction information from text sources, with however limited success. Moreover, to the best of our knowledge, none of the existing approaches have performed automatic validation on the mining results. In this paper, we describe a novel framework in which text mining results are automatically validated using the knowledge mined from gene expression profiles. A probability model is proposed to score the confidence of protein-protein interactions based on both text mining results and gene expression profiles. Experimental results are presented to show the feasibility of this framework.

...read moreread less

4 citations

Proceedings Article•DOI•

Zero pronoun resolution in Chinese using machine learning plus shallow parsing

[...]

Weijie Yang¹, Ruwei Dai¹, Xia Cui¹•Institutions (1)

Chinese Academy of Sciences¹

20 Jun 2008

TL;DR: This paper proposed a new method to detect and resolve zero pronouns in Chinese text with integrated automatic main verbs identification, verbal logic valence and machine learning approach and demonstrated this zero pronouns identifying and resolving method works effectively.

...read moreread less

Abstract: This paper proposed a new method to detect and resolve zero pronouns in Chinese text with integrated automatic main verbs identification, verbal logic valence and machine learning approach. Zero pronoun recognition was treated as the problem of finding missing verbs logic arguments. First, based on automatic main verbs identification, syntax hierarchies were analysed. Second, combining the syntax hierarchy and verbal logic valence theory, zero pronouns were identified. And then using a machine learning approach, zero pronouns were resolved. Experimental results on 150 news articles indicated that the precision and recall of zero pronoun detection is 72.9% and 92.7% respectively, and the accuracy of antecedent estimation is 64.3%. . These results demonstrated this zero pronouns identifying and resolving method works effectively.

...read moreread less

4 citations

Cost-effective machine learning strategies for shallow parsing

[...]

Claire Cardie, David Pierce

01 Jan 2003

TL;DR: A corpus-based shallow parsing approach to syntactic analysis of natural language from the perspective of practicality and economy is adopted, and a new memory-based algorithm for learning shallow syntax, called rote sequence learning, is contributed.

...read moreread less

Abstract: For natural language applications to become widespread, they must be both practical and economical. Practicality demands that systems are robust and efficient enough to handle realistic input. Economy demands that systems are inexpensive to construct and maintain. This dissertation explores syntactic analysis of natural language from the perspective of practicality and economy. We adopt a corpus-based shallow parsing approach to syntactic analysis. Shallow parsing addresses practicality by avoiding difficult attachment decisions and by employing simple, efficient algorithms. Corpus-based language learning addresses economy by applying machine learning techniques to develop language processing components. In particular, we contributed a new memory-based algorithm for learning shallow syntax, called rote sequence learning. Our experiments demonstrate that rote sequence learning achieves comparable performance to other, more complex, shallow parsing methods. Moreover, rote sequence learning possesses a number of desirable properties, including simplicity, efficiency, and portability. To support rote sequence learning, we developed algorithms for pruning bad rules from the grammar, for incorporating arbitrary additional information into the grammar using statistical models, and for determining the best parse among all possible parses. Rote sequence learning addresses the practicality requirement for shallow parsing. To address economy, we investigated learning strategies that allow the machine learner to manipulate its training setting. The goal of these strategies is to reduce the cost of training by reducing the number of examples needed and/or by reducing the cost of assembling the examples. In particular, active learning allows the learner to select training examples and ask the human teacher for answers, and weakly supervised learning allows the learner to guess at answers to some of the examples on its own. In experiments with these two strategies, we discovered interesting behaviors of each. Finally, we contributed a new learning strategy, cooperative learning, that combines the best aspects of active and weakly supervised learning.

...read moreread less

4 citations

Proceedings Article•

A Hybrid Machine Translation System for Typologically Related Languages

[...]

Petr Homola¹, Vladislav Kubon¹•Institutions (1)

Charles University in Prague¹

01 Jan 2008

TL;DR: A shallow parsing formalism aiming at machine translation between closely related languages by allowing to write grammar rules helping to (partially) disambiguate chunks in input sentences.

...read moreread less

Abstract: This paper describes a shallow parsing formalism aiming at machine translation between closely related languages. The formalism allows to write grammar rules helping to (partially) disambiguate chunks in input sentences. The chunks are then translatred into the target language without any deep syntactic or semantic processing. A stochastic ranker then selects the best translation according to the target language model. The results obtained for Czech and Slovak are presented.

...read moreread less

4 citations

Collapse

Network Information

Performance

Metrics

397

Papers

10,872

Citations

No. of papers in the topic in previous years
Year	Papers
2021	7
2020	12
2019	6
2018	5
2017	11
2016	11

Shallow parsing

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics