scispace - formally typeset
Search or ask a question

Showing papers by "Mihai Surdeanu published in 2010"


Proceedings Article
09 Oct 2010
TL;DR: This work proposes a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision, and outperforms many state-of-the-art supervised and unsupervised models on several standard corpora.
Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier's entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sieve-based approaches could be applied to other NLP tasks.

389 citations


Proceedings Article
02 Jun 2010
TL;DR: This study proves that fast and accurate ensemble parsers can be built with minimal effort and the simplest scoring model for re-parsing (unweighted voting) performs essentially as well as other more complex models.
Abstract: Previous work on dependency parsing used various kinds of combination models but a systematic analysis and comparison of these approaches is lacking. In this paper we implemented such a study for English dependency parsing and find several non-obvious facts: (a) the diversity of base parsers is more important than complex models for learning (e.g., stacking, supervised meta-classification), (b) approximate, linear-time re-parsing algorithms guarantee well-formed dependency trees without significant performance loss, and (c) the simplest scoring model for re-parsing (unweighted voting) performs essentially as well as other more complex models. This study proves that fast and accurate ensemble parsers can be built with minimal effort.

90 citations


Journal Article
TL;DR: The design and implementation of the slot filling system prepared by Stanford’s natural language processing group for the 2010 Knowledge Base Population track at the Text Analysis Conference (TAC) attained the median rank among all participating systems.
Abstract: This paper describes the design and implementation of the slot filling system prepared by Stanford’s natural language processing group for the 2010 Knowledge Base Population (KBP) track at the Text Analysis Conference (TAC). Our system relies on a simple distant supervision approach using mainly resources furnished by the track organizers: we used slot examples from the provided knowledge base, which we mapped to documents from several corpora, i.e., those distributed by the organizers, Wikipedia, and web snippets. Our implementation attained the median rank among all participating systems.

64 citations


Proceedings Article
02 Jun 2010
TL;DR: This work incorporates Selectional Preferences (SP) into a Semantic Role (SR) Classification system and shows that the inclusion of the refined SPs yields statistically significant improvements on both in domain and out of domain data.
Abstract: This work incorporates Selectional Preferences (SP) into a Semantic Role (SR) Classification system. We learn separate selectional preferences for noun phrases and prepositional phrases and we integrate them in a state-of-the-art SR classification system both in the form of features and individual class predictors. We show that the inclusion of the refined SPs yields statistically significant improvements on both in domain and out of domain data (14.07% and 11.67% error reduction, respectively). The key factor for success is the combination of several SP methods with the original classification model using meta-classification.

18 citations