scispace - formally typeset
Search or ask a question
Proceedings Article

Distant Supervision for Relation Extraction with an Incomplete Knowledge Base

Bonan Min1, Ralph Grishman1, Li Wan1, Chang Wang2, David C. Gondek2 
01 Jun 2013-pp 777-782
TL;DR: It is shown that a significant number of “negative“ examples generated by the labeling process are false negatives because the knowledge base is incomplete, therefore the heuristic for generating negative examples has a seriousflaw.
Abstract: Distant supervision, heuristically labeling a corpus using a knowledge base, has emerged as a popular choice for training relation extractors. In this paper, we show that a significant number of “negative“ examples generated by the labeling process are false negatives because the knowledge base is incomplete. Therefore the heuristic for generating negative examples has a seriousflaw. Building on a state-of-the-art distantly-supervised extraction algorithm, we proposed an algorithm that learns from only positive and unlabeled labels at the pair-of-entity level. Experimental results demonstrate its advantage over existing algorithms.
Citations
More filters
Proceedings ArticleDOI
24 Aug 2014
TL;DR: The Knowledge Vault is a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories that computes calibrated probabilities of fact correctness.
Abstract: Recent years have witnessed a proliferation of large-scale knowledge bases, including Wikipedia, Freebase, YAGO, Microsoft's Satori, and Google's Knowledge Graph. To increase the scale even further, we need to explore automatic methods for constructing knowledge bases. Previous approaches have primarily focused on text-based extraction, which can be very noisy. Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories. We employ supervised machine learning methods for fusing these distinct information sources. The Knowledge Vault is substantially bigger than any previously published structured knowledge repository, and features a probabilistic inference system that computes calibrated probabilities of fact correctness. We report the results of multiple studies that explore the relative utility of the different information sources and extraction methods.

1,657 citations


Cites methods from "Distant Supervision for Relation Ex..."

  • ...[28]....

    [...]

  • ...There are more sophisticated methods for training models that don’t make this assumption (such as [28, 36]), but we leave the integration of such methods into KV to future work....

    [...]

Journal ArticleDOI
17 Jul 2015-Science
TL;DR: This work describes successes and challenges in this rapidly advancing area of natural language processing, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services.
Abstract: Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today’s researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area.

859 citations


Cites background from "Distant Supervision for Relation Ex..."

  • ...More recent systems have used increasingly sophisticated probabilistic inference to discern which textual clauses map to which facts in the knowledge base, or to something else entirely (40, 41)....

    [...]

Proceedings ArticleDOI
01 Jul 2015
TL;DR: This work replaces this large pattern set with a few patterns for canonically structured sentences, and shifts the focus to a classifier which learns to extract self-contained clauses from longer sentences to determine the maximally specific arguments for each candidate triple.
Abstract: Relation triples produced by open domain information extraction (open IE) systems are useful for question answering, inference, and other IE tasks. Traditionally these are extracted using a large set of patterns; however, this approach is brittle on out-of-domain text and long-range dependencies, and gives no insight into the substructure of the arguments. We replace this large pattern set with a few patterns for canonically structured sentences, and shift the focus to a classifier which learns to extract self-contained clauses from longer sentences. We then run natural logic inference over these short clauses to determine the maximally specific arguments for each candidate triple. We show that our approach outperforms a state-of-the-art open IE system on the end-to-end TAC-KBP 2013 Slot Filling task.

704 citations


Cites background from "Distant Supervision for Relation Ex..."

  • ...This arises from the incomplete negatives problem in distantly supervised relation extraction (Min et al., 2013): since our knowledge base is not exhaustive, we cannot be sure if an extracted relation is incorrect or correct but previously unknown....

    [...]

Posted Content
TL;DR: A new algorithm MINERVA is proposed, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity, and significantly outperforms prior methods.
Abstract: Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information. A popular approach to KB completion is to infer new relations by combinatory reasoning over the information found along other paths connecting a pair of entities. Given the enormous size of KBs and the exponential number of paths, previous path-based models have considered only the problem of predicting a missing relation given two entities or evaluating the truth of a proposed triple. Additionally, these methods have traditionally used random paths between fixed entity pairs or more recently learned to pick paths between them. We propose a new algorithm MINERVA, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity. Since random walks are impractical in a setting with combinatorially many destinations from a start node, we present a neural reinforcement learning approach which learns how to navigate the graph conditioned on the input query to find predictive paths. Empirically, this approach obtains state-of-the-art results on several datasets, significantly outperforming prior methods.

367 citations


Cites background from "Distant Supervision for Relation Ex..."

  • ...KBs are highly incomplete [22], and facts not directly stored in a KB can often be inferred from those that are, creating exciting opportunities and challenges for automated reasoning....

    [...]

  • ...KBs are highly incomplete (Min et al., 2013), and facts not directly stored in a KB can often be inferred from those that are, creating exciting opportunities and challenges for automated reasoning....

    [...]

Proceedings ArticleDOI
07 Apr 2014
TL;DR: A way to leverage existing Web-search-based question-answering technology to fill in the gaps in knowledge bases in a targeted way by learning the best set of queries to ask, such that the answer snippets returned by the search engine are most likely to contain the correct value for that attribute.
Abstract: Over the past few years, massive amounts of world knowledge have been accumulated in publicly available knowledge bases, such as Freebase, NELL, and YAGO. Yet despite their seemingly huge size, these knowledge bases are greatly incomplete. For example, over 70% of people included in Freebase have no known place of birth, and 99% have no known ethnicity. In this paper, we propose a way to leverage existing Web-search-based question-answering technology to fill in the gaps in knowledge bases in a targeted way. In particular, for each entity attribute, we learn the best set of queries to ask, such that the answer snippets returned by the search engine are most likely to contain the correct value for that attribute. For example, if we want to find Frank Zappa's mother, we could ask the query `who is the mother of Frank Zappa'. However, this is likely to return `The Mothers of Invention', which was the name of his band. Our system learns that it should (in this case) add disambiguating terms, such as Zappa's place of birth, in order to make it more likely that the search results contain snippets mentioning his mother. Our system also learns how many different queries to ask for each attribute, since in some cases, asking too many can hurt accuracy (by introducing false positives). We discuss how to aggregate candidate answers across multiple queries, ultimately returning probabilistic predictions for possible values for each attribute. Finally, we evaluate our system and show that it is able to extract a large number of facts with high confidence.

311 citations

References
More filters
Proceedings ArticleDOI
02 Aug 2009
TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
Abstract: Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relations, to provide distant supervision. For each pair of entities that appears in some Freebase relation, we find all sentences containing those entities in a large unlabeled corpus and extract textual features to train a relation classifier. Our algorithm combines the advantages of supervised IE (combining 400,000 noisy pattern features in a probabilistic classifier) and unsupervised IE (extracting large numbers of relations from large corpora of any domain). Our model is able to extract 10,000 instances of 102 relations at a precision of 67.6%. We also analyze feature performance, showing that syntactic parse features are particularly helpful for relations that are ambiguous or lexically distant in their expression.

2,965 citations


"Distant Supervision for Relation Ex..." refers background in this paper

  • ...Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....

    [...]

  • ...Recently, Distant Supervision (DS) (Craven and Kumlien, 1999; Mintz et al., 2009) has emerged to be a popular choice for training relation extractors without using manually labeled data....

    [...]

  • ...Mintz++ is a strong baseline (Surdeanu et al., 2012) and an improved version of Mintz et al. (2009)....

    [...]

  • ...777 Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....

    [...]

Book ChapterDOI
20 Sep 2010
TL;DR: A novel approach to distant supervision that can alleviate the problem of noisy patterns that hurt precision by using a factor graph and applying constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in the authors' training KB.
Abstract: Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation. Here we argue that this leads to noisy patterns that hurt precision, in particular if the knowledge base is not directly related to the text we are working with. We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decision whether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. We apply our approach to extract relations from the New York Times corpus and use Freebase as knowledge base. When compared to a state-of-the-art approach for relation extraction under distant supervision, we achieve 31% error reduction.

1,304 citations

Proceedings Article
19 Jun 2011
TL;DR: A novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts is presented.
Abstract: Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web's natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multi-instance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint --- for example they cannot extract the pair Founded(Jobs, Apple) and CEO-of(Jobs, Apple). This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Free-base. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level.

1,015 citations


"Distant Supervision for Relation Ex..." refers background or methods in this paper

  • ...MultiR (Hoffmann et al., 2011) and Multi-Instance Multi-Label (MIML) learning (Surdeanu et al....

    [...]

  • ...Previous approaches (Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012) bypassed this problem by heavily under-sampling the “negative“ class....

    [...]

  • ...777 Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....

    [...]

  • ...Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....

    [...]

  • ...MultiR (Hoffmann et al., 2011) and Multi-Instance Multi-Label (MIML) learning (Surdeanu et al., 2012) further improve it to support multiple relations expressed by different sentences in a bag....

    [...]

Proceedings Article
12 Jul 2012
TL;DR: This work proposes a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables that performs competitively on two difficult domains.
Abstract: Distant supervision for relation extraction (RE) -- gathering training data by aligning a database of facts with text -- is an efficient approach to scale RE to thousands of different relations. However, this introduces a challenging learning scenario where the relation expressed by a pair of entities found in a sentence is unknown. For example, a sentence containing Balzac and France may express BornIn or Died, an unknown relation, or no relation at all. Because of this, traditional supervised learning, which assumes that each example is explicitly mapped to a label, is not appropriate. We propose a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables. Our model performs competitively on two difficult domains.

770 citations


"Distant Supervision for Relation Ex..." refers background or methods in this paper

  • ..., 2011) and Multi-Instance Multi-Label (MIML) learning (Surdeanu et al., 2012) further improve it to support multiple relations expressed by different sentences in a bag....

    [...]

  • ...Previous approaches (Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012) bypassed this problem by heavily under-sampling the “negative“ class....

    [...]

  • ...We randomly picked 200 unlabeled bags(5) from each of the two datasets (Riedel et al., 2010; Surdeanu et al., 2012) generated by DS, and we manually annotate all relation mentions in these bags....

    [...]

  • ...Mintz++ is a strong baseline (Surdeanu et al., 2012) and an improved version of Mintz et al....

    [...]

  • ...777 Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....

    [...]

Proceedings ArticleDOI
25 Jun 2005
TL;DR: This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using SVM and illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while additional information from full parsing gives limited further enhancement.
Abstract: Extracting semantic relationships between entities is challenging. This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using SVM. Our study illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while additional information from full parsing gives limited further enhancement. This suggests that most of useful information in full parse trees for relation extraction is shallow and can be captured by chunking. We also demonstrate how semantic information such as WordNet and Name List, can be used in feature-based relation extraction to further improve the performance. Evaluation on the ACE corpus shows that effective incorporation of diverse features enables our system outperform previously best-reported systems on the 24 ACE relation subtypes and significantly outperforms tree kernel-based systems by over 20 in F-measure on the 5 ACE relation types.

759 citations


"Distant Supervision for Relation Ex..." refers background in this paper

  • ...Relation Extraction is a well-studied problem (Miller et al., 2000; Zhou et al., 2005; Kambhatla, 2004; Min et al., 2012a)....

    [...]