Distant Supervision for Relation Extraction with an Incomplete Knowledge Base

Home
/
Papers
/
Distant Supervision for Relation Extraction with an Incomplete Knowledge Base

Proceedings Article•

Distant Supervision for Relation Extraction with an Incomplete Knowledge Base

Bonan Min¹, Ralph Grishman¹, Li Wan¹, Chang Wang², David C. Gondek² - Show less +1 more•Institutions (2)

01 Jun 2013-pp 777-782

TL;DR: It is shown that a significant number of “negative“ examples generated by the labeling process are false negatives because the knowledge base is incomplete, therefore the heuristic for generating negative examples has a seriousflaw.

read less

Abstract: Distant supervision, heuristically labeling a corpus using a knowledge base, has emerged as a popular choice for training relation extractors. In this paper, we show that a significant number of “negative“ examples generated by the labeling process are false negatives because the knowledge base is incomplete. Therefore the heuristic for generating negative examples has a seriousflaw. Building on a state-of-the-art distantly-supervised extraction algorithm, we proposed an algorithm that learns from only positive and unlabeled labels at the pair-of-entity level. Experimental results demonstrate its advantage over existing algorithms.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Knowledge vault: a web-scale approach to probabilistic knowledge fusion

[...]

Xin Dong¹, Evgeniy Gabrilovich¹, Geremy Heitz¹, Wilko Horn¹, Ni Lao¹, Kevin Murphy¹, Thomas Strohmann¹, Shaohua Sun¹, Wei Zhang¹ - Show less +5 more•Institutions (1)

Google¹

24 Aug 2014

TL;DR: The Knowledge Vault is a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories that computes calibrated probabilities of fact correctness.

...read moreread less

Abstract: Recent years have witnessed a proliferation of large-scale knowledge bases, including Wikipedia, Freebase, YAGO, Microsoft's Satori, and Google's Knowledge Graph. To increase the scale even further, we need to explore automatic methods for constructing knowledge bases. Previous approaches have primarily focused on text-based extraction, which can be very noisy. Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories. We employ supervised machine learning methods for fusing these distinct information sources. The Knowledge Vault is substantially bigger than any previously published structured knowledge repository, and features a probabilistic inference system that computes calibrated probabilities of fact correctness. We report the results of multiple studies that explore the relative utility of the different information sources and extraction methods.

...read moreread less

1,657 citations

Cites methods from "Distant Supervision for Relation Ex..."

...[28]....
[...]
...There are more sophisticated methods for training models that don’t make this assumption (such as [28, 36]), but we leave the integration of such methods into KV to future work....
[...]

Journal Article•DOI•

Advances in natural language processing.

[...]

Julia Hirschberg¹, Christopher D. Manning²•Institutions (2)

Columbia University¹, Stanford University²

17 Jul 2015-Science

TL;DR: This work describes successes and challenges in this rapidly advancing area of natural language processing, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services.

...read moreread less

Abstract: Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today’s researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area.

...read moreread less

859 citations

Cites background from "Distant Supervision for Relation Ex..."

...More recent systems have used increasingly sophisticated probabilistic inference to discern which textual clauses map to which facts in the knowledge base, or to something else entirely (40, 41)....
[...]

Proceedings Article•DOI•

Leveraging Linguistic Structure For Open Domain Information Extraction

[...]

Gabor Angeli¹, Melvin Jose Johnson Premkumar, Christopher D. Manning¹•Institutions (1)

Stanford University¹

01 Jul 2015

TL;DR: This work replaces this large pattern set with a few patterns for canonically structured sentences, and shifts the focus to a classifier which learns to extract self-contained clauses from longer sentences to determine the maximally specific arguments for each candidate triple.

...read moreread less

Abstract: Relation triples produced by open domain information extraction (open IE) systems are useful for question answering, inference, and other IE tasks. Traditionally these are extracted using a large set of patterns; however, this approach is brittle on out-of-domain text and long-range dependencies, and gives no insight into the substructure of the arguments. We replace this large pattern set with a few patterns for canonically structured sentences, and shift the focus to a classifier which learns to extract self-contained clauses from longer sentences. We then run natural logic inference over these short clauses to determine the maximally specific arguments for each candidate triple. We show that our approach outperforms a state-of-the-art open IE system on the end-to-end TAC-KBP 2013 Slot Filling task.

...read moreread less

704 citations

Cites background from "Distant Supervision for Relation Ex..."

...This arises from the incomplete negatives problem in distantly supervised relation extraction (Min et al., 2013): since our knowledge base is not exhaustive, we cannot be sure if an extracted relation is incorrect or correct but previously unknown....
[...]

Posted Content•

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

[...]

Rajarshi Das¹, Shehzaad Dhuliawala¹, Manzil Zaheer², Luke Vilnis¹, Ishan Durugkar¹, Akshay Krishnamurthy¹, Alexander J. Smola³, Andrew McCallum¹ - Show less +4 more•Institutions (3)

University of Massachusetts Amherst¹, Carnegie Mellon University², Microsoft³

15 Nov 2017-arXiv: Computation and Language

TL;DR: A new algorithm MINERVA is proposed, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity, and significantly outperforms prior methods.

...read moreread less

Abstract: Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information. A popular approach to KB completion is to infer new relations by combinatory reasoning over the information found along other paths connecting a pair of entities. Given the enormous size of KBs and the exponential number of paths, previous path-based models have considered only the problem of predicting a missing relation given two entities or evaluating the truth of a proposed triple. Additionally, these methods have traditionally used random paths between fixed entity pairs or more recently learned to pick paths between them. We propose a new algorithm MINERVA, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity. Since random walks are impractical in a setting with combinatorially many destinations from a start node, we present a neural reinforcement learning approach which learns how to navigate the graph conditioned on the input query to find predictive paths. Empirically, this approach obtains state-of-the-art results on several datasets, significantly outperforming prior methods.

...read moreread less

367 citations

Cites background from "Distant Supervision for Relation Ex..."

...KBs are highly incomplete [22], and facts not directly stored in a KB can often be inferred from those that are, creating exciting opportunities and challenges for automated reasoning....
[...]
...KBs are highly incomplete (Min et al., 2013), and facts not directly stored in a KB can often be inferred from those that are, creating exciting opportunities and challenges for automated reasoning....
[...]

Proceedings Article•DOI•

Knowledge base completion via search-based question answering

[...]

Robert West¹, Evgeniy Gabrilovich², Kevin Murphy², Shaohua Sun², Rahul Gupta², Dekang Lin² - Show less +2 more•Institutions (2)

Stanford University¹, Google²

07 Apr 2014

TL;DR: A way to leverage existing Web-search-based question-answering technology to fill in the gaps in knowledge bases in a targeted way by learning the best set of queries to ask, such that the answer snippets returned by the search engine are most likely to contain the correct value for that attribute.

...read moreread less

Abstract: Over the past few years, massive amounts of world knowledge have been accumulated in publicly available knowledge bases, such as Freebase, NELL, and YAGO. Yet despite their seemingly huge size, these knowledge bases are greatly incomplete. For example, over 70% of people included in Freebase have no known place of birth, and 99% have no known ethnicity. In this paper, we propose a way to leverage existing Web-search-based question-answering technology to fill in the gaps in knowledge bases in a targeted way. In particular, for each entity attribute, we learn the best set of queries to ask, such that the answer snippets returned by the search engine are most likely to contain the correct value for that attribute. For example, if we want to find Frank Zappa's mother, we could ask the query `who is the mother of Frank Zappa'. However, this is likely to return `The Mothers of Invention', which was the name of his band. Our system learns that it should (in this case) add disambiguating terms, such as Zappa's place of birth, in order to make it more likely that the search results contain snippets mentioning his mother. Our system also learns how many different queries to ask for each attribute, since in some cases, asking too many can hurt accuracy (by introducing false positives). We discuss how to aggregate candidate answers across multiple queries, ultimately returning probabilistic predictions for possible values for each attribute. Finally, we evaluate our system and show that it is able to extract a large number of facts with high confidence.

...read moreread less

311 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Distant supervision for relation extraction without labeled data

[...]

Mike D. Mintz¹, Steven Bills¹, Rion Snow¹, Dan Jurafsky¹•Institutions (1)

Stanford University¹

02 Aug 2009

TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.

...read moreread less

Abstract: Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relations, to provide distant supervision. For each pair of entities that appears in some Freebase relation, we find all sentences containing those entities in a large unlabeled corpus and extract textual features to train a relation classifier. Our algorithm combines the advantages of supervised IE (combining 400,000 noisy pattern features in a probabilistic classifier) and unsupervised IE (extracting large numbers of relations from large corpora of any domain). Our model is able to extract 10,000 instances of 102 relations at a precision of 67.6%. We also analyze feature performance, showing that syntactic parse features are particularly helpful for relations that are ambiguous or lexically distant in their expression.

...read moreread less

2,965 citations

"Distant Supervision for Relation Ex..." refers background in this paper

...Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....
[...]
...Recently, Distant Supervision (DS) (Craven and Kumlien, 1999; Mintz et al., 2009) has emerged to be a popular choice for training relation extractors without using manually labeled data....
[...]
...Mintz++ is a strong baseline (Surdeanu et al., 2012) and an improved version of Mintz et al. (2009)....
[...]
...777 Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....
[...]

Book Chapter•DOI•

Modeling relations and their mentions without labeled text

[...]

Sebastian Riedel¹, Limin Yao¹, Andrew McCallum¹•Institutions (1)

University of Massachusetts Amherst¹

20 Sep 2010

TL;DR: A novel approach to distant supervision that can alleviate the problem of noisy patterns that hurt precision by using a factor graph and applying constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in the authors' training KB.

...read moreread less

Abstract: Several recent works on relation extraction have been applying the distant supervision paradigm: instead of relying on annotated text to learn how to predict relations, they employ existing knowledge bases (KBs) as source of supervision. Crucially, these approaches are trained based on the assumption that each sentence which mentions the two related entities is an expression of the given relation. Here we argue that this leads to noisy patterns that hurt precision, in particular if the knowledge base is not directly related to the text we are working with. We present a novel approach to distant supervision that can alleviate this problem based on the following two ideas: First, we use a factor graph to explicitly model the decision whether two entities are related, and the decision whether this relation is mentioned in a given sentence; second, we apply constraint-driven semi-supervision to train this model without any knowledge about which sentences express the relations in our training KB. We apply our approach to extract relations from the New York Times corpus and use Freebase as knowledge base. When compared to a state-of-the-art approach for relation extraction under distant supervision, we achieve 31% error reduction.

...read moreread less

1,304 citations

Proceedings Article•

Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

[...]

Raphael Hoffmann¹, Congle Zhang¹, Xiao Ling¹, Luke Zettlemoyer¹, Daniel S. Weld¹ - Show less +1 more•Institutions (1)

University of Washington¹

19 Jun 2011

TL;DR: A novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts is presented.

...read moreread less

Abstract: Information extraction (IE) holds the promise of generating a large-scale knowledge base from the Web's natural language text. Knowledge-based weak supervision, using structured data to heuristically label a training corpus, works towards this goal by enabling the automated learning of a potentially unbounded number of relation extractors. Recently, researchers have developed multi-instance learning algorithms to combat the noisy training data that can come from heuristic labeling, but their models assume relations are disjoint --- for example they cannot extract the pair Founded(Jobs, Apple) and CEO-of(Jobs, Apple). This paper presents a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts. We apply our model to learn extractors for NY Times text using weak supervision from Free-base. Experiments show that the approach runs quickly and yields surprising gains in accuracy, at both the aggregate and sentence level.

...read moreread less

1,015 citations

"Distant Supervision for Relation Ex..." refers background or methods in this paper

...MultiR (Hoffmann et al., 2011) and Multi-Instance Multi-Label (MIML) learning (Surdeanu et al....
[...]
...Previous approaches (Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012) bypassed this problem by heavily under-sampling the “negative“ class....
[...]
...777 Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....
[...]
...Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....
[...]
...MultiR (Hoffmann et al., 2011) and Multi-Instance Multi-Label (MIML) learning (Surdeanu et al., 2012) further improve it to support multiple relations expressed by different sentences in a bag....
[...]

Proceedings Article•

Multi-instance Multi-label Learning for Relation Extraction

[...]

Mihai Surdeanu¹, Julie Tibshirani¹, Ramesh Nallapati², Christopher D. Manning¹•Institutions (2)

Stanford University¹, Artificial Intelligence Center²

12 Jul 2012

TL;DR: This work proposes a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables that performs competitively on two difficult domains.

...read moreread less

Abstract: Distant supervision for relation extraction (RE) -- gathering training data by aligning a database of facts with text -- is an efficient approach to scale RE to thousands of different relations. However, this introduces a challenging learning scenario where the relation expressed by a pair of entities found in a sentence is unknown. For example, a sentence containing Balzac and France may express BornIn or Died, an unknown relation, or no relation at all. Because of this, traditional supervised learning, which assumes that each example is explicitly mapped to a label, is not appropriate. We propose a novel approach to multi-instance multi-label learning for RE, which jointly models all the instances of a pair of entities in text and all their labels using a graphical model with latent variables. Our model performs competitively on two difficult domains.

...read moreread less

770 citations

"Distant Supervision for Relation Ex..." refers background or methods in this paper

..., 2011) and Multi-Instance Multi-Label (MIML) learning (Surdeanu et al., 2012) further improve it to support multiple relations expressed by different sentences in a bag....
[...]
...Previous approaches (Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012) bypassed this problem by heavily under-sampling the “negative“ class....
[...]
...We randomly picked 200 unlabeled bags(5) from each of the two datasets (Riedel et al., 2010; Surdeanu et al., 2012) generated by DS, and we manually annotate all relation mentions in these bags....
[...]
...Mintz++ is a strong baseline (Surdeanu et al., 2012) and an improved version of Mintz et al....
[...]
...777 Since then, it has gain popularity (Mintz et al., 2009; Bunescu and Mooney, 2007; Wu and Weld, 2007; Riedel et al., 2010; Hoffmann et al., 2011; Surdeanu et al., 2012; Nguyen and Moschitti, 2011)....
[...]

Proceedings Article•DOI•

Exploring Various Knowledge in Relation Extraction

[...]

Guodong Zhou¹, Jian Su², Jie Zhang², Min Zhang¹•Institutions (2)

Institute for Infocomm Research Singapore¹, Agency for Science, Technology and Research²

25 Jun 2005

TL;DR: This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using SVM and illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while additional information from full parsing gives limited further enhancement.

...read moreread less

Abstract: Extracting semantic relationships between entities is challenging. This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using SVM. Our study illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while additional information from full parsing gives limited further enhancement. This suggests that most of useful information in full parse trees for relation extraction is shallow and can be captured by chunking. We also demonstrate how semantic information such as WordNet and Name List, can be used in feature-based relation extraction to further improve the performance. Evaluation on the ACE corpus shows that effective incorporation of diverse features enables our system outperform previously best-reported systems on the 24 ACE relation subtypes and significantly outperforms tree kernel-based systems by over 20 in F-measure on the 5 ACE relation types.

...read moreread less

759 citations