Relation Extraction with Matrix Factorization and Universal Schemas

Home
/
Papers
/
Relation Extraction with Matrix Factorization and Universal Schemas

Proceedings Article•

Relation Extraction with Matrix Factorization and Universal Schemas

Sebastian Riedel¹, Limin Yao², Andrew McCallum², Benjamin M. Marlin²•Institutions (2)

University College London¹, University of Massachusetts Amherst²

01 Jan 2013-pp 74-84

TL;DR: In this article, a matrix factorization model is used to learn latent feature vectors for entity tuples and relations in a universal schema, which has an almost unlimited set of relations (due to surface forms).

read less

Abstract: © 2013 Association for Computational Linguistics. Traditional relation extraction predicts relations within some fixed and finite target schema. Machine learning approaches to this task require either manual annotation or, in the case of distant supervision, existing structured sources of the same schema. The need for existing datasets can be avoided by using a universal schema: the union of all involved schemas (surface form predicates as in OpenIE, and relations in the schemas of preexisting databases). This schema has an almost unlimited set of relations (due to surface forms), and supports integration with existing structured data (through the relation types of existing databases). To populate a database of such schema we present matrix factorization models that learn latent feature vectors for entity tuples and relations. We show that such latent models achieve substantially higher accuracy than a traditional classification approach. More importantly, by operating simultaneously on relations observed in text and in pre-existing structured DBs such as Freebase, we are able to reason about unstructured and structured data in mutually-supporting ways. By doing so our approach outperforms stateof- the-Art distant supervision.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Knowledge Graph Embedding: A Survey of Approaches and Applications

[...]

Quan Wang¹, Zhendong Mao¹, Bin Wang¹, Li Guo¹•Institutions (1)

Chinese Academy of Sciences¹

01 Dec 2017-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This article provides a systematic review of existing techniques of Knowledge graph embedding, including not only the state-of-the-arts but also those with latest trends, based on the type of information used in the embedding task.

...read moreread less

Abstract: Knowledge graph (KG) embedding is to embed components of a KG including entities and relations into continuous vector spaces, so as to simplify the manipulation while preserving the inherent structure of the KG. It can benefit a variety of downstream tasks such as KG completion and relation extraction, and hence has quickly gained massive attention. In this article, we provide a systematic review of existing techniques, including not only the state-of-the-arts but also those with latest trends. Particularly, we make the review based on the type of information used in the embedding task. Techniques that conduct embedding using only facts observed in the KG are first introduced. We describe the overall framework, specific model design, typical training procedures, as well as pros and cons of such techniques. After that, we discuss techniques that further incorporate additional information besides facts. We focus specifically on the use of entity types, relation paths, textual descriptions, and logical rules. Finally, we briefly introduce how KG embedding can be applied to and benefit a wide variety of downstream tasks such as KG completion, relation extraction, question answering, and so forth.

...read moreread less

1,905 citations

Journal Article•DOI•

A Review of Relational Machine Learning for Knowledge Graphs

[...]

Maximilian Nickel¹, Kevin Murphy², Volker Tresp³, Evgeniy Gabrilovich²•Institutions (3)

Massachusetts Institute of Technology¹, Google², Siemens³

01 Jan 2016

TL;DR: This paper provides a review of how statistical models can be “trained” on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph) and how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web.

...read moreread less

Abstract: Relational machine learning studies methods for the statistical analysis of relational, or graph-structured, data. In this paper, we provide a review of how such statistical models can be “trained” on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two fundamentally different kinds of statistical relational models, both of which can scale to massive data sets. The first is based on latent feature models such as tensor factorization and multiway neural networks. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. To this end, we also discuss Google's knowledge vault project as an example of such combination.

...read moreread less

1,452 citations

Proceedings Article•

Complex embeddings for simple link prediction

[...]

Théo Trouillon¹, Johannes Welbl², Sebastian Riedel², Eric Gaussier¹, Guillaume Bouchard² - Show less +1 more•Institutions (2)

University of Grenoble¹, University College London²

19 Jun 2016

TL;DR: This work makes use of complex valued embeddings to solve the link prediction problem through latent factorization, and uses the Hermitian dot product, the complex counterpart of the standard dot product between real vectors.

...read moreread less

Abstract: In statistical relational learning, the link prediction problem is key to automatically understand the structure of large knowledge bases. As in previous studies, we propose to solve this problem through latent factorization. However, here we make use of complex valued embeddings. The composition of complex embeddings can handle a large variety of binary relations, among them symmetric and antisymmetric relations. Compared to state-of-the-art models such as Neural Tensor Network and Holographic Embeddings, our approach based on complex embeddings is arguably simpler, as it only uses the Hermitian dot product, the complex counterpart of the standard dot product between real vectors. Our approach is scalable to large datasets as it remains linear in both space and time, while consistently outperforming alternative approaches on standard link prediction benchmarks.

...read moreread less

1,113 citations

Posted Content•

Complex Embeddings for Simple Link Prediction

[...]

Théo Trouillon¹, Johannes Welbl², Sebastian Riedel², Eric Gaussier³, Guillaume Bouchard² - Show less +1 more•Institutions (3)

Xerox¹, University College London², University of Grenoble³

20 Jun 2016-arXiv: Artificial Intelligence

TL;DR: In this article, the authors make use of complex valued embeddings to handle a large variety of binary relations, among them symmetric and antisymmetric relations, and their approach is scalable to large datasets as it remains linear in both space and time.

...read moreread less

1,100 citations

Journal Article•DOI•

Advances in natural language processing.

[...]

Julia Hirschberg¹, Christopher D. Manning²•Institutions (2)

Columbia University¹, Stanford University²

17 Jul 2015-Science

TL;DR: This work describes successes and challenges in this rapidly advancing area of natural language processing, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services.

...read moreread less

Abstract: Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today’s researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area.

...read moreread less

859 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122

Collapse

References

PDF

Open Access

More filters

Book•

Introduction to Information Retrieval

[...]

Christopher D. Manning¹, Prabhakar Raghavan², Hinrich Schütze³•Institutions (3)

Stanford University¹, Google², University of Stuttgart³

01 Jan 2008

TL;DR: In this article, the authors present an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.

...read moreread less

Abstract: Class-tested and coherent, this groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.

...read moreread less

11,804 citations

Proceedings Article•DOI•

Factorization meets the neighborhood: a multifaceted collaborative filtering model

[...]

Yehuda Koren¹•Institutions (1)

AT&T¹

24 Aug 2008

TL;DR: The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model and a new evaluation metric is suggested, which highlights the differences among methods, based on their performance at a top-K recommendation task.

...read moreread less

Abstract: Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent factor models, which directly profile both users and products, and neighborhood models, which analyze similarities between products or users. In this work we introduce some innovations to both approaches. The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model. Further accuracy improvements are achieved by extending the models to exploit both explicit and implicit feedback by the users. The methods are tested on the Netflix data. Results are better than those previously published on that dataset. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task.

...read moreread less

3,975 citations

Proceedings Article•

BPR: Bayesian personalized ranking from implicit feedback

[...]

Steffen Rendle¹, Christoph Freudenthaler¹, Zeno Gantner¹, Lars Schmidt-Thieme¹•Institutions (1)

University of Hildesheim¹

18 Jun 2009

TL;DR: In this article, the authors proposed a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem, which is based on stochastic gradient descent with bootstrap sampling.

...read moreread less

Abstract: Item recommendation is the task of predicting a personalized ranking on a set of items (e.g. websites, movies, products). In this paper, we investigate the most common scenario with implicit feedback (e.g. clicks, purchases). There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or adaptive k-nearest-neighbor (kNN). Even though these methods are designed for the item prediction task of personalized ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem. We also provide a generic learning algorithm for optimizing models with respect to BPR-Opt. The learning method is based on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of personalized ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the importance of optimizing models for the right criterion.

...read moreread less

3,429 citations

Proceedings Article•DOI•

Distant supervision for relation extraction without labeled data

[...]

Mike D. Mintz¹, Steven Bills¹, Rion Snow¹, Dan Jurafsky¹•Institutions (1)

Stanford University¹

02 Aug 2009

TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.

...read moreread less

Abstract: Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relations, to provide distant supervision. For each pair of entities that appears in some Freebase relation, we find all sentences containing those entities in a large unlabeled corpus and extract textual features to train a relation classifier. Our algorithm combines the advantages of supervised IE (combining 400,000 noisy pattern features in a probabilistic classifier) and unsupervised IE (extracting large numbers of relations from large corpora of any domain). Our model is able to extract 10,000 instances of 102 relations at a precision of 67.6%. We also analyze feature performance, showing that syntactic parse features are particularly helpful for relations that are ambiguous or lexically distant in their expression.

...read moreread less

2,965 citations

Proceedings Article•

Toward an architecture for never-ending language learning

[...]

Andrew Carlson¹, Justin Betteridge¹, Bryan Kisiel¹, Burr Settles¹, Estevam R. Hruschka², Tom M. Mitchell¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Federal University of São Carlos²

11 Jul 2010

TL;DR: This work proposes an approach and a set of design principles for an intelligent computer agent that runs forever and describes a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs.

...read moreread less

Abstract: We consider here the problem of building a never-ending language learner; that is, an intelligent computer agent that runs forever and that each day must (1) extract, or read, information from the web to populate a growing structured knowledge base, and (2) learn to perform this task better than on the previous day In particular, we propose an approach and a set of design principles for such an agent, describe a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs with an estimated precision of 74% after running for 67 days, and discuss lessons learned from this preliminary attempt to build a never-ending learning agent

...read moreread less

2,010 citations