An Information-Theoretic Definition of Similarity

Home
/
Papers
/
An Information-Theoretic Definition of Similarity

Proceedings Article•

An Information-Theoretic Definition of Similarity

24 Jul 1998-pp 296-304

TL;DR: This work presents an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model and demonstrates how this definition can be used to measure the similarity in a number of different domains.

read less

Abstract: Similarity is an important and widely used concept Previous definitions of similarity are tied to a particular application or a form of knowledge representation We present an informationtheoretic definition of similarity that is applicable as long as there is a probabilistic model We demonstrate how our definition can be used to measure the similarity in a number of different domains

...read moreread less

Citations

PDF

Open Access

More filters

Book•

Speech and Language Processing

[...]

Dan Jurafsky, James Martin

01 Dec 1999

TL;DR: It is now clear that HAL's creator, Arthur C. Clarke, was a little optimistic in predicting when an artiﬁcial agent such as HAL would be avail-able as discussed by the authors.

...read moreread less

Abstract: is one of the most recognizablecharacters in 20th century cinema. HAL is an artiﬁcial agent capable of such advancedlanguage behavior as speaking and understanding English, and at a crucial moment inthe plot, even reading lips. It is now clear that HAL’s creator, Arthur C. Clarke, wasa little optimistic in predicting when an artiﬁcial agent such as HAL would be avail-able. But just how far off was he? What would it take to create at least the language-relatedpartsofHAL?WecallprogramslikeHALthatconversewithhumansinnatural

...read moreread less

3,077 citations

Book Chapter•DOI•

A Survey of Clustering Data Mining Techniques

[...]

Pavel Berkhin¹•Institutions (1)

Yahoo!¹

01 Jan 2006

TL;DR: This survey concentrates on clustering algorithms from a data mining perspective as a data modeling technique that provides for concise summaries of the data.

...read moreread less

Abstract: Clustering is the division of data into groups of similar objects. In clustering, some details are disregarded in exchange for data simplification. Clustering can be viewed as a data modeling technique that provides for concise summaries of the data. Clustering is therefore related to many disciplines and plays an important role in a broad range of applications. The applications of clustering usually deal with large datasets and data with many attributes. Exploration of such data is a subject of data mining. This survey concentrates on clustering algorithms from a data mining perspective.

...read moreread less

3,047 citations

Book•

Ontology Matching

[...]

Jérôme Euzenat, Pavel Shvaiko

05 Jun 2007

TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.

...read moreread less

Abstract: Ontologies tend to be found everywhere. They are viewed as the silver bullet for many applications, such as database integration, peer-to-peer systems, e-commerce, semantic web services, or social networks. However, in open or evolving systems, such as the semantic web, different parties would, in general, adopt different ontologies. Thus, merely using ontologies, like using XML, does not reduce heterogeneity: it just raises heterogeneity problems to a higher level. Euzenat and Shvaikos book is devoted to ontology matching as a solution to the semantic heterogeneity problem faced by computer systems. Ontology matching aims at finding correspondences between semantically related entities of different ontologies. These correspondences may stand for equivalence as well as other relations, such as consequence, subsumption, or disjointness, between ontology entities. Many different matching solutions have been proposed so far from various viewpoints, e.g., databases, information systems, and artificial intelligence. The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content. In particular, the book includes a new chapter dedicated to the methodology for performing ontology matching. It also covers emerging topics, such as data interlinking, ontology partitioning and pruning, context-based matching, matcher tuning, alignment debugging, and user involvement in matching, to mention a few. More than 100 state-of-the-art matching systems and frameworks were reviewed. With Ontology Matching, researchers and practitioners will find a reference book that presents currently available work in a uniform framework. In particular, the work and the techniques presented in this book can be equally applied to database schema matching, catalog integration, XML schema matching and other related problems. The objectives of the book include presenting (i) the state of the art and (ii) the latest research results in ontology matching by providing a systematic and detailed account of matching techniques and matching systems from theoretical, practical and application perspectives.

...read moreread less

2,579 citations

Cites methods from "An Information-Theoretic Definition..."

...This is the case in the Jiang–Conrath method (Jiang and Conrath 1997) or the Lin information-theoretic similarity (Lin 1998)....
[...]

Journal Article•DOI•

Link prediction in complex networks: A survey

[...]

Linyuan Lü¹, Linyuan Lü², Linyuan Lü³, Tao Zhou⁴, Tao Zhou² - Show less +1 more•Institutions (4)

University of Shanghai for Science and Technology¹, University of Electronic Science and Technology of China², University of Fribourg³, University of Science and Technology of China⁴

15 Mar 2011-Physica A-statistical Mechanics and Its Applications

TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.

...read moreread less

Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

...read moreread less

2,530 citations

Cites background from "An Information-Theoretic Definition..."

...The algorithms’ performance can be effectively enhanced by considering some external information, like the attributes of nodes [35]....
[...]
...Node similarity can be defined by using the essential attributes of nodes: two nodes are considered to be similar if they have many common features [35]....
[...]

Proceedings Article•

Computing semantic relatedness using Wikipedia-based explicit semantic analysis

[...]

Evgeniy Gabrilovich¹, Shaul Markovitch¹•Institutions (1)

Technion – Israel Institute of Technology¹

06 Jan 2007

TL;DR: This work proposes Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia that results in substantial improvements in correlation of computed relatedness scores with human judgments.

...read moreread less

Abstract: Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia. We use machine learning techniques to explicitly represent the meaning of any text as a weighted vector of Wikipedia-based concepts. Assessing the relatedness of texts in this space amounts to comparing the corresponding vectors using conventional metrics (e.g., cosine). Compared with the previous state of the art, using ESA results in substantial improvements in correlation of computed relatedness scores with human judgments: from r = 0.56 to 0.75 for individual words and from r = 0.60 to 0.72 for texts. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.

...read moreread less

2,285 citations

Cites methods from "An Information-Theoretic Definition..."

...Quite a few metrics have been defined that compute relatedness using various properties of the underlying graph structure of these resources [Budanitsky and Hirst, 2006; Jarmasz, 2003; Banerjee and Pedersen, 2003; Resnik, 1999; Lin, 1998; Jiang and Conrath, 1997; Grefenstette, 1992]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Book•

Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference

[...]

Judea Pearl¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 1988

TL;DR: Probabilistic Reasoning in Intelligent Systems as mentioned in this paper is a complete and accessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty, and provides a coherent explication of probability as a language for reasoning with partial belief.

...read moreread less

Abstract: From the Publisher: Probabilistic Reasoning in Intelligent Systems is a complete andaccessible account of the theoretical foundations and computational methods that underlie plausible reasoning under uncertainty. The author provides a coherent explication of probability as a language for reasoning with partial belief and offers a unifying perspective on other AI approaches to uncertainty, such as the Dempster-Shafer formalism, truth maintenance systems, and nonmonotonic logic. The author distinguishes syntactic and semantic approaches to uncertaintyand offers techniques, based on belief networks, that provide a mechanism for making semantics-based systems operational. Specifically, network-propagation techniques serve as a mechanism for combining the theoretical coherence of probability theory with modern demands of reasoning-systems technology: modular declarative inputs, conceptually meaningful inferences, and parallel distributed computation. Application areas include diagnosis, forecasting, image interpretation, multi-sensor fusion, decision support systems, plan recognition, planning, speech recognitionin short, almost every task requiring that conclusions be drawn from uncertain clues and incomplete information. Probabilistic Reasoning in Intelligent Systems will be of special interest to scholars and researchers in AI, decision theory, statistics, logic, philosophy, cognitive psychology, and the management sciences. Professionals in the areas of knowledge-based systems, operations research, engineering, and statistics will find theoretical and computational tools of immediate practical use. The book can also be used as an excellent text for graduate-level courses in AI, operations research, or applied probability.

...read moreread less

15,671 citations

Journal Article•DOI•

Features of Similarity

[...]

Amos Tversky

01 Jul 1977-Psychological Review

TL;DR: The metric and dimensional assumptions that underlie the geometric representation of similarity are questioned on both theoretical and empirical grounds and a set of qualitative assumptions are shown to imply the contrast model, which expresses the similarity between objects as a linear combination of the measures of their common and distinctive features.

...read moreread less

Abstract: The metric and dimensional assumptions that underlie the geometric representation of similarity are questioned on both theoretical and empirical grounds. A new set-theoretical approach to similarity is developed in which objects are represented as collections of features, and similarity is described as a feature-matching process. Specifically, a set of qualitative assumptions is shown to imply the contrast model, which expresses the similarity between objects as a linear combination of the measures of their common and distinctive features. Several predictions of the contrast model are tested in studies of similarity with both semantic and perceptual stimuli. The model is used to uncover, analyze, and explain a variety of empirical phenomena such as the role of common and distinctive features, the relations between judgments of similarity and difference, the presence of asymmetric similarities, and the effects of context on judgments of similarity. The contrast model generalizes standard representations of similarity data in terms of clusters and trees. It is also used to analyze the relations of prototypicality and family resemblance

...read moreread less

7,251 citations

"An Information-Theoretic Definition..." refers methods in this paper

...We demonstrate how our definition can be used to measure the similarity in a number of different domains....
[...]

Journal Article•DOI•

Introduction to WordNet: An On-line Lexical Database

[...]

George A. Miller¹, Richard Beckwith¹, Christiane Fellbaum¹, Derek Gross², Katherine J. Miller¹ - Show less +1 more•Institutions (2)

Princeton University¹, University of Rochester²

01 Dec 1990-International Journal of Lexicography

TL;DR: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.

...read moreread less

Abstract: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list. Unfortunately, there is no obvious alternative, no other simple way for lexicographers to keep track of what has been done or for readers to find the word they are looking for. But a frequent objection to this solution is that finding things on an alphabetical list can be tedious and time-consuming. Many people who would like to refer to a dictionary decide not to bother with it because finding the information would interrupt their work and break their train of thought.

...read moreread less

5,038 citations

Journal Article•DOI•

Instance-Based Learning Algorithms

[...]

David W. Aha¹, Dennis F. Kibler¹, Marc K. Albert¹•Institutions (1)

University of California, Irvine¹

03 Jan 1991-Machine Learning

TL;DR: This paper describes how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy and extends the nearest neighbor algorithm, which has large storage requirements.

...read moreread less

Abstract: Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several real-world databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.

...read moreread less

4,499 citations

Posted Content•

Using Information Content to Evaluate Semantic Similarity in a Taxonomy

[...]

Philip Resnik¹•Institutions (1)

Sun Microsystems Laboratories¹

29 Nov 1995-arXiv: Computation and Language

TL;DR: In this article, a new measure of semantic similarity in an IS-A taxonomy based on the notion of information content is presented, and experimental evaluation suggests that the measure performs encouragingly well (a correlation of r = 0.79 with a benchmark set of human similarity judgments, with an upper bound of r < 0.90 for human subjects performing the same task).

...read moreread less

Abstract: This paper presents a new measure of semantic similarity in an IS-A taxonomy, based on the notion of information content. Experimental evaluation suggests that the measure performs encouragingly well (a correlation of r = 0.79 with a benchmark set of human similarity judgments, with an upper bound of r = 0.90 for human subjects performing the same task), and significantly better than the traditional edge counting approach (r = 0.66).

...read moreread less

3,533 citations