scispace - formally typeset
Proceedings ArticleDOI

SimRank: a measure of structural-context similarity

Glen Jeh, +1 more
- pp 538-543
TLDR
A complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects is proposed.
Abstract
The problem of measuring "similarity" of objects arises in many applications, and many domain-specific measures have been developed, e.g., matching text across documents or computing overlap among item-sets. We propose a complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects. Effectively, we compute a measure that says "two objects are similar if they are related to similar objects:" This general similarity measure, called SimRank, is based on a simple and intuitive graph-theoretic model. For a given domain, SimRank can be combined with other domain-specific similarity measures. We suggest techniques for efficient computation of SimRank scores, and provide experimental results on two application domains showing the computational feasibility and effectiveness of our approach.

read more

Citations
More filters
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal IssueDOI

The link-prediction problem for social networks

TL;DR: Experiments on large coauthorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures.
Journal ArticleDOI

Link prediction in complex networks: A survey

TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Book

Mining of Massive Datasets

TL;DR: This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets, and explains the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing.
Journal ArticleDOI

PathSim: meta path-based top-K similarity search in heterogeneous information networks

TL;DR: Under the meta path framework, a novel similarity measure called PathSim is defined that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures.
References
More filters
Proceedings Article

The PageRank Citation Ranking : Bringing Order to the Web

TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Book

Modern Information Retrieval

TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.
Journal ArticleDOI

An algorithm for suffix stripping

TL;DR: An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL, and performs slightly better than a much more elaborate system with which it has been compared.
Posted Content

Empirical Analysis of Predictive Algorithms for Collaborative Filtering

TL;DR: In this article, the authors compare the predictive accuracy of various methods in a set of representative problem domains, including correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods.
Proceedings Article

Empirical analysis of predictive algorithms for collaborative filtering

TL;DR: Several algorithms designed for collaborative filtering or recommender systems are described, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods, to compare the predictive accuracy of the various methods in a set of representative problem domains.