Proceedings ArticleDOI
SimRank: a measure of structural-context similarity
Glen Jeh,Jennifer Widom +1 more
- pp 538-543
TLDR
A complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects is proposed.Abstract:
The problem of measuring "similarity" of objects arises in many applications, and many domain-specific measures have been developed, e.g., matching text across documents or computing overlap among item-sets. We propose a complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects. Effectively, we compute a measure that says "two objects are similar if they are related to similar objects:" This general similarity measure, called SimRank, is based on a simple and intuitive graph-theoretic model. For a given domain, SimRank can be combined with other domain-specific similarity measures. We suggest techniques for efficient computation of SimRank scores, and provide experimental results on two application domains showing the computational feasibility and effectiveness of our approach.read more
Citations
More filters
Book
Data Mining: Concepts and Techniques
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal IssueDOI
The link-prediction problem for social networks
David Liben-Nowell,Jon Kleinberg +1 more
TL;DR: Experiments on large coauthorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures.
Journal ArticleDOI
Link prediction in complex networks: A survey
TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.
Book
Mining of Massive Datasets
TL;DR: This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets, and explains the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing.
Journal ArticleDOI
PathSim: meta path-based top-K similarity search in heterogeneous information networks
TL;DR: Under the meta path framework, a novel similarity measure called PathSim is defined that is able to find peer objects in the network (e.g., find authors in the similar field and with similar reputation), which turns out to be more meaningful in many scenarios compared with random-walk based similarity measures.
References
More filters
Proceedings Article
The PageRank Citation Ranking : Bringing Order to the Web
TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Book
Modern Information Retrieval
TL;DR: In this article, the authors present a rigorous and complete textbook for a first course on information retrieval from the computer science (as opposed to a user-centred) perspective, which provides an up-to-date student oriented treatment of the subject.
Journal ArticleDOI
An algorithm for suffix stripping
TL;DR: An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL, and performs slightly better than a much more elaborate system with which it has been compared.
Posted Content
Empirical Analysis of Predictive Algorithms for Collaborative Filtering
TL;DR: In this article, the authors compare the predictive accuracy of various methods in a set of representative problem domains, including correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods.
Proceedings Article
Empirical analysis of predictive algorithms for collaborative filtering
TL;DR: Several algorithms designed for collaborative filtering or recommender systems are described, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods, to compare the predictive accuracy of the various methods in a set of representative problem domains.