scispace - formally typeset
Book ChapterDOI

Calculating Similarity Efficiently in a Small World

Reads0
Chats0
TLDR
This paper proposes a novel algorithm called SW-SimRank to speed up similarity calculation by avoiding recalculating those unreachable pairs' similarity scores after first several iterations and shows the efficiency of this approach on web datasets.
Abstract
SimRank is a well-known algorithm for similarity calculation based on link analysis. However, it suffers from high computational cost. It has been shown that the world web graph is a "small world graph". In this paper, we observe that for this kind of small world graph, node pairs whose similarity scores are zero after first several iterations will remain zero in the final output. Based on this observation, we proposed a novel algorithm calledSW-SimRank to speed up similarity calculation by avoiding recalculating those unreachable pairs' similarity scores. Our experimental results on web datasets showed the efficiency of our approach. The larger the proportion of unreachable pairs is in the relationship graph, the more improvement the SW-SimRank algorithm will achieve. In addition, SW-SimRank can be integrated with other SimRank acceleration methods.

read more

Citations
More filters
Proceedings ArticleDOI

Axiomatic ranking of network role similarity

TL;DR: RoleSim as mentioned in this paper is a role similarity metric which satisfies axioms and which can be computed with a simple iterative algorithm, and rigorously prove that RoleSim satisfies all the axiomatic properties and demonstrate its superior interpretative power on both synthetic and real datasets.
Proceedings ArticleDOI

Scalable similarity search for SimRank

TL;DR: This paper proposes a very fast and scalable SimRank-based similarity search problem, and establishes a Monte-Carlo based algorithm to compute a single pair SimRank score s(u,v), which is based on the random-walk interpretation of the linear recursive formula.
Journal ArticleDOI

Scalable and axiomatic ranking of network role similarity

TL;DR: RoleSim is presented, a new similarity metric that satisfies several axiomatic properties necessary for a role similarity measure or metric that can be computed with a simple iterative algorithm and demonstrated the interpretative power of RoleSim on both both synthetic and real datasets.
Posted Content

Axiomatic Ranking of Network Role Similarity

TL;DR: RoleSim is presented, a role similarity metric which satisfies all the axiomatic properties and which can be computed with a simple iterative algorithm and demonstrated its superior interpretative power on both synthetic and real datasets.
Book ChapterDOI

A fast two-stage algorithm for computing SimRank and its extensions

TL;DR: A new algorithm, fast two-stage SimRank (F2S-SimRank), which can avoid storing unnecessary zeros and can accelerate the computation without accuracy loss and uses less computation time and occupies less main memory.
References
More filters
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Journal ArticleDOI

Co-citation in the scientific literature: A new measure of the relationship between two documents

TL;DR: A new form of document coupling called co-citation is defined as the frequency with which two documents are cited together, and clusters of co- cited papers provide a new way to study the specialty structure of science.
Journal ArticleDOI

Graph structure in the Web

TL;DR: The study of the web as a graph yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution.
Journal ArticleDOI

Bibliographic coupling between scientific papers

TL;DR: The population of papers under study was ordered into groups that satisfy the stated criterion of interrelation and an examination of the papers that constitute the groups shows a high degree of logical correlation.
Proceedings ArticleDOI

SimRank: a measure of structural-context similarity

TL;DR: A complementary approach, applicable in any domain with object-to-object relationships, that measures similarity of the structural context in which objects occur, based on their relationships with other objects is proposed.