Topic

SimRank

About: SimRank is a research topic. Over the lifetime, 250 publications have been published within this topic receiving 21163 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs

[...]

Zhewei Wei¹, Xiaodong He², Xiaokui Xiao³, Sibo Wang⁴, Yu Liu⁵, Xiaoyong Du¹, Ji-Rong Wen¹ - Show less +3 more•Institutions (5)

Renmin University of China¹, Paradigm², National University of Singapore³, The Chinese University of Hong Kong⁴, Peking University⁵

25 Jun 2019

TL;DR: In this article, the authors proposed a single-source SimRank query that uses an index of size O(m) where m is the number of edges in the graph, and guarantees a query time that depends on the reverse PageRank distribution of the input graph.

...read moreread less

Abstract: SimRank is a classic measure of the similarities of nodes in a graph. Given a node u in graph $G =(V, E)$, a \em single-source SimRank query returns the SimRank similarities $s(u, v)$ between node u and each node $v \in V$. This type of queries has numerous applications in web search and social networks analysis, such as link prediction, web mining, and spam detection. Existing methods for single-source SimRank queries, however, incur query cost at least linear to the number of nodes n, which renders them inapplicable for real-time and interactive analysis. This paper proposes \prsim, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries. \prsim uses an index of size $O(m)$, where m is the number of edges in the graph, and guarantees a query time that depends on the \em reverse PageRank distribution of the input graph. In particular, we prove that \prsim runs in sub-linear time if the degree distribution of the input graph follows the power-law distribution, a property possessed by many real-world graphs. Based on the theoretical analysis, we show that the empirical query time of all existing SimRank algorithms also depends on the reverse PageRank distribution of the graph. Finally, we present the first experimental study that evaluates the absolute errors of various SimRank algorithms on large graphs, and we show that \prsim outperforms the state of the art in terms of query time, accuracy, index size, and scalability.

...read moreread less

23 citations

Proceedings Article•DOI•

MapReduce-Based SimRank Computation and Its Application in Social Recommender System

[...]

Lina Li¹, Cuiping Li¹, Hong Chen¹, Xiaoyong Du¹•Institutions (1)

Renmin University of China¹

27 Jun 2013

TL;DR: This paper proposes parallel algorithms for SimRank computation on Map-Reduce framework, and more specifically its open source implementation, Hadoop, and employs the proposed methods to do the similarity computation in order to recommend appropriate products to users in social recommender systems.

...read moreread less

Abstract: Recently there has been a lot of interest in graph-based analysis, with examples including social network analysis, recommendation systems, document classification and clustering, and so on. A graph is an abstraction that naturally captures data objects as well as relationships among those objects. Objects are represented as nodes and relationships are represented as edges in the graph. There are many cases in which similarities among nodes are required to compute. SimRank is one of the simple and intuitive algorithms for this purpose. It is rigidly based on the random walk theorem. Existing methods on SimRank computation suffer from one limitation: the computing cost can be very high in practice. In order to optimize the computation of SimRank, a few techniques have been proposed. However, the performance of these methods are still limited by the processing ability of the single computer. Ideally, we would like to develop new parallel solutions that can offer improved processing power to compute SimRank on large data set. In this paper, we propose parallel algorithms for SimRank computation on Map-Reduce framework, and more specifically its open source implementation, Hadoop. Two different parallel methods are proposed and their performances are evaluated and compared. Furthermore, we employ the proposed methods to do the similarity computation in order to recommend appropriate products to users in social recommender systems.

...read moreread less

23 citations

Book Chapter•DOI•

PathSimExt: Revisiting PathSim in Heterogeneous Information Networks

[...]

U Leong Hou¹, Kun Yao¹, Hoi Fong Mak¹•Institutions (1)

University of Macau¹

16 Jun 2014

TL;DR: The definition of PathSim is revisited by introducing external support to enrich the result ofPathSim, the first work to address the problem which captures the similarity of two objects based on their connectivity along a semantic path.

...read moreread less

Abstract: Similarity queries in graph databases have been studied over the past few decades. Typically, the similarity queries are used in homogeneous networks, where random walk based approaches (e.g., Personalized PageRank and SimRank) are the representative methods. However, these approaches do not well suit for heterogeneous networks that consist of multi-typed and interconnected objects, such as bibliographic information, social media networks, crowdsourcing data, etc. Intuitively, two objects are similar in heterogeneous networks if they have strong connections among the heterogeneous relationships. PathSim is the first work to address this problem which captures the similarity of two objects based on their connectivity along a semantic path. However, PathSim only considers the information in the semantic path but simply omit other supportive information (e.g., number of citations in bibliographic data) . Thus we revisit the definition of PathSim by introducing external support to enrich the result of PathSim.

...read moreread less

23 citations

Journal Article•DOI•

ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

[...]

Hung-Hsuan Chen¹, C. Lee Giles²•Institutions (2)

Industrial Technology Research Institute¹, Pennsylvania State University²

12 Oct 2015-ACM Transactions on Knowledge Discovery From Data

TL;DR: This article argues that SimRank and its families, such as P-Rank and SimRank++, fail to capture similar node pairs in certain conditions, and presents new similarity measures ASCOS and ASCOS++ to address the problem.

...read moreread less

Abstract: In this article, we explore the relationships among digital objects in terms of their similarity based on vertex similarity measures. We argue that SimRank—a famous similarity measure—and its families, such as P-Rank and SimRank++, fail to capture similar node pairs in certain conditions, especially when two nodes can only reach each other through paths of odd lengths. We present new similarity measures ASCOS and ASCOS++ to address the problem. ASCOS outputs a more complete similarity score than SimRank and SimRank’s families. ASCOS++ enriches ASCOS to include edge weight into the measure, giving all edges and network weights an opportunity to make their contribution. We show that both ASCOS++ and ASCOS can be reformulated and applied on a distributed environment for parallel contribution. Experimental results show that ASCOS++ reports a better score than SimRank and several famous similarity measures. Finally, we re-examine previous use cases of SimRank, and explain appropriate and inappropriate use cases. We suggest future SimRank users following the rules proposed here before naively applying it. We also discuss the relationship between ASCOS++ and PageRank.

...read moreread less

22 citations

Proceedings Article•DOI•

PRSim: Sublinear Time SimRank Computation on Large Power-Law Graphs

[...]

Zhewei Wei¹, Xiaodong He², Xiaokui Xiao³, Sibo Wang⁴, Yu Liu⁵, Xiaoyong Du¹, Ji-Rong Wen¹ - Show less +3 more•Institutions (5)

Renmin University of China¹, Paradigm², National University of Singapore³, The Chinese University of Hong Kong⁴, Peking University⁵

07 May 2019-arXiv: Data Structures and Algorithms

TL;DR: Prsim is proposed, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries and runs in sub-linear time if the degree distribution of the input graph follows the power-law distribution, a property possessed by many real-world graphs.

...read moreread less

Abstract: {\it SimRank} is a classic measure of the similarities of nodes in a graph. Given a node $u$ in graph $G =(V, E)$, a {\em single-source SimRank query} returns the SimRank similarities $s(u, v)$ between node $u$ and each node $v \in V$. This type of queries has numerous applications in web search and social networks analysis, such as link prediction, web mining, and spam detection. Existing methods for single-source SimRank queries, however, incur query cost at least linear to the number of nodes $n$, which renders them inapplicable for real-time and interactive analysis. { This paper proposes \prsim, an algorithm that exploits the structure of graphs to efficiently answer single-source SimRank queries. \prsim uses an index of size $O(m)$, where $m$ is the number of edges in the graph, and guarantees a query time that depends on the {\em reverse PageRank} distribution of the input graph. In particular, we prove that \prsim runs in sub-linear time if the degree distribution of the input graph follows the power-law distribution, a property possessed by many real-world graphs. Based on the theoretical analysis, we show that the empirical query time of all existing SimRank algorithms also depends on the reverse PageRank distribution of the graph.} Finally, we present the first experimental study that evaluates the absolute errors of various SimRank algorithms on large graphs, and we show that \prsim outperforms the state of the art in terms of query time, accuracy, index size, and scalability.

...read moreread less

22 citations

Collapse

Network Information

Performance

Metrics

250

Papers

22,828

Citations

No. of papers in the topic in previous years
Year	Papers
2021	15
2020	26
2019	16
2018	17
2017	19
2016	16

SimRank

Papers published on a yearly basis

Papers

Trending Questions (4)

Network Information

Related Topics (5)

Performance

Metrics