Topic

SimRank

About: SimRank is a research topic. Over the lifetime, 250 publications have been published within this topic receiving 21163 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

LinkClus: efficient clustering via heterogeneous semantic links

[...]

Xiaoxin Yin¹, Jiawei Han¹, Philip S. Yu²•Institutions (2)

University of Illinois at Urbana–Champaign¹, IBM²

01 Sep 2006

TL;DR: This paper takes advantage of the power law distribution of links, and develops a hierarchical structure called SimTree to represent similarities in multi-granularity manner, to compute similarities between objects by avoiding pairwise similarity computations through merging computations that go through the same branches in the SimTree.

...read moreread less

Abstract: Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects. Most current clustering methods rely only on the properties that belong to the objects per se. However, the similarities between objects are often indicated by the links, and desirable clusters cannot be generated using only the properties of objects.In this paper we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. In comparison with a previous study (SimRank) that computes links recursively on all pairs of objects, we take advantage of the power law distribution of links, and develop a hierarchical structure called SimTree to represent similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. An efficient algorithm is proposed to compute similarities between objects by avoiding pairwise similarity computations through merging computations that go through the same branches in the SimTree. Experiments show the proposed approach achieves high efficiency, scalability, and accuracy in clustering multi-typed linked objects.

...read moreread less

124 citations

Proceedings Article•DOI•

Parallel SimRank computation on large graphs with iterative aggregation

[...]

Guoming He¹, Haijun Feng¹, Cuiping Li¹, Hong Chen¹•Institutions (1)

Renmin University of China¹

25 Jul 2010

TL;DR: This paper exploits the inherent parallelism and high memory bandwidth of graphics processing units (GPU) to accelerate the computation of SimRank on large graphs and proposes to utilize the iterative aggregation techniques for uncoupling Markov chains to compute SimRank scores in parallel for large graphs.

...read moreread less

Abstract: Recently there has been a lot of interest in graph-based analysis. One of the most important aspects of graph-based analysis is to measure similarity between nodes in a graph. SimRank is a simple and influential measure of this kind, based on a solid graph theoretical model. However, existing methods on SimRank computation suffer from two limitations: 1) the computing cost can be very high in practice; and 2) they can only be applied on static graphs. In this paper, we exploit the inherent parallelism and high memory bandwidth of graphics processing units (GPU) to accelerate the computation of SimRank on large graphs. Furthermore, based on the observation that SimRank is essentially a first-order Markov Chain, we propose to utilize the iterative aggregation techniques for uncoupling Markov chains to compute SimRank scores in parallel for large graphs. The iterative aggregation method can be applied on dynamic graphs. Moreover, it can handle not only the link-updating problem but also the node-updating problem. Extensive experiments on synthetic and real data sets verify that the proposed methods are efficient and effective.

...read moreread less

113 citations

Proceedings Article•DOI•

Axiomatic ranking of network role similarity

[...]

Ruoming Jin¹, Victor E. Lee¹, Hui Hong¹•Institutions (1)

Kent State University¹

21 Aug 2011

TL;DR: RoleSim as mentioned in this paper is a role similarity metric which satisfies axioms and which can be computed with a simple iterative algorithm, and rigorously prove that RoleSim satisfies all the axiomatic properties and demonstrate its superior interpretative power on both synthetic and real datasets.

...read moreread less

Abstract: A key task in analyzing social networks and other complex networks is role analysis: describing and categorizing nodes by how they interact with other nodes. Two nodes have the same role if they interact with equivalent sets of neighbors. The most fundamental role equivalence is automorphic equivalence. Unfortunately, the fastest algorithm known for graph automorphism is nonpolynomial. Moreover, since exact equivalence is rare, a more meaningful task is measuring the role similarity between any two nodes. This task is closely related to the link-based similarity problem that SimRank addresses. However, SimRank and other existing simliarity measures are not sufficient because they do not guarantee to recognize automorphically or structurally equivalent nodes. This paper makes two contributions. First, we present and justify several axiomatic properties necessary for a role similarity measure or metric. Second, we present RoleSim, a role similarity metric which satisfies these axioms and which can be computed with a simple iterative algorithm. We rigorously prove that RoleSim satisfies all the axiomatic properties and demonstrate its superior interpretative power on both synthetic and real datasets.

...read moreread less

109 citations

Proceedings Article•DOI•

Scalable similarity search for SimRank

[...]

Mitsuru Kusumoto, Takanori Maehara¹, Ken-ichi Kawarabayashi¹•Institutions (1)

National Institute of Informatics¹

18 Jun 2014

TL;DR: This paper proposes a very fast and scalable SimRank-based similarity search problem, and establishes a Monte-Carlo based algorithm to compute a single pair SimRank score s(u,v), which is based on the random-walk interpretation of the linear recursive formula.

...read moreread less

Abstract: SimRank, proposed by Jeh and Widom, provides a good similarity score and has been successfully used in many of the above mentioned applications. While there are many algorithms proposed so far to compute SimRank, but unfortunately, none of them are scalable up to graphs of billions size. Motivated by this fact, we consider the following SimRank-based similarity search problem: given a query vertex u, find top-k vertices v with the k highest SimRank scores s(u,v) with respect to u. We propose a very fast and scalable algorithm for this similarity search problem. Our method consists of the following ingredients: (1) We first introduce a "linear" recursive formula for SimRank. This allows us to formulate a problem that we can propose a very fast algorithm. (2) We establish a Monte-Carlo based algorithm to compute a single pair SimRank score s(u,v), which is based on the random-walk interpretation of our linear recursive formula. (3) We empirically show that SimRank score s(u,v) decreases rapidly as distance d(u,v) increases. Therefore, in order to compute SimRank scores for a query vertex u for our similarity search problem, we only need to look at very "local" area. (4) We can combine two upper bounds for SimRank score s(u,v) (which can be obtained by Monte-Carlo simulation in our preprocess), together with some adaptive sample technique, to prune the similarity search procedure. This results in a much faster algorithm. Once our preprocess is done (which only takes O(n) time), our algorithm finds, given a query vertex u, top-20 similar vertices v with the 20 highest SimRank scores s(u,v) in less than a few seconds even for graphs with billions edges. To the best of our knowledge, this is the first time to scale for graphs with at least billions edges(for the single source case).

...read moreread less

99 citations

Proceedings Article•DOI•

To randomize or not to randomize: space optimal summaries for hyperlink analysis

[...]

Tamas Sarlos¹, Adrás A. Benczúr¹, Károly Csalogány¹, Dániel Fogaras², Balázs Rácz² - Show less +1 more•Institutions (2)

Hungarian Academy of Sciences¹, Budapest University of Technology and Economics²

23 May 2006

TL;DR: This paper achieves unrestricted personalization by combining rounding and randomized sketching techniques in the dynamic programming algorithm of Jeh and Widom and shows that the algorithms use an optimal amount of space by also improving earlier asymptotic worst-case lower bounds.

...read moreread less

Abstract: Personalized PageRank expresses link-based page quality around user selected pages. The only previous personalized PageRank algorithm that can serve on-line queries for an unrestricted choice of pages on large graphs is our Monte Carlo algorithm [WAW 2004]. In this paper we achieve unrestricted personalization by combining rounding and randomized sketching techniques in the dynamic programming algorithm of Jeh and Widom [WWW 2003]. We evaluate the precision of approximation experimentally on large scale real-world data and find significant improvement over previous results. As a key theoretical contribution we show that our algorithms use an optimal amount of space by also improving earlier asymptotic worst-case lower bounds. Our lower bounds and algorithms apply to the SimRank as well; of independent interest is the reduction of the SimRank computation to personalized PageRank.

...read moreread less

92 citations

Collapse

Network Information

Performance

Metrics

250

Papers

22,828

Citations

No. of papers in the topic in previous years
Year	Papers
2021	15
2020	26
2019	16
2018	17
2017	19
2016	16

SimRank

Papers published on a yearly basis

Papers

Trending Questions (4)

Network Information

Related Topics (5)

Performance

Metrics