Topic

SimRank

About: SimRank is a research topic. Over the lifetime, 250 publications have been published within this topic receiving 21163 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

UniWalk: Unidirectional Random Walk Based Scalable SimRank Computation over Large Graph

[...]

Junshuai Song¹, Xiongcai Luo¹, Jun Gao¹, Chang Zhou², Hu Wei², Jeffery Xu Yu³ - Show less +2 more•Institutions (3)

Peking University¹, Alibaba Group², The Chinese University of Hong Kong³

01 May 2018-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A Monte Carlo based method to enable the fast top-to-bottom SimRank computation over large undirected graphs, which outperforms the state-of-the-art methods by orders of magnitude and is extended to existing distributed graph processing frameworks to improve its scalability.

...read moreread less

Abstract: SimRank is an important measure of vertex-pair similarity according to the structure of graphs. Although progress has been achieved, existing methods still face challenges to handle large graphs. Besides huge index construction and maintenance cost, existing methods may require considerable search space and time overheads in the online SimRank query. In this paper, we design a Monte Carlo based method, UniWalk, to enable the fast top- $k$ SimRank computation over large undirected graphs. UniWalk directly locates the top- $k$ similar vertices for any single source vertex $u$ via $R$ sampling paths originating from $u$ , which avoids selecting candidate vertex set $\mathcal{C}$ and the following $O(|\mathcal{C}|R)$ bidirectional sampling paths. We also devise a path enumeration strategy to improve the SimRank precision by using path probabilities instead of path frequencies when sampling, a space-efficient method to reduce intermediate results, and a path-sharing strategy to lower the redundant path sampling cost for multiple source vertices. Furthermore, we extend UniWalk to existing distributed graph processing frameworks to improve its scalability. We conduct extensive experiments to illustrate that UniWalk has high scalability, and outperforms the state-of-the-art methods by orders of magnitude.

...read moreread less

14 citations

Proceedings Article•DOI•

A Graph-Theoretic Algorithm for Automatic Extension of Translation Lexicons

[...]

Beate Dorow¹, Florian Laws¹, Lukas Michelbacher¹, Christian Scheible¹, Jason Utt¹ - Show less +1 more•Institutions (1)

University of Stuttgart¹

31 Mar 2009

TL;DR: A graph-theoretic approach to the identification of yet-unknown word translations based on the recursive Sim-Rank algorithm and relies on the intuition that two words are similar if they establish similar grammatical relationships with similar other words.

...read moreread less

Abstract: This paper presents a graph-theoretic approach to the identification of yet-unknown word translations. The proposed algorithm is based on the recursive Sim-Rank algorithm and relies on the intuition that two words are similar if they establish similar grammatical relationships with similar other words. We also present a formulation of SimRank in matrix form and extensions for edge weights, edge labels and multiple graphs.

...read moreread less

14 citations

Journal Article•DOI•

CiteRank: combination similarity and static ranking with research paper searching

[...]

Pijitra Jomsri¹, Siripun Sanguansintukul¹, Worasit Choochaiwattana²•Institutions (2)

Chulalongkorn University¹, Dhurakij Pundit University²

01 Apr 2011-International Journal of Internet Technology and Secured Transactions

TL;DR: CiteRank, a combination of a similarity ranking with a static ranking, implies that CiteRank can improve the effectiveness of research paper searching on social bookmarking websites.

...read moreread less

Abstract: Search engines and social bookmarking systems are important tools for web resource discovery. The performance and capabilities of web search engines are vital. This paper proposes CiteRank, a combination of a similarity ranking with a static ranking. Similarity ranking measures the match between a query and a research paper index; while a static ranking, or a query independent ranking, measures the quality of a research paper. For this particular study, a group of factors containing: number of groups contained the posted paper, year of publication, research paper posted time, and priority of a research paper was used to determine a static ranking score. The NDCG was used as an evaluation metric. CiteRank was compared with SimRank and StaticRank. The results of the experiment showed that CiteRank produces a better ranking than the other methods. This implies that CiteRank can improve the effectiveness of research paper searching on social bookmarking websites.

...read moreread less

13 citations

Journal Article•DOI•

A fuzzy clustering based method for attributed graph partitioning

[...]

Chaobo He¹, Shuangyin Liu¹, Lei Zhang¹, Jianhua Zheng¹•Institutions (1)

Zhongkai University of Agriculture and Engineering¹

01 Sep 2019-Journal of Ambient Intelligence and Humanized Computing

TL;DR: This paper proposes a novel method for attributed graph partitioning based on fuzzy clustering that devises a unified similarity measure using SimRank to construct the fuzzy similarity matrix of the attributed graph and deduces the corresponding fuzzy equivalent matrix using fuzzy set theory.

...read moreread less

Abstract: Graph partitioning methods in data mining have been widely used to discover protein complexes in protein–protein interaction (PPI) network. However, PPI networks with attributes need more effective attribute graph partitioning methods. Attribute graph partitioning aims to obtain high quality partitions satisfying the requirement: nodes in the same partition not only connect to each other more densely but also share more similar attribute values. In this paper, we propose a novel method for attributed graph partitioning based on fuzzy clustering. This method firstly devises a unified similarity measure using SimRank to construct the fuzzy similarity matrix of the attributed graph and can integrate structural and attribute similarities of nodes into a flexible weighted framework. Then it deduces the corresponding fuzzy equivalent matrix using fuzzy set theory. Finally, the result of partitioning can be obtained using fuzzy clustering algorithm. We conduct some experiments on several typical attributed graphs, which can also simulate PPI networks with attributes. The results show that our method is very effective to identify high quality partitions of attributed graphs and even performs better than some representative methods.

...read moreread less

13 citations

Journal Article•DOI•

Dynamical SimRank search on time-varying networks

[...]

Weiren Yu¹, Xuemin Lin², Wenjie Zhang², Julie A. McCann³•Institutions (3)

Aston University¹, University of New South Wales², Imperial College London³

01 Feb 2018

TL;DR: The efficient dynamical computation of all-pairs SimRanks on time-varying graphs is studied and it is shown that the SimRank update in response to every link update is expressible as a rank-one Sylvester matrix equation.

...read moreread less

Abstract: SimRank is an appealing pair-wise similarity measure based on graph structure. It iteratively follows the intuition that two nodes are assessed as similar if they are pointed to by similar nodes. Many real graphs are large, and links are constantly subject to minor changes. In this article, we study the efficient dynamical computation of all-pairs SimRanks on time-varying graphs. Existing methods for the dynamical SimRank computation [e.g., LTSF (Shao et al. in PVLDB 8(8):838---849, 2015) and READS (Zhang et al. in PVLDB 10(5):601---612, 2017)] mainly focus on top-k search with respect to a given query. For all-pairs dynamical SimRank search, Li et al.'s approach (Li et al. in EDBT, 2010) was proposed for this problem. It first factorizes the graph via a singular value decomposition (SVD) and then incrementally maintains such a factorization in response to link updates at the expense of exactness. As a result, all pairs of SimRanks are updated approximately, yielding $$O({r}^{4}n^2)$$O(r4n2) time and $$O({r}^{2}n^2)$$O(r2n2) memory in a graph with n nodes, where r is the target rank of the low-rank SVD. Our solution to the dynamical computation of SimRank comprises of five ingredients: (1) We first consider edge update that does not accompany new node insertions. We show that the SimRank update $${\varvec{\Delta }}{} \mathbf{S}$$ΔS in response to every link update is expressible as a rank-one Sylvester matrix equation. This provides an incremental method requiring $$O(Kn^2)$$O(Kn2) time and $$O(n^2)$$O(n2) memory in the worst case to update $$n^2$$n2 pairs of similarities for K iterations. (2) To speed up the computation further, we propose a lossless pruning strategy that captures the "affected areas" of $${\varvec{\Delta }}{} \mathbf{S}$$ΔS to eliminate unnecessary retrieval. This reduces the time of the incremental SimRank to $$O(K(m+|{\textsf {AFF}}|))$$O(K(m+|AFF|)), where m is the number of edges in the old graph, and $$|{\textsf {AFF}}| \ (\le n^2)$$|AFF|(≤n2) is the size of "affected areas" in $${\varvec{\Delta }}{} \mathbf{S}$$ΔS, and in practice, $$|{\textsf {AFF}}| \ll n^2$$|AFF|źn2. (3) We also consider edge updates that accompany node insertions, and categorize them into three cases, according to which end of the inserted edge is a new node. For each case, we devise an efficient incremental algorithm that can support new node insertions and accurately update the affected SimRanks. (4) We next study batch updates for dynamical SimRank computation, and design an efficient batch incremental method that handles "similar sink edges" simultaneously and eliminates redundant edge updates. (5) To achieve linear memory, we devise a memory-efficient strategy that dynamically updates all pairs of SimRanks column by column in just $$O(Kn+m)$$O(Kn+m) memory, without the need to store all $$(n^2)$$(n2) pairs of old SimRank scores. Experimental studies on various datasets demonstrate that our solution substantially outperforms the existing incremental SimRank methods and is faster and more memory-efficient than its competitors on million-scale graphs.

...read moreread less

12 citations

Collapse

Network Information

Performance

Metrics

250

Papers

22,828

Citations

No. of papers in the topic in previous years
Year	Papers
2021	15
2020	26
2019	16
2018	17
2017	19
2016	16

SimRank

Papers published on a yearly basis

Papers

Trending Questions (4)

Network Information

Related Topics (5)

Performance

Metrics