Topic

SimRank

About: SimRank is a research topic. Over the lifetime, 250 publications have been published within this topic receiving 21163 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Accuracy Estimation of Link-Based Similarity Measures and Its Application

[...]

Yinglong Zhang¹, Yinglong Zhang², Cuiping Li², Chengwang Xie¹, Hong Chen² - Show less +1 more•Institutions (2)

East China Jiaotong University¹, Renmin University of China²

16 Jun 2014

TL;DR: This work designs accurate and tight upper bounds of Personalized PageRank (PPR) and SR based on human intuition and demonstrates effectiveness of the novel upper bounds in the scenario of top-k similar nodes query, where the upper bounds accelerate speed of the query.

...read moreread less

Abstract: Link-based similarity measures play significant role in many graph based applications. Consequently, measuring nodes similarity in a graph is a fundamental problem of graph data mining. Personalized PageRank (PPR) and SimRank (SR) have emerged as the most popular and influential link-based similarity measures. In practice, PPR and SR scores are achieved by iterative computing. With increasing of iterations, the computations incur heavy overhead. The ideal solution is that computing similarity within the minimum number of iterations is sufficient to guarantee a desired accuracy. However, the existing upper bounds are too coarse to be useful in general. Therefore, we focus on designing accurate and tight upper bounds of PPR and SR in the paper. Our upper bounds are designed based on following human intuition: “the smaller the difference between the two consecutive iteration step results is, the smaller the difference between iterative similarity scores and theoretical ones is”. Furthermore, we demonstrate effectiveness of our novel upper bounds in the scenario of top-k similar nodes query, where our upper bounds accelerate speed of the query. At last, we run a comprehensive set of experiments on real data sets to verify effectiveness and efficiency of our upper bounds

...read moreread less

1 citations

Proceedings Article•DOI•

Inferring borrower network in a microfinancing framework (KIVA)

[...]

Aritra Ghosh¹, Jithin Vachery¹•Institutions (1)

Indian Institute of Technology Madras¹

01 Aug 2016

TL;DR: A novel tripartite extension of SimRank is formulated using the network of lenders, loans and borrowers to capture the inherent pattern in the system to validate the effectiveness of the modeling and the proposed disambiguation scheme for borrowers.

...read moreread less

Abstract: Microfinance institutions aim at offering financial services to people in low-income category, who typically lack access to traditional banking systems. Till date, greater than 15 billion U.S dollars has been infused into microfinancing, assisting more than 160 million people in developing countries. With the tremendous growth in the World Wide Web, a number of microfinance institutions have recently moved online. One such noble initiative is KIVA, a crowd sourced online microfinance platform which connects borrowers (small entrepreneurs and individuals) to lenders through the field partners. One particular interest to such microfinancing institutions, is the analysis of the network of borrowers which can help them improve the percentage of loan requests fulfilled. KIVA provides a rich dataset capturing the lending activities on the website. In this paper, we analyze the data to find and extract the structure in the KIVA framework. We formulate a novel tripartite extension of SimRank using the network of lenders, loans and borrowers to capture the inherent pattern in the system. We also propose a Multipartite extension of SimRank useful for real world settings. Extensive experiments validate the effectiveness of our modeling and the proposed disambiguation scheme for borrowers.

...read moreread less

1 citations

Leveraging structural-context similarity of Wikipedia links to predict twitter user locations

[...]

Chuanqi Huang

31 Dec 2016

TL;DR: This thesis proposes a novel framework for predicting the location of a social media user by leveraging structural-context similarity over Wikipedia links and provides a list of ranked "probable" cities based on the distances between candidate locations and their weights.

...read moreread less

Abstract: LEVERAGING STRUCTURAL-CONTEXT SIMILARITY OF WIKIPEDIA LINKS TO PREDICT TWITTER USER LOCATIONS Twitter is a widely used social media service. Several efforts have targeted understanding the patterns of information dissemination underlying this social network. A user’s location is one of the most important information items relative to analyzing content. However, location information tends to be unavailable because most users do not (want to) include geo-tags in their tweets. To predict a user’s location, existing approaches require voluminous training data sets of geo-tagged tweets. However, some of the characteristics of tweets, such as compact, non-traditional linguistic expressions, have posed significant challenges when applying model-fitting approaches. In this thesis, we propose a novel framework for predicting the location of a social media user by leveraging structural-context similarity over Wikipedia links. We measure SimRanks between pages over the Wikipedia dump dataset and build a knowledge base, mapping location information (e.g., cities and states) to related vocabularies along with the likelihood for these mappings. Our results evolve as the users’ tweet stream grows. We have implemented this framework using Apache Storm to observe real-time tweets. Finally, our framework provides a list of ranked "probable" cities based on the distances between candidate locations and their weights. This thesis includes empirical evaluations that demonstrate performance that is in line with current state-of-the-art location prediction approaches.

...read moreread less

1 citations

Journal Article•DOI•

Large-scale supervised similarity learning in networks

[...]

Shiyu Chang¹, Guo-Jun Qi², Yingzhen Yang¹, Charu C. Aggarwal³, Jiayu Zhou⁴, Meng Wang⁵, Thomas S. Huang¹ - Show less +3 more•Institutions (5)

University of Illinois at Urbana–Champaign¹, University of Central Florida², IBM³, Michigan State University⁴, Hefei University of Technology⁵

01 Sep 2016-Knowledge and Information Systems

TL;DR: A factorized similarity learning (FSL) is proposed to integrate the link, node content, and user supervision into a uniform framework by using matrix factorization, and the final similarities are approximated by the span of low-rank matrices.

...read moreread less

Abstract: The problem of similarity learning is relevant to many data mining applications, such as recommender systems, classification, and retrieval. This problem is particularly challenging in the context of networks, which contain different aspects such as the topological structure, content, and user supervision. These different aspects need to be combined effectively, in order to create a holistic similarity function. In particular, while most similarity learning methods in networks such as SimRank utilize the topological structure, the user supervision and content are rarely considered. In this paper, a factorized similarity learning (FSL) is proposed to integrate the link, node content, and user supervision into a uniform framework. This is learned by using matrix factorization, and the final similarities are approximated by the span of low-rank matrices. The proposed framework is further extended to a noise-tolerant version by adopting a hinge loss alternatively. To facilitate efficient computation on large-scale data, a parallel extension is developed. Experiments are conducted on the DBLP and CoRA data sets. The results show that FSL is robust and efficient and outperforms the state of the art. The code for the learning algorithm used in our experiments is available at http://www.ifp.illinois.edu/~chang87/.

...read moreread less

1 citations

Journal Article•DOI•

A Big Graph Clustering Algorithm Based on MapReduce

[...]

Yong Lin Leng¹, Qing Chen Zhang²•Institutions (2)

Bohai University¹, Dalian University of Technology²

01 Oct 2014-Advanced Materials Research

TL;DR: Results show that the distributed SimRank algorithm proposed based on Mapreduce was used to measure the similarity of graph and can efficiently complete graph nodes similarity measure and clustering the large graph effectively.

...read moreread less

Abstract: Graph clustering is an important technology in graph analysis area, the measure of similarity between node of graph is the presise for graph clustering. SimRank algorithm is a kind of universal structure similarity calculation model which is proposed by Jeh and Widom. SimRank algorithm using iterative method to calculate the similarity between nodes, so the time and space complexity is very high. With the rapid increase of data, the ability of single machine can not meet the requirement of the large-scale data calculation. In this paper, the distributed SimRank algorithm was proposed based on Mapreduce and was used to measure the similarity of graph. Then the distributed AP clustering algorithm was designed for clustering analysis graph nodes. The experimental was executed to compare the clustering running time and speedup and results show that the method can efficiently complete graph nodes similarity measure and clustering the large graph effectively.

...read moreread less

1 citations

Collapse

Network Information

Performance

Metrics

250

Papers

22,828

Citations

No. of papers in the topic in previous years
Year	Papers
2021	15
2020	26
2019	16
2018	17
2017	19
2016	16

SimRank

Papers published on a yearly basis

Papers

Trending Questions (4)

Network Information

Related Topics (5)

Performance

Metrics