scispace - formally typeset
Search or ask a question

Showing papers on "SimRank published in 2008"


Journal ArticleDOI
01 Aug 2008
TL;DR: It is argued that Simrank fails to properly identify query similarities in the authors' application, and two enhanced versions of Simrank are presented: one that exploits weights on click graph edges and another that exploits "evidence."
Abstract: We focus on the problem of query rewriting for sponsored search. We base rewrites on a historical click graph that records the ads that have been clicked on in response to past user queries. Given a query q, we first consider Simrank [7] as a way to identify queries similar to q, i.e., queries whose ads a user may be interested in. We argue that Simrank fails to properly identify query similarities in our application, and we present two enhanced versions of Simrank: one that exploits weights on click graph edges and another that exploits "evidence." We experimentally evaluate our new schemes against Simrank, using actual click graphs and queries from Yahoo!, and using a variety of metrics. Our results show that the enhanced methods can yield more and better query rewrites.

188 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing SimRank iteratively, and introduces a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method.
Abstract: The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of this kind, based on graph-theoretic model. SimRank is typically computed iteratively, in the spirit of PageRank. However, existing work on SimRank lacks accuracy estimation of iterative computation and has discouraging time complexity.In this paper we present a technique to estimate the accuracy of computing SimRank iteratively. This technique provides a way to find out the number of iterations required to achieve a desired accuracy when computing SimRank. We also present optimization techniques that improve the computational complexity of the iterative algorithm from O(n4) to O(n3) in the worst case. We also introduce a threshold sieving heuristic and its accuracy estimation that further improves the efficiency of the method.As a practical illustration of our techniques we computed SimRank scores on a subset of English Wikipedia corpus, consisting of the complete set of articles and category links.

168 citations


Journal ArticleDOI
01 Aug 2008
TL;DR: SimRank is a simple and intuitive measure of similarity between objects, based on similarity similarity scores, that is applicable to many areas of computer science, including information retrieval.
Abstract: The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of this kind, based on ...

62 citations


Proceedings ArticleDOI
21 Apr 2008
TL;DR: It is argued that Simrank fails to properly identify query similarities in this application, and two enhanced versions of Simrank are presented: one that exploits weights on click graph edges and another that exploits evidence.
Abstract: We focus on the problem of query rewriting for sponsored search. We base rewrites on a historical click graph that records the ads that have been clicked on in response to past user queries. Given a query q, we first consider Simrank [2] as a way to identify queries similar to q, i.e., queries whose ads a user may be interested in. We argue that Simrank fails to properly identify query similarities in our application, and we present two enhanced versions of Simrank: one that exploits weights on click graph edges and another that exploits evidence." We experimentally evaluate our new schemes against Simrank, using actual click graphs and queries form Yahoo!, and using a variety of metrics. Our results show that the enhanced methods can yield more and better query rewrites.

31 citations


Book ChapterDOI
08 Oct 2008
TL;DR: This paper proposes a new method to combine these two methods to compute the similarity of research papers so that it can do clustering of these papers more accurately and develops a strategy to deal with the relationship graph separately without affecting the accuracy.
Abstract: Both Content analysis and link analysis have its advantages in measuring relationships among documents. In this paper, we propose a new method to combine these two methods to compute the similarity of research papers so that we can do clustering of these papers more accurately. In order to improve the efficiency of similarity calculation, we develop a strategy to deal with the relationship graph separately without affecting the accuracy. We also design an approach to assign different weights to different links to the papers, which can enhance the accuracy of similarity calculation. The experimental results conducted on ACM Data Set show that our new algorithm, S-SimRank,outperforms other algorithms.

19 citations


Journal Article
TL;DR: An algorithm of mining 'effect-effect' similarity relations in TCM based on SimRank method is proposed in the paper and the results consulted by TCM experts show that the correct rate of the algorithm is comparatively high.
Abstract: Analysis of 'effect-effect' relations in Traditional Chinese Medicine(TCM) is one of the most fundamental and important issues for TCM research, which is of great significance for TCM prescription effect research. The paper intends to use data mining technology to automatically mine the similarity relations in TCM prescription data and induce degree of the similarity between different drug effects. For this reason, an algorithm of mining 'effect-effect' similarity relations in TCM based on SimRank method is proposed in the paper. The results consulted by TCM experts show that the correct rate of the algorithm is comparatively high. Among them, 'good' and 'reasonable' have 70.568% totally.

1 citations