The stochastic approach for link-structure analysis (SALSA) and the TKC effect
read more
Citations
Google's PageRank and Beyond: The Science of Search Engine Rankings
Link mining: a survey
Vital nodes identification in complex networks
Deeper Inside PageRank
Co-authorship networks in the digital library research community
References
The anatomy of a large-scale hypertextual Web search engine
The Anatomy of a Large-Scale Hypertextual Web Search Engine.
Co-citation in the scientific literature: A new measure of the relationship between two documents
Citation analysis as a tool in journal evaluation.
Related Papers (5)
Frequently Asked Questions (9)
Q2. What are the contributions in this paper?
In this context, Jon Kleinberg introduced the notion of two distinct types of Web sites: hubs and authorities. The authors present SALSA, a new stochastic approach for link structure analysis, which examines random walks on graphs derived from the link structure. The authors show that both SALSA and Kleinberg ’ s mutual reinforcement approach employ the same meta-algorithm. The authors then prove that SALSA is equivalent to a weighted in-degree analysis of the link-structure of World Wide Web subgraphs, making it computationally more efficient than the mutual reinforcement approach. These comparisons reveal a topological phenomenon called the TKC effect ( Tightly Knit Community ) which, in certain cases, prevents the mutual reinforcement approach from identifying meaningful authorities.
Q3. What is the main goal of broad-topic searches?
It is important to keep in mind the main goal of broad-topic World Wide Web searches, which is to enhance the precision at 10 of the results, not to rank the entire collection of sites correctly.
Q4. What is the principal eigenvector of a aperiodic matrix?
By the ergodic theorem [9], the principal eigenvector of an irreducible, aperiodic stochastic matrix is actually the stationary distribution of the underlying Markov chain, and its high entries correspond to sites most frequently visited by the (infinite) random walk.
Q5. What is the probability of a random walk on A?
The random walk on A, governed by the transition matrix PA and started from all states with equal probability, will converge to a stationary distribution as follows: limn!1 ePnA D Q³ where Q³ j D jAc. j/j jAj ³ c. j/ jProof.
Q6. What is the simplest way to assign authority weights to sites?
In order to assign such weights, Kleinberg uses the following iterative algorithm: (1) Initialize a.s/ 1, h.s/ 1 for all sites s 2 C. (2) Repeat the following three operations until con-vergence: ž Update the authority weight of each site s (the The authoroperation):a.s/ Xxjx points to s h.x/ž
Q7. What is the simplest algorithm for calculating the stochastic ranking?
This mathematical analysis, in addition to providing insight about the ranking that is produced by SALSA, also suggests a very simple algorithm for calculating the stochastic ranking: simply calculate, for all sites, the sum of weights on their incoming (outgoing) edges, and normalize these two vectors.
Q8. What kinds of links are used by the Web?
There are many kinds of links which confer little or no authority [4], such as intra-domain (inner) links (whose purpose is to provide navigational aid in a complex Web site of some organization), commercial=sponsor links, and links which result from link-exchange agreements.
Q9. What is the corresponding eigenvector of a hub matrix?
Given a topic t , construct a site collection C whichshould contain many t-hubs and t-authorities, but should not contain many hubs or authorities for any other topic t 0. Let n D jCj. ž Derive, from C and the link structure induced byit, two nð n association matrices: a hub matrix H and an authority matrix A. Association matrices are widely used in classification algorithms [22] and will be used here in order to classify the Web sites into communities of hubs=authorities.