Proceedings ArticleDOI
Inferring Web communities from link topology
David Gibson,Jon Kleinberg,Prabhakar Raghavan +2 more
- pp 225-234
Reads0
Chats0
TLDR
This investigation shows that although the process by which users of the Web create pages and links is very difficult to understand at a “local” level, it results in a much greater degree of orderly high-level structure than has typically been assumed.Abstract:
The World Wide Web grows through a decentralized, almost anarchic process, and this has resulted in a large hyperlinked corpus without the kind of logical organization that can be built into more tradit,ionally-created hypermedia. To extract, meaningful structure under such circumstances, we develop a notion of hyperlinked communities on the www t,hrough an analysis of the link topology. By invoking a simple, mathematically clean method for defining and exposing the structure of these communities, we are able to derive a number of themes: The communities can be viewed as containing a core of central, “authoritative” pages linked togh and they exhibit a natural type of hierarchical topic generalization that can be inferred directly from the pat,t,ern of linkage. Our investigation shows that although the process by which users of the Web create pages and links is very difficult to understand at a “local” level, it results in a much greater degree of orderly high-level structure than has typically been assumed.read more
Citations
More filters
Journal ArticleDOI
Authoritative sources in a hyperlinked environment
TL;DR: This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.
Journal ArticleDOI
Evolution of networks
TL;DR: The recent rapid progress in the statistical physics of evolving networks is reviewed, and how growing networks self-organize into scale-free structures is discussed, and the role of the mechanism of preferential linking is investigated.
Data Mining: Concepts and Techniques (2nd edition)
Jiawei Han,Micheline Kamber +1 more
TL;DR: There have been many data mining books published in recent years, including Predictive Data Mining by Weiss and Indurkhya [WI98], Data Mining Solutions: Methods and Tools for Solving Real-World Problems by Westphal and Blaxton [WB98], Mastering Data Mining: The Art and Science of Customer Relationship Management by Berry and Linofi [BL99].
Journal ArticleDOI
Friends and neighbors on the Web
Lada A. Adamic,Eytan Adar +1 more
TL;DR: In this paper, the authors show that some factors are better indicators of social connections than others, and that these indicators vary between user populations, and provide potential applications in automatically inferring real world connections and discovering, labeling, and characterizing communities.
Journal ArticleDOI
Web usage mining: discovery and applications of usage patterns from Web data
TL;DR: Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications as mentioned in this paper, where preprocessing, pattern discovery, and pattern analysis are described in detail.
References
More filters
Proceedings Article
The PageRank Citation Ranking : Bringing Order to the Web
TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.
Journal ArticleDOI
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI
Authoritative sources in a hyperlinked environment
TL;DR: This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.