scispace - formally typeset
Search or ask a question

Showing papers by "Ron Weiss published in 1996"


Proceedings ArticleDOI
01 Mar 1996
TL;DR: Experience with HyPursuit suggests that abstraction functions based on hypertext clustering can be used to construct meaningful and scalable cluster hierarchies, and is encouraged by preliminary results on clustering based on both document contents and hyperlink structures.
Abstract: HyPursuit is a new hierarchical network search engine that clusters hypertext documents to structure a given information space for browsing and search act ivities. Our content-link clustering algorithm is based on the semantic information embedded in hyperlink structures and document contents. HyPursuit admits multiple, coexisting cluster hierarchies based on different principles for grouping documents, such as the Library of Congress catalog scheme and automatically created hypertext clusters. HyPursuit’s abstraction functions summarize cluster contents to support scalable query processing. The abstraction functions satisfy system resource limitations with controlled information 10SS. The result of query processing operations on a cluster summary approximates the result of performing the operations on the entire information space. We constructed a prototype system comprising 100 leaf WorldWide Web sites and a hierarchy of 42 servers that route queries to the leaf sites. Experience with our system suggests that abstraction functions based on hypertext clustering can be used to construct meaningful and scalable cluster hierarchies. We are also encouraged by preliminary results on clustering based on both document contents and hyperlink structures.

342 citations