Proceedings ArticleDOI
Rough set based clustering in dense web domain
Rajhans Mishra,Pradeep Kumar,Bharat Bhasker +2 more
- pp 521-526
Reads0
Chats0
TLDR
The clustering task for sequence data (web page visits) is demonstrated in three ways namely, capturing content information, sequence information and combination of both, suggesting that the measure which captures both content and sequence forms compact clusters, thus putting the web users of similar interests in one group.Abstract:
Clustering is a widely used technique in data mining applications. It groups the objects on the basis of similarity among them. Web has evolved enormously in past few years which resulted in sharp augmentation in number of web users and web pages. Web personalization has become a challenging task for e-Commerce based companies due to the information overload on web and increase of web users. Web users are matched with the available information in order to make personalization effective. Web usage data, coming from a single domain happens to be dense in nature as plenty of web users are fetching the pages from the same domain/ application area. This scenario is prevalent in case of e-Commerce websites. Rough set is a soft computing technique which is efficient in dealing with ambiguities present in data. In this paper we have utilized rough set based clustering using similarity upper approximation for deriving the clusters. The clusters evolve in steps and finally converge in to a well defined clustering scheme. Developers are trying to customize web sites as per the needs of specific users with the help of knowledge acquired from users' navigational behaviour. Since user page visits are intrinsically sequential in nature, efficient clustering algorithms with suitable distance/similarity measure for sequential data is needed. In the current paper, we demonstrate the clustering task for sequence data (web page visits) in three ways namely, capturing content information, sequence information and combination of both. Experimental results suggest that the measure which captures both content and sequence forms compact clusters, thus putting the web users of similar interests in one group.read more
Citations
More filters
Certain Investigations for Retrieving Web Documents using Soft Computing Techniques
TL;DR: This paper performs a complete survey of different techniques that are available for retrieving the web documents and presents research directions for using soft computing techniques for web document retrieval.
References
More filters
Proceedings Article
Letizia: an agent that assists web browsing
TL;DR: Letizia is a user interface agent that assists a user browsing the World Wide Web by automates a browsing strategy consisting of a best-first search augmented by heuristics inferring user interest from browsing behavior.
Journal ArticleDOI
Interval Set Clustering of Web Users with Rough K -Means
Pawan Lingras,Chad West +1 more
TL;DR: A variation of the K-means clustering algorithm based on properties of rough sets is proposed, which represents clusters as interval or rough sets.
Journal ArticleDOI
Discovering Internet marketing intelligence through online analytical web usage mining
Alex G. Büchner,Maurice Mulvenna +1 more
TL;DR: A novel way of combining data mining techniques on Internet data in order to discover actionable marketing intelligence in electronic commerce scenarios is described, which include marketing expertise as domain knowledge and are specifically designed for electronic commerce purposes.
Clickstream clustering using weighted longest common subsequences
TL;DR: A novel and eff ective algorithm for clustering webusers based on a function of the longest common subsequence of their clickst reams that takes into account both the trajectory taken through a website and the time spent at each page.