scispace - formally typeset
Open AccessJournal Article

A conceptual clustering approach for user profiling in personal information agents

Reads0
Chats0
TLDR
This paper describes and evaluates a document clustering algorithm, named WebDCC (Web Document Conceptual Clustering), designed to support learning of user interests by personal information agents, and empirical evaluation of using this algorithm for user profiling and its advantages with respect to other clustering algorithms are presented.
Abstract
Information agents have emerged in the last decade as an alternative to assist users to cope with the increasing volume of information available on the Web. In order to provide personalized assistance, these agents rely on having some knowledge about users contained into user profiles, i.e., models of users preferences and interests gathered by observation of user behavior. User profiles have to summarize categories corresponding not only to diverse user information interests but also to different levels of abstraction in order to allow agents to decide on the relevance of new pieces of information. In accomplishing this goal, the discovery of interest categories using document clustering offers the advantage that an a priori knowledge of user interests is not needed, therefore the process of acquiring profiles is completely unsupervised. However, most document clustering algorithms are not applicable to the problem of incrementally acquiring and modeling interests because of either the kind of solutions they provide, which do not resemble user interests, or the way they build such solutions, which is generally not incremental. In this paper we describe and evaluate a document clustering algorithm, named WebDCC (Web Document Conceptual Clustering), designed to support learning of user interests by personal information agents. WebDCC algorithm carries out incremental, unsupervised concept learning over Web documents with the goal of building and maintaining both accurate and comprehensible user profiles. Empirical evaluation of using this algorithm for user profiling and its advantages with respect to other clustering algorithms are presented.

read more

Citations
More filters
Journal Article

Inductive learning algorithms and representations for text categorization

TL;DR: Text categorization-assignment of natural language texts to one or more predefined categories based on their content-is an important component in many information organization and management tasks.
Book ChapterDOI

Intelligent user profiling

TL;DR: This chapter studies the main issues regarding user profiles from the perspectives of these research fields, and examines what information constitutes a user profile; how the user profile is represented; how it is acquired and built; and how the profile information is used.
Journal ArticleDOI

A review of conceptual clustering algorithms

TL;DR: This work presents an overview of the most influential algorithms reported in the field of conceptual clustering, highlighting their limitations or drawbacks, and presents a taxonomy of these methods as well as a qualitative comparison of these algorithms.
Journal ArticleDOI

Urban Water Consumption at Multiple Spatial and Temporal Scales. A Review of Existing Datasets

TL;DR: This research effort builds an updated catalog of the existing water demand datasets to facilitate future research efforts and encourage the publication of open-access datasets in water demand modelling and management research.
Journal ArticleDOI

Interest Drifts in User Profiling

TL;DR: A user-profiling technique named WebProfiler, which learns a hierarchical representation of user interests using conceptual clustering, is augmented with an adaptation strategy based on relevance feedback and time-based forgetting in order to deal with drifting interests.
References
More filters
Journal ArticleDOI

A mathematical theory of communication

TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Book

Artificial Intelligence: A Modern Approach

TL;DR: In this article, the authors present a comprehensive introduction to the theory and practice of artificial intelligence for modern applications, including game playing, planning and acting, and reinforcement learning with neural networks.
Book

Introduction to Modern Information Retrieval

TL;DR: Reading is a need and a hobby at once and this condition is the on that will make you feel that you must read.
Journal ArticleDOI

Machine learning in automated text categorization

TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Journal ArticleDOI

A vector space model for automatic indexing

TL;DR: An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents, demonstating the usefulness of the model.
Related Papers (5)