Book ChapterDOI
NewsWeeder: learning to filter netnews
Ken Lang
- pp 331-339
Reads0
Chats0
TLDR
The results show that a learning algorithm based on the Minimum Description Length (MDL) principle was able to raise the percentage of interesting articles to be shown to users from 14% to 52% on average.Abstract:
A significant problem in many information filtering systems is the dependence on the user for the creation and maintenance of a user profile, which describes the user's interests. NewsWeeder is a netnews-filtering system that addresses this problem by letting the user rate his or her interest level for each article being read (1-5), and then learning a user profile based on these ratings. This paper describes how NewsWeeder accomplishes this task, and examines the alternative learning methods used. The results show that a learning algorithm based on the Minimum Description Length (MDL) principle was able to raise the percentage of interesting articles to be shown to users from 14% to 52% on average. Further, this performance significantly outperformed (by 21%) one of the most successful techniques in Information Retrieval (IR), term-frequency/inverse-document-frequency (tf-idf) weighting.read more
Citations
More filters
Journal ArticleDOI
Regularization Paths for Generalized Linear Models via Coordinate Descent
TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.
Journal ArticleDOI
Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions
TL;DR: This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches.
Journal ArticleDOI
Machine learning in automated text categorization
TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Active Learning Literature Survey
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
Journal ArticleDOI
Hybrid Recommender Systems: Survey and Experiments
TL;DR: This paper surveys the landscape of actual and possible hybrid recommenders, and introduces a novel hybrid, EntreeC, a system that combines knowledge-based recommendation and collaborative filtering to recommend restaurants, and shows that semantic ratings obtained from the knowledge- based part of the system enhance the effectiveness of collaborative filtering.
References
More filters
Journal ArticleDOI
A mathematical theory of communication
TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
Book
The Mathematical Theory of Communication
TL;DR: The Mathematical Theory of Communication (MTOC) as discussed by the authors was originally published as a paper on communication theory more than fifty years ago and has since gone through four hardcover and sixteen paperback printings.
Journal ArticleDOI
Paper: Modeling by shortest data description
TL;DR: The number of digits it takes to write down an observed sequence x1,...,xN of a time series depends on the model with its parameters that one assumes to have generated the observed data.
Proceedings ArticleDOI
GroupLens: an open architecture for collaborative filtering of netnews
TL;DR: GroupLens is a system for collaborative filtering of netnews, to help people find articles they will like in the huge stream of available articles, and protect their privacy by entering ratings under pseudonyms, without reducing the effectiveness of the score prediction.
Journal ArticleDOI
Using collaborative filtering to weave an information tapestry
TL;DR: Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser.