Author

Ken Lang

Bio: Ken Lang is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: User profile. The author has an hindex of 1, co-authored 1 publications receiving 1993 citations.

Topics: User profile

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

NewsWeeder: learning to filter netnews

[...]

Ken Lang¹•Institutions (1)

Carnegie Mellon University¹

09 Jul 1995

TL;DR: The results show that a learning algorithm based on the Minimum Description Length (MDL) principle was able to raise the percentage of interesting articles to be shown to users from 14% to 52% on average.

...read moreread less

Abstract: A significant problem in many information filtering systems is the dependence on the user for the creation and maintenance of a user profile, which describes the user's interests. NewsWeeder is a netnews-filtering system that addresses this problem by letting the user rate his or her interest level for each article being read (1-5), and then learning a user profile based on these ratings. This paper describes how NewsWeeder accomplishes this task, and examines the alternative learning methods used. The results show that a learning algorithm based on the Minimum Description Length (MDL) principle was able to raise the percentage of interesting articles to be shown to users from 14% to 52% on average. Further, this performance significantly outperformed (by 21%) one of the most successful techniques in Information Retrieval (IR), term-frequency/inverse-document-frequency (tf-idf) weighting.

...read moreread less

2,234 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Regularization Paths for Generalized Linear Models via Coordinate Descent

[...]

Jerome H. Friedman¹, Trevor Hastie¹, Robert Tibshirani•Institutions (1)

Stanford University¹

02 Feb 2010-Journal of Statistical Software

TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.

...read moreread less

Abstract: We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include l(1) (the lasso), l(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.

...read moreread less

13,656 citations

Journal Article•DOI•

Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions

[...]

Gediminas Adomavicius¹, Alexander Tuzhilin•Institutions (1)

University of Minnesota¹

01 Jun 2005-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches.

...read moreread less

Abstract: This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches. This paper also describes various limitations of current recommendation methods and discusses possible extensions that can improve recommendation capabilities and make recommender systems applicable to an even broader range of applications. These extensions include, among others, an improvement of understanding of users and items, incorporation of the contextual information into the recommendation process, support for multicriteria ratings, and a provision of more flexible and less intrusive types of recommendations.

...read moreread less

9,873 citations

Journal Article•DOI•

Machine learning in automated text categorization

[...]

Fabrizio Sebastiani

01 Mar 2002-ACM Computing Surveys

TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

Abstract: The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

7,539 citations

Active Learning Literature Survey

[...]

Burr Settles

01 Jan 2009

TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.

...read moreread less

Abstract: The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer training labels if it is allowed to choose the data from which it learns. An active learner may pose queries, usually in the form of unlabeled data instances to be labeled by an oracle (e.g., a human annotator). Active learning is well-motivated in many modern machine learning problems, where unlabeled data may be abundant or easily obtained, but labels are difficult, time-consuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for successful active learning, a summary of problem setting variants and practical issues, and a discussion of related topics in machine learning research are also presented.

...read moreread less

5,227 citations

Journal Article•DOI•

Hybrid Recommender Systems: Survey and Experiments

[...]

Robin Burke¹•Institutions (1)

California State University, Fullerton¹

04 Nov 2002-User Modeling and User-adapted Interaction

TL;DR: This paper surveys the landscape of actual and possible hybrid recommenders, and introduces a novel hybrid, EntreeC, a system that combines knowledge-based recommendation and collaborative filtering to recommend restaurants, and shows that semantic ratings obtained from the knowledge- based part of the system enhance the effectiveness of collaborative filtering.

...read moreread less

Abstract: Recommender systems represent user preferences for the purpose of suggesting items to purchase or examine They have become fundamental applications in electronic commerce and information access, providing suggestions that effectively prune large information spaces so that users are directed toward those items that best meet their needs and preferences A variety of techniques have been proposed for performing recommendation, including content-based, collaborative, knowledge-based and other techniques To improve performance, these methods have sometimes been combined in hybrid recommenders This paper surveys the landscape of actual and possible hybrid recommenders, and introduces a novel hybrid, EntreeC, a system that combines knowledge-based recommendation and collaborative filtering to recommend restaurants Further, we show that semantic ratings obtained from the knowledge-based part of the system enhance the effectiveness of collaborative filtering

...read moreread less

3,883 citations

Collapse