scispace - formally typeset
Open AccessJournal ArticleDOI

A survey of collaborative filtering techniques

Reads0
Chats0
TLDR
From basic techniques to the state-of-the-art, this paper attempts to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.
Abstract
As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, modelbased, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

EndorTrust: An Endorsement-Based Reputation System for Trustworthy and Heterogeneous Crowdsourcing

TL;DR: A reputation system, EndorTrust, is proposed to not only assess but also predict the trustworthiness of contributions without wasting workers' effort to improve trustworthiness prediction using machine learning methods, while also taking into account the heterogeneity of both workers and tasks.
Journal ArticleDOI

A new user similarity measure in a new prediction model for collaborative filtering

TL;DR: A modified proximity-impact-popularity ( MPIP ) similarity measure is introduced and a modified prediction expression is proposed to predict the available and unavailable ratings by combining user- and item-related components.
Journal ArticleDOI

Collaborative Filtering Recommendation on Users’ Interest Sequences

TL;DR: A new collaborative filtering recommendation method based on users’ interest sequences (IS) that rank users�’ ratings or other online behaviors according to the timestamps when they occurred and outperforms several existing algorithms in the accuracy of rating prediction is proposed.
Proceedings ArticleDOI

On the Use of LSH for Privacy Preserving Personalization

TL;DR: It is reported in this work how changing the nature of LSH inputs, which in this case corresponds to the user profile representations, impacts the performance of L SH-based clustering and the final quality of recommendations.
Book ChapterDOI

Improving Jaccard Index for Measuring Similarity in Collaborative Filtering

TL;DR: A novel improvement of Jaccard index is proposed that reflects the frequency of ratings assigned by users as well as the number of items co-rated by users and its combination with a previous similarity measure is superior to existing measures in terms of both prediction and recommendation qualities.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Related Papers (5)