A survey of collaborative filtering techniques
Reads0
Chats0
TLDR
From basic techniques to the state-of-the-art, this paper attempts to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.Abstract:
As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, modelbased, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.read more
Citations
More filters
Proceedings ArticleDOI
Item Similarity Learning Methods for Collaborative Filtering Recommender Systems
TL;DR: This work proposes several methods to learn more accurate item similarities by minimizing the squared prediction error, which can achieve comparable or even better performance than other state-of-the-art recommendation methods of Matrix Factorization, and greatly outperform traditional item based CF method.
Proceedings ArticleDOI
Metaphor: a system for related search recommendations
TL;DR: The design, implementation, and deployment of Metaphor, the related search recommendation system on LinkedIn, a professional social networking site with over 175~million members worldwide, is described.
Proceedings ArticleDOI
An improved collaborative filtering algorithm combining content-based algorithm and user activity
Jiaqi Fan,Weimin Pan,Lisi Jiang +2 more
TL;DR: A new method of user-based collaborative filtering based on predictive value padding that would predict the empty values in user-item matrix by integrating content-based recommendation algorithm and user activity level before calculating user similarity is proposed.
Book ChapterDOI
Exploiting Bhattacharyya Similarity Measure to Diminish User Cold-Start Problem in Sparse Data
TL;DR: Experimental results show that the approach based CF outperforms state-of-the art measures based CFs for cold-start problem and can find neighbors in the absence of co-rated items unlike existing measures.
Journal ArticleDOI
Management and application of mobile big data
TL;DR: An overview of mobile big data's content, scope, methods, challenges and samples is presented and a mobile data infrastructure (MDI) and aMobile data lifetime management (MDLM) model is introduced.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Book
Reinforcement Learning: An Introduction
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal ArticleDOI
Latent dirichlet allocation
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article
Latent Dirichlet Allocation
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Some methods for classification and analysis of multivariate observations
TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.