Bayesian probabilistic matrix factorization using Markov chain Monte Carlo
Ruslan Salakhutdinov,Andriy Mnih +1 more
- pp 880-887
TLDR
This paper presents a fully Bayesian treatment of the Probabilistic Matrix Factorization (PMF) model in which model capacity is controlled automatically by integrating over all model parameters and hyperparameters and shows that Bayesian PMF models can be efficiently trained using Markov chain Monte Carlo methods by applying them to the Netflix dataset.Abstract:
Low-rank matrix approximation methods provide one of the simplest and most effective approaches to collaborative filtering. Such models are usually fitted to data by finding a MAP estimate of the model parameters, a procedure that can be performed efficiently even on very large datasets. However, unless the regularization parameters are tuned carefully, this approach is prone to overfitting because it finds a single point estimate of the parameters. In this paper we present a fully Bayesian treatment of the Probabilistic Matrix Factorization (PMF) model in which model capacity is controlled automatically by integrating over all model parameters and hyperparameters. We show that Bayesian PMF models can be efficiently trained using Markov chain Monte Carlo methods by applying them to the Netflix dataset, which consists of over 100 million movie ratings. The resulting models achieve significantly higher prediction accuracy than PMF models trained using MAP estimation.read more
Citations
More filters
Journal Article
Dropout: a simple way to prevent neural networks from overfitting
TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Proceedings ArticleDOI
Factorization Machines
TL;DR: Factorization Machines (FM) are introduced which are a new model class that combines the advantages of Support Vector Machines (SVM) with factorization models and can mimic these models just by specifying the input data (i.e. the feature vectors).
Journal ArticleDOI
Stochastic variational inference
TL;DR: Stochastic variational inference lets us apply complex Bayesian models to massive data sets, and it is shown that the Bayesian nonparametric topic model outperforms its parametric counterpart.
Proceedings ArticleDOI
Collaborative topic modeling for recommending scientific articles
Chong Wang,David M. Blei +1 more
TL;DR: An algorithm to recommend scientific articles to users of an online community that combines the merits of traditional collaborative filtering and probabilistic topic modeling and can form recommendations about both existing and newly published articles is developed.
Proceedings ArticleDOI
Recommender systems with social regularization
TL;DR: This paper proposes a matrix factorization framework with social regularization, which can be easily extended to incorporate other contextual information, like social tags, etc, and demonstrates that the approaches outperform other state-of-the-art methods.
References
More filters
Journal ArticleDOI
An introduction to variational methods for graphical models
TL;DR: This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields), and describes a general framework for generating variational transformations based on convex duality.
Proceedings Article
Probabilistic Matrix Factorization
Andriy Mnih,Ruslan Salakhutdinov +1 more
TL;DR: The Probabilistic Matrix Factorization (PMF) model is presented, which scales linearly with the number of observations and performs well on the large, sparse, and very imbalanced Netflix dataset and is extended to include an adaptive prior on the model parameters.
Proceedings Article
Probabilistic latent semantic analysis
TL;DR: This work proposes a widely applicable generalization of maximum likelihood model fitting by tempered EM, based on a mixture decomposition derived from a latent class model which results in a more principled approach which has a solid foundation in statistics.
Proceedings ArticleDOI
Fast maximum margin matrix factorization for collaborative prediction
TL;DR: This work investigates a direct gradient-based optimization method for MMMF and finds that MMMf substantially outperforms all nine methods he tested and demonstrates it on large collaborative prediction problems.
Proceedings ArticleDOI
Keeping the neural networks simple by minimizing the description length of the weights
Geoffrey E. Hinton,Drew van Camp +1 more
TL;DR: A method of computing the derivatives of the expected squared error and of the amount of information in the noisy weights in a network that contains a layer of non-linear hidden units without time-consuming Monte Carlo simulations is described.