Showing papers by "Ruslan Salakhutdinov published in 2010"

PDF

Open Access

Proceedings Article•

Efficient Learning of Deep Boltzmann Machines

[...]

Ruslan Salakhutdinov¹, Hugo Larochelle²•Institutions (2)

Massachusetts Institute of Technology¹, University of Toronto²

31 Mar 2010

TL;DR: A new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables, that learns a separate “recognition” model that is used to quickly initialize, in a single bottom-up pass, the values of the latent variables in all hidden layers.

...read moreread less

Abstract: We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. The algorithm learns a separate “recognition” model that is used to quickly initialize, in a single bottom-up pass, the values of the latent variables in all hidden layers. We show that using such a recognition model, followed by a combined top-down and bottom-up pass, it is possible to efficiently learn a good generative model of high-dimensional highly-structured sensory input. We show that the additional computations required by incorporating a top-down feedback plays a critical role in the performance of a DBM, both as a generative and discriminative model. Moreover, inference is only at most three times slower compared to the approximate inference in a Deep Belief Network (DBN), making large-scale learning of DBM’s practical. Finally, we demonstrate that the DBM’s trained using the proposed approximate inference algorithm perform well compared to DBN’s and SVM’s on the MNIST handwritten digit, OCR English letters, and NORB visual object recognition tasks.

...read moreread less

374 citations

Posted Content•

Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm

[...]

Ruslan Salakhutdinov, Nathan Srebro

14 Feb 2010-arXiv: Learning

TL;DR: In this article, a weighted version of the trace-norm regularizer is proposed for matrix completion with non-uniform sampling, and the experimental results demonstrate that the weighted trace-normal regularization indeed yields significant gains on the (highly nonuniformly sampled) Netflix dataset.

...read moreread less

Abstract: We show that matrix completion with trace-norm regularization can be significantly hurt when entries of the matrix are sampled non-uniformly. We introduce a weighted version of the trace-norm regularizer that works well also with non-uniform sampling. Our experimental results demonstrate that the weighted trace-norm regularization indeed yields significant gains on the (highly non-uniformly sampled) Netflix dataset.

...read moreread less

205 citations

Proceedings Article•

Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm

[...]

Nathan Srebro¹, Ruslan Salakhutdinov²•Institutions (2)

Toyota Technological Institute at Chicago¹, Massachusetts Institute of Technology²

06 Dec 2010

TL;DR: It is shown that matrix completion with trace-norm regularization can be significantly hurt when entries of the matrix are sampled non-uniformly, but that a properly weighted version of the trace- norm regularizer works well with non- uniform sampling.

...read moreread less

Abstract: We show that matrix completion with trace-norm regularization can be significantly hurt when entries of the matrix are sampled non-uniformly, but that a properly weighted version of the trace-norm regularizer works well with non-uniform sampling. We show that the weighted trace-norm regularization indeed yields significant gains on the highly non-uniformly sampled Netflix dataset.

...read moreread less

169 citations

Proceedings Article•

Practical Large-Scale Optimization for Max-norm Regularization

[...]

Jason D. Lee¹, Ben Recht², Nathan Srebro³, Joel A. Tropp¹, Ruslan Salakhutdinov⁴ - Show less +1 more•Institutions (4)

California Institute of Technology¹, University of Wisconsin-Madison², Toyota Technological Institute at Chicago³, Massachusetts Institute of Technology⁴

06 Dec 2010

TL;DR: This work uses a factorization technique of Burer and Monteiro to devise scalable first-order algorithms for convex programs involving the max-norm and these algorithms are applied to solve huge collaborative filtering, graph cut, and clustering problems.

...read moreread less

Abstract: The max-norm was proposed as a convex matrix regularizer in [1] and was shown to be empirically superior to the trace-norm for collaborative filtering problems. Although the max-norm can be computed in polynomial time, there are currently no practical algorithms for solving large-scale optimization problems that incorporate the max-norm. The present work uses a factorization technique of Burer and Monteiro [2] to devise scalable first-order algorithms for convex programs involving the max-norm. These algorithms are applied to solve huge collaborative filtering, graph cut, and clustering problems. Empirically, the new methods outperform mature techniques from all three areas.

...read moreread less

167 citations

Proceedings Article•

Learning Deep Boltzmann Machines using Adaptive MCMC

[...]

Ruslan Salakhutdinov¹•Institutions (1)

Massachusetts Institute of Technology¹

21 Jun 2010

TL;DR: This paper first shows a close connection between Fast PCD and adaptive MCMC, and develops a Coupled Adaptive Simulated Tempering algorithm that can be used to better explore a highly multimodal energy landscape.

...read moreread less

Abstract: When modeling high-dimensional richly structured data, it is often the case that the distribution defined by the Deep Boltzmann Machine (DBM) has a rough energy landscape with many local minima separated by high energy barriers. The commonly used Gibbs sampler tends to get trapped in one local mode, which often results in unstable learning dynamics and leads to poor parameter estimates. In this paper, we concentrate on learning DBM's using adaptive MCMC algorithms. We first show a close connection between Fast PCD and adaptive MCMC. We then develop a Coupled Adaptive Simulated Tempering algorithm that can be used to better explore a highly multimodal energy landscape. Finally, we demonstrate that the proposed algorithm considerably improves parameter estimates, particularly when learning large-scale DBM's.

...read moreread less

92 citations

An Efficient Learning Procedure for Deep

[...]

Boltzmann Machines, Ruslan Salakhutdinov, Geoffrey E. Hinton

01 Jan 2010

7 citations