BaPa: A Novel Approach of Improving Load Balance in Parallel Matrix Factorization for Recommender Systems

doi:10.1109/TC.2020.2997051

Journal ArticleDOI

BaPa: A Novel Approach of Improving Load Balance in Parallel Matrix Factorization for Recommender Systems

Ruixin Guo, +6 more

- 01 May 2021 -

IEEE Transactions on Computers

- Vol. 70, Iss: 5, pp 789-802

Chats0

TLDR

This work formally proves the feasibility of BaPa by observing the variance of rating numbers across blocks, and empirically validate its soundness by applying it to two standard parallel matrix factorization algorithms, DSGD and CCD++.

Abstract:

A simplified approach to accelerate matrix factorization of big data is to parallelize it. A commonly used method is to divide the matrix into multiple non-intersecting blocks and concurrently calculate them. This operation causes the Load balance problem, which significantly impacts parallel performance and is a big concern. A general belief is that the load balance across blocks is impossible by balancing rows and columns separately. We challenge the belief by proposing an approach of “Balanced Partitioning (BaPa)”. We demonstrate under what circumstance independently balancing rows and columns can lead to the balanced intersection of rows and columns, why, and how. We formally prove the feasibility of BaPa by observing the variance of rating numbers across blocks, and empirically validate its soundness by applying it to two standard parallel matrix factorization algorithms, DSGD and CCD++. Besides, we establish a mathematical model of “Imbalance Degree” to explain further why BaPa works well. BaPa is applied to synchronous parallel matrix factorization, but as a general load balance solution, it has significant application potential.

BaPa: A Novel Approach of Improving Load Balance in Parallel Matrix Factorization for Recommender Systems

Citations

Matrix Factorization Techniques for Recommender Systems

DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

A Survey and Guideline on Privacy Enhancing Technologies for Collaborative Machine Learning

A Survey and Guideline on Privacy Enhancing Technologies for Collaborative Machine Learning

References

Matrix Factorization Techniques for Recommender Systems

Factorization meets the neighborhood: a multifaceted collaborative filtering model

Matrix Factorization Techniques for Recommender Systems

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

Large-scale matrix factorization with distributed stochastic gradient descent

Related Papers (5)

Parallel Symbolic Factorization for Sparse LU with Static Pivoting

Distributed sparse matrix factorization: QR and Cholesky decompositions

Load-Balancing-Aware Parallel Algorithms of H-Matrices with Adaptive Cross Approximation for GPUs

The parallel solution of nonlinear least-squares problems

Efficient mapping and implementation of matrix algorithms on a hypercube