scispace - formally typeset
Journal ArticleDOI

BaPa: A Novel Approach of Improving Load Balance in Parallel Matrix Factorization for Recommender Systems

Reads0
Chats0
TLDR
This work formally proves the feasibility of BaPa by observing the variance of rating numbers across blocks, and empirically validate its soundness by applying it to two standard parallel matrix factorization algorithms, DSGD and CCD++.
Abstract
A simplified approach to accelerate matrix factorization of big data is to parallelize it. A commonly used method is to divide the matrix into multiple non-intersecting blocks and concurrently calculate them. This operation causes the Load balance problem, which significantly impacts parallel performance and is a big concern. A general belief is that the load balance across blocks is impossible by balancing rows and columns separately. We challenge the belief by proposing an approach of “Balanced Partitioning (BaPa)”. We demonstrate under what circumstance independently balancing rows and columns can lead to the balanced intersection of rows and columns, why, and how. We formally prove the feasibility of BaPa by observing the variance of rating numbers across blocks, and empirically validate its soundness by applying it to two standard parallel matrix factorization algorithms, DSGD and CCD++. Besides, we establish a mathematical model of “Imbalance Degree” to explain further why BaPa works well. BaPa is applied to synchronous parallel matrix factorization, but as a general load balance solution, it has significant application potential.

read more

Citations
More filters
Journal ArticleDOI

DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

TL;DR: Wang et al. as mentioned in this paper integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property and introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency.
Journal ArticleDOI

DS-ADMM++: A Novel Distributed Quantized ADMM to Speed up Differentially Private Matrix Factorization

TL;DR: Wang et al. as mentioned in this paper integrated local differential privacy paradigm into DS-ADMM to provide the privacy-preserving property and introduced a stochastic quantized function to reduce transmission overheads in ADMM to further improve efficiency.
Journal ArticleDOI

A Survey and Guideline on Privacy Enhancing Technologies for Collaborative Machine Learning

TL;DR: This study constitutes the first survey to provide an in-depth focus on collaborative ML requirements and constraints for privacy solutions while also providing guidelines on the selection of PETs, covering secure multi-party computation, homomorphic encryption, differential privacy, and confidential computing in the context of collaborative ML.
Journal ArticleDOI

A Survey and Guideline on Privacy Enhancing Technologies for Collaborative Machine Learning

- 01 Jan 2022 - 
TL;DR: A detailed analysis of state of the art for collaborative ML approaches from a privacy perspective is provided in this paper , where a detailed threat model and security and privacy considerations are given for each collaborative method.
References
More filters
Journal ArticleDOI

Matrix Factorization Techniques for Recommender Systems

TL;DR: As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels.
Proceedings ArticleDOI

Factorization meets the neighborhood: a multifaceted collaborative filtering model

TL;DR: The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model and a new evaluation metric is suggested, which highlights the differences among methods, based on their performance at a top-K recommendation task.
Proceedings Article

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

TL;DR: In this paper, the authors present an update scheme called HOGWILD!, which allows processors access to shared memory with the possibility of overwriting each other's work, which achieves a nearly optimal rate of convergence.
Proceedings ArticleDOI

Large-scale matrix factorization with distributed stochastic gradient descent

TL;DR: A novel algorithm to approximately factor large matrices with millions of rows, millions of columns, and billions of nonzero elements, called DSGD, that can be fully distributed and run on web-scale datasets using, e.g., MapReduce.
Related Papers (5)