Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

doi:10.1515/POPETS-2017-0053

Open AccessJournal ArticleDOI

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

Adrià Gascón, +6 more

- Vol. 2017, Iss: 4, pp 345-364

Chats0

TLDR

A hybrid multi-party computation protocol that combines Yao’s garbled circuits with tailored protocols for computing inner products is proposed, suitable for secure computation because it uses an efficient fixed-point representation of real numbers while maintaining accuracy and convergence rates comparable to what can be obtained with a classical solution using floating point numbers.

Abstract:

We propose privacy-preserving protocols for computing linear regression models, in the setting where the training dataset is vertically distributed among several parties. Our main contribution is a hybrid multi-party computation protocol that combines Yao’s garbled circuits with tailored protocols for computing inner products. Like many machine learning tasks, building a linear regression model involves solving a system of linear equations. We conduct a comprehensive evaluation and comparison of different techniques for securely performing this task, including a new Conjugate Gradient Descent (CGD) algorithm. This algorithm is suitable for secure computation because it uses an efficient fixed-point representation of real numbers while maintaining accuracy and convergence rates comparable to what can be obtained with a classical solution using floating point numbers. Our technique improves on Nikolaenko et al.’s method for privacy-preserving ridge regression (S&P 2013), and can be used as a building block in other analyses. We implement a complete system and demonstrate that our approach is highly scalable, solving data analysis problems with one million records and one hundred features in less than one hour of total running time.

Privacy-Preserving Distributed Linear Regression on High-Dimensional Data

Citations

Cryptϵ: Crypto-Assisted Differential Privacy on Untrusted Servers

Federated Doubly Stochastic Kernel Learning for Vertically Partitioned Data

PrivFL: Practical Privacy-preserving Federated Regressions on High-dimensional Data over Mobile Networks

Privacy-Preserving Asynchronous Vertical Federated Learning Algorithms for Multiparty Collaborative Learning.

Federated Learning and Differential Privacy: Software tools analysis, the Sherpa.ai FL framework and methodological guidelines for preserving data privacy

References

Numerical Optimization

UCI Machine Learning Repository

Machine Learning : A Probabilistic Perspective

The algebraic eigenvalue problem

The Algorithmic Foundations of Differential Privacy

Related Papers (5)

The Algorithmic Foundations of Differential Privacy

Public-key cryptosystems based on composite degree residuosity classes

How to share a secret

Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures

Deep Learning with Differential Privacy