scispace - formally typeset
Open AccessProceedings Article

Privacy-preserving SVM classification

TLDR
In this article, a privacy-preserving solution for support vector machine (SVM) classification, PP-SVM for short, is proposed, which constructs the global SVM classification model from data distributed at multiple parties, without disclosing the data of each party to others.
Abstract
Traditional Data Mining and Knowledge Discovery algorithms assume free access to data, either at a centralized location or in federated form. Increasingly, privacy and security concerns restrict this access, thus derailing data mining projects. What is required is distributed knowledge discovery that is sensitive to this problem. The key is to obtain valid results, while providing guarantees on the nondisclosure of data. Support vector machine classification is one of the most widely used classification methodologies in data mining and machine learning. It is based on solid theoretical foundations and has wide practical application. This paper proposes a privacy-preserving solution for support vector machine (SVM) classification, PP-SVM for short. Our solution constructs the global SVM classification model from data distributed at multiple parties, without disclosing the data of each party to others. Solutions are sketched out for data that is vertically, horizontally, or even arbitrarily partitioned. We quantify the security and efficiency of the proposed method, and highlight future challenges.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal Article

Privacy preserving association rule mining in vertically partitioned data

TL;DR: A privacy preserving association rule mining algorithm was introduced that preserved privacy of individual values by computing scalar product and the security was analyzed.
Journal ArticleDOI

Machine learning on big data

TL;DR: A framework of ML on big data (MLBiD) is introduced to guide the discussion of its opportunities and challenges and provides directions for identification of associated opportunities and challenged and open up future work in many unexplored or under explored research areas.
Posted Content

SecureML: A System for Scalable Privacy-Preserving Machine Learning.

TL;DR: In this article, the authors present new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method, where data owners distribute their private data among two non-colluding servers who train various models on the joint data using secure two-party computation.
Journal ArticleDOI

Information Security in Big Data: Privacy and Data Mining

TL;DR: This paper identifies four different types of users involved in data mining applications, namely, data provider, data collector, data miner, and decision maker, and examines various approaches that can help to protect sensitive information.
References
More filters

Statistical learning theory

TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Book

An Introduction to Support Vector Machines and Other Kernel-based Learning Methods

TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
Book ChapterDOI

Public-key cryptosystems based on composite degree residuosity classes

TL;DR: A new trapdoor mechanism is proposed and three encryption schemes are derived : a trapdoor permutation and two homomorphic probabilistic encryption schemes computationally comparable to RSA, which are provably secure under appropriate assumptions in the standard model.
Proceedings ArticleDOI

How to play ANY mental game

TL;DR: This work presents a polynomial-time algorithm that, given as a input the description of a game with incomplete information and any number of players, produces a protocol for playing the game that leaks no partial information, provided the majority of the players is honest.
Journal ArticleDOI

Privacy-preserving data mining

TL;DR: This work considers the concrete case of building a decision-tree classifier from training data in which the values of individual records have been perturbed and proposes a novel reconstruction procedure to accurately estimate the distribution of original data values.
Related Papers (5)