scispace - formally typeset
Search or ask a question
Author

Udi Weinsberg

Other affiliations: Tel Aviv University
Bio: Udi Weinsberg is an academic researcher from Facebook. The author has contributed to research in topics: Recommender system & The Internet. The author has an hindex of 19, co-authored 61 publications receiving 1739 citations. Previous affiliations of Udi Weinsberg include Tel Aviv University.


Papers
More filters
Proceedings ArticleDOI
19 May 2013
TL;DR: This work implements the complete system and experiments with it on real data-sets, and shows that it significantly outperforms pure implementations based only on homomorphic encryption or Yao circuits.
Abstract: Ridge regression is an algorithm that takes as input a large number of data points and finds the best-fit linear curve through these points. The algorithm is a building block for many machine-learning operations. We present a system for privacy-preserving ridge regression. The system outputs the best-fit curve in the clear, but exposes no other information about the input data. Our approach combines both homomorphic encryption and Yao garbled circuits, where each is used in a different part of the algorithm to obtain the best performance. We implement the complete system and experiment with it on real data-sets, and show that it significantly outperforms pure implementations based only on homomorphic encryption or Yao circuits.

464 citations

Proceedings ArticleDOI
04 Nov 2013
TL;DR: This work shows that a recommender can profile items without ever learning the ratings users provide, or even which items they have rated, by designing a system that performs matrix factorization, a popular method used in a variety of modern recommendation systems, through a cryptographic technique known as garbled circuits.
Abstract: Recommender systems typically require users to reveal their ratings to a recommender service, which subsequently uses them to provide relevant recommendations. Revealing ratings has been shown to make users susceptible to a broad set of inference attacks, allowing the recommender to learn private user attributes, such as gender, age, etc. In this work, we show that a recommender can profile items without ever learning the ratings users provide, or even which items they have rated. We show this by designing a system that performs matrix factorization, a popular method used in a variety of modern recommendation systems, through a cryptographic technique known as garbled circuits. Our design uses oblivious sorting networks in a novel way to leverage sparsity in the data. This yields an efficient implementation, whose running time is O(Mlog^2M) in the number of ratings M. Crucially, our design is also highly parallelizable, giving a linear speedup with the number of available processors. We further fully implement our system, and demonstrate that even on commodity hardware with 16 cores, our privacy-preserving implementation can factorize a matrix with 10K ratings within a few hours.

308 citations

Proceedings ArticleDOI
09 Sep 2012
TL;DR: This work shows that a recommender system can infer the gender of a user with high accuracy, based solely on the ratings provided by users (without additional metadata), and a relatively small number of users who share their demographics.
Abstract: User demographics, such as age, gender and ethnicity, are routinely used for targeting content and advertising products to users. Similarly, recommender systems utilize user demographics for personalizing recommendations and overcoming the cold-start problem. Often, privacy-concerned users do not provide these details in their online profiles. In this work, we show that a recommender system can infer the gender of a user with high accuracy, based solely on the ratings provided by users (without additional metadata), and a relatively small number of users who share their demographics. Focusing on gender, we design techniques for effectively adding ratings to a user's profile for obfuscating the user's gender, while having an insignificant effect on the recommendations provided to that user.

168 citations

Proceedings ArticleDOI
17 May 2015
TL;DR: This work builds Graph SC, a framework that provides a programming paradigm that allows non-cryptography experts to write secure code, brings parallelism to such secure implementations, and meets the need for obliviousness, thereby not leaking any private information.
Abstract: We propose introducing modern parallel programming paradigms to secure computation, enabling their secure execution on large datasets. To address this challenge, we present Graph SC, a framework that (i) provides a programming paradigm that allows non-cryptography experts to write secure code, (ii) brings parallelism to such secure implementations, and (iii) meets the need for obliviousness, thereby not leaking any private information. Using Graph SC, developers can efficiently implement an oblivious version of graph-based algorithms (including sophisticated data mining and machine learning algorithms) that execute in parallel with minimal communication overhead. Importantly, our secure version of graph-based algorithms incurs a small logarithmic overhead in comparison with the non-secure parallel version. We build Graph SC and demonstrate, using several algorithms as examples, that secure computation can be brought into the realm of practicality for big data analysis. Our secure matrix factorization implementation can process 1 million ratings in 13 hours, which is a multiple order-of-magnitude improvement over the only other existing attempt, which requires 3 hours to process 16K ratings.

152 citations

Proceedings ArticleDOI
21 Nov 2013
TL;DR: AdReveal, a practical measurement and analysis framework, provides a first look at the prevalence of different ad targeting mechanisms and designs and implements a browser based tool that provides detailed measurements of online display ads, and develops analysis techniques to characterize the contextual, behavioral and re-marketing based targeting mechanisms used by advertisers.
Abstract: To address the pressing need to provide transparency into the online targeted advertising ecosystem, we present AdReveal, a practical measurement and analysis framework, that provides a first look at the prevalence of different ad targeting mechanisms. We design and implement a browser based tool that provides detailed measurements of online display ads, and develop analysis techniques to characterize the contextual, behavioral and re-marketing based targeting mechanisms used by advertisers. Our analysis is based on a large dataset consisting of measurements from 103K webpages and 139K display ads. Our results show that advertisers frequently target users based on their online interests; almost half of the ad categories employ behavioral targeting. Ads related to Insurance, Real Estate and Travel and Tourism make extensive use of behavioral targeting. Furthermore, up to 65% of ad categories received by users are behaviorally targeted. Finally, our analysis of re-marketing shows that it is adopted by a wide range of websites and the most commonly targeted re-marketing based ads are from the Travel and Tourism and Shopping categories.

90 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.
Abstract: Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.

2,593 citations

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.
Abstract: Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. Training in heterogeneous and potentially massive networks introduces novel challenges that require a fundamental departure from standard approaches for large-scale machine learning, distributed optimization, and privacy-preserving data analysis. In this article, we discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.

2,163 citations

Posted Content
TL;DR: This work proposes building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.
Abstract: Today's AI still faces two major challenges. One is that in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated learning framework, which includes horizontal federated learning, vertical federated learning and federated transfer learning. We provide definitions, architectures and applications for the federated learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allow knowledge to be shared without compromising user privacy.

1,317 citations

Proceedings ArticleDOI
22 May 2017
TL;DR: This paper presents new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method, and implements the first privacy preserving system for training neural networks.
Abstract: Machine learning is widely used in practice to produce predictive models for applications such as image processing, speech and text recognition. These models are more accurate when trained on large amount of data collected from different sources. However, the massive data collection raises privacy concerns. In this paper, we present new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method. Our protocols fall in the two-server model where data owners distribute their private data among two non-colluding servers who train various models on the joint data using secure two-party computation (2PC). We develop new techniques to support secure arithmetic operations on shared decimal numbers, and propose MPC-friendly alternatives to non-linear functions such as sigmoid and softmax that are superior to prior work. We implement our system in C++. Our experiments validate that our protocols are several orders of magnitude faster than the state of the art implementations for privacy preserving linear and logistic regressions, and scale to millions of data samples with thousands of features. We also implement the first privacy preserving system for training neural networks.

1,164 citations