scispace - formally typeset
Search or ask a question

Showing papers by "Li Zhang published in 2019"


Posted Content
Ilya Mironov1, Kunal Talwar1, Li Zhang1
TL;DR: A numerically stable procedure for precise computation of SGM's Renyi Differential Privacy is described and a nearly tight (within a small constant factor) closed-form bound is proved.
Abstract: The Sampled Gaussian Mechanism (SGM)---a composition of subsampling and the additive Gaussian noise---has been successfully used in a number of machine learning applications. The mechanism's unexpected power is derived from privacy amplification by sampling where the privacy cost of a single evaluation diminishes quadratically, rather than linearly, with the sampling rate. Characterizing the precise privacy properties of SGM motivated development of several relaxations of the notion of differential privacy. This work unifies and fills in gaps in published results on SGM. We describe a numerically stable procedure for precise computation of SGM's R\'enyi Differential Privacy and prove a nearly tight (within a small constant factor) closed-form bound.

153 citations


Posted Content
TL;DR: It is shown that running baselines properly is difficult and empirical findings in research papers are questionable unless they were obtained on standardized benchmarks where baselines have been tuned extensively by the research community.
Abstract: Numerical evaluations with comparisons to baselines play a central role when judging research in recommender systems. In this paper, we show that running baselines properly is difficult. We demonstrate this issue on two extensively studied datasets. First, we show that results for baselines that have been used in numerous publications over the past five years for the Movielens 10M benchmark are suboptimal. With a careful setup of a vanilla matrix factorization baseline, we are not only able to improve upon the reported results for this baseline but even outperform the reported results of any newly proposed method. Secondly, we recap the tremendous effort that was required by the community to obtain high quality results for simple methods on the Netflix Prize. Our results indicate that empirical findings in research papers are questionable unless they were obtained on standardized benchmarks where baselines have been tuned extensively by the research community.

96 citations


Posted Content
Li Zhang1
TL;DR: It is shown that for any convex differentiable loss, a deep linear network has no spurious local minima as long as it is true for the two layer case, and a new perturbation argument is developed to show that any spurious local minimum must have full rank, a structural property which can be useful more generally.
Abstract: We show that for any convex differentiable loss, a deep linear network has no spurious local minima as long as it is true for the two layer case. This reduction greatly simplifies the study on the existence of spurious local minima in deep linear networks. When applied to the quadratic loss, our result immediately implies the powerful result in [Kawaguchi 2016]. Further, with the work in [Zhou and Liang 2018], we can remove all the assumptions in [Kawaguchi 2016]. This property holds for more general "multi-tower" linear networks too. Our proof builds on [Laurent and von Brecht 2018] and develops a new perturbation argument to show that any spurious local minimum must have full rank, a structural property which can be useful more generally.

17 citations


Patent
25 Jul 2019
TL;DR: In this article, differentially private machine-learned models are learned from a set of client computing devices, based on the data-weighted average of the local updates of the machine learned model.
Abstract: Systems and methods for learning differentially private machine-learned models are provided. A computing system can include one or more server computing devices comprising one or more processors and one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors cause the one or more server computing devices to perform operations. The operations can include selecting a subset of client computing devices from a pool of available client computing devices; providing a machine-learned model to the selected client computing devices; receiving, from each selected client computing device, a local update for the machine-learned model; determining a differentially private aggregate of the local updates; and determining an updated machine-learned model based at least in part on the data-weighted average of the local updates.

16 citations