scispace - formally typeset
N

Nicholas Carlini

Researcher at Google

Publications -  104
Citations -  24459

Nicholas Carlini is an academic researcher from Google. The author has contributed to research in topics: Computer science & Robustness (computer science). The author has an hindex of 40, co-authored 78 publications receiving 15330 citations. Previous affiliations of Nicholas Carlini include University of California, Berkeley.

Papers
More filters
Posted Content

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations

TL;DR: This paper breaks state-of-the-art adversarially-trained and certifiably-robust models by generating small perturbations that the models are (provably) robust to, yet that change an input's class according to human labelers.
Posted Content

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

TL;DR: In this article, the authors describe a testing methodology for quantitatively assessing the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models, a common type of machine learning model.
Posted Content

High-Fidelity Extraction of Neural Network Models.

TL;DR: This work expands on prior work to develop the first practical functionally-equivalent extraction attack for direct extraction of a model’s weights, and demonstrates the practicality of model extraction attacks against production-grade systems.
Posted Content

Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

TL;DR: This paper demonstrates that robustness to perturbation-based adversarial examples is not only insufficient for general robustness, but worse, it can also increase vulnerability of the model to invariance-based adversaries, and argues that the term adversarial example is used to capture a series of model limitations.
Posted Content

Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples

Nicholas Carlini
- 06 Feb 2019 - 
TL;DR: This defense (presented at at NeurIPS 2018 as a spotlight paper—the top 3% of submissions) is completely ineffective, and even defense-oblivious attacks reduce the detection rate to 0% on untargeted attacks.