N
Nicholas Carlini
Researcher at Google
Publications - 104
Citations - 24459
Nicholas Carlini is an academic researcher from Google. The author has contributed to research in topics: Computer science & Robustness (computer science). The author has an hindex of 40, co-authored 78 publications receiving 15330 citations. Previous affiliations of Nicholas Carlini include University of California, Berkeley.
Papers
More filters
Posted Content
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
TL;DR: This paper breaks state-of-the-art adversarially-trained and certifiably-robust models by generating small perturbations that the models are (provably) robust to, yet that change an input's class according to human labelers.
Posted Content
The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks
TL;DR: In this article, the authors describe a testing methodology for quantitatively assessing the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models, a common type of machine learning model.
Posted Content
High-Fidelity Extraction of Neural Network Models.
TL;DR: This work expands on prior work to develop the first practical functionally-equivalent extraction attack for direct extraction of a model’s weights, and demonstrates the practicality of model extraction attacks against production-grade systems.
Posted Content
Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness
TL;DR: This paper demonstrates that robustness to perturbation-based adversarial examples is not only insufficient for general robustness, but worse, it can also increase vulnerability of the model to invariance-based adversaries, and argues that the term adversarial example is used to capture a series of model limitations.
Posted Content
Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples
TL;DR: This defense (presented at at NeurIPS 2018 as a spotlight paper—the top 3% of submissions) is completely ineffective, and even defense-oblivious attacks reduce the detection rate to 0% on untargeted attacks.