scispace - formally typeset
Open AccessProceedings ArticleDOI

Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution

Reads0
Chats0
TLDR
In this paper, the authors show that NLP models can be injected with backdoors that lead to a nearly 100% attack success rate, whereas being highly invisible to existing defense strategies and even human inspections.
Abstract
Recent studies show that neural natural language processing (NLP) models are vulnerable to backdoor attacks. Injected with backdoors, models perform normally on benign examples but produce attacker-specified predictions when the backdoor is activated, presenting serious security threats to real-world applications. Since existing textual backdoor attacks pay little attention to the invisibility of backdoors, they can be easily detected and blocked. In this work, we present invisible backdoors that are activated by a learnable combination of word substitution. We show that NLP models can be injected with backdoors that lead to a nearly 100% attack success rate, whereas being highly invisible to existing defense strategies and even human inspections. The results raise a serious alarm to the security of NLP models, which requires further research to be resolved. All the data and code of this paper are released at https://github.com/thunlp/BkdAtk-LWS.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

TL;DR: A simple and effective textual backdoor defense named ONION, which is based on outlier word detection and, to the best of the knowledge, is the first method that can handle all the textual backdoor attack situations.
Proceedings ArticleDOI

Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger

TL;DR: This paper proposed to use the syntactic structure as the trigger in textual backdoor attacks, which can achieve comparable attack performance to the insertion-based methods but possesses much higher invisibility and stronger resistance to defenses.
Proceedings ArticleDOI

Improving Machine Translation Systems via Isotopic Replacement

TL;DR: This work proposes CAT, a novel word-replacement-based approach, whose basic idea is to identify word replacement with controlled impact (referred to as isotopic replacement), which uses a neural-based language model to encode the sentence context, and design an neural-network-based algorithm to evaluate context-aware semantic similarity between two words.
Posted Content

Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer.

TL;DR: In this paper, the authors make the first attempt to conduct adversarial and backdoor attacks based on text style transfer, which is aimed at altering the style of a sentence while preserving its meaning.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Journal ArticleDOI

WordNet : an electronic lexical database

Christiane Fellbaum
- 01 Sep 2000 - 
TL;DR: The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.
Proceedings Article

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

TL;DR: A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.
Proceedings Article

Categorical Reparameterization with Gumbel-Softmax

TL;DR: Gumbel-Softmax as mentioned in this paper replaces the non-differentiable samples from a categorical distribution with a differentiable sample from a novel Gumbel softmax distribution, which has the essential property that it can be smoothly annealed into the categorical distributions.
Related Papers (5)