Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification

Open AccessPosted Content

Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification

Chuanshuai Chen, +1 more

- 11 Jul 2020 -

arXiv: Cryptography and Security

Chats0

TLDR

A defense method called Backdoor Keyword Identification (BKI) is proposed to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning, which can identify and exclude poisoning samples crafted to insert backdoor into the model from training data without a verified and trusted dataset.

Abstract:

It has been proved that deep neural networks are facing a new threat called backdoor attacks, where the adversary can inject backdoors into the neural network model through poisoning the training dataset. When the input containing some special pattern called the backdoor trigger, the model with backdoor will carry out malicious task such as misclassification specified by adversaries. In text classification systems, backdoors inserted in the models can cause spam or malicious speech to escape detection. Previous work mainly focused on the defense of backdoor attacks in computer vision, little attention has been paid to defense method for RNN backdoor attacks regarding text classification. In this paper, through analyzing the changes in inner LSTM neurons, we proposed a defense method called Backdoor Keyword Identification (BKI) to mitigate backdoor attacks which the adversary performs against LSTM-based text classification by data poisoning. This method can identify and exclude poisoning samples crafted to insert backdoor into the model from training data without a verified and trusted dataset. We evaluate our method on four different text classification datset: IMDB, DBpedia ontology, 20 newsgroups and Reuters-21578 dataset. It all achieves good performance regardless of the trigger sentences.

Citations

PDF

Open Access

More filters

Posted Content

ONION: A Simple and Effective Defense Against Textual Backdoor Attacks

Fanchao Qi, +4 more

- 20 Nov 2020 -

arXiv: Computation and Language

TL;DR: A simple and effective textual backdoor defense named ONION, which is based on outlier word detection and, to the best of the knowledge, is the first method that can handle all the textual backdoor attack situations.

...read moreread less

Posted Content

Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review.

Yansong Gao, +7 more

- 21 Jul 2020 -

arXiv: Cryptography and Security

TL;DR: This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning, and presents key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and more efficient and practical countermeasures are solicited.

...read moreread less

Proceedings ArticleDOI

Piccolo: Exposing Complex Backdoors in NLP Transformer Models

Yingqi Liu, +5 more

TL;DR:

...read moreread less

Journal ArticleDOI

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning

Antonio Emanuele Cinà, +9 more

- 04 May 2022 -

ACM Computing Surveys

TL;DR: In this paper , a comprehensive systematization of poisoning attacks and defenses in machine learning is provided, which includes state-of-the-art attacks and defences for other data modalities.

...read moreread less

Proceedings ArticleDOI

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Ganqu Cui, +5 more

TL;DR: This work categorizes existing works into three practical scenarios in which attackers release datasets, pre-trained models, and fine-tuned models respectively, then discusses their unique evaluation methodologies, and proposes CUBE, a simple yet strong clustering-based defense baseline.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Posted Content

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

- 10 Sep 2014 -

arXiv: Computation and Language

TL;DR: This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

...read moreread less