scispace - formally typeset
Search or ask a question
Author

Yong Jiang

Bio: Yong Jiang is an academic researcher from Tsinghua University. The author has contributed to research in topics: Deep learning & Information privacy. The author has an hindex of 4, co-authored 10 publications receiving 34 citations.

Papers
More filters
Posted Content
TL;DR: This paper demonstrates that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data and designs a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers, based on the understanding of verification tasks.
Abstract: Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data ($e.g.$, data from the Internet or third-party data company). This raises the question of whether adopting untrusted third-party data can pose a security threat. In this paper, we demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. Specifically, we design a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers ($i.e.$, pre-defined utterances), based on our understanding of verification tasks. The infected models behave normally on benign samples, while attacker-specified unenrolled triggers will successfully pass the verification even if the attacker has no information about the enrolled speaker. We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification. Our approach not only provides a new perspective for designing novel attacks, but also serves as a strong baseline for improving the robustness of verification methods. The code for reproducing main results is available at \url{this https URL}.

43 citations

Proceedings ArticleDOI
06 Jun 2021
TL;DR: Zhang et al. as discussed by the authors demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data, where poisoned samples from different clusters will contain different triggers (i.e., pre-defined utterances), based on their understanding of verification tasks.
Abstract: Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data (e.g., data from the Internet or third-party data company). This raises the question of whether adopting untrusted third-party data can pose a security threat. In this paper, we demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. Specifically, we design a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers (i.e., pre-defined utterances), based on our understanding of verification tasks. The infected models behave normally on benign samples, while attacker-specified unenrolled triggers will successfully pass the verification even if the attacker has no information about the enrolled speaker. We also demonstrate that existing back-door attacks cannot be directly adopted in attacking speaker verification. Our approach not only provides a new perspective for designing novel attacks, but also serves as a strong baseline for improving the robustness of verification methods. The code for reproducing main results is available at https://github.com/zhaitongqing233/Backdoor-attack-against-speaker-verification.

28 citations

Proceedings ArticleDOI
01 Oct 2020
TL;DR: This paper defines the local flatness of the loss surface as the maximum value of the chosen norm of the gradient regarding to the input within a neighborhood centered on the benign sample, and discusses the relationship between the localflatness and adversarial vulnerability.
Abstract: Adversarial defense is a popular and important research area. Due to its intrinsic mechanism, one of the most straightforward and effective ways of defending attacks is to analyze the property of loss surface in the input space. In this paper, we define the local flatness of the loss surface as the maximum value of the chosen norm of the gradient regarding to the input within a neighborhood centered on the benign sample, and discuss the relationship between the local flatness and adversarial vulnerability. Based on the analysis, we propose a novel defense approach via regularizing the local flatness, dubbed local flatness regularization (LFR). We also demonstrate the effectiveness of the proposed method from other perspectives, such as human visual mechanism, and analyze the relationship between LFR and other related methods theoretically. Experiments are conducted to verify our theory and demonstrate the superiority of the proposed method.

15 citations

Posted Content
TL;DR: In the modified dataset generated by MDP, the image and its label are not consistent, whereas the DNNs trained on it can still achieve good performance on benign testing set, and this method can protect privacy when the dataset is leaked.
Abstract: Privacy protection is an important research area, which is especially critical in this big data era. To a large extent, the privacy of visual classification data is mainly in the mapping between the image and its corresponding label, since this relation provides a great amount of information and can be used in other scenarios. In this paper, we propose the mapping distortion based protection (MDP) and its augmentation-based extension (AugMDP) to protect the data privacy by modifying the original dataset. In the modified dataset generated by MDP, the image and its label are not consistent ($e.g.$, a cat-like image is labeled as the dog), whereas the DNNs trained on it can still achieve good performance on benign testing set. As such, this method can protect privacy when the dataset is leaked. Extensive experiments are conducted, which verify the effectiveness and feasibility of our method. The code for reproducing main results is available at \url{this https URL}.

5 citations

Proceedings Article
03 May 2021
TL;DR: CSE-Autoloss as discussed by the authors proposes an effective convergence-simulation driven evolutionary search algorithm, which can accelerate the search progress by regularizing the mathematical rationality of loss candidates via two progressive convergence simulation modules: convergence property verification and model optimization simulation.
Abstract: Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models For object detection, the well-established classification and regression loss functions have been carefully designed by considering diverse learning challenges (eg class imbalance, hard negative samples, and scale variances) Inspired by the recent progress in network architecture search, it is interesting to explore the possibility of discovering new loss function formulations via directly searching the primitive operation combinations So that the learned losses not only fit for diverse object detection challenges to alleviate huge human efforts, but also have better alignment with evaluation metric and good mathematical convergence property Beyond the previous auto-loss works on face recognition and image classification, our work makes the first attempt to discover new loss functions for the challenging object detection from primitive operation levels and finds the searched losses are insightful We propose an effective convergence-simulation driven evolutionary search algorithm, called CSE-Autoloss, for speeding up the search progress by regularizing the mathematical rationality of loss candidates via two progressive convergence simulation modules: convergence property verification and model optimization simulation CSE-Autoloss involves the search space (ie 21 mathematical operators, 3 constant-type inputs, and 3 variable-type inputs) that cover a wide range of the possible variants of existing losses and discovers best-searched loss function combination within a short time (around 15 wall-clock days with 20x speedup in comparison to the vanilla evolutionary algorithm) We conduct extensive evaluations of loss function search on popular detectors and validate the good generalization capability of searched losses across diverse architectures and various datasets Our experiments show that the best-discovered loss function combinations outperform default combinations (Cross-entropy/Focal loss for classification and L1 loss for regression) by 11% and 08% in terms of mAP for two-stage and one-stage detectors on COCO respectively Our searched losses are available at https://githubcom/PerdonLiu/CSE-Autoloss

5 citations


Cited by
More filters
Posted Content
TL;DR: This paper summarizes and categorizes existing backdoor attacks and defenses based on their characteristics, and provides a unified framework for analyzing poisoning-based backdoor attacks.
Abstract: Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs), such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. This threat could happen when the training process is not fully controlled, such as training on third-party datasets or adopting third-party models, which poses a new and realistic threat. Although backdoor learning is an emerging and rapidly growing research area, its systematic review, however, remains blank. In this paper, we present the first comprehensive survey of this realm. We summarize and categorize existing backdoor attacks and defenses based on their characteristics, and provide a unified framework for analyzing poisoning-based backdoor attacks. Besides, we also analyze the relation between backdoor attacks and relevant fields ($i.e.,$ adversarial attacks and data poisoning), and summarize widely adopted benchmark datasets. Finally, we briefly outline certain future research directions relying upon reviewed works. A curated list of backdoor-related resources is also available at \url{this https URL}.

260 citations

Journal Article
TL;DR: This paper demonstrates that many backdoor attack paradigms are vulnerable when the trigger in testing images is not consistent with the one used for training, and proposes a transformation-based attack enhancement to improve the robustness of existing attacks towards transformation- based defense.
Abstract: Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of the infected model will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger, while it performs well on benign samples. Currently, most of existing backdoor attacks adopted the setting of \emph{static} trigger, i.e., triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing the characteristics of the static trigger. We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. We further explore how to utilize this property for backdoor defense, and discuss how to alleviate such vulnerability of existing attacks. We hope that this work could inspire more explorations on backdoor properties, to help the design of more advanced backdoor defense and attack methods.

82 citations

Posted Content
TL;DR: A systematic approach is proposed to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions and envision this framework can be a good benchmark tool to advance future DNN backdoor studies.
Abstract: Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.

79 citations

Proceedings ArticleDOI
24 May 2021
TL;DR: Wang et al. as mentioned in this paper investigated the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness, and proposed a unified defense solution to fine-tune the infected model and eliminate the effects of the embedded backdoor; another augmentation policy to preprocess input samples and invalidate the triggers during inference.
Abstract: Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.

66 citations

Proceedings Article
05 Feb 2022
TL;DR: This work proposes a novel backdoor defense via decoupling the original end-to-end training process into three stages, and reveals that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the endto- end supervised training paradigm.
Abstract: Recent studies have revealed that deep neural networks (DNNs) are vulnerable to backdoor attacks, where attackers embed hidden backdoors in the DNN model by poisoning a few training samples. The attacked model behaves normally on benign samples, whereas its prediction will be maliciously changed when the backdoor is activated. We reveal that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the end-to-end supervised training paradigm. Inspired by this observation, we propose a novel backdoor defense via decoupling the original end-to-end training process into three stages. Specifically, we first learn the backbone of a DNN model via \emph{self-supervised learning} based on training samples without their labels. The learned backbone will map samples with the same ground-truth label to similar locations in the feature space. Then, we freeze the parameters of the learned backbone and train the remaining fully connected layers via standard training with all (labeled) training samples. Lastly, to further alleviate side-effects of poisoned samples in the second stage, we remove labels of some `low-credible' samples determined based on the learned model and conduct a \emph{semi-supervised fine-tuning} of the whole model. Extensive experiments on multiple benchmark datasets and DNN models verify that the proposed defense is effective in reducing backdoor threats while preserving high accuracy in predicting benign samples. Our code is available at \url{https://github.com/SCLBD/DBD}.

56 citations