scispace - formally typeset
Search or ask a question
Author

Ge Pei

Bio: Ge Pei is an academic researcher from Xidian University. The author has contributed to research in topics: CAPTCHA & Usability. The author has an hindex of 4, co-authored 6 publications receiving 23 citations.

Papers
More filters
Proceedings ArticleDOI
Yang Zhang1, Haichang Gao1, Ge Pei1, Sainan Luo1, Chang Guoqin1, Nuo Cheng1 
01 Aug 2019
TL;DR: A comprehensive survey of recent developments for each CAPTCHA type in terms of usability, robustness and their weaknesses and strengths is presented and the attack methods for each category are summarized.
Abstract: The Internet plays an increasingly important role in people's lives, but it also brings security problems. CAPTCHA, which stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart, has been widely used as a security mechanism. This paper outlines the scientific and technological progress in both the design and attack of CAPTCHAs related to these three CAPTCHA categories. It first presents a comprehensive survey of recent developments for each CAPTCHA type in terms of usability, robustness and their weaknesses and strengths. Second, it summarizes the attack methods for each category. In addition, the differences between the three CAPTCHA categories and the attack methods will also be discussed. Lastly, this paper provides suggestions for future research and proposes some problems worthy of further study.

19 citations

Journal ArticleDOI
Cheng Nuo1, Chang Guoqin1, Haichang Gao1, Ge Pei1, Yang Zhang1 
TL;DR: This paper proposes a “word-split” strategy to substitute keywords that are designed by the structure and semantic property of Chinese texts and applies “swap” and “insert” strategies on Chinese texts to generate adversarial examples, and evaluates the effectiveness of the method in sentiment analysis dataset and spam dataset.
Abstract: As an important carrier for disseminating information in the Internet Age, the text contains a large amount of information. In recent years, adversarial example attacks against text discrete domains have been received widespread attention. Deep neural network (DNN) produces opposite predictions by adding small perturbations to the text data. In this paper, we present “WordChange”: an adversarial examples generation approach for Chinese text classification based on multiple modification strategies, and we evaluate the effectiveness of the method in sentiment analysis dataset and spam dataset. This method effectively locates important word positions by designing a keyword contribution algorithm. We first propose a “word-split” strategy to substitute keywords thatare designed by the structure and semantic property of Chinese texts. We also first apply “swap” and “insert” strategies on Chinese texts to generate adversarial examples. We further discuss the influence of multiple Chinese Word Segmentation tools and different text lengths on the proposed method, as well as the diversification of Chinese text modification strategies. Finally, the adversarial texts based on the long short-term memory network (LSTM) can be successfully transferred to other text classifiers and real-world applications.

15 citations

Proceedings ArticleDOI
Yang Zhang1, Haichang Gao1, Ge Pei1, Shuai Kang1, Xin Zhou1 
01 Oct 2018
TL;DR: This paper studies the effect of adversarial examples on CAPTCHA robustness (including image-selecting, clicking-based, and text-based CAPTCHAs), and demonstrates that adversarialExamples have a positive effect on the robustness ofCAPTCHA.
Abstract: A good CAPTCHA(Completely Automated Public Turing Test to Tell Computers and Humans Apart) should be friendly for humans to solve but hard for computers. This balance between security and usability is hard to achieve. With the development of deep neural network techniques, increasingly more CAPTCHAs have been cracked. Recent works have shown deep neural networks to be highly susceptible to adversarial examples, which can reliably fool neural networks by adding noise that is imperceptible to humans that matches the needs of CAPTCHA design. In this paper, we study the effect of adversarial examples on CAPTCHA robustness (including image-selecting, clicking-based, and text-based CAPTCHAs). The experimental results demonstrate that adversarial examples have a positive effect on the robustness of CAPTCHA. Even if we fine tune the neural network, the impact of adversarial examples cannot be completely eliminated. At the end of this paper, suggestions are given on how to improve the security of CAPTCHA using adversarial examples.

11 citations

Journal ArticleDOI
TL;DR: The results prove deep learning can have a positive effect on enhancing CAPTCHA security and provides a promising direction for future CAPTCHAs study.
Abstract: Over the last few years, completely automated public turing test to tell computers and humans apart (CAPTCHA) has been used as an effective method to prevent websites from malicious attacks, however, CAPTCHA designers failed to reach a balance between good usability and high security. In this study, the authors apply neural style transfer to enhance the security for CAPTCHA design. Two image-based CAPTCHAs, Grid-CAPTCHA and Font-CAPTCHA, based on neural style transfer are proposed. Grid-CAPTCHA offers nine stylized images to users and requires users to select all corresponding images according to a short description, and Font-CAPTCHA asks users to click Chinese characters presented in the image in sequence according to the description. To evaluate the effectiveness of this techniques on enhancing CAPTCHA security, they conducted a comprehensive field study and compared them to similar mechanisms. The comparison results demonstrated that the neural style transfer decreased the success rate of automated attacks. Human beings have achieved a successful solving rate of 75.04 and 84.49% on the Grid-CAPTCHA and Font-CAPTCHA schemes, respectively, indicating good usability. The results prove deep learning can have a positive effect on enhancing CAPTCHA security and provides a promising direction for future CAPTCHA study.

7 citations

Patent
03 May 2019
TL;DR: In this paper, a man-machine verification method based on a clock is proposed, which aims to improve the safety of manmachine verification and comprises the following implementation steps of: 1, settingparameters in the implementation process of the verification method, 2, generating a verification interface of the clock verification code, 3, acquiring dragging behavior information of a user, 4, calculating the time corresponding to the position of the sliding block when the user stops dragging the sliding blocks, and 5, carrying out man-machines verification on the user.
Abstract: The invention provides a man-machine verification method based on a clock, which aims to improve the safety of man-machine verification and comprises the following implementation steps of: 1, settingparameters in the implementation process of the man-machine verification method based on the clock; 2, generating a verification interface of the clock verification code; 3, acquiring dragging behavior information of a user; 4, calculating the time corresponding to the position of the sliding block when the user stops dragging the sliding block; and 5, carrying out man-machine verification on theuser. Through double verification on the user identity, the cracking difficulty is increased, the security of the verification code is improved, the risk that the Internet is maliciously attacked is reduced, and the method can be used for carrying out man-machine verification on the user in network scenes such as login and registration.

1 citations


Cited by
More filters
Posted Content
TL;DR: Two categories of model-agnostic adversarial strategies are presented that reveal the weaknesses of several generative, task-oriented dialogue models: Should-Not-Change strategies that evaluate over-sensitivity to small and semantics-preserving edits, as well as Should- change strategies that test if a model is over-stable against subtle yet semantics-changing modifications.
Abstract: We present two categories of model-agnostic adversarial strategies that reveal the weaknesses of several generative, task-oriented dialogue models: Should-Not-Change strategies that evaluate over-sensitivity to small and semantics-preserving edits, as well as Should-Change strategies that test if a model is over-stable against subtle yet semantics-changing modifications. We next perform adversarial training with each strategy, employing a max-margin approach for negative generative examples. This not only makes the target dialogue model more robust to the adversarial inputs, but also helps it perform significantly better on the original inputs. Moreover, training on all strategies combined achieves further improvements, achieving a new state-of-the-art performance on the original task (also verified via human evaluation). In addition to adversarial training, we also address the robustness task at the model-level, by feeding it subword units as both inputs and outputs, and show that the resulting model is equally competitive, requires only 1/4 of the original vocabulary size, and is robust to one of the adversarial strategies (to which the original model is vulnerable) even without adversarial training.

23 citations

Journal ArticleDOI
Cheng Nuo1, Chang Guoqin1, Haichang Gao1, Ge Pei1, Yang Zhang1 
TL;DR: This paper proposes a “word-split” strategy to substitute keywords that are designed by the structure and semantic property of Chinese texts and applies “swap” and “insert” strategies on Chinese texts to generate adversarial examples, and evaluates the effectiveness of the method in sentiment analysis dataset and spam dataset.
Abstract: As an important carrier for disseminating information in the Internet Age, the text contains a large amount of information. In recent years, adversarial example attacks against text discrete domains have been received widespread attention. Deep neural network (DNN) produces opposite predictions by adding small perturbations to the text data. In this paper, we present “WordChange”: an adversarial examples generation approach for Chinese text classification based on multiple modification strategies, and we evaluate the effectiveness of the method in sentiment analysis dataset and spam dataset. This method effectively locates important word positions by designing a keyword contribution algorithm. We first propose a “word-split” strategy to substitute keywords thatare designed by the structure and semantic property of Chinese texts. We also first apply “swap” and “insert” strategies on Chinese texts to generate adversarial examples. We further discuss the influence of multiple Chinese Word Segmentation tools and different text lengths on the proposed method, as well as the diversification of Chinese text modification strategies. Finally, the adversarial texts based on the long short-term memory network (LSTM) can be successfully transferred to other text classifiers and real-world applications.

15 citations

Journal ArticleDOI
TL;DR: A Neural Style Geometric Transformation (NSGT) is proposed as a data augmentation technique for Balinese carvings recognition by combining Neural Style Transfers and Geometric Transformations for a small dataset solution.
Abstract: The preservation of Balinese carving data is a challenge in recognition of Balinese carving. Balinese carvings are a cultural heritage found in traditional buildings in Bali. The collection of Balinese carving images from public images can be a solution for preserving cultural heritage. However, the lousy quality of taking photographs, e.g., skewed shots, can affect the recognition results. Research on the Balinese carving recognition has existed but only recognizes a predetermined image. We proposed a Neural Style Geometric Transformation (NSGT) as a data augmentation technique for Balinese carvings recognition. NSGT is combining Neural Style Transfers and Geometric Transformations for a small dataset solution. This method provides variations in color, lighting, rotation, rescale, zoom, and the size of the training dataset, to improve recognition performance. We use MobileNet as a feature extractor because it has a small number of parameters, which makes it suitable to be applied on mobile devices. Eight scenarios were tested based on image styles and geometric transformations to get the best results. Based on the results, the proposed method can improve accuracy by up to 16.2%.

10 citations

Journal ArticleDOI
TL;DR: This study proposes a novel framework for automated breaking of dark web CAPTCHA, DW-GAN, which significantly outperformed the state-of-the-art benchmark methods on all datasets, achieving over 94.4% success rate on a carefully collected real-world dark web dataset.
Abstract: Automated monitoring of dark web (DW) platforms on a large scale is the first step toward developing proactive Cyber Threat Intelligence (CTI). While there are efficient methods for collecting data from the surface web, large-scale dark web data collection is often hindered by anti-crawling measures. In particular, text-based CAPTCHA serves as the most prevalent and prohibiting type of these measures in the dark web. Text-based CAPTCHA identifies and blocks automated crawlers by forcing the user to enter a combination of hard-to-recognize alphanumeric characters. In the dark web, CAPTCHA images are meticulously designed with additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing automated CAPTCHA breaking methods have difficulties in overcoming these dark web challenges. As such, solving dark web text-based CAPTCHA has been relying heavily on human involvement, which is labor-intensive and time-consuming. In this study, we propose a novel framework for automated breaking of dark web CAPTCHA to facilitate dark web data collection. This framework encompasses a novel generative method to recognize dark web text-based CAPTCHA with noisy background and variable character length. To eliminate the need for human involvement, the proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web background noise and leverages an enhanced character segmentation algorithm to handle CAPTCHA images with variable character length. Our proposed framework, DW-GAN, was systematically evaluated on multiple dark web CAPTCHA testbeds. DW-GAN significantly outperformed the state-of-the-art benchmark methods on all datasets, achieving over 94.4% success rate on a carefully collected real-world dark web dataset. We further conducted a case study on an emergent Dark Net Marketplace (DNM) to demonstrate that DW-GAN eliminated human involvement by automatically solving CAPTCHA challenges with no more than three attempts. Our research enables the CTI community to develop advanced, large-scale dark web monitoring. We make DW-GAN code available to the community as an open-source tool in GitHub.

10 citations

Journal ArticleDOI
TL;DR: This study attempts to employ the limitations of algorithm to design robust CAPTCHA questions easily solvable to human, and finds that adversarial perturbation is significantly annoying to algorithm yet friendly to human.
Abstract: Turing test was originally proposed to examine whether machine's behavior is indistinguishable from a human. The most popular and practical Turing test is CAPTCHA, which is to discriminate algorithm from human by offering recognition-alike questions. The recent development of deep learning has significantly advanced the capability of algorithm in solving CAPTCHA questions, forcing CAPTCHA designers to increase question complexity. Instead of designing questions difficult for both algorithm and human, this study attempts to employ the limitations of algorithm to design robust CAPTCHA questions easily solvable to human. Specifically, our data analysis observes that human and algorithm demonstrates different vulnerability to visual distortions: adversarial perturbation is significantly annoying to algorithm yet friendly to human. We are motivated to employ adversarially perturbed images for robust CAPTCHA design in the context of character-based questions. Four modules of multi-target attack, ensemble adversarial training, image preprocessing differentiable approximation, and expectation are proposed to address the characteristics of character-based CAPTCHA cracking. Qualitative and quantitative experimental results demonstrate the effectiveness of the proposed solution. We hope this study can lead to the discussions around adversarial attack/defense in CAPTCHA design and also inspire the future attempts in employing algorithm limitation for practical usage.

10 citations