scispace - formally typeset
Search or ask a question
Proceedings Article•DOI•

A Study on Captcha Recognition

27 Aug 2014-pp 395-398
TL;DR: The probability pattern framework to recognize the target numbers in the captcha images is proposed and shows that the proposed recognition method achieved an average of 81.05% for more than two thousand captcha cases.
Abstract: Through most of captcha images all said that their system can defense the external attacking, many corresponding researches are still presented to examine the captcha systems in the current internet platforms. The safety of captcha image can be decided by the complexity of imaging structures. Therefore, different captcha recognition methods are proposed to apply into different captcha images. In this study, there are many noisy lines and points in our testing captcha cases. We proposed the probability pattern framework to recognize the target numbers in the captcha images. In the experiment, the quantitative assessment shows that the proposed recognition method achieved an average of 81.05% for more than two thousand captcha cases.
Citations
More filters
Proceedings Article•DOI•
04 Jun 2016
TL;DR: This research has found vulnerabilities in Text based CAPTCHAs, a novel mechanism, i.e. the recognition based segmentation is applied to crop such connected characters, a sliding window based neural network classifier is used to recognize and segment the connected characters.
Abstract: Text based CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is the most widely used mechanism adopted by numerous popular web sites in order to differentiate between machines and humans, however due to extensive research carried out by computer vision researchers, it is now a days vulnerable against automated attacks. Segmentation is the most difficult task in automatic recognition of CAPTCHAs, therefore contemporary Text based CAPTCHAs try to combine the characters together in order to make them as segmentation resistant against these attacks as possible. In this research, we have found vulnerabilities in such CAPTCHAs, a novel mechanism, i.e. the recognition based segmentation is applied to crop such connected characters, a sliding window based neural network classifier is used to recognize and segment the connected characters. Experimental results have proved 95.5% recognition success rate and 58.25% segmentation success rate on our dataset of tmall CAPTCHAs, this algorithm is further tested on two other datasets of slightly different implementations and promising results were achieved.

11 citations

Journal Article•DOI•
TL;DR: This study introduces an efficient CNN model that uses attached binary images to recognize CAPTCHAs and achieves experimental results that reveal the strength of the model in CAPTCHA character recognition.
Abstract: Websites can increase their security and prevent harmful Internet attacks by providing CAPTCHA verification for determining whether end-user is a human or a robot. Text-based CAPTCHA is the most common and designed to be easily recognized by humans and difficult to identify by machines or robots. However, with the dramatic advancements in deep learning, it becomes much easier to build convolutional neural network (CNN) models that can efficiently recognize text-based CAPTCHAs. In this study, we introduce an efficient CNN model that uses attached binary images to recognize CAPTCHAs. By making a specific number of copies of the input CAPTCHA image equal to the number of characters in that input CAPTCHA image and attaching distinct binary images to each copy, we build a new CNN model that can recognize CAPTCHAs effectively. The model has a simple structure and small storage size and does not require the segmentation of CAPTCHAs into individual characters. After training and testing the proposed CAPTCHA recognition CNN model, the achieved experimental results reveal the strength of the model in CAPTCHA character recognition.

11 citations

Patent•
Liu Wei1, Vinay Damodar Shet1, Ying Liu1, Aaron Malenfant1, Haidong Shao1, Hongshu Liao1, Jiexing Gu1, Edison Tan1 •
03 Dec 2015
TL;DR: In this article, the first image can be provided to a plurality of user devices in a verification challenge, and the verification challenge can include one or more instructions to be presented to a user of each device.
Abstract: Systems and methods of determining image characteristics are provided. More particularly, a first image having an unknown characteristic can be obtained. The first image can be provided to a plurality of user devices in a verification challenge. The verification challenge can include one or more instructions to be presented to a user of each user device. The instructions being determined based at least in part on the first image. User responses can be received, and an unknown characteristic of the first image can be determined based at least in part on the received responses. Subsequent to determining the unknown characteristic of the first image, one or more machine learning models can be trained based at least in part on the determined characteristic.

7 citations

Journal Article•DOI•
TL;DR: This study split four-character CAPTCHA images for the individual characters with a 2-pixel margin around the edges of a new training dataset, and proposed an efficient and accurate Depth-wise Separable Convolutional Neural Network for breaking text-based CAPTCHAs.
Abstract: Cybersecurity practitioners generate a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHAs) as a form of security mechanism in website applications, in order to differentiate between human end-users and machine bots. They tend to use standard security to implement CAPTCHAs in order to prevent hackers from writing malicious automated programs to make false website registrations and to restrict them from stealing end-users’ private information. Among the categories of CAPTCHAs, the text-based CAPTCHA is the most widely used. However, with the evolution of deep learning, it has been so dramatic that tasks previously thought not easily addressable by computers and used as CAPTCHA to prevent spam are now possible to break. The workflow of CAPTCHA breaking is a combination of efforts, approaches, and the development of the computation-efficient Convolutional Neural Network (CNN) model that attempts to increase accuracy. In this study, in contrast to breaking the whole CAPTCHA images simultaneously, this study split four-character CAPTCHA images for the individual characters with a 2-pixel margin around the edges of a new training dataset, and then proposed an efficient and accurate Depth-wise Separable Convolutional Neural Network for breaking text-based CAPTCHAs. Most importantly, to the best of our knowledge, this is the first CAPTCHA breaking study to use the Depth-wise Separable Convolution layer to build an efficient CNN model to break text-based CAPTCHAs. We have evaluated and compared the performance of our proposed model to that of fine-tuning other popular CNN image recognition architectures on the generated CAPTCHA image dataset. In real-time, our proposed model used less time to break the text-based CAPTCHAs with an accuracy of more than 99% on the testing dataset. We observed that our proposed CNN model has efficiently improved the CAPTCHA breaking accuracy and streamlined the structure of the CAPTCHA breaking network as compared to other CAPTCHA breaking techniques.

5 citations

Journal Article•DOI•
31 Aug 2017
TL;DR: A recognition method of CAPTCHA with Adhesion Character is proposed, which effectively improves the segmentation quality of adhesive characterCAPTCHA in the complex background and has higher recognition rate.
Abstract: The emergence of CAPTCHA( Completely Automated Public Turing Test to Tell Computers and Humans Apart )is to better protect the network security ,and the research on recognition of CAPTCHA technology is conducive to expand the design ideas and to improve the loopholes of the original design. In order to improve the recognition rate of CAPTCHA, we proposed a recognition method of CAPTCHA with Adhesion Character, which effectively improves the segmentation quality of adhesive character CAPTCHA in the complex background. First is the based preprocessing of the image and utilize the method of connected area noise reduction to further denoise the image. Then use the projection histogram to cut the adhesion characters and finally neural network training is used to obtain the final recognition results. The method can deal with images with complex background, basically achieve zero noise cuttingand can better cut the adhesion characters, the latter part of the training process is relatively simple. The experimental results show that this method is practical and has higher recognition rate.

3 citations


Cites background from "A Study on Captcha Recognition"

  • ...05% for more than two thousand CAPTCHA cases[17]....

    [...]

References
More filters
Journal Article•DOI•
TL;DR: Results clearly show that the proposed switching median filter substantially outperforms all existing median-based filters, in terms of suppressing impulse noise while preserving image details, and yet, the proposed BDND is algorithmically simple, suitable for real-time implementation and application.
Abstract: A novel switching median filter incorporating with a powerful impulse noise detection method, called the boundary discriminative noise detection (BDND), is proposed in this paper for effectively denoising extremely corrupted images. To determine whether the current pixel is corrupted, the proposed BDND algorithm first classifies the pixels of a localized window, centering on the current pixel, into three groups-lower intensity impulse noise, uncorrupted pixels, and higher intensity impulse noise. The center pixel will then be considered as "uncorrupted," provided that it belongs to the "uncorrupted" pixel group, or "corrupted." For that, two boundaries that discriminate these three groups require to be accurately determined for yielding a very high noise detection accuracy-in our case, achieving zero miss-detection rate while maintaining a fairly low false-alarm rate, even up to 70% noise corruption. Four noise models are considered for performance evaluation. Extensive simulation results conducted on both monochrome and color images under a wide range (from 10% to 90%) of noise corruption clearly show that our proposed switching median filter substantially outperforms all existing median-based filters, in terms of suppressing impulse noise while preserving image details, and yet, the proposed BDND is algorithmically simple, suitable for real-time implementation and application.

614 citations

Journal Article•DOI•
TL;DR: A novel switching-based median filter with incorporation of fuzzy-set concept, called the noise adaptive soft-switching median (NASM) filter, to achieve much improved filtering performance in terms of effectiveness in removing impulse noise while preserving signal details and robustness in combating noise density variations.
Abstract: Existing state-of-the-art switching-based median filters are commonly found to be nonadaptive to noise density variations and prone to misclassifying pixel characteristics at high noise density interference. This reveals the critical need of having a sophisticated switching scheme and an adaptive weighted median filter. We propose a novel switching-based median filter with incorporation of fuzzy-set concept, called the noise adaptive soft-switching median (NASM) filter, to achieve much improved filtering performance in terms of effectiveness in removing impulse noise while preserving signal details and robustness in combating noise density variations. The proposed NASM filter consists of two stages. A soft-switching noise-detection scheme is developed to classify each pixel to be uncorrupted pixel, isolated impulse noise, nonisolated impulse noise or image object's edge pixel. "No filtering" (or identity filter), standard median (SM) filter or our developed fuzzy weighted median (FWM) filter will then be employed according to the respective characteristic type identified. Experimental results show that our NASM filter impressively outperforms other techniques by achieving fairly close performance to that of ideal-switching median filter across a wide range of noise densities, ranging from 10% to 70%.

598 citations

Journal Article•DOI•
TL;DR: Examining some CAPTCHAs to determine whether their use of color negatively affects their usability, security, or both.
Abstract: Most user interfaces use color, which can greatly enhance their design. Because the use of color is typically a usability issue, it rarely causes security failures. However, using color when designing CAPTCHAs, a standard security technology that many commercial websites apply widely, can have an impact on usability and interesting but critical implications for security. Here, the authors examine some CAPTCHAs to determine whether their use of color negatively affects their usability, security, or both.

70 citations

Journal Article•DOI•
Jeff Yan1, A.S. El Ahmad1•
01 Jul 2009
TL;DR: A simple but novel attack can break some CAPTCHAs with a success rate higher than 90 percent and the authors used simple pattern recognition algorithms to exploit fatal design errors.
Abstract: A simple but novel attack can break some CAPTCHAs with a success rate higher than 90 percent. In contrast to early work that relied on sophisticated computer vision or machine learning techniques, the authors used simple pattern recognition algorithms to exploit fatal design errors.

51 citations

Proceedings Article•DOI•
05 Jun 2011
TL;DR: A performance and usability study of iCAPTCHA shows the proposed scheme is effective in attack detection, is easy to use, and is a viable replacement of the current text-based CAPTCHA.
Abstract: CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) is a simple test that is easy for humans but extremely difficult for computers to solve. CAPTCHA has been widely used in commercial websites such as web-based email providers, TicketMaster, GoDaddy, and Facebook to protect their resources from attacks initiated by automatic scripts. By design, CAPTCHA is unable to distinguish between a human attacker and a legitimate human user. This leaves websites using CAPTCHA vulnerable to 3rd party human CAPTCHA attacks. In order to demonstrate the vulnerabilities in existing CAPTCHA technologies we develop a new streamlined human-based CAPTCHA attack that uses Instant Messenger infrastructure. Facing this serious human-based attack threat, we then present a new defense system called Interactive CAPTCHA (iCAPTCHA), which is the next generation of CAPTCHA technology providing the first steps toward defending against 3rd party human CAPTCHA attacks. iCAPTCHA requires a user to solve a CAPTCHA test via a series of user interactions. The multi-step back-and-forth traffic between client and server amplifies the statistical timing difference between a legitimate user and a human solver, which enables better attack detection performance. A performance and usability study of iCAPTCHA shows the proposed scheme is effective in attack detection, is easy to use, and is a viable replacement of the current text-based CAPTCHA.

46 citations