scispace - formally typeset
Open AccessBook ChapterDOI

A CAPTCHA Scheme Based on the Identification of Character Locations

TLDR
The goal was to also design a text-based CAPTCHA scheme that can easily be used on increasingly pervasive touch-screen devices, without the need for keyboard input, and to examine the usability and robustness of the proposed CAPTcha scheme.
Abstract
CAPTCHAs are a standard security mechanism used on many websites to protect online services against abuse by automated programs, or bots. The purpose of a CAPTCHA is to distinguish whether an online transaction is being carried out by a human or a bot. Unfortunately, to date many existing CAPTCHA schemes have been found to be vulnerable to automated attacks. It is widely accepted that state-of-the-art in text-based CAPTCHA design requires that a CAPTCHA be resistant against segmentation. In this paper, we examine CAPTCHA usability issues and current segmentation techniques that have been used to attack various CAPTCHA schemes. We then introduce the design of a new CAPTCHA scheme that was designed based on these usability and segmentation considerations. Our goal was to also design a text-based CAPTCHA scheme that can easily be used on increasingly pervasive touch-screen devices, without the need for keyboard input. This paper also examines the usability and robustness of the proposed CAPTCHA scheme.

read more

Content maybe subject to copyright    Report

%61>-:;1<A7.'74476/76/%61>-:;1<A7.'74476/76/
"-;-):+0 6416-"-;-):+0 6416-
)+=4<A7.6/16--:16/)6,6.7:5)<176
#+1-6+-;!)8-:;!):<
)+=4<A7.6/16--:16/)6,6.7:5)<176
#+1-6+-;

!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;
=+&=/=A-6
%61>-:;1<A7.'74476/76/
,>6=7?5)14-,=)=
()6/')107?
%61>-:;1<A7.'74476/76/
+);-A+=7?-,=)=
'144A#=;147
%61>-:;1<A7.'74476/76/
?;=;147=7?-,=)=
7447?<01;)6,),,1<176)4?7:3;)<0<<8;:7=7?-,=)=-1;8)8-:;
!):<7.<0-6/16--:16/75576;)6,<0-#+1-6+-)6,$-+06747/A#<=,1-;75576;
"-+755-6,-,1<)<176"-+755-6,-,1<)<176
/=A-6=+&=07?()6/')1)6,#=;147'144A!$;+0-5-*);-,76<0-1,-6<1C+)<1767.
+0):)+<-:47+)<176;
)+=4<A7.6/16--:16/)6,6.7:5)<176#+1-6+-;!)8-:;!):<

0<<8;:7=7?-,=)=-1;8)8-:;
"-;-):+0 6416-1;<0-78-6)++-;;16;<1<=<176)4:-87;1<7:A.7:<0-%61>-:;1<A7.'74476/76/7:.=:<0-:16.7:5)<176
+76<)+<<0-% '1*:):A:-;-):+08=*;=7?-,=)=

!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;
*;<:)+<*;<:)+<
!$;):-);<)6,):,;-+=:1<A5-+0)61;5=;-,765)6A?-*;1<-;<78:7<-+<76416-;-:>1+-;)/)16;<
)*=;-*A)=<75)<-,8:7/:)5;7:*7<;$0-8=:87;-7.)!$1;<7,1;<16/=1;0?0-<0-:)676416-
<:)6;)+<1761;*-16/+)::1-,7=<*A)0=5)67:)*7<%6.7:<=6)<-4A<7,)<-5)6A-@1;<16/!$
;+0-5-;0)>-*--6.7=6,<7*->=46-:)*4-<7)=<75)<-,)<<)+3;<1;?1,-4A)++-8<-,<0)<;<)<-7.<0-
):<16<-@<*);-,!$,-;1/6:-9=1:-;<0)<)!$*-:-;1;<)6<)/)16;<;-/5-6<)<1766<01;
8)8-:?--@)516-!$=;)*141<A1;;=-;)6,+=::-6<;-/5-6<)<176<-+0619=-;<0)<0)>-*--6=;-,
<7)<<)+3>):17=;!$;+0-5-;'-<0-616<:7,=+-<0-,-;1/67.)6-?!$;+0-5-<0)<?);
,-;1/6-,*);-,76<0-;-=;)*141<A)6,;-/5-6<)<176+76;1,-:)<176; =:/7)4?);<7)4;7,-;1/6)<-@<
*);-,!$;+0-5-<0)<+)6-);14A*-=;-,7616+:-);16/4A8-:>);1>-<7=+0;+:--6,->1+-;?1<07=<
<0-6--,.7:3-A*7):,168=<$01;8)8-:)4;7-@)516-;<0-=;)*141<A)6,:7*=;<6-;;7.<0-8:787;-,
!$;+0-5-
-A?7:,;-A?7:,;
;+0-5-1,-6<1+)<176+0):)+<-:+)8<+0)47+)<176;
1;+18416-;1;+18416-;
6/16--:16/B#+1-6+-)6,$-+06747/A#<=,1-;
!=*41+)<176-<)14;!=*41+)<176-<)14;
/=A-6&=+07?(#=;147'!$;+0-5-*);-,76<0-1,-6<1C+)<1767.
+0):)+<-:47+)<176;-+<=:-7<-;16758=<-:#+1-6+-
$01;27=:6)4):<1+4-1;)>)14)*4-)<"-;-):+0 6416-0<<8;:7=7?-,=)=-1;8)8-:;

A CAPTCHA Scheme based on the
Identification of Character Locations
Vu Duc Nguyen, Yang-Wai Chow and Willy Susilo
Centre for Computer and Information Security Research,
School of Computer Science and Software Engineering,
University of Wollongong, Australia
{vdn108, caseyc, wsusilo}@uow.edu.au
Abstract. CAPTCHAs are a standard security mechanism used on
many websites to protect online services against abuse by automated pro-
grams, or bots. The purpose of a CAPTCHA is to distinguish whether
an online transaction is being carried out by a human or a bot. Unfor-
tunately, to date many existing CAPTCHA schemes have been found
to be vulnerable to automated attacks. It is widely accepted that state-
of-the-art in text-based CAPTCHA design requires that a CAPTCHA
b e resistant against segmentation. In this paper, we examine CAPTCHA
usability issues and current segmentation techniques that have been used
to attack various CAPTCHA schemes. We then introduce the design of a
new CAPTCHA scheme that was designed based on these usability and
segmentation considerations. Our goal was to also design a text-based
CAPTCHA scheme that can easily be used on increasingly pervasive
touch-screen devices, without the need for keyboard input. This paper
also examines the usability and robustness of the proposed CAPTCHA
scheme.
Keywords: text-based CAPTCHA, segmentation resistance, optical character
recognition
1 Introduction
CAPTCHAs (Completely Automated Public Turing test to tell Computers and
Humans Apart) are essentially automated reverse Turing tests that are com-
monly used by online services to distinguish whether an online transaction is
being carried out by a human or an automated program, i.e. a bot [24]. Since its
inception, many diverse CAPTCHA schemes have been proposed, and to date,
CAPTCHAs have become a standard Internet security mechanism for deterring
automated attacks by bots and other malicious programs. Of the different types
of CAPTCHAs (e.g. text-based, image-based, audio-based) that are currently
used in practice, text-based CAPTCHAs are the most prevalent form in use.
Chellapilla et al. [12] attribute the popularity and pervasiveness of text-based
This work is supported by ARC Future Fellowship FT0991397.

2 Vu Duc Nguyen, Yang-Wai Chow and Willy Susilo
CAPTCHAs to its human friendliness, intuitiveness, ease of use, low implemen-
tation cost, etc. In general, a traditional text-based CAPTCHA challenge con-
sists of a word or a random sequence of characters, which may consist of letters
and/or digits, that are embedded within an image. The user’s task is to solve
the CAPTCHA challenge by entering the appropriate sequence of characters in
the correct order.
Unfortunately, while there are numerous existing CAPTCHA schemes that
are currently deployed on a vast number of websites, many of these schemes
have been found to be insecure. The vulnerability of these schemes stem from
various design flaws that can be exploited to break these CAPTCHAs. Over
the years, researchers have documented many techniques that can be used to
break a variety of CAPTCHA schemes at high success rates [1, 3, 4, 7, 13, 16, 19,
21, 20, 28, 29]. Furthermore, attacks against CAPTCHA schemes are not only
limited to traditional text-based CAPTCHAs, as techniques to break other
forms of CAPTCHAs have also been documented. These include techniques
for breaking animated CAPTCHAs [23, 27], 3D-based CAPTCHAs [22], image-
based CAPTCHAs [31], audio-based CAPTCHAs [5], etc. As such, the design
of a CAPTCHA scheme that is robust against automated attacks is an impor-
tant and open research problem. In addition, the challenge of designing a secure
CAPTCHA scheme is further complicated by the fact that not only must the
resulting CAPTCHA be secure against automated attacks, it must also be easily
usable by a human.
This paper presents the design of a new CAPTCHA scheme along with a
discussion on the security and usability of the proposed scheme. It is widely ac-
cepted that state-of-the-art in CAPTCHA design requires that a CAPTCHA be
segmentation-resistant [1, 12], as once a CAPTCHA can be segmented into its
constituting characters, the scheme is essentially deemed to be broken [11]. In
this paper, we first examine CAPTCHA usability issues and current segmenta-
tion techniques that have been used to attack a variety of existing CAPTCHAs,
in order to identify the various factors that must be considered when designing
a robust CAPTCHA scheme. This will then be followed by a discussion on the
design of our proposed scheme in relation to these usability and segmentation
considerations. In addition, this paper presents the results of a user study that
was conducted to ascertain the usability of the proposed CAPTCHA scheme,
followed by an analysis on the robustness of the scheme.
Our Contributions. In this paper, we present and discuss the design of a
new text-based CAPTCHA scheme that is robust against current segmentation
techniques. The proposed CAPTCHA scheme is also usable on touch-screen
interfaces, without the need to enter text via a physical or on-screen keyboard.
Our proposed approach alters the traditional challenge posed by conventional
text-based CAPTCHAs in which the user’s task is to answer the question of
“What is the text?”, into a question of “Where is the text?”. Hence, the user’s
task is to recognize and identify the locations of characters in the CAPTCHA
challenge. Furthermore, this paper outlines and examines the various usability

A CAPTCHA Scheme based on the Identification of Character Locations 3
and security issues that must be considered in the design of a robust CAPTCHA
scheme.
2 Background
2.1 Usability versus Security
The fundamental requirement of a practical CAPTCHA scheme necessitates that
humans must be able to solve the CAPTCHA challenges with a high degree
of success, while the likelihood that a computer program can correctly solve
them must be very small. This tradeoff between the usability and security of
a CAPTCHA scheme is a hard act to balance. Security considerations push
designers to increase the difficulty of the CAPTCHA scheme, while usability
requirements compel them to make the scheme only as difficult as they need to
be, but still be effective in deterring automated abuse. These conflicting demands
have resulted in the ongoing arms race between CAPTCHA designers and those
who try to break them [10, 15].
The design of a robust CAPTCHA must capitalize on the difference in natural
human ability and the capabilities of current computer programs [10]. This is
a challenging task because on one hand, computing technology and algorithms
that can be used to solve CAPTCHAs are constantly evolving and improving
(e.g. Optical Character Recognition (OCR) software), while on the other hand,
humans must rely on their inherent abilities and are unlikely to get better at
solving CAPTCHAs. In addition, it has been shown that several key features
that are commonly employed to increase the usability of CAPTCHA schemes
can easily be exploited by computer programs.
The use of color is a major factor that has to be considered in CAPTCHA
design. Color is used in CAPTCHAs for a variety of reasons. From a usability
perspective, color is a strong attention-getting mechanism, it is appealing and
can make CAPTCHA challenges interesting, appropriate use of color can facili-
tate recognition and comprehension of a CAPTCHA, and so on [2]. However, it
has been shown that the imprudent use of color can have a negative impact on
both CAPTCHA usability and security [1, 30].
To aid usability, text-based CAPTCHA challenges that are based on dictio-
nary words are intuitive and easier for humans to solve because humans find
familiar text easier to perceive and read [25]. However, CAPTCHA challenges
that are based on language models are susceptible to dictionary attacks. Rather
of trying to recognize individual characters, which may be difficult if the charac-
ters are overly distorted and/or overlapping, researchers have successfully used
holistic approaches to recognize entire words for CAPTCHA schemes that are
based on language models [4, 21].
Instead of using actual dictionary words, it is possible to take advantage of
text familiarity using “language-like” strings. Phonetic text or Markov dictionary
strings are pronounceable strings that are not words of any language. Experi-
ments have shown that humans perform better when solving CAPTCHAs with

Citations
More filters
Book ChapterDOI

CAPTCHA Design and Security Issues

TL;DR: This chapter presents an overview of research examining a wide range of issues that have been conducted on different types of CAPTCHAs, an integral part of the internet for providing online services which are intended for humans, with some level of protection against automated abuse.
Proceedings ArticleDOI

Robust Math Formula Recognition in Degraded Chinese Document Images

TL;DR: An over-segmentation strategy to split and recognize adhesive formula elements based on convolutional neural network (CNN) and a hierarchical framework for formula structure analysis that constructs the formula in a top-down manner to iteratively split the regions into recognizable units are proposed.
Proceedings ArticleDOI

A Reading Oriented Overlapping Text based CAPTCHA

TL;DR: The Reading Oriented Overlapping Text (ROOT) based CAPTCHA, introduced in this article, consists of overlapping alphanumeric letters both hand written and computer generated along with a reading pattern scheme that helps to use the sameCAPTCHA multiple number of times.
Book ChapterDOI

Haptic Alternatives for Mobile Device Authentication by Older Technology Users

TL;DR: The results of this study showed that tactile differentiation can be a viable alternative for device and security authentication for Turing tests such as those used for CAPTCHA and reCAPTCHA verification.
Proceedings ArticleDOI

A Captcha-Based Graphical Password With Strong Password Space and Usability Study

TL;DR: This paper presents a model of the Graphical password scheme under the impact of security and ease of use for user authentication, and integrates the concept of recognition with re-called and cued-recall based schemes to offer superior security compared to existing schemes.
References
More filters
Journal ArticleDOI

A Computational Approach to Edge Detection

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.
Journal ArticleDOI

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

TL;DR: The analogy between images and statistical mechanics systems is made and the analogous operation under the posterior distribution yields the maximum a posteriori (MAP) estimate of the image given the degraded observations, creating a highly parallel ``relaxation'' algorithm for MAP estimation.
Journal ArticleDOI

Use of the Hough transformation to detect lines and curves in pictures

TL;DR: It is pointed out that the use of angle-radius rather than slope-intercept parameters simplifies the computation further, and how the method can be used for more general curve fitting.
Book ChapterDOI

CAPTCHA: using hard AI problems for security

TL;DR: This work introduces captcha, an automated test that humans can pass, but current computer programs can't pass; any program that has high success over a captcha can be used to solve an unsolved Artificial Intelligence (AI) problem; and provides several novel constructions of captchas, which imply a win-win situation.
Frequently Asked Questions (10)
Q1. What are the contributions in "A captcha scheme based on the identification of character locations" ?

In this paper, the authors examine CAPTCHA usability issues and current segmentation techniques that have been used to attack various CAPTCHA schemes. The authors then introduce the design of a new CAPTCHA scheme that was designed based on these usability and segmentation considerations. This paper also examines the usability and robustness of the proposed CAPTCHA scheme. 

Collapsing techniques such as crowding and overlapping characters together are considered to be the most secure anti-segmentation approach to date. 

Of the different types of CAPTCHAs (e.g. text-based, image-based, audio-based) that are currently used in practice, text-based CAPTCHAs are the most prevalent form in use. 

In addition, using lines to connect characters as a segmentationresistant technique has also been found to be inadequate in preventing the resulting CAPTCHA from being segmented. 

In a large scale evaluation study by Bursztein et al. [6] where they tested a variety of CAPTCHA schemes, they reported an average solving time of 9.8 seconds for image CAPTCHAs and 28.4 seconds for audio CAPTCHAs. 

The colored version of their scheme was implemented to ascertain whether or not it would facilitate human visual perception, by making it easier for humans to distinguish characters from the background, instead of having to solely rely on the outlines of characters. 

for all characters in the image, their rotation angles, positions and sizes, as well as the number of characters, can all be randomized within a certain range of values. 

There are three main ways of achieving this; namely, by using a complex background image, by using a background with very similar colors to the text, or by adding noise. 

While it is widely accepted that a robust CAPTCHA scheme must be designed to be segmentation-resistant, many existing schemes that adopt anti-segmentation mechanisms have in fact been found to be insecure. 

In this paper, the authors present and discuss the design of a new text-based CAPTCHA scheme that is robust against current segmentation techniques.