What are the secure techniques to date?

Collapsing techniques such as crowding and overlapping characters together are considered to be the most secure anti-segmentation approach to date.

What is the main reason for the lack of a robust CAPTCHA?

In addition, using lines to connect characters as a segmentationresistant technique has also been found to be inadequate in preventing the resulting CAPTCHA from being segmented.

How long did Bursztein et al. solve the CAPTCHA?

In a large scale evaluation study by Bursztein et al. [6] where they tested a variety of CAPTCHA schemes, they reported an average solving time of 9.8 seconds for image CAPTCHAs and 28.4 seconds for audio CAPTCHAs.

Why was the color version of the CAPTCHA implemented?

The colored version of their scheme was implemented to ascertain whether or not it would facilitate human visual perception, by making it easier for humans to distinguish characters from the background, instead of having to solely rely on the outlines of characters.

How many characters can be randomized within a range?

for all characters in the image, their rotation angles, positions and sizes, as well as the number of characters, can all be randomized within a certain range of values.

What are the main ways of achieving this?

There are three main ways of achieving this; namely, by using a complex background image, by using a background with very similar colors to the text, or by adding noise.

What are the main reasons for the insecureness of CAPTCHA?

While it is widely accepted that a robust CAPTCHA scheme must be designed to be segmentation-resistant, many existing schemes that adopt anti-segmentation mechanisms have in fact been found to be insecure.

(Open Access) A CAPTCHA Scheme Based on the Identification of Character Locations (2014) | Vu Duc Nguyen

Q: What are the contributions in "A captcha scheme based on the identification of character locations" ?

In this paper, the authors examine CAPTCHA usability issues and current segmentation techniques that have been used to attack various CAPTCHA schemes. The authors then introduce the design of a new CAPTCHA scheme that was designed based on these usability and segmentation considerations. This paper also examines the usability and robustness of the proposed CAPTCHA scheme.

Q: What is the common type of CAPTCHA?

Of the different types of CAPTCHAs (e.g. text-based, image-based, audio-based) that are currently used in practice, text-based CAPTCHAs are the most prevalent form in use.

Q: What is the purpose of this paper?

In this paper, the authors present and discuss the design of a new text-based CAPTCHA scheme that is robust against current segmentation techniques.

%61>-:;1<A7.'74476/76/%61>-:;1<A7.'74476/76/

"-;-):+0 6416-"-;-):+0 6416-

)+=4<A7.6/16--:16/)6,6.7:5)<176

#+1-6+-;!)8-:;!):<

)+=4<A7.6/16--:16/)6,6.7:5)<176

#+1-6+-;



!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;

=+&=/=A-6

%61>-:;1<A7.'74476/76/

,>6=7?5)14-,=)=

()6/')107?

%61>-:;1<A7.'74476/76/

+);-A+=7?-,=)=

'144A#=;147

%61>-:;1<A7.'74476/76/

?;=;147=7?-,=)=

7447?<01;)6,),,1<176)4?7:3;)<0<<8;:7=7?-,=)=-1;8)8-:;

!):<7.<0-6/16--:16/75576;)6,<0-#+1-6+-)6,$-+06747/A#<=,1-;75576;

"-+755-6,-,1<)<176"-+755-6,-,1<)<176

/=A-6=+&=07?()6/')1)6,#=;147'144A!$;+0-5-*);-,76<0-1,-6<1C+)<1767.

+0):)+<-:47+)<176;

)+=4<A7.6/16--:16/)6,6.7:5)<176#+1-6+-;!)8-:;!):<



0<<8;:7=7?-,=)=-1;8)8-:;

"-;-):+0 6416-1;<0-78-6)++-;;16;<1<=<176)4:-87;1<7:A.7:<0-%61>-:;1<A7.'74476/76/7:.=:<0-:16.7:5)<176

+76<)+<<0-% '1*:):A:-;-):+08=*;=7?-,=)=

!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;!$;+0-5-*);-,76<0-1,-6<1C+)<1767.+0):)+<-:47+)<176;

*;<:)+<*;<:)+<

!$;):-);<)6,):,;-+=:1<A5-+0)61;5=;-,765)6A?-*;1<-;<78:7<-+<76416-;-:>1+-;)/)16;<

)*=;-*A)=<75)<-,8:7/:)5;7:*7<;$0-8=:87;-7.)!$1;<7,1;<16/=1;0?0-<0-:)676416-

<:)6;)+<1761;*-16/+)::1-,7=<*A)0=5)67:)*7<%6.7:<=6)<-4A<7,)<-5)6A-@1;<16/!$

;+0-5-;0)>-*--6.7=6,<7*->=46-:)*4-<7)=<75)<-,)<<)+3;<1;?1,-4A)++-8<-,<0)<;<)<-7.<0-

):<16<-@<*);-,!$,-;1/6:-9=1:-;<0)<)!$*-:-;1;<)6<)/)16;<;-/5-6<)<1766<01;

8)8-:?--@)516-!$=;)*141<A1;;=-;)6,+=::-6<;-/5-6<)<176<-+0619=-;<0)<0)>-*--6=;-,

<7)<<)+3>):17=;!$;+0-5-;'-<0-616<:7,=+-<0-,-;1/67.)6-?!$;+0-5-<0)<?);

,-;1/6-,*);-,76<0-;-=;)*141<A)6,;-/5-6<)<176+76;1,-:)<176; =:/7)4?);<7)4;7,-;1/6)<-@<

*);-,!$;+0-5-<0)<+)6-);14A*-=;-,7616+:-);16/4A8-:>);1>-<7=+0;+:--6,->1+-;?1<07=<

<0-6--,.7:3-A*7):,168=<$01;8)8-:)4;7-@)516-;<0-=;)*141<A)6,:7*=;<6-;;7.<0-8:787;-,

!$;+0-5-

-A?7:,;-A?7:,;

;+0-5-1,-6<1+)<176+0):)+<-:+)8<+0)47+)<176;

1;+18416-;1;+18416-;

6/16--:16/B#+1-6+-)6,$-+06747/A#<=,1-;

!=*41+)<176-<)14;!=*41+)<176-<)14;

/=A-6&=+07?(#=;147'!$;+0-5-*);-,76<0-1,-6<1C+)<1767.

+0):)+<-:47+)<176;-+<=:-7<-;16758=<-:#+1-6+-

$01;27=:6)4):<1+4-1;)>)14)*4-)<"-;-):+0 6416-0<<8;:7=7?-,=)=-1;8)8-:;

A CAPTCHA Scheme based on the

Identiﬁcation of Character Locations

Vu Duc Nguyen, Yang-Wai Chow and Willy Susilo

⋆

Centre for Computer and Information Security Research,

School of Computer Science and Software Engineering,

University of Wollongong, Australia

{vdn108, caseyc, wsusilo}@uow.edu.au

Abstract. CAPTCHAs are a standard security mechanism used on

many websites to protect online services against abuse by automated pro-

grams, or bots. The purpose of a CAPTCHA is to distinguish whether

an online transaction is being carried out by a human or a bot. Unfor-

tunately, to date many existing CAPTCHA schemes have been found

to be vulnerable to automated attacks. It is widely accepted that state-

of-the-art in text-based CAPTCHA design requires that a CAPTCHA

b e resistant against segmentation. In this paper, we examine CAPTCHA

usability issues and current segmentation techniques that have been used

to attack various CAPTCHA schemes. We then introduce the design of a

new CAPTCHA scheme that was designed based on these usability and

segmentation considerations. Our goal was to also design a text-based

CAPTCHA scheme that can easily be used on increasingly pervasive

touch-screen devices, without the need for keyboard input. This paper

also examines the usability and robustness of the proposed CAPTCHA

scheme.

Keywords: text-based CAPTCHA, segmentation resistance, optical character

recognition

1 Introduction

CAPTCHAs (Completely Automated Public Turing test to tell Computers and

Humans Apart) are essentially automated reverse Turing tests that are com-

monly used by online services to distinguish whether an online transaction is

being carried out by a human or an automated program, i.e. a bot [24]. Since its

inception, many diverse CAPTCHA schemes have been proposed, and to date,

CAPTCHAs have become a standard Internet security mechanism for deterring

automated attacks by bots and other malicious programs. Of the diﬀerent types

of CAPTCHAs (e.g. text-based, image-based, audio-based) that are currently

used in practice, text-based CAPTCHAs are the most prevalent form in use.

Chellapilla et al. [12] attribute the popularity and pervasiveness of text-based

⋆

This work is supported by ARC Future Fellowship FT0991397.

2 Vu Duc Nguyen, Yang-Wai Chow and Willy Susilo

CAPTCHAs to its human friendliness, intuitiveness, ease of use, low implemen-

tation cost, etc. In general, a traditional text-based CAPTCHA challenge con-

sists of a word or a random sequence of characters, which may consist of letters

and/or digits, that are embedded within an image. The user’s task is to solve

the CAPTCHA challenge by entering the appropriate sequence of characters in

the correct order.

Unfortunately, while there are numerous existing CAPTCHA schemes that

are currently deployed on a vast number of websites, many of these schemes

have been found to be insecure. The vulnerability of these schemes stem from

various design ﬂaws that can be exploited to break these CAPTCHAs. Over

the years, researchers have documented many techniques that can be used to

break a variety of CAPTCHA schemes at high success rates [1, 3, 4, 7, 13, 16, 19,

21, 20, 28, 29]. Furthermore, attacks against CAPTCHA schemes are not only

limited to traditional text-based CAPTCHAs, as techniques to break other

forms of CAPTCHAs have also been documented. These include techniques

for breaking animated CAPTCHAs [23, 27], 3D-based CAPTCHAs [22], image-

based CAPTCHAs [31], audio-based CAPTCHAs [5], etc. As such, the design

of a CAPTCHA scheme that is robust against automated attacks is an impor-

tant and open research problem. In addition, the challenge of designing a secure

CAPTCHA scheme is further complicated by the fact that not only must the

resulting CAPTCHA be secure against automated attacks, it must also be easily

usable by a human.

This paper presents the design of a new CAPTCHA scheme along with a

discussion on the security and usability of the proposed scheme. It is widely ac-

cepted that state-of-the-art in CAPTCHA design requires that a CAPTCHA be

segmentation-resistant [1, 12], as once a CAPTCHA can be segmented into its

constituting characters, the scheme is essentially deemed to be broken [11]. In

this paper, we ﬁrst examine CAPTCHA usability issues and current segmenta-

tion techniques that have been used to attack a variety of existing CAPTCHAs,

in order to identify the various factors that must be considered when designing

a robust CAPTCHA scheme. This will then be followed by a discussion on the

design of our proposed scheme in relation to these usability and segmentation

considerations. In addition, this paper presents the results of a user study that

was conducted to ascertain the usability of the proposed CAPTCHA scheme,

followed by an analysis on the robustness of the scheme.

Our Contributions. In this paper, we present and discuss the design of a

new text-based CAPTCHA scheme that is robust against current segmentation

techniques. The proposed CAPTCHA scheme is also usable on touch-screen

interfaces, without the need to enter text via a physical or on-screen keyboard.

Our proposed approach alters the traditional challenge posed by conventional

text-based CAPTCHAs in which the user’s task is to answer the question of

“What is the text?”, into a question of “Where is the text?”. Hence, the user’s

task is to recognize and identify the locations of characters in the CAPTCHA

challenge. Furthermore, this paper outlines and examines the various usability

A CAPTCHA Scheme based on the Identiﬁcation of Character Locations 3

and security issues that must be considered in the design of a robust CAPTCHA

scheme.

2 Background

2.1 Usability versus Security

The fundamental requirement of a practical CAPTCHA scheme necessitates that

humans must be able to solve the CAPTCHA challenges with a high degree

of success, while the likelihood that a computer program can correctly solve

them must be very small. This tradeoﬀ between the usability and security of

a CAPTCHA scheme is a hard act to balance. Security considerations push

designers to increase the diﬃculty of the CAPTCHA scheme, while usability

requirements compel them to make the scheme only as diﬃcult as they need to

be, but still be eﬀective in deterring automated abuse. These conﬂicting demands

have resulted in the ongoing arms race between CAPTCHA designers and those

who try to break them [10, 15].

The design of a robust CAPTCHA must capitalize on the diﬀerence in natural

human ability and the capabilities of current computer programs [10]. This is

a challenging task because on one hand, computing technology and algorithms

that can be used to solve CAPTCHAs are constantly evolving and improving

(e.g. Optical Character Recognition (OCR) software), while on the other hand,

humans must rely on their inherent abilities and are unlikely to get better at

solving CAPTCHAs. In addition, it has been shown that several key features

that are commonly employed to increase the usability of CAPTCHA schemes

can easily be exploited by computer programs.

The use of color is a major factor that has to be considered in CAPTCHA

design. Color is used in CAPTCHAs for a variety of reasons. From a usability

perspective, color is a strong attention-getting mechanism, it is appealing and

can make CAPTCHA challenges interesting, appropriate use of color can facili-

tate recognition and comprehension of a CAPTCHA, and so on [2]. However, it

has been shown that the imprudent use of color can have a negative impact on

both CAPTCHA usability and security [1, 30].

To aid usability, text-based CAPTCHA challenges that are based on dictio-

nary words are intuitive and easier for humans to solve because humans ﬁnd

familiar text easier to perceive and read [25]. However, CAPTCHA challenges

that are based on language models are susceptible to dictionary attacks. Rather

of trying to recognize individual characters, which may be diﬃcult if the charac-

ters are overly distorted and/or overlapping, researchers have successfully used

holistic approaches to recognize entire words for CAPTCHA schemes that are

based on language models [4, 21].

Instead of using actual dictionary words, it is possible to take advantage of

text familiarity using “language-like” strings. Phonetic text or Markov dictionary

strings are pronounceable strings that are not words of any language. Experi-

ments have shown that humans perform better when solving CAPTCHAs with

A CAPTCHA Scheme Based on the Identification of Character Locations

Figures

Citations

CAPTCHA Design and Security Issues

Robust Math Formula Recognition in Degraded Chinese Document Images

A Reading Oriented Overlapping Text based CAPTCHA

Haptic Alternatives for Mobile Device Authentication by Older Technology Users

A Captcha-Based Graphical Password With Strong Password Space and Usability Study

References

A Computational Approach to Edge Detection

Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

Use of the Hough transformation to detect lines and curves in pictures

Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images

CAPTCHA: using hard AI problems for security

Related Papers (5)

3DOC: 3D object CAPTCHA

AJIGJAX: A hybrid image based model for Captcha/CaRP

Image-based CAPTCHAs based on neural style transfer

Breaking a 3d-based CAPTCHA scheme

A Reading Oriented Overlapping Text based CAPTCHA

Frequently Asked Questions (10)

Q1. What are the contributions in "A captcha scheme based on the identification of character locations" ?

Q2. What are the secure techniques to date?

Q3. What is the common type of CAPTCHA?

Q4. What is the main reason for the lack of a robust CAPTCHA?

Q5. How long did Bursztein et al. solve the CAPTCHA?

Q6. Why was the color version of the CAPTCHA implemented?

Q7. How many characters can be randomized within a range?

Q8. What are the main ways of achieving this?

Q9. What are the main reasons for the insecureness of CAPTCHA?

Q10. What is the purpose of this paper?