scispace - formally typeset
Search or ask a question
Proceedings Article

Asirra: a CAPTCHA that exploits interest-aligned manual image categorization.

01 Oct 2007-pp 366-374
TL;DR: A CAPTCHA that asks users to identify cats out of a set of 12 photographs of both cats and dogs, and two novel algorithms for amplifying the skill gap between humans and computers that can be used on many existing CAPTCHAs are described.
Abstract: We present Asirra (Figure 1), a CAPTCHA that asks users to identify cats out of a set of 12 photographs of both cats and dogs. Asirra is easy for users; user studies indicate it can be solved by humans 99.6% of the time in under 30 seconds. Barring a major advance in machine vision, we expect computers will have no better than a 1/54,000 chance of solving it. Asirra’s image database is provided by a novel, mutually beneficial partnership with Petfinder.com. In exchange for the use of their three million images, we display an “adopt me” link beneath each one, promoting Petfinder’s primary mission of finding homes for homeless animals. We describe the design of Asirra, discuss threats to its security, and report early deployment experiences. We also describe two novel algorithms for amplifying the skill gap between humans and computers that can be used on many existing CAPTCHAs.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
16 Jun 2012
TL;DR: These models are very good: they beat all previously published results on the challenging ASIRRA test (cat vs dog discrimination) when applied to the task of discriminating the 37 different breeds of pets, and obtain an average accuracy of about 59%, a very encouraging result considering the difficulty of the problem.
Abstract: We investigate the fine grained object categorization problem of determining the breed of animal from an image. To this end we introduce a new annotated dataset of pets covering 37 different breeds of cats and dogs. The visual problem is very challenging as these animals, particularly cats, are very deformable and there can be quite subtle differences between the breeds. We make a number of contributions: first, we introduce a model to classify a pet breed automatically from an image. The model combines shape, captured by a deformable part model detecting the pet face, and appearance, captured by a bag-of-words model that describes the pet fur. Fitting the model involves automatically segmenting the animal in the image. Second, we compare two classification approaches: a hierarchical one, in which a pet is first assigned to the cat or dog family and then to a breed, and a flat one, in which the breed is obtained directly. We also investigate a number of animal and image orientated spatial layouts. These models are very good: they beat all previously published results on the challenging ASIRRA test (cat vs dog discrimination). When applied to the task of discriminating the 37 different breeds of pets, the models obtain an average accuracy of about 59%, a very encouraging result considering the difficulty of the problem.

1,076 citations

Proceedings ArticleDOI
27 Oct 2008
TL;DR: It is shown that CAPTCHAs that are carefully designed to be segmentation-resistant are vulnerable to novel but simple attacks, including the schemes designed and deployed by Microsoft, Yahoo and Google.
Abstract: CAPTCHA is now almost a standard security technology. The most widely deployed CAPTCHAs are text-based schemes, which typically require users to solve a text recognition task. The state of the art of CAPTCHA design suggests that such text-based schemes should rely on segmentation resistance to provide security guarantee, as individual character recognition after segmentation can be solved with a high success rate by standard methods such as neural networks.In this paper, we present new character segmentation techniques of general value to attack a number of text CAPTCHAs, including the schemes designed and deployed by Microsoft, Yahoo and Google. In particular, the Microsoft CAPTCHA has been deployed since 2002 at many of their online services including Hotmail, MSN and Windows Live. Designed to be segmentation-resistant, this scheme has been studied and tuned by its designers over the years. However, our simple attack has achieved a segmentation success rate of higher than 90% against this scheme. It took on average ~80 ms for the attack to completely segment a challenge on an ordinary desktop computer. As a result, we estimate that this CAPTCHA could be instantly broken by a malicious bot with an overall (segmentation and then recognition) success rate of more than 60%. On the contrary, the design goal was that automated attacks should not achieve a success rate of higher than 0.01%. For the first time, this paper shows that CAPTCHAs that are carefully designed to be segmentation-resistant are vulnerable to novel but simple attacks.

407 citations

Posted Content
TL;DR: The main idea behind the scheme is to train a multi-class model to discriminate between dozens of geometric transformations applied on all the given images, which generates feature detectors that effectively identify, at test time, anomalous images based on the softmax activation statistics of the model when applied on transformed images.
Abstract: We consider the problem of anomaly detection in images, and present a new detection technique. Given a sample of images, all known to belong to a "normal" class (e.g., dogs), we show how to train a deep neural model that can detect out-of-distribution images (i.e., non-dog objects). The main idea behind our scheme is to train a multi-class model to discriminate between dozens of geometric transformations applied on all the given images. The auxiliary expertise learned by the model generates feature detectors that effectively identify, at test time, anomalous images based on the softmax activation statistics of the model when applied on transformed images. We present extensive experiments using the proposed detector, which indicate that our algorithm improves state-of-the-art methods by a wide margin.

342 citations


Cites methods from "Asirra: a CAPTCHA that exploits int..."

  • ...All baseline methods, including OC-SVM variants, which enjoy hindsight information, only achieve performance that is slightly better than random guessing in the CatsVsDogs dataset....

    [...]

  • ...In the CatsVsDogs dataset, we improve the top performing baseline AUROC by 67%....

    [...]

  • ...• CatsVsDogs: extracted from the ASIRRA dataset, it contains 25,000 images of cats and dogs, 12,500 in each class....

    [...]

  • ...We consider four image datasets in our experiments: CIFAR-10, CIFAR-100 [18], CatsVsDogs [9], and fashion-MNIST [33], which are described below....

    [...]

  • ...Dataset ci Single Class Name CIFAR-10 0 Airplane 1 Automobile 2 Bird 3 Cat 4 Deer 5 Dog 6 Frog 7 Horse 8 Ship 9 Truck CIFAR-100 0 Aquatic mammals 1 Fish 2 Flowers 3 Food containers 4 Fruit and vegetables 5 Household electrical devices 6 Household furniture 7 Insects 8 Large carnivores 9 Large man-made outdoor things 10 Large natural outdoor scenes 11 Large omnivores and herbivores 12 Medium-sized mammals 13 Non-insect invertebrates 14 People 15 Reptiles 16 Small mammals 17 Trees 18 Vehicles 1 19 Vehicles 2 Fashion-MNIST 0 Ankle-boot 1 Bag 2 Coat 3 Dress 4 Pullover 5 Sandal 6 Shirt 7 Sneaker 8 T-shirt 9 Trouser CatsVsDogs 0 Cat1 Dog...

    [...]

Proceedings ArticleDOI
23 Jul 2008
TL;DR: Usability issues that should be considered and addressed in the design of CAPTCHAs are discussed, and a simple but novel framework for examining CAPTCHA usability is proposed.
Abstract: CAPTCHA is now almost a standard security technology, and has found widespread application in commercial websites. Usability and robustness are two fundamental issues with CAPTCHA, and they often interconnect with each other. This paper discusses usability issues that should be considered and addressed in the design of CAPTCHAs. Some of these issues are intuitive, but some others have subtle implications for robustness (or security). A simple but novel framework for examining CAPTCHA usability is also proposed.

319 citations

Posted Content
TL;DR: In this article, failure modes for MC dropout, a widely used approach for estimating uncertainty in deep models, are highlighted, and a proposal to improve the quality of uncertainty estimates using probabilistic model ensembles is made.
Abstract: Measuring uncertainty is a promising technique for detecting adversarial examples, crafted inputs on which the model predicts an incorrect class with high confidence. But many measures of uncertainty exist, including predictive en- tropy and mutual information, each capturing different types of uncertainty. We study these measures, and shed light on why mutual information seems to be effective at the task of adversarial example detection. We highlight failure modes for MC dropout, a widely used approach for estimating uncertainty in deep models. This leads to an improved understanding of the drawbacks of current methods, and a proposal to improve the quality of uncertainty estimates using probabilistic model ensembles. We give illustrative experiments using MNIST to demonstrate the intuition underlying the different measures of uncertainty, as well as experiments on a real world Kaggle dogs vs cats classification dataset.

257 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors provide an up-to-date critical survey of still-and video-based face recognition research, and provide some insights into the studies of machine recognition of faces.
Abstract: As one of the most successful applications of image analysis and understanding, face recognition has recently received significant attention, especially during the past several years. At least two reasons account for this trend: the first is the wide range of commercial and law enforcement applications, and the second is the availability of feasible technologies after 30 years of research. Even though current machine recognition systems have reached a certain level of maturity, their success is limited by the conditions imposed by many real applications. For example, recognition of face images acquired in an outdoor environment with changes in illumination and/or pose remains a largely unsolved problem. In other words, current systems are still far away from the capability of the human perception system.This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition, relevant topics such as psychophysical studies, system evaluation, and issues of illumination and pose variation are covered.

6,384 citations

Proceedings ArticleDOI
03 Aug 2003
TL;DR: A set of concrete bestpractices that document analysis researchers can use to get good results with neural networks, including a simple "do-it-yourself" implementation of convolution with a flexible architecture suitable for many visual document problems.
Abstract: Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.

2,783 citations

Proceedings ArticleDOI
25 Apr 2004
TL;DR: A new interactive system: a game that is fun and can be used to create valuable output that addresses the image-labeling problem and encourages people to do the work by taking advantage of their desire to be entertained.
Abstract: We introduce a new interactive system: a game that is fun and can be used to create valuable output. When people play the game they help determine the contents of images by providing meaningful labels for them. If the game is played as much as popular online games, we estimate that most images on the Web can be labeled in a few months. Having proper labels associated with each image on the Web would allow for more accurate image search, improve the accessibility of sites (by providing descriptions of images to visually impaired individuals), and help users block inappropriate images. Our system makes a significant contribution because of its valuable output and because of the way it addresses the image-labeling problem. Rather than using computer vision techniques, which don't work well enough, we encourage people to do the work by taking advantage of their desire to be entertained.

2,365 citations

01 Jan 2006
TL;DR: This report presents the results of the 2006 PASCAL Visual Object Classes Challenge (VOC2006).
Abstract: This report presents the results of the 2006 PASCAL Visual Object Classes Challenge (VOC2006). Details of the challenge, data, and evaluation are presented. Participants in the challenge submitted descriptions of their methods, and these have been included verbatim. This document should be considered preliminary, and subject to change.

2,034 citations

Book ChapterDOI
04 May 2003
TL;DR: This work introduces captcha, an automated test that humans can pass, but current computer programs can't pass; any program that has high success over a captcha can be used to solve an unsolved Artificial Intelligence (AI) problem; and provides several novel constructions of captchas, which imply a win-win situation.
Abstract: We introduce captcha, an automated test that humans can pass, but current computer programs can't pass: any program that has high success over a captcha can be used to solve an unsolved Artificial Intelligence (AI) problem. We provide several novel constructions of captchas. Since captchas have many applications in practical security, our approach introduces a new class of hard problems that can be exploited for security purposes. Much like research in cryptography has had a positive impact on algorithms for factoring and discrete log, we hope that the use of hard AI problems for security purposes allows us to advance the field of Artificial Intelligence. We introduce two families of AI problems that can be used to construct captchas and we show that solutions to such problems can be used for steganographic communication. captchas based on these AI problem families, then, imply a win-win situation: either the problems remain unsolved and there is a way to differentiate humans from computers, or the problems are solved and there is a way to communicate covertly on some channels.

1,525 citations