Author
Matt Weir
Bio: Matt Weir is an academic researcher from Florida State University. The author has contributed to research in topics: Password & Key stretching. The author has an hindex of 2, co-authored 2 publications receiving 802 citations.
Papers
More filters
17 May 2009
TL;DR: This paper discusses a new method that generates password structures in highest probability order by automatically creating a probabilistic context-free grammar based upon a training set of previously disclosed passwords, and then generating word-mangling rules to be used in password cracking.
Abstract: Choosing the most effective word-mangling rules to use when performing a dictionary-based password cracking attack can be a difficult task In this paper we discuss a new method that generates password structures in highest probability order We first automatically create a probabilistic context-free grammar based upon a training set of previously disclosed passwords This grammar then allows us to generate word-mangling rules, and from them, password guesses to be used in password cracking We will also show that this approach seems to provide a more effective way to crack passwords as compared to traditional methods by testing our tools and techniques on real password sets In one series of experiments, training on a set of disclosed passwords, our approach was able to crack 28% to 129% more passwords than John the Ripper, a publicly available standard password cracking program
491 citations
04 Oct 2010
TL;DR: This paper attempts to determine the effectiveness of using entropy, as defined in NIST SP800-63, as a measurement of the security provided by various password creation policies, by modeling the success rate of current password cracking techniques against real user passwords.
Abstract: In this paper we attempt to determine the effectiveness of using entropy, as defined in NIST SP800-63, as a measurement of the security provided by various password creation policies. This is accomplished by modeling the success rate of current password cracking techniques against real user passwords. These data sets were collected from several different websites, the largest one containing over 32 million passwords. This focus on actual attack methodologies and real user passwords quite possibly makes this one of the largest studies on password security to date. In addition we examine what these results mean for standard password creation policies, such as minimum password length, and character set requirements.
423 citations
Cited by
More filters
20 May 2012
TL;DR: It is estimated that passwords provide fewer than 10 bits of security against an online, trawling attack, and only about 20 bits ofSecurity against an optimal offline dictionary attack, when compared with a uniform distribution which would provide equivalent security against different forms of guessing attack.
Abstract: We report on the largest corpus of user-chosen passwords ever studied, consisting of anonymized password histograms representing almost 70 million Yahoo! users, mitigating privacy concerns while enabling analysis of dozens of subpopulations based on demographic factors and site usage characteristics. This large data set motivates a thorough statistical treatment of estimating guessing difficulty by sampling from a secret distribution. In place of previously used metrics such as Shannon entropy and guessing entropy, which cannot be estimated with any realistically sized sample, we develop partial guessing metrics including a new variant of guesswork parameterized by an attacker's desired success rate. Our new metric is comparatively easy to approximate and directly relevant for security engineering. By comparing password distributions with a uniform distribution which would provide equivalent security against different forms of guessing attack, we estimate that passwords provide fewer than 10 bits of security against an online, trawling attack, and only about 20 bits of security against an optimal offline dictionary attack. We find surprisingly little variation in guessing difficulty; every identifiable group of users generated a comparably weak password distribution. Security motivations such as the registration of a payment card have no greater impact than demographic factors such as age and nationality. Even proactive efforts to nudge users towards better password choices with graphical feedback make little difference. More surprisingly, even seemingly distant language communities choose the same weak passwords and an attacker never gains more than a factor of 2 efficiency gain by switching from the globally optimal dictionary to a population-specific lists.
711 citations
20 May 2012
TL;DR: An efficient distributed method is developed for calculating how effectively several heuristic password-guessing algorithms guess passwords, and the relationship between guess ability, as measured with password-cracking algorithms, and entropy estimates is investigated.
Abstract: Text-based passwords remain the dominant authentication method in computer systems, despite significant advancement in attackers' capabilities to perform password cracking. In response to this threat, password composition policies have grown increasingly complex. However, there is insufficient research defining metrics to characterize password strength and using them to evaluate password-composition policies. In this paper, we analyze 12,000 passwords collected under seven composition policies via an online study. We develop an efficient distributed method for calculating how effectively several heuristic password-guessing algorithms guess passwords. Leveraging this method, we investigate (a) the resistance of passwords created under different conditions to guessing, (b) the performance of guessing algorithms under different training sets, (c) the relationship between passwords explicitly created under a given composition policy and other passwords that happen to meet the same requirements, and (d) the relationship between guess ability, as measured with password-cracking algorithms, and entropy estimates. Our findings advance understanding of both password-composition policies and metrics for quantifying password security.
464 citations
01 Jan 2014
TL;DR: This paper investigates for the first time how an attacker can leverage a known password from one site to more easily guess that user's password at other sites and develops the first cross-site password-guessing algorithm, able to guess 30% of transformed passwords within 100 attempts.
Abstract: Today's Internet services rely heavily on text-based passwords for user authentication. The pervasiveness of these services coupled with the difficulty of remembering large numbers of secure passwords tempts users to reuse passwords at multiple sites. In this paper, we investigate for the first time how an attacker can leverage a known password from one site to more easily guess that user's password at other sites. We study several hundred thousand leaked passwords from eleven web sites and conduct a user survey on password reuse; we estimate that 43- 51% of users reuse the same password across multiple sites. We further identify a few simple tricks users often employ to transform a basic password between sites which can be used by an attacker to make password guessing vastly easier. We develop the first cross-site password-guessing algorithm, which is able to guess 30% of transformed passwords within 100 attempts compared to just 14% for a standard password-guessing algorithm without cross-site password knowledge.
426 citations
04 Oct 2010
TL;DR: This paper attempts to determine the effectiveness of using entropy, as defined in NIST SP800-63, as a measurement of the security provided by various password creation policies, by modeling the success rate of current password cracking techniques against real user passwords.
Abstract: In this paper we attempt to determine the effectiveness of using entropy, as defined in NIST SP800-63, as a measurement of the security provided by various password creation policies. This is accomplished by modeling the success rate of current password cracking techniques against real user passwords. These data sets were collected from several different websites, the largest one containing over 32 million passwords. This focus on actual attack methodologies and real user passwords quite possibly makes this one of the largest studies on password security to date. In addition we examine what these results mean for standard password creation policies, such as minimum password length, and character set requirements.
423 citations
07 May 2011
TL;DR: A large-scale study investigates password strength, user behavior, and user sentiment across four password-composition policies, and describes the predictability of passwords by calculating their entropy, finding that a number of commonly held beliefs about password composition and strength are inaccurate.
Abstract: Text-based passwords are the most common mechanism for authenticating humans to computer systems. To prevent users from picking passwords that are too easy for an adversary to guess, system administrators adopt password-composition policies (e.g., requiring passwords to contain symbols and numbers). Unfortunately, little is known about the relationship between password-composition policies and the strength of the resulting passwords, or about the behavior of users (e.g., writing down passwords) in response to different policies. We present a large-scale study that investigates password strength, user behavior, and user sentiment across four password-composition policies. We characterize the predictability of passwords by calculating their entropy, and find that a number of commonly held beliefs about password composition and strength are inaccurate. We correlate our results with user behavior and sentiment to produce several recommendations for password-composition policies that result in strong passwords without unduly burdening users.
398 citations