scispace - formally typeset
Search or ask a question
Author

Haibo Cheng

Other affiliations: Chinese Ministry of Education
Bio: Haibo Cheng is an academic researcher from Peking University. The author has contributed to research in topics: Password & Password strength. The author has an hindex of 8, co-authored 12 publications receiving 467 citations. Previous affiliations of Haibo Cheng include Chinese Ministry of Education.

Papers
More filters
Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed two Zipf-like models (i.e., PDF-Zipf and CDF-ZipF) to characterize the distribution of passwords and proposed a new metric for measuring the strength of password data sets.
Abstract: Despite three decades of intensive research efforts, it remains an open question as to what is the underlying distribution of user-generated passwords. In this paper, we make a substantial step forward toward understanding this foundational question. By introducing a number of computational statistical techniques and based on 14 large-scale data sets, which consist of 113.3 million real-world passwords, we, for the first time, propose two Zipf-like models (i.e., PDF-Zipf and CDF-Zipf) to characterize the distribution of passwords. More specifically, our PDF-Zipf model can well fit the popular passwords and obtain a coefficient of determination larger than 0.97; our CDF-Zipf model can well fit the entire password data set, with the maximum cumulative distribution function (CDF) deviation between the empirical distribution and the fitted theoretical model being 0.49%~4.59% (on an average 1.85%). With the concrete knowledge of password distributions, we suggest a new metric for measuring the strength of password data sets. Extensive experimental results show the effectiveness and general applicability of the proposed Zipf-like models and security metric.

300 citations

Journal ArticleDOI
TL;DR: Three representative identity-based remote user authentication schemes are employed as case studies to reveal the challenges and subtleties in designing a practical authentication scheme for mobile devices and a “provably secure” scheme for roaming services in mobile networks is scrutinized.
Abstract: Providing secure, efficient, and privacy-preserving user authentication in mobile networks is a challenging problem due to the inherent mobility of users, variety of attack vectors, and resource-constrained nature of user devices. Recent studies show that identity-based cryptosystems can eliminate the certificate overhead and thus address the issues associated with public-key infrastructure technology—which is a rare bit of good news in today's computer security world. In this paper, we employ three representative identity-based remote user authentication schemes (i.e., Truong et al. 's scheme, Li et al. 's scheme, and Zhang et al. 's scheme) as case studies to reveal the challenges and subtleties in designing a practical authentication scheme for mobile devices. First, we demonstrate that Truong et al. 's scheme, which was presented at the IEEE AINA 2012, cannot achieve a few important security goals under our new attacking scenarios: 1) it fails to resist against known session-specific temporary information attack; 2) it cannot withstand key compromise impersonation attack; and 3) it is of poor usability. Second, we show that Li et al. 's privacy-preserving scheme, which was proposed at GLOBECOM 2012, is subject to some subtle (yet severe) efficiency problems that make it virtually impossible for any practical use. Third, we scrutinize a “provably secure” scheme for roaming services in mobile networks designed by Zhang et al. at SCN 2015 and find it prone to collusion attack and replay attack. Further, we investigate into the underlying causes for these identified failures, and figure out an improvement over Truong et al. 's scheme to overcome the revealed challenges while maintaining reasonable efficiency.

113 citations

Proceedings ArticleDOI
30 May 2016
TL;DR: It is shown that there are at least seven different attacking scenarios that may lead to the failure of a scheme in achieving truly two-factor security, and the request for better measurement when assessing new schemes is outlined.
Abstract: Despite over two decades of continuous efforts, how to design a secure and efficient two-factor authentication scheme remains an open issue. Hundreds of new schemes have wave upon wave been proposed, yet most of them are shortly found unable to achieve some important security goals (e.g., truly two-factor security) and desirable properties (e.g., user anonymity), falling into the unsatisfactory "break-fix-break-fix" cycle. In this vicious cycle, protocol designers often advocate the superiorities of their improved scheme, but do not illustrate (or unconsciously overlooking) the aspects on which their scheme performs poorly. In this paper, we first use a series of "improved schemes" over Xu et al.'s 2009 scheme as case studies to highlight that, if there are no improved measurements, more "improved schemes" generally would not mean more advancements. To figure out why the measurement of existing schemes is invariably insufficient, we further investigate into the state-of-the-art evaluation criteria set (i.e., Madhusudhan-Mittal's set). Besides reporting its ambiguities and redundancies, we propose viable fixes and refinements. To our knowledge, we for the first time show that there are at least seven different attacking scenarios that may lead to the failure of a scheme in achieving truly two-factor security. Finally, we conduct a large-scale comparative evaluation of 26 representative two-factor schemes, and our results outline the request for better measurement when assessing new schemes.

80 citations

Proceedings ArticleDOI
01 Jun 2016
TL;DR: Extensive experiments on 11 real-world password lists show that fuzzyPSM, in general, outperforms all its counterparts, especially accurate in telling apart weak passwords and suitable for services where online guessing attacks prevail.
Abstract: To provide timely feedbacks to users, nearly every respectable Internet service now imposes a password strength meter (PSM) upon user registration or password change. It is a rare bit of good news in password research that well-designed PSMs do help improve the strength of user-chosen passwords. However, leading PSMs in the industrial world (e.g., Zxcvbn, KeePSM and NIST PSM) are mainly composed of simple heuristic rules and found to be highly inaccurate, while state-of-the-art PSMs from academia (e.g., probabilistic context-free grammar based ones and Markov-based ones) are still far from satisfactory, especially incompetent at gauging weak passwords. As preventing weak passwords is the primary goal of any PSM, this means that existing PSMs largely fail to serve their purpose. To fill this gap, in this paper we propose a novel PSM that is grounded on real user behavior. Our user survey reveals that when choosing passwords for a new web service, most users (77.38%) simply retrieve one of their existing passwords from memory and then reuse (or slightly modify) it. This is in vast contrast to the seemingly intuitive yet unrealistic assumption (often implicitly) made in most of the existing PSMs that, when user registers, a whole new password is constructed by mixing segments of letter, digit and/or symbol or by combining n-grams. To model users' realistic behaviors, we use passwords leaked from a less sensitiveservice as our base dictionary and another list of relatively strong passwords leaked from a sensitive service as our training dictionary, and determine how mangling rules are employed by users to construct passwords for new services. This process automatically creates a fuzzy probabilistic context-free grammar (PCFG) and gives rise to our fuzzy-PCFG-based meter, fuzzyPSM. It can react dynamically to changes in how users choose passwords and is evaluated by comparisons with five representative PSMs. Extensive experiments on 11 real-world password lists show that fuzzyPSM, in general, outperforms all its counterparts, especially accurate in telling apart weak passwords and suitable for services where online guessing attacks prevail.

70 citations

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This work develops a series of practical experiments using 10 large-scale datasets, a total of 104 million real-world passwords, to quantitatively evaluate the security that these four honeyword-generation methods can provide and resolves three open problems in honeyword research, as defined by Juels and Rivest.
Abstract: Honeywords are decoy passwords associated with each user account, and they contribute a promising approach to detecting password leakage. This approach was first proposed by Juels and Rivest at CCS’13, and has been covered by hundreds of medias and also adopted in various research domains. The idea of honeywords looks deceptively simple, but it is a deep and sophisticated challenge to automatically generate honeywords that are hard to differentiate from real passwords. In JuelsRivest’s work, four main honeyword-generation methods are suggested but only justified by heuristic security arguments. In this work, we for the first time develop a series of practical experiments using 10 large-scale datasets, a total of 104 million real-world passwords, to quantitatively evaluate the security that these four methods can provide. Our results reveal that they all fail to provide the expected security: real passwords can be distinguished with a success rate of 29.29%∼32.62% by our basic trawling-guessing attacker, but not the expected 5%, with just one guess (when each user account is associated with 19 honeywords as recommended). This figure reaches 34.21%∼49.02% under the advanced trawling-guessing attackers who make use of various state-of-the-art probabilistic password models. We further evaluate the security of Juels-Rivest’s methods under a targeted-guessing attacker who can exploit the victim’ personal information, and the results are even more alarming: 56.81%∼67.98%. Overall, our work resolves three open problems in honeyword research, as defined by Juels and Rivest.

62 citations


Cited by
More filters
Journal ArticleDOI
Ding Wang1, Ping Wang1
TL;DR: In this paper, a security model that can accurately capture the practical capabilities of an adversary is defined and a broad set of twelve properties framed as a systematic methodology for comparative evaluation, allowing schemes to be rated across a common spectrum.
Abstract: As the most prevailing two-factor authentication mechanism, smart-card-based password authentication has been a subject of intensive research in the past two decades, and hundreds of this type of schemes have wave upon wave been proposed. In most of these studies, there is no comprehensive and systematical metric available for schemes to be assessed objectively, and the authors present new schemes with assertions of the superior aspects over previous ones, while overlooking dimensions on which their schemes fare poorly. Unsurprisingly, most of them are far from satisfactory—either are found short of important security goals or lack of critical properties, especially being stuck with the security-usability tension. To overcome this issue, in this work we first explicitly define a security model that can accurately capture the practical capabilities of an adversary and then suggest a broad set of twelve properties framed as a systematic methodology for comparative evaluation, allowing schemes to be rated across a common spectrum. As our main contribution, a new scheme is advanced to resolve the various issues arising from user corruption and server compromise, and it is formally proved secure under the harshest adversary model so far. In particular, by integrating “honeywords”, traditionally the purview of system security, with a “fuzzy-verifier”, our scheme hits “two birds”: it not only eliminates the long-standing security-usability conflict that is considered intractable in the literature, but also achieves security guarantees beyond the conventional optimal security bound.

323 citations

Proceedings ArticleDOI
24 Oct 2016
TL;DR: TarGuess, a framework that systematically characterizes typical targeted guessing scenarios with seven sound mathematical models, each of which is based on varied kinds of data available to an attacker, is proposed to design novel and efficient guessing algorithms.
Abstract: While trawling online/offline password guessing has been intensively studied, only a few studies have examined targeted online guessing, where an attacker guesses a specific victim's password for a service, by exploiting the victim's personal information such as one sister password leaked from her another account and some personally identifiable information (PII). A key challenge for targeted online guessing is to choose the most effective password candidates, while the number of guess attempts allowed by a server's lockout or throttling mechanisms is typically very small. We propose TarGuess, a framework that systematically characterizes typical targeted guessing scenarios with seven sound mathematical models, each of which is based on varied kinds of data available to an attacker. These models allow us to design novel and efficient guessing algorithms. Extensive experiments on 10 large real-world password datasets show the effectiveness of TarGuess. Particularly, TarGuess I~IV capture the four most representative scenarios and within 100 guesses: (1) TarGuess-I outperforms its foremost counterpart by 142% against security-savvy users and by 46% against normal users; (2) TarGuess-II outperforms its foremost counterpart by 169% on security-savvy users and by 72% against normal users; and (3) Both TarGuess-III and IV gain success rates over 73% against normal users and over 32% against security-savvy users. TarGuess-III and IV, for the first time, address the issue of cross-site online guessing when given the victim's one sister password and some PII.

304 citations

Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed two Zipf-like models (i.e., PDF-Zipf and CDF-ZipF) to characterize the distribution of passwords and proposed a new metric for measuring the strength of password data sets.
Abstract: Despite three decades of intensive research efforts, it remains an open question as to what is the underlying distribution of user-generated passwords. In this paper, we make a substantial step forward toward understanding this foundational question. By introducing a number of computational statistical techniques and based on 14 large-scale data sets, which consist of 113.3 million real-world passwords, we, for the first time, propose two Zipf-like models (i.e., PDF-Zipf and CDF-Zipf) to characterize the distribution of passwords. More specifically, our PDF-Zipf model can well fit the popular passwords and obtain a coefficient of determination larger than 0.97; our CDF-Zipf model can well fit the entire password data set, with the maximum cumulative distribution function (CDF) deviation between the empirical distribution and the fitted theoretical model being 0.49%~4.59% (on an average 1.85%). With the concrete knowledge of password distributions, we suggest a new metric for measuring the strength of password data sets. Extensive experimental results show the effectiveness and general applicability of the proposed Zipf-like models and security metric.

300 citations

Journal ArticleDOI
TL;DR: This work presents a lightweight and secure user authentication protocol based on the Rabin cryptosystem, which has the characteristic of computational asymmetry and presents a comprehensive heuristic security analysis to show that the protocol is secure against all the possible attacks and provides the desired security features.
Abstract: Wireless sensor networks (WSNs) will be integrated into the future Internet as one of the components of the Internet of Things, and will become globally addressable by any entity connected to the Internet. Despite the great potential of this integration, it also brings new threats, such as the exposure of sensor nodes to attacks originating from the Internet. In this context, lightweight authentication and key agreement protocols must be in place to enable end-to-end secure communication. Recently, Amin et al. proposed a three-factor mutual authentication protocol for WSNs. However, we identified several flaws in their protocol. We found that their protocol suffers from smart card loss attack where the user identity and password can be guessed using offline brute force techniques. Moreover, the protocol suffers from known session-specific temporary information attack, which leads to the disclosure of session keys in other sessions. Furthermore, the protocol is vulnerable to tracking attack and fails to fulfill user untraceability. To address these deficiencies, we present a lightweight and secure user authentication protocol based on the Rabin cryptosystem, which has the characteristic of computational asymmetry. We conduct a formal verification of our proposed protocol using ProVerif in order to demonstrate that our scheme fulfills the required security properties. We also present a comprehensive heuristic security analysis to show that our protocol is secure against all the possible attacks and provides the desired security features. The results we obtained show that our new protocol is a secure and lightweight solution for authentication and key agreement for Internet-integrated WSNs.

259 citations

Journal ArticleDOI
TL;DR: It is shown that the proposed scheme ensures security even if a sensor node is captured by an adversary, and the proposed protocol uses the lightweight cryptographic primitives, such as one way cryptographic hash function, physically unclonable function, and bitwise exclusive operations.
Abstract: Industrial wireless sensor network (IWSN) is an emerging class of a generalized WSN having constraints of energy consumption, coverage, connectivity, and security. However, security and privacy is one of the major challenges in IWSN as the nodes are connected to Internet and usually located in an unattended environment with minimum human interventions. In IWSN, there is a fundamental requirement for a user to access the real-time information directly from the designated sensor nodes. This task demands to have a user authentication protocol. To satisfy this requirement, this paper proposes a lightweight and privacy-preserving mutual user authentication protocol in which only the user with a trusted device has the right to access the IWSN. Therefore, in the proposed scheme, we considered the physical layer security of the sensor nodes. We show that the proposed scheme ensures security even if a sensor node is captured by an adversary. The proposed protocol uses the lightweight cryptographic primitives, such as one way cryptographic hash function, physically unclonable function, and bitwise exclusive operations. Security and performance analysis shows that the proposed scheme is secure, and is efficient for the resource-constrained sensing devices in IWSN.

182 citations