scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Information Forensics and Security in 2017"


Journal ArticleDOI
TL;DR: This paper presents an alternative approach to steganalysis of digital images based on convolutional neural network (CNN), which is shown to be able to well replicate and optimize these key steps in a unified framework and learn hierarchical representations directly from raw images.
Abstract: Nowadays, the prevailing detectors of steganographic communication in digital images mainly consist of three steps, ie, residual computation, feature extraction, and binary classification In this paper, we present an alternative approach to steganalysis of digital images based on convolutional neural network (CNN), which is shown to be able to well replicate and optimize these key steps in a unified framework and learn hierarchical representations directly from raw images The proposed CNN has a quite different structure from the ones used in conventional computer vision tasks Rather than a random strategy, the weights in the first layer of the proposed CNN are initialized with the basic high-pass filter set used in the calculation of residual maps in a spatial rich model (SRM), which acts as a regularizer to suppress the image content effectively To better capture the structure of embedding signals, which usually have extremely low SNR (stego signal to image content), a new activation function called a truncated linear unit is adopted in our CNN model Finally, we further boost the performance of the proposed CNN-based steganalyzer by incorporating the knowledge of selection channel Three state-of-the-art steganographic algorithms in spatial domain, eg, WOW, S-UNIWARD, and HILL, are used to evaluate the effectiveness of our model Compared to SRM and its selection-channel-aware variant maxSRMd2, our model achieves superior performance across all tested algorithms for a wide variety of payloads

483 citations


Journal ArticleDOI
TL;DR: This paper proposes a new construction of identity-based (ID-based) RDIC protocol by making use of key-homomorphic cryptographic primitive to reduce the system complexity and the cost for establishing and managing the public key authentication framework in PKI-based RDIC schemes.
Abstract: Remote data integrity checking (RDIC) enables a data storage server, say a cloud server, to prove to a verifier that it is actually storing a data owner’s data honestly. To date, a number of RDIC protocols have been proposed in the literature, but most of the constructions suffer from the issue of a complex key management, that is, they rely on the expensive public key infrastructure (PKI), which might hinder the deployment of RDIC in practice. In this paper, we propose a new construction of identity-based (ID-based) RDIC protocol by making use of key-homomorphic cryptographic primitive to reduce the system complexity and the cost for establishing and managing the public key authentication framework in PKI-based RDIC schemes. We formalize ID-based RDIC and its security model, including security against a malicious cloud server and zero knowledge privacy against a third party verifier. The proposed ID-based RDIC protocol leaks no information of the stored data to the verifier during the RDIC process. The new construction is proven secure against the malicious server in the generic group model and achieves zero knowledge privacy against a verifier. Extensive security analysis and implementation results demonstrate that the proposed protocol is provably secure and practical in the real-world applications.

354 citations


Journal ArticleDOI
TL;DR: A fast image similarity measurement based on random verification is proposed to efficiently implement copy detection and the proposed method achieves higher accuracy than the state-of-the-art methods, and has comparable efficiency to the baseline method based on the BOW quantization.
Abstract: To detect illegal copies of copyrighted images, recent copy detection methods mostly rely on the bag-of-visual-words (BOW) model, in which local features are quantized into visual words for image matching. However, both the limited discriminability of local features and the BOW quantization errors will lead to many false local matches, which make it hard to distinguish similar images from copies. Geometric consistency verification is a popular technology for reducing the false matches, but it neglects global context information of local features and thus cannot solve this problem well. To address this problem, this paper proposes a global context verification scheme to filter false matches for copy detection. More specifically, after obtaining initial scale invariant feature transform (SIFT) matches between images based on the BOW quantization, the overlapping region-based global context descriptor (OR-GCD) is proposed for the verification of these matches to filter false matches. The OR-GCD not only encodes relatively rich global context information of SIFT features but also has good robustness and efficiency. Thus, it allows an effective and efficient verification. Furthermore, a fast image similarity measurement based on random verification is proposed to efficiently implement copy detection. In addition, we also extend the proposed method for partial-duplicate image detection. Extensive experiments demonstrate that our method achieves higher accuracy than the state-of-the-art methods, and has comparable efficiency to the baseline method based on the BOW quantization.

332 citations


Journal ArticleDOI
TL;DR: This paper consists of the following contributions: massive social images and their privacy settings are leveraged to learn the object-privacy relatedness effectively and identify a set of privacy-sensitive object classes automatically and a deep multi-task learning algorithm is developed.
Abstract: To achieve automatic recommendation of privacy settings for image sharing, a new tool called iPrivacy (image privacy) is developed for releasing the burden from users on setting the privacy preferences when they share their images for special moments. Specifically, this paper consists of the following contributions: 1) massive social images and their privacy settings are leveraged to learn the object-privacy relatedness effectively and identify a set of privacy-sensitive object classes automatically; 2) a deep multi-task learning algorithm is developed to jointly learn more representative deep convolutional neural networks and more discriminative tree classifier, so that we can achieve fast and accurate detection of large numbers of privacy-sensitive object classes; 3) automatic recommendation of privacy settings for image sharing can be achieved by detecting the underlying privacy-sensitive objects from the images being shared, recognizing their classes, and identifying their privacy settings according to the object-privacy relatedness; and 4) one simple solution for image privacy protection is provided by blurring the privacy-sensitive objects automatically. We have conducted extensive experimental studies on real-world images and the results have demonstrated both the efficiency and effectiveness of our proposed approach.

314 citations


Journal ArticleDOI
TL;DR: This paper proposes an efficient public auditing protocol with global and sampling blockless verification as well as batch auditing, where data dynamics are substantially more efficiently supported than is the case with the state of the art.
Abstract: With the rapid development of cloud computing, cloud storage has been accepted by an increasing number of organizations and individuals, therein serving as a convenient and on-demand outsourcing application However, upon losing local control of data, it becomes an urgent need for users to verify whether cloud service providers have stored their data securely Hence, many researchers have devoted themselves to the design of auditing protocols directed at outsourced data In this paper, we propose an efficient public auditing protocol with global and sampling blockless verification as well as batch auditing, where data dynamics are substantially more efficiently supported than is the case with the state of the art Note that, the novel dynamic structure in our protocol consists of a doubly linked info table and a location array Moreover, with such a structure, computational and communication overheads can be reduced substantially Security analysis indicates that our protocol can achieve the desired properties Moreover, numerical analysis and real-world experimental results demonstrate that the proposed protocol achieves a given efficiency in practice

305 citations


Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed two Zipf-like models (i.e., PDF-Zipf and CDF-ZipF) to characterize the distribution of passwords and proposed a new metric for measuring the strength of password data sets.
Abstract: Despite three decades of intensive research efforts, it remains an open question as to what is the underlying distribution of user-generated passwords. In this paper, we make a substantial step forward toward understanding this foundational question. By introducing a number of computational statistical techniques and based on 14 large-scale data sets, which consist of 113.3 million real-world passwords, we, for the first time, propose two Zipf-like models (i.e., PDF-Zipf and CDF-Zipf) to characterize the distribution of passwords. More specifically, our PDF-Zipf model can well fit the popular passwords and obtain a coefficient of determination larger than 0.97; our CDF-Zipf model can well fit the entire password data set, with the maximum cumulative distribution function (CDF) deviation between the empirical distribution and the fitted theoretical model being 0.49%~4.59% (on an average 1.85%). With the concrete knowledge of password distributions, we suggest a new metric for measuring the strength of password data sets. Extensive experimental results show the effectiveness and general applicability of the proposed Zipf-like models and security metric.

300 citations


Journal ArticleDOI
TL;DR: A Q-learning-based approach to identify critical attack sequences with consideration of physical system behaviors is proposed to identify new smart grid vulnerability that can be exploited by attacks on the network topology.
Abstract: Recent studies on sequential attack schemes revealed new smart grid vulnerability that can be exploited by attacks on the network topology. Traditional power systems contingency analysis needs to be expanded to handle the complex risk of cyber-physical attacks. To analyze the transmission grid vulnerability under sequential topology attacks, this paper proposes a Q-learning-based approach to identify critical attack sequences with consideration of physical system behaviors. A realistic power flow cascading outage model is used to simulate the system behavior, where attacker can use the Q-learning to improve the damage of sequential topology attack toward system failures with the least attack efforts. Case studies based on three IEEE test systems have demonstrated the learning ability and effectiveness of Q-learning-based vulnerability analysis.

202 citations


Journal ArticleDOI
TL;DR: It is argued that there is a significant need in forensics for new authorship attribution algorithms that can exploit context, can process multi-modal data, and are tolerant to incomplete knowledge of the space of all possible authors at training time.
Abstract: The veil of anonymity provided by smartphones with pre-paid SIM cards, public Wi-Fi hotspots, and distributed networks like Tor has drastically complicated the task of identifying users of social media during forensic investigations. In some cases, the text of a single posted message will be the only clue to an author’s identity. How can we accurately predict who that author might be when the message may never exceed 140 characters on a service like Twitter? For the past 50 years, linguists, computer scientists, and scholars of the humanities have been jointly developing automated methods to identify authors based on the style of their writing. All authors possess peculiarities of habit that influence the form and content of their written works. These characteristics can often be quantified and measured using machine learning algorithms. In this paper, we provide a comprehensive review of the methods of authorship attribution that can be applied to the problem of social media forensics. Furthermore, we examine emerging supervised learning-based methods that are effective for small sample sizes, and provide step-by-step explanations for several scalable approaches as instructional case studies for newcomers to the field. We argue that there is a significant need in forensics for new authorship attribution algorithms that can exploit context, can process multi-modal data, and are tolerant to incomplete knowledge of the space of all possible authors at training time.

189 citations


Journal ArticleDOI
TL;DR: This paper develops two methods to provide differential privacy to distributed learning algorithms over a network by decentralizing the learning algorithm using the alternating direction method of multipliers, and proposing the methods of dual variable perturbation and primal variable perturgation to provide dynamic differential privacy.
Abstract: Privacy-preserving distributed machine learning becomes increasingly important due to the recent rapid growth of data. This paper focuses on a class of regularized empirical risk minimization machine learning problems, and develops two methods to provide differential privacy to distributed learning algorithms over a network. We first decentralize the learning algorithm using the alternating direction method of multipliers, and propose the methods of dual variable perturbation and primal variable perturbation to provide dynamic differential privacy. The two mechanisms lead to algorithms that can provide privacy guarantees under mild conditions of the convexity and differentiability of the loss function and the regularizer. We study the performance of the algorithms, and show that the dual variable perturbation outperforms its primal counterpart. To design an optimal privacy mechanism, we analyze the fundamental tradeoff between privacy and accuracy, and provide guidelines to choose privacy parameters. Numerical experiments using customer information database are performed to corroborate the results on privacy and utility tradeoffs and design.

189 citations


Journal ArticleDOI
TL;DR: A generic hybrid deep-learning framework for JPEG steganalysis incorporating the domain knowledge behind rich steganalytic models is proposed, and it is demonstrated that the framework is insensitive to JPEG blocking artifact alterations, and the learned model can be easily transferred to a different attacking target and even a different data set.
Abstract: Adoption of deep learning in image steganalysis is still in its initial stage. In this paper, we propose a generic hybrid deep-learning framework for JPEG steganalysis incorporating the domain knowledge behind rich steganalytic models. Our proposed framework involves two main stages. The first stage is hand-crafted, corresponding to the convolution phase and the quantization and truncation phase of the rich models. The second stage is a compound deep-neural network containing multiple deep subnets, in which the model parameters are learned in the training procedure. We provided experimental evidence and theoretical reflections to argue that the introduction of threshold quantizers, though disabling the gradient-descent-based learning of the bottom convolution phase, is indeed cost-effective. We have conducted extensive experiments on a large-scale data set extracted from ImageNet. The primary data set used in our experiments contains 500 000 cover images, while our largest data set contains five million cover images. Our experiments show that the integration of quantization and truncation into deep-learning steganalyzers do boost the detection performance by a clear margin. Furthermore, we demonstrate that our framework is insensitive to JPEG blocking artifact alterations, and the learned model can be easily transferred to a different attacking target and even a different data set. These properties are of critical importance in practical applications.

178 citations


Journal ArticleDOI
TL;DR: A deep learning model is proposed to extract and recover vein features using limited a priori knowledge to recover missing finger-vein patterns in the segmented image.
Abstract: Finger-vein biometrics has been extensively investigated for personal verification. Despite recent advances in finger-vein verification, current solutions completely depend on domain knowledge and still lack the robustness to extract finger-vein features from raw images. This paper proposes a deep learning model to extract and recover vein features using limited a priori knowledge. First, based on a combination of the known state-of-the-art handcrafted finger-vein image segmentation techniques, we automatically identify two regions: a clear region with high separability between finger-vein patterns and background, and an ambiguous region with low separability between them. The first is associated with pixels on which all the above-mentioned segmentation techniques assign the same segmentation label (either foreground or background), while the second corresponds to all the remaining pixels. This scheme is used to automatically discard the ambiguous region and to label the pixels of the clear region as foreground or background. A training data set is constructed based on the patches centered on the labeled pixels. Second, a convolutional neural network (CNN) is trained on the resulting data set to predict the probability of each pixel of being foreground (i.e., vein pixel), given a patch centered on it. The CNN learns what a finger-vein pattern is by learning the difference between vein patterns and background ones. The pixels in any region of a test image can then be classified effectively. Third, we propose another new and original contribution by developing and investigating a fully convolutional network to recover missing finger-vein patterns in the segmented image. The experimental results on two public finger-vein databases show a significant improvement in terms of finger-vein verification accuracy.

Journal ArticleDOI
Hassan Salmani1
TL;DR: Using an unsupervised clustering analysis, the paper shows that the controllability and observability characteristics of Trojan gates present significant inter-cluster distance from those of genuine gates in a Trojan-inserted circuit, such that Trojan gates are easily distinguishable.
Abstract: This paper presents a novel hardware Trojan detection technique in gate-level netlist based on the controllability and observability analyses. Using an unsupervised clustering analysis, the paper shows that the controllability and observability characteristics of Trojan gates present significant inter-cluster distance from those of genuine gates in a Trojan-inserted circuit, such that Trojan gates are easily distinguishable. The proposed technique does not require any golden model and can be easily integrated into the current integrated circuit design flow. Furthermore, it performs a static analysis and does not require any test pattern application for Trojan activation either partially or fully. In addition, the timing complexity of the proposed technique is an order of the number of signals in a circuit. Moreover, the proposed technique makes it possible to fully restore an inserted Trojan and to isolate its trigger and payload circuits. The technique has been applied on various types of Trojans, and all Trojans are successfully detected with 0 false positive and negative rates in less than 14 s in the worst case.

Journal ArticleDOI
TL;DR: It is shown that piggybacking operations not only concern app code, but also extensively manipulates app resource files, largely contradicting common beliefs.
Abstract: The Android packaging model offers ample opportunities for malware writers to piggyback malicious code in popular apps, which can then be easily spread to a large user base. Although recent research has produced approaches and tools to identify piggybacked apps, the literature lacks a comprehensive investigation into such phenomenon. We fill this gap by: 1) systematically building a large set of piggybacked and benign apps pairs, which we release to the community; 2) empirically studying the characteristics of malicious piggybacked apps in comparison with their benign counterparts; and 3) providing insights on piggybacking processes. Among several findings providing insights analysis techniques should build upon to improve the overall detection and classification accuracy of piggybacked apps, we show that piggybacking operations not only concern app code, but also extensively manipulates app resource files, largely contradicting common beliefs. We also find that piggybacking is done with little sophistication, in many cases automatically, and often via library code.

Journal ArticleDOI
TL;DR: An attack impact model is derived and an optimal attack is analyzed, consisting of a series of FDIs that minimizes the remaining time until the onset of disruptive remedial actions, leaving the shortest time for the grid to counteract.
Abstract: This paper studies the impact of false data injection (FDI) attacks on automatic generation control (AGC), a fundamental control system used in all power grids to maintain the grid frequency at a nominal value. Attacks on the sensor measurements for AGC can cause frequency excursion that triggers remedial actions, such as disconnecting customer loads or generators, leading to blackouts, and potentially costly equipment damage. We derive an attack impact model and analyze an optimal attack , consisting of a series of FDIs that minimizes the remaining time until the onset of disruptive remedial actions, leaving the shortest time for the grid to counteract. We show that, based on eavesdropped sensor data and a few feasible-to-obtain system constants, the attacker can learn the attack impact model and achieve the optimal attack in practice. This paper provides essential understanding on the limits of physical impact of the FDIs on power grids, and provides an analysis framework to guide the protection of sensor data links. For countermeasures, we develop efficient algorithms to detect the attack, estimate which sensor data links are under attack, and mitigate attack impact. Our analysis and algorithms are validated by experiments on a physical 16-bus power system test bed and extensive simulations based on a 37-bus power system model.

Journal ArticleDOI
TL;DR: A generic scheme that uses cryptographic commitment schemes to counter BWH attack is proposed that protects a pool from rogue miners as well as rogue pool administrators and is so designed that the administrator cannot cheat on the entire pool.
Abstract: We address two problems: first, we study a variant of block withholding (BWH) attack in Bitcoins and second, we propose solutions to prevent all existing types of BWH attacks in Bitcoins. We analyze the strategies of a selfish Bitcoin miner who in connivance with one pool attacks another pool and receives reward from the former mining pool for attacking the latter. We name this attack as “sponsored block withholding attack.” We present detailed quantitative analysis of the monetary incentive that a selfish miner can earn by adopting this strategy under different scenarios. We prove that under certain conditions, the attacker can maximize her revenue by adopting some strategies and by utilizing her computing power wisely. We also show that an attacker may use this strategy for attacking both the pools for earning higher amount of incentives. More importantly, we present a strategy that can effectively counter block withholding attack in any mining pool. First, we propose a generic scheme that uses cryptographic commitment schemes to counter BWH attack. Then, we suggest an alternative implementation of the same scheme using hash function. Our scheme protects a pool from rogue miners as well as rogue pool administrators. The scheme and its variant defend against BWH attack by making it impossible for the miners to distinguish between a partial proof of work and a complete proof of work. The scheme is so designed that the administrator cannot cheat on the entire pool. The scheme can be implemented by making minor changes to existing Bitcoin protocol. We also analyze the security of the scheme.

Journal ArticleDOI
TL;DR: This paper introduces the first-of-its-kind silicone mask attack database which contains 130 real and attacked videos to facilitate research in developing presentation attack detection algorithms for this challenging scenario.
Abstract: In movies, film stars portray another identity or obfuscate their identity with the help of silicone/latex masks. Such realistic masks are now easily available and are used for entertainment purposes. However, their usage in criminal activities to deceive law enforcement and automatic face recognition systems is also plausible. Therefore, it is important to guard biometrics systems against such realistic presentation attacks. This paper introduces the first-of-its-kind silicone mask attack database which contains 130 real and attacked videos to facilitate research in developing presentation attack detection algorithms for this challenging scenario. Along with silicone mask, there are several other presentation attack instruments that are explored in literature. The next contribution of this research is a novel multilevel deep dictionary learning-based presentation attack detection algorithm that can discern different kinds of attacks. An efficient greedy layer by layer training approach is formulated to learn the deep dictionaries followed by SVM to classify an input sample as genuine or attacked. Experimental are performed on the proposed SMAD database, some samples with real world silicone mask attacks, and four existing presentation attack databases, namely, replay-attack, CASIA-FASD, 3DMAD, and UVAD. The results show that the proposed algorithm yields better performance compared with state-of-the-art algorithms, in both intra-database and cross-database experiments.

Journal ArticleDOI
TL;DR: This paper considers a photo response non-uniformity analysis and focuses on the detection of small forgeries, adopting a recently proposed paradigm of multi-scale analysis and discussing various strategies for its implementation.
Abstract: Accurate unsupervised tampering localization is one of the most challenging problems in digital image forensics. In this paper, we consider a photo response non-uniformity analysis and focus on the detection of small forgeries. For this purpose, we adopt a recently proposed paradigm of multi-scale analysis and discuss various strategies for its implementation. First, we consider a multi-scale fusion approach, which involves combination of multiple candidate tampering probability maps into a single, more reliable decision map. The candidate maps are obtained with sliding windows of various sizes and thus allow to exploit the benefits of both the small- and large-scale analyses. We extend this approach by introducing modulated threshold drift and content-dependent neighborhood interactions, leading to improved localization performance with superior shape representation and easier detection of small forgeries. We also discuss two novel alternative strategies: a segmentation-guided approach, which contracts the decision statistic to a central segment within each analysis window and an adaptive-window approach, which dynamically chooses analysis window size for each location in the image. We perform extensive experimental evaluation on both synthetic and realistic forgeries and discuss in detail practical aspects of parameter selection. Our evaluation shows that the multi-scale analysis leads to significant performance improvement compared with the commonly used single-scale approach. The proposed multi-scale fusion strategy delivers stable results with consistent improvement in various test scenarios.

Journal ArticleDOI
TL;DR: A content-aware search scheme, which can make semantic search more smart and employ the technology of multi-keyword ranked search over encrypted cloud data as the basis against two threat models to resolve the problem of privacy-preserving smart semantic search.
Abstract: Searchable encryption is an important research area in cloud computing. However, most existing efficient and reliable ciphertext search schemes are based on keywords or shallow semantic parsing, which are not smart enough to meet with users’ search intention. Therefore, in this paper, we propose a content-aware search scheme, which can make semantic search more smart. First, we introduce conceptual graphs (CGs) as a knowledge representation tool. Then, we present our two schemes (PRSCG and PRSCG-TF) based on CGs according to different scenarios. In order to conduct numerical calculation, we transfer original CGs into their linear form with some modification and map them to numerical vectors. Second, we employ the technology of multi-keyword ranked search over encrypted cloud data as the basis against two threat models and raise PRSCG and PRSCG-TF to resolve the problem of privacy-preserving smart semantic search based on CGs. Finally, we choose a real-world data set: CNN data set to test our scheme. We also analyze the privacy and efficiency of proposed schemes in detail. The experiment results show that our proposed schemes are efficient.

Journal ArticleDOI
TL;DR: This paper introduces DeepCAPTCHA, a new and secure CAPTCHA scheme based on adversarial examples, an inherit limitation of the current DL networks, and implements a proof of concept system, which shows that the scheme offers high security and good usability compared with the best previously existing CAPTCHAs.
Abstract: Recent advances in deep learning (DL) allow for solving complex AI problems that used to be considered very hard. While this progress has advanced many fields, it is considered to be bad news for Completely Automated Public Turing tests to tell Computers and Humans Apart (CAPTCHAs), the security of which rests on the hardness of some learning problems. In this paper, we introduce DeepCAPTCHA, a new and secure CAPTCHA scheme based on adversarial examples , an inherit limitation of the current DL networks. These adversarial examples are constructed inputs, either synthesized from scratch or computed by adding a small and specific perturbation called adversarial noise to correctly classified items, causing the targeted DL network to misclassify them. We show that plain adversarial noise is insufficient to achieve secure CAPTCHA schemes, which leads us to introduce immutable adversarial noise —an adversarial noise that is resistant to removal attempts. In this paper, we implement a proof of concept system, and its analysis shows that the scheme offers high security and good usability compared with the best previously existing CAPTCHAs.

Journal ArticleDOI
TL;DR: A light-weight and robust security-aware D2D-assist data transmission protocol for M-health systems is proposed by using a certificateless generalized signcryption (CLGSC) technique and performance analysis demonstrates that the proposed protocol can achieve the design objectives and outperform existing schemes in terms of computational and communication overhead.
Abstract: With the rapid advancement of technology, healthcare systems have been quickly transformed into a pervasive environment, where both challenges and opportunities abound. On the one hand, the proliferation of smart phones and advances in medical sensors and devices have driven the emergence of wireless body area networks for remote patient monitoring, also known as mobile-health (M-health), thereby providing a reliable and cost effective way to improving efficiency and quality of health care. On the other hand, the advances of M-health systems also generate extensive medical data, which could crowd today’s cellular networks. Device-to-device (D2D) communications have been proposed to address this challenge, but unfortunately, security threats are also emerging because of the open nature of D2D communications between medical sensors and highly privacy-sensitive nature of medical data. Even, more disconcerting is healthcare systems that have many characteristics that make them more vulnerable to privacy attacks than in other applications. In this paper, we propose a light-weight and robust security-aware D2D-assist data transmission protocol for M-health systems by using a certificateless generalized signcryption (CLGSC) technique. Specifically, we first propose a new efficient CLGSC scheme, which can adaptively work as one of the three cryptographic primitives: signcryption, signature, or encryption, but within one single algorithm. The scheme is proved to be secure, simultaneously achieving confidentiality and unforgeability. Based on the proposed CLGSC algorithm, we further design a D2D-assist data transmission protocol for M-health systems with security properties, including data confidentiality and integrity, mutual authentication, contextual privacy, anonymity, unlinkability, and forward security. Performance analysis demonstrates that the proposed protocol can achieve the design objectives and outperform existing schemes in terms of computational and communication overhead.

Journal ArticleDOI
TL;DR: An attribute-aware encrypted traffic classification method based on the second-order Markov Chains is proposed, which can improve the classification accuracy by 29% on the average compared with the state-of-the-art Markov-based method.
Abstract: With a profusion of network applications, traffic classification plays a crucial role in network management and policy-based security control. The widely used encryption transmission protocols, such as the secure socket layer/transport layer security (SSL/TLS) protocols, lead to the failure of traditional payload-based classification methods. Existing methods for encrypted traffic classification cannot achieve high discrimination accuracy for applications with similar fingerprints. In this paper, we propose an attribute-aware encrypted traffic classification method based on the second-order Markov Chains. We start by exploring approaches that can further improve the performance of existing methods in terms of discrimination accuracy, and make promising observations that the application attribute bigram, which consists of the certificate packet length and the first application data size in SSL/TLS sessions, contributes to application discrimination. To increase the diversity of application fingerprints, we develop a new method by incorporating the attribute bigrams into the second-order homogeneous Markov chains. Extensive evaluation results show that the proposed method can improve the classification accuracy by 29% on the average compared with the state-of-the-art Markov-based method.

Journal ArticleDOI
TL;DR: This paper proposes a novel public verification scheme for the cloud storage using indistinguishability obfuscation, which requires a lightweight computation on the auditor and the delegate most computation to the cloud and extends it to support batch verification and data dynamic operations.
Abstract: Cloud storage services allow users to outsource their data to cloud servers to save local data storage costs. However, unlike using local storage devices, users do not physically manage the data stored on cloud servers; therefore, the data integrity of the outsourced data has become an issue. Many public verification schemes have been proposed to enable a third-party auditor to verify the data integrity for users. These schemes make an impractical assumption—the auditors have enough computation capability to bear expensive verification costs. In this paper, we propose a novel public verification scheme for the cloud storage using indistinguishability obfuscation, which requires a lightweight computation on the auditor and the delegate most computation to the cloud. We further extend our scheme to support batch verification and data dynamic operations, where multiple verification tasks from different users can be performed efficiently by the auditor and the cloud-stored data can be updated dynamically. Compared with other existing works, our scheme significantly reduces the auditor’s computation overhead. Moreover, the batch verification overhead on the auditor side in our scheme is independent of the number of verification tasks. Our scheme could be practical in a scenario, where the data integrity verifications are executed frequently, and the number of verification tasks (i.e., the number of users) is numerous; even if the auditor is equipped with a low-power device, it can verify the data integrity efficiently. We prove the security of our scheme under the strongest security model proposed by Shi et al. (ACM CCS 2013). Finally, we conduct a performance analysis to demonstrate that our scheme is more efficient than other existing works in terms of the auditor’s communication and computation efficiency.

Journal ArticleDOI
TL;DR: A novel heterogeneous framework is proposed to remove the problem of single-point performance bottleneck and provide a more efficient access control scheme with an auditing mechanism and shows that the system not only guarantees the security requirements but also makes great performance improvement on key generation.
Abstract: Data access control is a challenging issue in public cloud storage systems. Ciphertext-policy attribute-based encryption (CP-ABE) has been adopted as a promising technique to provide flexible, fine-grained, and secure data access control for cloud storage with honest-but-curious cloud servers. However, in the existing CP-ABE schemes, the single attribute authority must execute the time-consuming user legitimacy verification and secret key distribution, and hence, it results in a single-point performance bottleneck when a CP-ABE scheme is adopted in a large-scale cloud storage system. Users may be stuck in the waiting queue for a long period to obtain their secret keys, thereby resulting in low efficiency of the system. Although multi-authority access control schemes have been proposed, these schemes still cannot overcome the drawbacks of single-point bottleneck and low efficiency, due to the fact that each of the authorities still independently manages a disjoint attribute set. In this paper, we propose a novel heterogeneous framework to remove the problem of single-point performance bottleneck and provide a more efficient access control scheme with an auditing mechanism. Our framework employs multiple attribute authorities to share the load of user legitimacy verification. Meanwhile, in our scheme, a central authority is introduced to generate secret keys for legitimacy verified users. Unlike other multi-authority access control schemes, each of the authorities in our scheme manages the whole attribute set individually. To enhance security, we also propose an auditing mechanism to detect which attribute authority has incorrectly or maliciously performed the legitimacy verification procedure. Analysis shows that our system not only guarantees the security requirements but also makes great performance improvement on key generation.

Journal ArticleDOI
TL;DR: This paper provides a new efficient RDPC protocol based on homomorphic hash function that is provably secure against forgery attack, replace attack, and replay attack based on a typical security model and gives a new optimized implementation for the ORT, which makes the cost of accessing ORT nearly constant.
Abstract: As an important application in cloud computing, cloud storage offers user scalable, flexible, and high-quality data storage and computation services. A growing number of data owners choose to outsource data files to the cloud. Because cloud storage servers are not fully trustworthy, data owners need dependable means to check the possession for their files outsourced to remote cloud servers. To address this crucial problem, some remote data possession checking (RDPC) protocols have been presented. But many existing schemes have vulnerabilities in efficiency or data dynamics. In this paper, we provide a new efficient RDPC protocol based on homomorphic hash function. The new scheme is provably secure against forgery attack, replace attack, and replay attack based on a typical security model. To support data dynamics, an operation record table (ORT) is introduced to track operations on file blocks. We further give a new optimized implementation for the ORT, which makes the cost of accessing ORT nearly constant. Moreover, we make the comprehensive performance analysis, which shows that our scheme has advantages in computation and communication costs. Prototype implementation and experiments exhibit that the scheme is feasible for real applications.

Journal ArticleDOI
TL;DR: In this paper, the authors studied the minimum download cost of PIR schemes for arbitrary message lengths under arbitrary choices of alphabet (not restricted to finite fields) for the message and download symbols, and showed that the optimal download cost is characterized to within two symbols.
Abstract: A private information retrieval (PIR) scheme is a mechanism that allows a user to retrieve any one out of $K$ messages from $N$ non-communicating replicated databases, each of which stores all $K$ messages, without revealing anything (in the information theoretic sense) about the identity of the desired message index to any individual database. If the size of each message is $L$ bits and the total download required by a PIR scheme from all $N$ databases is $D$ bits, then $D$ is called the download cost and the ratio $L/D$ is called an achievable rate. For fixed $K,N\in \mathbb {N}$ , the capacity of PIR, denoted by $C$ , is the supremum of achievable rates over all PIR schemes and over all message sizes, and was recently shown to be $C=(1+1/N+1/N^{2}+\cdots +1/N^{K-1})^{-1}$ . In this paper, for arbitrary $K$ and $N$ , we explore the minimum download cost $D_{L}$ across all PIR schemes (not restricted to linear schemes) for arbitrary message lengths $L$ under arbitrary choices of alphabet (not restricted to finite fields) for the message and download symbols. If the same $M$ -ary alphabet is used for the message and download symbols, then we show that the optimal download cost in $M$ -ary symbols is $D_{L}=\lceil \frac {L}{C}\rceil$ . If the message symbols are in $M$ -ary alphabet and the downloaded symbols are in $M'$ -ary alphabet, then we show that the optimal download cost in $M'$ -ary symbols, $D_{L}\in \{\lceil ~({L'}/{C})\rceil, \lceil ~({L'}/{C}\rceil -1,\lceil ~({L'}/{C})\rceil -2\}$ , where $L'= \lceil L \log _{M'} M\rceil$ , i.e., the optimal download cost is characterized to within two symbols.

Journal ArticleDOI
TL;DR: This paper exposes a potential vulnerability of partial fingerprint-based authentication systems, especially when multiple impressions are enrolled per finger, and indicates that it is indeed possible to locate or generate partial fingerprints that can be used to impersonate a large number of users.
Abstract: This paper investigates the security of partial fingerprint-based authentication systems, especially when multiple fingerprints of a user are enrolled. A number of consumer electronic devices, such as smartphones, are beginning to incorporate fingerprint sensors for user authentication. The sensors embedded in these devices are generally small and the resulting images are, therefore, limited in size. To compensate for the limited size, these devices often acquire multiple partial impressions of a single finger during enrollment to ensure that at least one of them will successfully match with the image obtained from the user during authentication. Furthermore, in some cases, the user is allowed to enroll multiple fingers, and the impressions pertaining to multiple partial fingers are associated with the same identity (i.e., one user). A user is said to be successfully authenticated if the partial fingerprint obtained during authentication matches any one of the stored templates. This paper investigates the possibility of generating a “MasterPrint,” a synthetic or real partial fingerprint that serendipitously matches one or more of the stored templates for a significant number of users. Our preliminary results on an optical fingerprint data set and a capacitive fingerprint data set indicate that it is indeed possible to locate or generate partial fingerprints that can be used to impersonate a large number of users. In this regard, we expose a potential vulnerability of partial fingerprint-based authentication systems, especially when multiple impressions are enrolled per finger.

Journal ArticleDOI
TL;DR: This paper enhances the least-effort attack model, which computes the minimum number of sensors that must be compromised to manipulate a given number of states, and develops an effective greedy algorithm for optimal PMU placement to defend against data integrity attacks.
Abstract: State estimation plays a critical role in self-detection and control of the smart grid. Data integrity attacks (also known as false data injection attacks) have shown significant potential in undermining the state estimation of power systems, and corresponding countermeasures have drawn increased scholarly interest. Nonetheless, leveraging optimal phasor measurement unit (PMU) placement to defend against these attacks, while simultaneously ensuring the system observability, has yet to be addressed without incurring significant overhead. In this paper, we enhance the least-effort attack model, which computes the minimum number of sensors that must be compromised to manipulate a given number of states, and develop an effective greedy algorithm for optimal PMU placement to defend against data integrity attacks. Regarding the least-effort attack model, we prove the existence of smallest set of sensors to compromise and propose a feasible reduced row echelon form (RRE)-based method to efficiently compute the optimal attack vector. Based on the IEEE standard systems, we validate the efficiency of the RRE algorithm, in terms of a low computation complexity. Regarding the defense strategy, we propose an effective PMU-based greedy algorithm, which cannot only defend against data integrity attacks, but also ensure the system observability with low overhead. The experimental results obtained based on various IEEE standard systems show the effectiveness of the proposed defense scheme against data integrity attacks.

Journal ArticleDOI
TL;DR: The proposed ASF provides efficient authentication and key agreement, and enables devices (identity and data) anonymity and unlinkability, and demonstrates that computation complexity of the proposed framework is low as compared with the existing schemes, while security has been significantly improved.
Abstract: The smart home is an environment, where heterogeneous electronic devices and appliances are networked together to provide smart services in a ubiquitous manner to the individuals. As the homes become smarter, more complex, and technology dependent, the need for an adequate security mechanism with minimum individual's intervention is growing. The recent serious security attacks have shown how the Internet-enabled smart homes can be turned into very dangerous spots for various ill intentions, and thus lead the privacy concerns for the individuals. For instance, an eavesdropper is able to derive the identity of a particular device/appliance via public channels that can be used to infer in the life pattern of an individual within the home area network. This paper proposes an anonymous secure framework (ASF) in connected smart home environments, using solely lightweight operations. The proposed framework in this paper provides efficient authentication and key agreement, and enables devices (identity and data) anonymity and unlinkability. One-time session key progression regularly renews the session key for the smart devices and dilutes the risk of using a compromised session key in the ASF. It is demonstrated that computation complexity of the proposed framework is low as compared with the existing schemes, while security has been significantly improved.

Journal ArticleDOI
TL;DR: The proposed Lfun scheme can discover “changed” spam tweets from unlabeled tweets and incorporate them into classifier's training process and can significantly improve the spam detection accuracy in real-world scenarios.
Abstract: Twitter spam has become a critical problem nowadays Recent works focus on applying machine learning techniques for Twitter spam detection, which make use of the statistical features of tweets In our labeled tweets data set, however, we observe that the statistical properties of spam tweets vary over time, and thus, the performance of existing machine learning-based classifiers decreases This issue is referred to as “Twitter Spam Drift” In order to tackle this problem, we first carry out a deep analysis on the statistical features of one million spam tweets and one million non-spam tweets, and then propose a novel Lfun scheme The proposed scheme can discover “changed” spam tweets from unlabeled tweets and incorporate them into classifier's training process A number of experiments are performed to evaluate the proposed scheme The results show that our proposed Lfun scheme can significantly improve the spam detection accuracy in real-world scenarios

Journal ArticleDOI
TL;DR: DAPASA, an approach to detect Android piggybacked apps through sensitive subgraph analysis, is proposed and exhibits an impressive detection performance compared with that of three baseline approaches even with only five numeric features.
Abstract: With the exponential growth of smartphone adoption, malware attacks on smartphones have resulted in serious threats to users, especially those on popular platforms, such as Android. Most Android malware is generated by piggybacking malicious payloads into benign applications (apps), which are called piggybacked apps. In this paper, we propose DAPASA, an approach to detect Android piggybacked apps through sensitive subgraph analysis. Two assumptions are established to reflect the different invocation patterns of sensitive APIs in the injected malicious payloads (rider) of a piggybacked app and in its host app (carrier). With these two assumptions, DAPASA generates a sensitive subgraph (SSG) to profile the most suspicious behavior of an app. Five features are constructed from SSG to depict the invocation patterns. The five features are fed into the machine learning algorithms to detect whether the app is piggybacked or benign. DAPASA is evaluated on a large real-world data set consisting of 2551 piggybacked apps and 44 921 popular benign apps. Extensive evaluation results demonstrate that the proposed approach exhibits an impressive detection performance compared with that of three baseline approaches even with only five numeric features. Furthermore, the proposed approach can complement permission-based approaches and API-based approaches with the combination of our five features from a new perspective of the invocation structure.