scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Generalized pattern matching string search on encrypted data in cloud systems

TL;DR: This paper proposes a scheme for Generalized Pattern-matching String-search on Encrypted data (GPSE) in cloud systems and implements two most commonly used pattern matching search functions on encrypted data, the substring matching and the longest-prefix-first matching.
Abstract: Searchable encryption is an important and challenging issue. It allows people to search on encrypted data. This is a very useful function when more and more people choose to host their data in the cloud and the cloud server is not fully trustable. Existing solutions for searchable encryption are only limited to some simple functions of search, such as boolean search or similarity search. In this paper, we propose a scheme for Generalized Pattern-matching String-search on Encrypted data (GPSE) in cloud systems. GPSE allows users to specify their search queries by using generalized wildcard-based string patterns (such as SQL-like patterns). It gives users great expressive power in specifying highly targeted search queries. In the framework of GPSE, we particularly implemented two most commonly used pattern matching search functions on encrypted data, the substring matching and the longest-prefix-first matching. We also prove that GPSE is secure under the known-plaintext model. Experiments over real data sets show that GPSE achieves high search accuracy.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
27 Apr 2018
TL;DR: This paper systematically analyzes the privacy leaks and potential threats in the task matching and proposes a single-keyword task matching scheme for the multirequester/multiworker crowdsourcing with efficient worker revocation that is secure and feasible.
Abstract: With the development of sharing economy, crowdsourcing as a distributed computing paradigm has become increasingly pervasive. As one of indispensable services for most crowdsourcing applications, task matching has also been extensively explored. However, privacy issues are usually ignored during the task matching and few existing privacy-preserving crowdsourcing mechanisms can simultaneously protect both task privacy and worker privacy. This paper systematically analyzes the privacy leaks and potential threats in the task matching and proposes a single-keyword task matching scheme for the multirequester/multiworker crowdsourcing with efficient worker revocation. The proposed scheme not only protects data confidentiality and identity anonymity against the crowd-server, but also achieves query traceability against dishonest or revoked workers. Detailed privacy analysis and thorough performance evaluation show that the proposed scheme is secure and feasible.

56 citations


Cites background from "Generalized pattern matching string..."

  • ..., fuzzy keyword matching [11], Boolean matching [14], and pattern matching [15]....

    [...]

  • ...Since the first searchable symmetric encryption (SSE) [10], fuzzy query [11], ranked query [12], personalized query [13], Boolean query [14], and pattern query [15] have been extensively explored in this model....

    [...]

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed matrix-based multi-keyword fuzzy search (M2FS) schemes, which support approximate keyword matching by exploiting the indecomposable property of primes.
Abstract: With the ever-increasing amount of data resided in a cloud, how to provide users with secure and practical query services has become the key to improve the quality of cloud services. Fuzzy searchable encryption (FSE) is identified as one of the most promising approaches for enabling secure query services, since it allows searching encrypted data by using keywords with spelling errors. However, existing FSE schemes are far from the practical use for the following reasons: (1) Inflexibility. It is hard for them to simultaneously support AND and OR semantics in a multi-keyword query. (2) Inefficiency. They require sequentially scanning a whole dataset to find matched files, and thus are difficult to apply to a large-scale dataset. (3) Limited robustness. It is difficult for them to resist the linear analysis attack in the known-background model. To fix the above problems, this article proposes matrix-based multi-keyword fuzzy search (M2FS) schemes, which support approximate keyword matching by exploiting the indecomposable property of primes. Specifically, we first present a basic scheme, called M2FS-B, where multiple keywords in a query or a file are constructed as prime-related matrices such that the result of matrix multiplication can be employed to determine the level of matching for different query semantics. Then, we construct an advanced scheme, named M2FS-E, which builds a searchable index as a keyword balanced binary (KBB) tree for dynamic and parallel searches, while adding random noises into a query matrix for enhanced robustness. Extensive analyses and experiments demonstrate the validity of our M2FS schemes.

41 citations

Journal ArticleDOI
TL;DR: A prime inner product encoding (PIPE) scheme, which makes use of the indecomposable property of prime numbers to provide efficient, highly accurate, and flexible multi-keyword fuzzy search.
Abstract: With the prevalence of cloud computing, a growing number of users are delegating clouds to host their sensitive data. To preserve user privacy, it is suggested that data is encrypted before outsourcing. However, data encryption makes keyword-based searches over ciphertexts extremely difficult. This is even challenging for fuzzy search that allows uncertainties or misspellings of keywords in a query. In this paper, we propose a prime inner product encoding (PIPE) scheme, which makes use of the indecomposable property of prime numbers to provide efficient, highly accurate, and flexible multi-keyword fuzzy search. Our main idea is to encode either a query keyword or an index keyword into a vector filled with primes or reciprocals of primes, such that the result of vectors' inner product is an integer only when two keywords are similar. Specifically, we first construct PIPE0 that is secure in the known ciphertext model. Unlike existing works that have difficulty supporting AND and OR semantics simultaneously, PIPE0 gives users the flexibility to specify different search semantics in their queries. Then, we construct PIPES that subtly adds random noises to a query vector to resist linear analyses. Both theoretical analyses and experiment results demonstrate the effectiveness of our scheme.

37 citations


Cites background from "Generalized pattern matching string..."

  • ...[25] proposed a scheme for a generalized pattern-matching string-search....

    [...]

Proceedings ArticleDOI
01 Aug 2016
TL;DR: A verifiable and dynamic fuzzy keywords search (VDFS) scheme to offer secure fuzzy keyword search, update the outsourced document collection and verify the authenticity of the search result is proposed and proved universally composable (UC) security by rigorous security analysis.
Abstract: In recent years, cloud computing becomes more and more popular. Users outsource large amount of encrypted documents to the cloud in order to avoid information leakage. Searchable encryption technique is a desirable service to enable users search on encrypted data. In most existing searchable encryption schemes, they only provide exact keyword search. Fuzzy keyword search improves the system usability because it allows users to make spelling errors or format inconsistencies. Besides, verifiable encryption schemes usually consider a semitrusted server and verify the authenticity of the search results. However, the server may be malicious, which may modify/delete some encrypted files or forge erroneous results in order to save its storage space or computation ability. In this paper, we investigate the searchable encryption problem in the presence of a malicious server, the verifiable searchability is needed to provide users the ability to detect the potential misbehavior. We propose a verifiable and dynamic fuzzy keyword search (VDFS) scheme to offer secure fuzzy keyword search, update the outsourced document collection and verify the authenticity of the search result. Our scheme is proved universally composable (UC) security by rigorous security analysis.

33 citations


Cites background from "Generalized pattern matching string..."

  • ...[20] proposed a generalized pattern-matching string-search scheme based on generalized wildcard-based string patterns....

    [...]

Journal ArticleDOI
TL;DR: An Efficient Leakage-resilient Multi-keyword Fuzzy Search (EliMFS) framework over encrypted cloud data is proposed and two specific schemes to resist these potential attacks in different threat models are proposed.
Abstract: Motivated by privacy preservation requirements for outsourced data, keyword searches over encrypted cloud data have become a hot topic. Compared to single-keyword exact searches, multi-keyword fuzzy search schemes attract more attention because of their improvements in search accuracy, typo tolerance, and user experience in general. However, existing multi-keyword fuzzy search solutions are not sufficiently efficient when the file set in the cloud is large. To address this, we propose an Efficient Leakage-resilient Multi-keyword Fuzzy Search (EliMFS) framework over encrypted cloud data. In this framework, a novel two-stage index structure is exploited to ensure that search time is independent of file set size. The multi-keyword fuzzy search function is achieved through a delicate design based on the Gram Counting Order, the Bloom filter, and the Locality-Sensitive Hashing. Furthermore, considering the leakages caused by the two-stage index structure, we propose two specific schemes to resist these potential attacks in different threat models. Extensive analysis and experiments show that our schemes are highly efficient and leakage-resilient.

24 citations


Cites methods from "Generalized pattern matching string..."

  • ...[37] supported a pattern-matching string search, a more flexible method than a general boolean search SSE....

    [...]

References
More filters
Proceedings ArticleDOI
27 Oct 1986
TL;DR: A new tool for controlling the knowledge transfer process in cryptographic protocol design is introduced and it is applied to solve a general class of problems which include most of the two-party cryptographic problems in the literature.
Abstract: In this paper we introduce a new tool for controlling the knowledge transfer process in cryptographic protocol design. It is applied to solve a general class of problems which include most of the two-party cryptographic problems in the literature. Specifically, we show how two parties A and B can interactively generate a random integer N = p?q such that its secret, i.e., the prime factors (p, q), is hidden from either party individually but is recoverable jointly if desired. This can be utilized to give a protocol for two parties with private values i and j to compute any polynomially computable functions f(i,j) and g(i,j) with minimal knowledge transfer and a strong fairness property. As a special case, A and B can exchange a pair of secrets sA, sB, e.g. the factorization of an integer and a Hamiltonian circuit in a graph, in such a way that sA becomes computable by B when and only when sB becomes computable by A. All these results are proved assuming only that the problem of factoring large intergers is computationally intractable.

3,463 citations

Journal ArticleDOI
TL;DR: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
Abstract: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings. The constant of proportionality is low enough to make this algorithm of practical use, and the procedure can also be extended to deal with some more general pattern-matching problems. A theoretical application of the algorithm shows that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time. Other algorithms which run even faster on the average are also considered.

3,156 citations

Proceedings ArticleDOI
30 Oct 2006
TL;DR: In this paper, the authors proposed a searchable symmetric encryption (SSE) scheme for the multi-user setting, where queries to the server can be chosen adaptively during the execution of the search.
Abstract: Searchable symmetric encryption (SSE) allows a party to outsource the storage of its data to another party (a server) in a private manner, while maintaining the ability to selectively search over it. This problem has been the focus of active research in recent years. In this paper we show two solutions to SSE that simultaneously enjoy the following properties: Both solutions are more efficient than all previous constant-round schemes. In particular, the work performed by the server per returned document is constant as opposed to linear in the size of the data. Both solutions enjoy stronger security guarantees than previous constant-round schemes. In fact, we point out subtle but serious problems with previous notions of security for SSE, and show how to design constructions which avoid these pitfalls. Further, our second solution also achieves what we call adaptive SSE security, where queries to the server can be chosen adaptively (by the adversary) during the execution of the search; this notion is both important in practice and has not been previously considered.Surprisingly, despite being more secure and more efficient, our SSE schemes are remarkably simple. We consider the simplicity of both solutions as an important step towards the deployment of SSE technologies.As an additional contribution, we also consider multi-user SSE. All prior work on SSE studied the setting where only the owner of the data is capable of submitting search queries. We consider the natural extension where an arbitrary group of parties other than the owner can submit search queries. We formally define SSE in the multi-user setting, and present an efficient construction that achieves better performance than simply using access control mechanisms.

1,673 citations

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This paper formalizes and solves the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy, and exploits edit distance to quantify keywords similarity and develops an advanced technique on constructing fuzzy keyword sets, which greatly reduces the storage and representation overheads.
Abstract: As Cloud Computing becomes prevalent, more and more sensitive information are being centralized into the cloud. For the protection of data privacy, sensitive data usually have to be encrypted before outsourcing, which makes effective data utilization a very challenging task. Although traditional searchable encryption schemes allow a user to securely search over encrypted data through keywords and selectively retrieve files of interest, these techniques support only exact keyword search. That is, there is no tolerance of minor typos and format inconsistencies which, on the other hand, are typical user searching behavior and happen very frequently. This significant drawback makes existing techniques unsuitable in Cloud Computing as it greatly affects system usability, rendering user searching experiences very frustrating and system efficacy very low. In this paper, for the first time we formalize and solve the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy. Fuzzy keyword search greatly enhances system usability by returning the matching files when users' searching inputs exactly match the predefined keywords or the closest possible matching files based on keyword similarity semantics, when exact match fails. In our solution, we exploit edit distance to quantify keywords similarity and develop an advanced technique on constructing fuzzy keyword sets, which greatly reduces the storage and representation overheads. Through rigorous security analysis, we show that our proposed solution is secure and privacy-preserving, while correctly realizing the goal of fuzzy keyword search.

917 citations


"Generalized pattern matching string..." refers background in this paper

  • ...Existing SE schemes can be classified into two categories, boolean search (e.g. [1]–[4]) and similarity search (e.g. [5]–[8])....

    [...]

Proceedings ArticleDOI
29 Jun 2009
TL;DR: A new asymmetric scalar-product-preserving encryption (ASPE) that preserves a special type of scalar product and is shown to resist practical attacks of a different background knowledge level, at a different overhead cost.
Abstract: Service providers like Google and Amazon are moving into the SaaS (Software as a Service) business. They turn their huge infrastructure into a cloud-computing environment and aggressively recruit businesses to run applications on their platforms. To enforce security and privacy on such a service model, we need to protect the data running on the platform. Unfortunately, traditional encryption methods that aim at providing "unbreakable" protection are often not adequate because they do not support the execution of applications such as database queries on the encrypted data. In this paper we discuss the general problem of secure computation on an encrypted database and propose a SCONEDB Secure Computation ON an Encrypted DataBase) model, which captures the execution and security requirements. As a case study, we focus on the problem of k-nearest neighbor (kNN) computation on an encrypted database. We develop a new asymmetric scalar-product-preserving encryption (ASPE) that preserves a special type of scalar product. We use APSE to construct two secure schemes that support kNN computation on encrypted data; each of these schemes is shown to resist practical attacks of a different background knowledge level, at a different overhead cost. Extensive performance studies are carried out to evaluate the overhead and the efficiency of the schemes.

801 citations


"Generalized pattern matching string..." refers methods in this paper

  • ...For privacy preservation, all the vector-matrix computation and comparison are conducted under two-tier protection: one-way transformation and extended usage of secure kNN computation [17]....

    [...]