Generalized pattern matching string search on encrypted data in cloud systems

doi:10.1109/INFOCOM.2015.7218595

Home
/
Papers
/
Generalized pattern matching string search on encrypted data in cloud systems

Proceedings Article•DOI•

Generalized pattern matching string search on encrypted data in cloud systems

Dongsheng Wang¹, Xiaohua Jia², Cong Wang², Kan Yang³, Shaojing Fu¹, Ming Xu¹ - Show less +2 more•Institutions (3)

National University of Defense Technology¹, City University of Hong Kong², University of Waterloo³

24 Aug 2015-pp 2101-2109

TL;DR: This paper proposes a scheme for Generalized Pattern-matching String-search on Encrypted data (GPSE) in cloud systems and implements two most commonly used pattern matching search functions on encrypted data, the substring matching and the longest-prefix-first matching.

read less

Abstract: Searchable encryption is an important and challenging issue. It allows people to search on encrypted data. This is a very useful function when more and more people choose to host their data in the cloud and the cloud server is not fully trustable. Existing solutions for searchable encryption are only limited to some simple functions of search, such as boolean search or similarity search. In this paper, we propose a scheme for Generalized Pattern-matching String-search on Encrypted data (GPSE) in cloud systems. GPSE allows users to specify their search queries by using generalized wildcard-based string patterns (such as SQL-like patterns). It gives users great expressive power in specifying highly targeted search queries. In the framework of GPSE, we particularly implemented two most commonly used pattern matching search functions on encrypted data, the substring matching and the longest-prefix-first matching. We also prove that GPSE is secure under the known-plaintext model. Experiments over real data sets show that GPSE achieves high search accuracy.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Anonymous Privacy-Preserving Task Matching in Crowdsourcing

[...]

Jiangang Shu¹, Ximeng Liu², Xiaohua Jia¹, Kan Yang³, Robert H. Deng² - Show less +1 more•Institutions (3)

City University of Hong Kong¹, Singapore Management University², University of Memphis³

27 Apr 2018

TL;DR: This paper systematically analyzes the privacy leaks and potential threats in the task matching and proposes a single-keyword task matching scheme for the multirequester/multiworker crowdsourcing with efficient worker revocation that is secure and feasible.

...read moreread less

Abstract: With the development of sharing economy, crowdsourcing as a distributed computing paradigm has become increasingly pervasive. As one of indispensable services for most crowdsourcing applications, task matching has also been extensively explored. However, privacy issues are usually ignored during the task matching and few existing privacy-preserving crowdsourcing mechanisms can simultaneously protect both task privacy and worker privacy. This paper systematically analyzes the privacy leaks and potential threats in the task matching and proposes a single-keyword task matching scheme for the multirequester/multiworker crowdsourcing with efficient worker revocation. The proposed scheme not only protects data confidentiality and identity anonymity against the crowd-server, but also achieves query traceability against dishonest or revoked workers. Detailed privacy analysis and thorough performance evaluation show that the proposed scheme is secure and feasible.

...read moreread less

56 citations

Cites background from "Generalized pattern matching string..."

..., fuzzy keyword matching [11], Boolean matching [14], and pattern matching [15]....
[...]
...Since the first searchable symmetric encryption (SSE) [10], fuzzy query [11], ranked query [12], personalized query [13], Boolean query [14], and pattern query [15] have been extensively explored in this model....
[...]

Journal Article•DOI•

Secure Multi-keyword Fuzzy Searches With Enhanced Service Quality in Cloud Computing

[...]

Qin Liu¹, Yu Peng¹, Jie Wu², Tian Wang³, Guojun Wang⁴ - Show less +1 more•Institutions (4)

Hunan University¹, Temple University², Beijing Normal University³, Guangzhou University⁴

01 Jun 2021-IEEE Transactions on Network and Service Management

TL;DR: Wang et al. as mentioned in this paper proposed matrix-based multi-keyword fuzzy search (M2FS) schemes, which support approximate keyword matching by exploiting the indecomposable property of primes.

...read moreread less

Abstract: With the ever-increasing amount of data resided in a cloud, how to provide users with secure and practical query services has become the key to improve the quality of cloud services. Fuzzy searchable encryption (FSE) is identified as one of the most promising approaches for enabling secure query services, since it allows searching encrypted data by using keywords with spelling errors. However, existing FSE schemes are far from the practical use for the following reasons: (1) Inflexibility. It is hard for them to simultaneously support AND and OR semantics in a multi-keyword query. (2) Inefficiency. They require sequentially scanning a whole dataset to find matched files, and thus are difficult to apply to a large-scale dataset. (3) Limited robustness. It is difficult for them to resist the linear analysis attack in the known-background model. To fix the above problems, this article proposes matrix-based multi-keyword fuzzy search (M2FS) schemes, which support approximate keyword matching by exploiting the indecomposable property of primes. Specifically, we first present a basic scheme, called M2FS-B, where multiple keywords in a query or a file are constructed as prime-related matrices such that the result of matrix multiplication can be employed to determine the level of matching for different query semantics. Then, we construct an advanced scheme, named M2FS-E, which builds a searchable index as a keyword balanced binary (KBB) tree for dynamic and parallel searches, while adding random noises into a query matrix for enhanced robustness. Extensive analyses and experiments demonstrate the validity of our M2FS schemes.

...read moreread less

41 citations

Journal Article•DOI•

Prime Inner Product Encoding for Effective Wildcard-based Multi-Keyword Fuzzy Search

[...]

Qin Liu¹, Yu Peng¹, Shuyu Pei¹, Jie Wu², Tao Peng³, Guojun Wang³ - Show less +2 more•Institutions (3)

Hunan University¹, Temple University², Guangzhou University³

01 Sep 2020-IEEE Transactions on Services Computing

TL;DR: A prime inner product encoding (PIPE) scheme, which makes use of the indecomposable property of prime numbers to provide efficient, highly accurate, and flexible multi-keyword fuzzy search.

...read moreread less

Abstract: With the prevalence of cloud computing, a growing number of users are delegating clouds to host their sensitive data. To preserve user privacy, it is suggested that data is encrypted before outsourcing. However, data encryption makes keyword-based searches over ciphertexts extremely difficult. This is even challenging for fuzzy search that allows uncertainties or misspellings of keywords in a query. In this paper, we propose a prime inner product encoding (PIPE) scheme, which makes use of the indecomposable property of prime numbers to provide efficient, highly accurate, and flexible multi-keyword fuzzy search. Our main idea is to encode either a query keyword or an index keyword into a vector filled with primes or reciprocals of primes, such that the result of vectors' inner product is an integer only when two keywords are similar. Specifically, we first construct PIPE0 that is secure in the known ciphertext model. Unlike existing works that have difficulty supporting AND and OR semantics simultaneously, PIPE0 gives users the flexibility to specify different search semantics in their queries. Then, we construct PIPES that subtly adds random noises to a query vector to resist linear analyses. Both theoretical analyses and experiment results demonstrate the effectiveness of our scheme.

...read moreread less

37 citations

Cites background from "Generalized pattern matching string..."

...[25] proposed a scheme for a generalized pattern-matching string-search....
[...]

Proceedings Article•DOI•

A Novel Verifiable and Dynamic Fuzzy Keyword Search Scheme over Encrypted Data in Cloud Computing

[...]

Xiaoyu Zhu¹, Qin Liu², Guojun Wang¹•Institutions (2)

Central South University¹, Hunan University²

01 Aug 2016

TL;DR: A verifiable and dynamic fuzzy keywords search (VDFS) scheme to offer secure fuzzy keyword search, update the outsourced document collection and verify the authenticity of the search result is proposed and proved universally composable (UC) security by rigorous security analysis.

...read moreread less

Abstract: In recent years, cloud computing becomes more and more popular. Users outsource large amount of encrypted documents to the cloud in order to avoid information leakage. Searchable encryption technique is a desirable service to enable users search on encrypted data. In most existing searchable encryption schemes, they only provide exact keyword search. Fuzzy keyword search improves the system usability because it allows users to make spelling errors or format inconsistencies. Besides, verifiable encryption schemes usually consider a semitrusted server and verify the authenticity of the search results. However, the server may be malicious, which may modify/delete some encrypted files or forge erroneous results in order to save its storage space or computation ability. In this paper, we investigate the searchable encryption problem in the presence of a malicious server, the verifiable searchability is needed to provide users the ability to detect the potential misbehavior. We propose a verifiable and dynamic fuzzy keyword search (VDFS) scheme to offer secure fuzzy keyword search, update the outsourced document collection and verify the authenticity of the search result. Our scheme is proved universally composable (UC) security by rigorous security analysis.

...read moreread less

33 citations

Cites background from "Generalized pattern matching string..."

...[20] proposed a generalized pattern-matching string-search scheme based on generalized wildcard-based string patterns....
[...]

Journal Article•DOI•

EliMFS: Achieving Efficient, Leakage-Resilient, and Multi-Keyword Fuzzy Search on Encrypted Cloud Data

[...]

Jing Chen¹, Kun He¹, Lan Deng¹, Quan Yuan², Ruiying Du, Yang Xiang³, Jie Wu⁴ - Show less +3 more•Institutions (4)

Wuhan University¹, University of Texas of the Permian Basin², Deakin University³, Temple University⁴

01 Nov 2020-IEEE Transactions on Services Computing

TL;DR: An Efficient Leakage-resilient Multi-keyword Fuzzy Search (EliMFS) framework over encrypted cloud data is proposed and two specific schemes to resist these potential attacks in different threat models are proposed.

...read moreread less

Abstract: Motivated by privacy preservation requirements for outsourced data, keyword searches over encrypted cloud data have become a hot topic. Compared to single-keyword exact searches, multi-keyword fuzzy search schemes attract more attention because of their improvements in search accuracy, typo tolerance, and user experience in general. However, existing multi-keyword fuzzy search solutions are not sufficiently efficient when the file set in the cloud is large. To address this, we propose an Efficient Leakage-resilient Multi-keyword Fuzzy Search (EliMFS) framework over encrypted cloud data. In this framework, a novel two-stage index structure is exploited to ensure that search time is independent of file set size. The multi-keyword fuzzy search function is achieved through a delicate design based on the Gram Counting Order, the Bloom filter, and the Locality-Sensitive Hashing. Furthermore, considering the leakages caused by the two-stage index structure, we propose two specific schemes to resist these potential attacks in different threat models. Extensive analysis and experiments show that our schemes are highly efficient and leakage-resilient.

...read moreread less

24 citations

Cites methods from "Generalized pattern matching string..."

...[37] supported a pattern-matching string search, a more flexible method than a general boolean search SSE....
[...]

1
2
3
4
…
5
6
7

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

How to generate and exchange secrets

[...]

Andrew Chi-Chih Yao

27 Oct 1986

TL;DR: A new tool for controlling the knowledge transfer process in cryptographic protocol design is introduced and it is applied to solve a general class of problems which include most of the two-party cryptographic problems in the literature.

...read moreread less

Abstract: In this paper we introduce a new tool for controlling the knowledge transfer process in cryptographic protocol design. It is applied to solve a general class of problems which include most of the two-party cryptographic problems in the literature. Specifically, we show how two parties A and B can interactively generate a random integer N = p?q such that its secret, i.e., the prime factors (p, q), is hidden from either party individually but is recoverable jointly if desired. This can be utilized to give a protocol for two parties with private values i and j to compute any polynomially computable functions f(i,j) and g(i,j) with minimal knowledge transfer and a strong fairness property. As a special case, A and B can exchange a pair of secrets sA, sB, e.g. the factorization of an integer and a Hamiltonian circuit in a graph, in such a way that sA becomes computable by B when and only when sB becomes computable by A. All these results are proved assuming only that the problem of factoring large intergers is computationally intractable.

...read moreread less

3,463 citations

Journal Article•DOI•

Fast Pattern Matching in Strings

[...]

Donald E. Knuth, James Morris, Vaughan R. Pratt

01 Jun 1977-SIAM Journal on Computing

TL;DR: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.

...read moreread less

Abstract: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings. The constant of proportionality is low enough to make this algorithm of practical use, and the procedure can also be extended to deal with some more general pattern-matching problems. A theoretical application of the algorithm shows that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time. Other algorithms which run even faster on the average are also considered.

...read moreread less

3,156 citations

Proceedings Article•DOI•

Searchable symmetric encryption: improved definitions and efficient constructions

[...]

Reza Curtmola¹, Juan A. Garay², Seny Kamara¹, Rafail Ostrovsky³•Institutions (3)

Johns Hopkins University¹, Alcatel-Lucent², University of California, Los Angeles³

30 Oct 2006

TL;DR: In this paper, the authors proposed a searchable symmetric encryption (SSE) scheme for the multi-user setting, where queries to the server can be chosen adaptively during the execution of the search.

...read moreread less

Abstract: Searchable symmetric encryption (SSE) allows a party to outsource the storage of its data to another party (a server) in a private manner, while maintaining the ability to selectively search over it. This problem has been the focus of active research in recent years. In this paper we show two solutions to SSE that simultaneously enjoy the following properties: Both solutions are more efficient than all previous constant-round schemes. In particular, the work performed by the server per returned document is constant as opposed to linear in the size of the data. Both solutions enjoy stronger security guarantees than previous constant-round schemes. In fact, we point out subtle but serious problems with previous notions of security for SSE, and show how to design constructions which avoid these pitfalls. Further, our second solution also achieves what we call adaptive SSE security, where queries to the server can be chosen adaptively (by the adversary) during the execution of the search; this notion is both important in practice and has not been previously considered.Surprisingly, despite being more secure and more efficient, our SSE schemes are remarkably simple. We consider the simplicity of both solutions as an important step towards the deployment of SSE technologies.As an additional contribution, we also consider multi-user SSE. All prior work on SSE studied the setting where only the owner of the data is capable of submitting search queries. We consider the natural extension where an arbitrary group of parties other than the owner can submit search queries. We formally define SSE in the multi-user setting, and present an efficient construction that achieves better performance than simply using access control mechanisms.

...read moreread less

1,673 citations

Proceedings Article•DOI•

Fuzzy Keyword Search over Encrypted Data in Cloud Computing

[...]

Jin Li¹, Qian Wang¹, Cong Wang¹, Ning Cao², Kui Ren¹, Wenjing Lou² - Show less +2 more•Institutions (2)

Illinois Institute of Technology¹, Worcester Polytechnic Institute²

14 Mar 2010

TL;DR: This paper formalizes and solves the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy, and exploits edit distance to quantify keywords similarity and develops an advanced technique on constructing fuzzy keyword sets, which greatly reduces the storage and representation overheads.

...read moreread less

Abstract: As Cloud Computing becomes prevalent, more and more sensitive information are being centralized into the cloud. For the protection of data privacy, sensitive data usually have to be encrypted before outsourcing, which makes effective data utilization a very challenging task. Although traditional searchable encryption schemes allow a user to securely search over encrypted data through keywords and selectively retrieve files of interest, these techniques support only exact keyword search. That is, there is no tolerance of minor typos and format inconsistencies which, on the other hand, are typical user searching behavior and happen very frequently. This significant drawback makes existing techniques unsuitable in Cloud Computing as it greatly affects system usability, rendering user searching experiences very frustrating and system efficacy very low. In this paper, for the first time we formalize and solve the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy. Fuzzy keyword search greatly enhances system usability by returning the matching files when users' searching inputs exactly match the predefined keywords or the closest possible matching files based on keyword similarity semantics, when exact match fails. In our solution, we exploit edit distance to quantify keywords similarity and develop an advanced technique on constructing fuzzy keyword sets, which greatly reduces the storage and representation overheads. Through rigorous security analysis, we show that our proposed solution is secure and privacy-preserving, while correctly realizing the goal of fuzzy keyword search.

...read moreread less

917 citations

"Generalized pattern matching string..." refers background in this paper

...Existing SE schemes can be classified into two categories, boolean search (e.g. [1]–[4]) and similarity search (e.g. [5]–[8])....
[...]

Proceedings Article•DOI•

Secure kNN computation on encrypted databases

[...]

Wai Kit Wong¹, David W. Cheung¹, Ben Kao¹, Nikos Mamoulis¹•Institutions (1)

University of Hong Kong¹

29 Jun 2009

TL;DR: A new asymmetric scalar-product-preserving encryption (ASPE) that preserves a special type of scalar product and is shown to resist practical attacks of a different background knowledge level, at a different overhead cost.

...read moreread less

Abstract: Service providers like Google and Amazon are moving into the SaaS (Software as a Service) business. They turn their huge infrastructure into a cloud-computing environment and aggressively recruit businesses to run applications on their platforms. To enforce security and privacy on such a service model, we need to protect the data running on the platform. Unfortunately, traditional encryption methods that aim at providing "unbreakable" protection are often not adequate because they do not support the execution of applications such as database queries on the encrypted data. In this paper we discuss the general problem of secure computation on an encrypted database and propose a SCONEDB Secure Computation ON an Encrypted DataBase) model, which captures the execution and security requirements. As a case study, we focus on the problem of k-nearest neighbor (kNN) computation on an encrypted database. We develop a new asymmetric scalar-product-preserving encryption (ASPE) that preserves a special type of scalar product. We use APSE to construct two secure schemes that support kNN computation on encrypted data; each of these schemes is shown to resist practical attacks of a different background knowledge level, at a different overhead cost. Extensive performance studies are carried out to evaluate the overhead and the efficiency of the schemes.

...read moreread less

801 citations

"Generalized pattern matching string..." refers methods in this paper

...For privacy preservation, all the vector-matrix computation and comparison are conducted under two-tier protection: one-way transformation and extended usage of secure kNN computation [17]....
[...]