What are the contributions in "Shared and searchable encrypted data for untrusted servers" ?

In this paper, the authors propose an encryption scheme where each authorised user in the system has his own keys to encrypt and decrypt data. The authors provide a concrete construction of the scheme and give formal proofs of its security. The authors also report on the results of their implementation.

What have the authors stated for future works in "Shared and searchable encrypted data for untrusted servers" ?

One aspect of their future work is to achieve access pattern privacy. By combining PIR, the search queries can be executed without revealing the access pattern to the server. Another possible extension may be to integrate bucketization [ 17, 18 ]. Range queries can be translated into querying a set of bucket tags.

What is the probability that an adversary can use A′ to break the system?

A′ who controls up to n servers can break the system with non-negligible probability, then in a single-server setting an adversary A can use

What is the definition of a multi-user searchable data encryption scheme?

A multi-user searchable data encryption scheme is a tuple of probabilistic polynomial time algorithms (Init, Keygen, Enc, Re-enc, Trapdoor, Search, Dec, Revoke) such that:• The initialisation algorithm Init(1k) is run by the KMS which takes as input the security parameter 1k and outputs master public parameters Params and a master key set MSK.•

What is the main weakness of the scheme?

A weakness of their scheme and most of the other keyword-based search schemes is thatthe server knows the access pattern of the users which allows it infer some information about the queries.

what is the definition of a multi-user searchable data encryption scheme?

A Multi-user Searchable Data Encryption scheme is a mechanism such that a group of authorised users can share encrypted documents and perform keyword search on the encrypted documents without decrypting them.

What is the way to decrypt encrypted data?

Authorised users can retrieve an encrypted file by its identifier and decrypt it using a key obtained from the owner of the file.

What is the intuition of using document identifiers?

The intuition of using document identifiers is that the adversaries can identify the documents but should not learn anything about the content of the documents.

What is the definition of non-adaptive indistinguishability security?

non-adaptive indistinguishability security means that given two non-adaptively generated query histories with the same length and outcome, no PPT adversary can distinguish one from another based on what it can “see” in the interaction.

What is the way to encrypt a document?

However in practice, the document can be encrypted by a more efficient hybrid encryption scheme, where a secure symmetric cipher is chosen to encrypt the document under a random key and the random key is then encrypted under PE-U-Enc.

What is the ciphertext used to decrypt?

The user sends id(Di) to the server which locates the ciphertext from the data storage and runs the data pre-decryption algorithm.

What is the probability of a g3 being a random group element?

There are two cases to consider: Case 1: If g3 = gγ , the authors know that gγ is a random group element of G because γ is chosen at random.

What is the revocability of the DDH problem?

It follows from the above theorems that as long as the server is honest and the authorised users can protect their keys, revocability is guaranteed.

What is the assumption that a collusion attack is possible?

The assumption is that a collusion attack is only possible when all the SSPs involved collude and the SSPs are competitors thus are unlikely to co-operate in such a collusion.

(Open Access) Shared and searchable encrypted data for untrusted servers (2011) | Changyu Dong

Q: What is the cost of encrypting data?

Bucketization has relatively small performance overhead and enables more complex queries such as range queries and comparison queries at the cost of revealing more information about the encrypted data.

Q: What is the main reason for outsourcing data storage?

To reduce the increasing costs of storage management, many organizations would like to outsource their data storage to third party service providers.

Q: What is the common approach for encrypting data?

Asymmetric keys allow multiple users to encrypt data using the public key, but only the user who has the private key can search and decrypt the data.

Shared and Searchable Encrypted Data for

Untrusted Servers

Changyu Dong

, Giovanni Russello

, Naranker Dulay

Department of Computing,

Security Area,

Imperial College London, Create-Net,

180 Queen’s Gate, Via alla Cascata 56/D Povo,

London, SW7 2AZ, UK 38100 Trento, Italy

{changyu.dong,n.dulay}@imperial.ac.uk giovanni.russello@create-net.org

Abstract

Current security mechanisms are not suitable for organisations that

outsource their data management to untrusted servers. Encrypting and

decrypting sensitive data at the client side is the normal approach in

this situation but has high communication and computation overheads if

only a subset of the data is required, for example, selecting records in a

database table based on a keyword search. New cryptographic schemes

have been proposed th at support encrypted queries over encrypted data.

But they all depend on a single set of secret keys, which implies single user

access or sharing keys among multiple users, with key revocation requiring

costly data re-encryption. In this pap er, we propose an encryption scheme

where each authorised user in the system has his own keys to encrypt and

decrypt data. The scheme supports keyword search which enables the

server to return only the encrypted data that satisﬁes an encrypted query

without decrypting it. We provide a concrete construction of the scheme

and give formal proofs of its security. We also report on the results of our

implementation.

1 Introduction

The demand for outsourcing data storage and management has increased dra-

matically in the last decade. The foremost reason is that for near ly all organ-

isations, data growth is inevitable. Data is at the heart of business oper ations

and applications, driving the critical activities that help the organisations im-

prove customer satisfaction and accelerate business g rowth. Huge amounts of

data are collected or generated everyday and put into data storage for future

processing and analysing. According to Forrester Research, enterprise storage

needs grow at 52 percent per year [6]. To reduce the increasing costs of stor-

age management, many organizations would like to outsource their data storage

to third party service providers. Recent research from TheInfoPro shows that

nearly 20% of Fortune 1000 organizations outsource at least some portion of

their storage management activities [10]. Apa rt from business data , there is

also an emerging trend in personal data outsourcing. People are demanding

more storage space from service provider for various reasons: data backup [1],

sharing photos and videos with family and friends [2] or even to mana ge their

medical record [3].

One of the biggest challenges raised by data storage outsourcing is data

conﬁdentiality. Business data is vital to many companies, any security breaches

will leave the companies with lost revenues, reduced shareholder value, lawsuits

as well a s damaged reputations. Exp osing this valuable information to outsiders

poses huge risks. While companies may trust a Storage Service Provider’s (SSP)

reliability, availability, fa ult-tole rance and per fo rmance, they cannot trust that

the SSP is not going to use the data for other purposes. The same problem also

exists in personal data outsourcing. For privacy reas ons, individuals want to

be sure tha t the data can o nly be accessed by particular people and certainly

not by the SSP’s employees. The negative impact of this distrust is two-fold.

From the customers’ point of view, it is hard to ﬁnd a trusted service provider

to host their data. From the SSPs’ p oint of view, as long as they cannot dispel

the co nce rn, they will lose potential customers.

Traditional access controls which are used to provide conﬁdentiality are

mostly designed for in-house services and depend greatly on the system itself to

enforce authorisation policies, eﬀectively relying on a trusted infrastructure. In

the absence of trust, traditional security models are no longer valid. Another

common approach to provide data conﬁdentiality is cryptography. Server s ide

encryption is not appropriate when the server is not trusted. The client must

encrypt the data be fore sending it to the SSP and later the encrypted data c an

be retrieved and decrypted by the client. This would ease a company’s concern

about data leakage, but introduces a new problem. Because the encrypted data

is not meaningful to the SSP’s servers, many us eful data oper ations are not

possible. Fo r example, if a client wants to retrieve documents or records con-

taining certain keywords, can we keep the data incomprehensible to servers and

their administrato rs while eﬃciently retrieving the data? Consider the following

scenarios:

Scenario 1 Company A is considering outsourcing its data processing centre

to a service provider B. This will cut its annual IT cost by up to 25%. But

the company is concerned about data security. The company’s databases

contain valuable production data and customer information. It would be

unacceptable if competitors got hold of the data. Administrative controls

such as formal contracts, conﬁdential agreements and continuous auditing

provide a certain level of assurance, but the company would also like to

encrypt the sensitive data and have fast searches over it.

Scenario 2 Bob subscribes to a Personal Health Record service. The service

allows Bob to maintain his electronic medical records and share them with

his doctors through a web interface. Bob wants to en crypt his records,

ensuring that the employees of the s ervice provider will not be able to

know what is inside.

A trivial solution is to download all the data to the client’s computer and

decrypt it locally. This does not scale to large datasets. Several schemes have

been prop osed to partially address the above problems. The basic idea is to

divide the cryptographic component between the client and the server. The

client performs the data encryption/decryption and manages keys. The server

processes encrypted search queries by c arrying out some computation on the

encrypted data . The server learns nothing about the keys or the plaintexts of

the data nor the queries, but is still able to return the correct results.

These schemes have an important limitation. The operations, e.g. encryp-

tion, decryption and query generation, more or less rely on some shared secre t

keys. This implies that the operations can only be executed by one user, or by a

group of users who share the secret keys somehow. A single user is usually not

an adequate assumption for data outsourcing. Perhaps the biggest problem for

supporting multiple user access to encrypted data is key management. Sharing

keys is generally not a good idea since it increases the risk of key e xposure.

In respo nse to this, keys must be changed regularly. The keys must also be

changed if a user is no longer qualiﬁed to access the da ta. However, changing

keys may result in decrypting all the data with the old key and r e -encrypting it

using the new keys. For large data sets, this is not practical.

In this work, we propose a scheme for multi-user searchable data encryption

based on proxy cr yptography. We consider the application scena rio where a

group of users share data through an untrusted data storage server which is

hosted by a third party. Unlike existing schemes for searchable data encryption

in multi-user settings which have constraints such as asymmetric use r permis-

sions (multiple writers, single reader) or read-only shared data set, in our scheme

the shared data set can be updated by the users and each user in the group can

be both reader a nd writer. The s e rver can search on the encrypted data using

encrypted keywords. More importantly o ur scheme do not rely on shared keys.

This signiﬁcantly simpliﬁes key revocation. Each authorised user in the system

has his own unique key set and can insert encr ypted data, decrypt the data

inserted by other users and search encrypted data without knowing the other

users’ keys. The keys of one user can easily be revoked without aﬀecting other

users or the encrypted data at the server. After a user’s keys have been revoked,

the user will no longer be able to read and search the shared data.

2 Related Work

Many systems designed for securing untrusted storage rely on the clients to

encrypt the data. For example, in cryptographic distributed ﬁle sy stems such

as [23, 20, 16, 26], the untrusted ﬁle servers store encrypted ﬁles and have no

knowledge of the keys. Authorised users can retrieve an encrypted ﬁle by its

identiﬁer and decrypt it using a key obtained from the owner of the ﬁle. The

servers per fo rm only the basic I/O function and cannot do advanced operations

such as keyword search.

Several schemes have been developed to encrypt data on the client-side and

enable ser ver-side searches on encrypted data. Song et.al. [25] introduced the

ﬁrst pr actical scheme for s e arching on encrypted data. The scheme enables

clients to perform searches on encrypted text without disclosing any informa-

tion about the plaintext to the untrusted server. The untrusted server cannot

learn the plaintext given only the ciphertext, it cannot search without the user’s

authorisation, and it learns nothing more than the encrypted search results. The

basic idea is to g e nerate a keyed hash for the keywords and store this informa-

tion inside the ciphertext. The server can search the keywords by recalculating

and matching the hash value. Yang et. al. [29] prop osed an elega nt scheme

for performing queries on e ncrypted data and also provided a secure index to

sp e e d up queries by two-step mapping. Goh’s scheme [15] enables searches on

encrypted da ta that employed a secure index based on a bloom ﬁlter which has

low sto rage ove rheads. Damiani et.al. [12] proposed an approach to indexing

encrypted da ta which allows eﬃcient data access. The indexed attribute values

are encrypted directly with a key or hashed. This approach also support range

queries by creating an encrypted B+-tree. Agrawal et.al. [4] propos ed an order

preserving encryption for numeric data which allows queries based on compari-

son conditions. The plaintext is encrypted so that the ciphertext follows a target

distribution provided by the user while the order of the data is preserved.

In the bucketization approach for searching encry pted databases [17, 18], an

attribute do main is par titioned into a set of buckets each of which is identiﬁed

by a tag. These bucket tags are maintained as an index and are utilised by the

server to process the queries. Bucketization has relatively small perfo rmance

overhead a nd enables more complex queries such as range queries and com-

parison queries a t the cost o f revealing more information about the encrypted

data.

All the encrypted search schemes above for searches on encrypted data rely

on secret keys, which implies single user access or sharing keys among a group of

users. Boneh et. al. [8] pre sented a scheme for searches on encrypted data using

a public key system that allows mail gateways to handle email based on whether

certain ke ywords exist in the encrypted message. The application scenario is

similar to [25], but the scheme uses asymmetric encryption schemes instead o f

symmetric ones. Asymmetric keys a llow multiple users to encrypt data using

the public key, but only the user who has the private key can search and decrypt

the data. Curtmola et. al. [11] partly solved the multi-user problem by using

broadcast encryption. The set of authorised use rs share a secret key r (which is

used in conjunction with a trapdoor function). Only people who know r will be

able to acce ss/query the data. A user can be r evoked by changing r, and using

broadcast encryption to send the new key r

′

to the set of authorised users . The

revoked user does not know r

′

, and hence cannot s earch. In this scheme, the

database is sear chable, but is read-only and cannot be updated. In our scheme,

any authorised user can read, search and update the encrypted data.

Our scheme is dependent on proxy encryption. The notion o f proxy encryp-

tion was ﬁrst introduced in [7]. In a proxy encryption scheme, a ciphertext

encrypted by one key can be transformed by a proxy function into the corre-

sp onding ciphertext for another key without revealing any information about

the keys and the plaintext. Proxy encryption schemes can be built on top of

diﬀerent cryptosystems such as El Gamal [14] and RSA [24]. Applications o f

proxy encryption include: secure email lists [22], acces s control systems [5] and

attribute based publishing of data [21]. A comprehensive study on proxy cr yp-

tography c an be found in [19].

3 Threat Model

Before presenting our scheme, it is nece ssary to discuss the threat model. In this

section, we ﬁrst identify the entities involved and the assumptions underlying

the system design. Then we identify the potential adversaries and the possible

attacks.

3.1 Entities

There are three types of entities in our system:

• Users: Authorised users are able to read, write and search encrypted data

residing on the remote se rver. Sometimes we may need to revoke an

authorised user. After being r e voked, the user is no longer able to access

the data .

• Server: The main responsibility of the data stor age server is to store and

retrieve encrypted data according to authorised us e rs’ requests.

• Key management server (KMS): The KMS is a fully trusted server which is

responsible for generating and revoking keys. It generates key sets for each

authorised user and is also responsible for securely distributing generated

key sets. When a user is no longer trusted to access the data , the KMS

revokes the use r’s permission by revoking his keys.

Authorised users a re fully trusted. They are given permissions to access the

shared data stored on the remote server by the data owner. They are believed

to behave properly and can protect their key sets properly. The data storage

server is not trus ted in the sense that we believe that the server will execute

requests it re c e ived correctly, but we do not rely on them to maintain data

conﬁdentiality. In other words, the server is modelled as “honest but curious”

in our trust model.The KMS is also fully trusted. Although r equir ing a trusted

KMS seems at odds with using an untrusted data storage service, we argue that

the KMS requires less resources and less management eﬀor t. Securing the KMS

is much easier since a very limited amo unt of data needs to be protected and

the KMS can be kept oﬄine most of time.

3.2 Assumptions

We assume that there are mechanisms in place which ensure integrity and avail-

ability o f the remotely stored data. Our scheme focuses only on conﬁdentiality

Shared and searchable encrypted data for untrusted servers

Figures

Citations

Scalable and Secure Sharing of Personal Health Records in Cloud Computing Using Attribute-Based Encryption

Securing Personal Health Records in Cloud Computing: Patient-Centric and Fine-Grained Data Access Control in Multi-owner Settings

Authorized Private Keyword Search over Encrypted Data in Cloud Computing

Securing Personal Health Records in Cloud Computing: Patient-Centric and Fine-Grained Data Access Control in Multi-owner Settings

A Survey of Provably Secure Searchable Encryption

References

A method for obtaining digital signatures and public-key cryptosystems

Practical techniques for searches on encrypted data

Public Key Encryption with Keyword Search

A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms

Foundations of Cryptography: Volume 2, Basic Applications

Related Papers (5)

Practical techniques for searches on encrypted data

Public Key Encryption with Keyword Search

Attribute-based encryption for fine-grained access control of encrypted data

Ciphertext-Policy Attribute-Based Encryption

Searchable symmetric encryption: improved definitions and efficient constructions

Frequently Asked Questions (20)

Q1. What are the contributions in "Shared and searchable encrypted data for untrusted servers" ?

Q2. What have the authors stated for future works in "Shared and searchable encrypted data for untrusted servers" ?

Q3. What is the cost of encrypting data?

Q4. What is the probability that an adversary can use A′ to break the system?

Q5. What is the main reason for outsourcing data storage?

Q6. What is the definition of a multi-user searchable data encryption scheme?

Q7. What is the common approach for encrypting data?

Q8. What is the main weakness of the scheme?

Q9. what is the definition of a multi-user searchable data encryption scheme?

Q10. What is the way to decrypt encrypted data?

Q11. What are the main reasons people are demanding more storage space from service provider?

Q12. What is the intuition of using document identifiers?

Q13. What is the definition of non-adaptive indistinguishability security?

Q14. What is the way to encrypt a document?

Q15. What is the ciphertext used to decrypt?

Q16. What is the probability of a g3 being a random group element?

Q17. What is the revocability of the DDH problem?

Q18. What is the scheme for searching encrypted data?

Q19. What is the assumption that a collusion attack is possible?

Q20. What is the common way to search encrypted data?