scispace - formally typeset
Open AccessJournal ArticleDOI

Shared and searchable encrypted data for untrusted servers

Reads0
Chats0
TLDR
This paper proposes an encryption scheme where each authorised user in the system has his own keys to encrypt and decrypt data and supports keyword search which enables the server to return only the encrypted data that satisfies an encrypted query without decrypting it.
Abstract
Current security mechanisms are not suitable for organisations that outsource their data management to untrusted servers. Encrypting and decrypting sensitive data at the client side is the normal approach in this situation but has high communication and computation overheads if only a subset of the data is required, for example, selecting records in a database table based on a keyword search. New cryptographic schemes have been proposed that support encrypted queries over encrypted data. But they all depend on a single set of secret keys, which implies single user access or sharing keys among multiple users, with key revocation requiring costly data re-encryption. In this paper, we propose an encryption scheme where each authorised user in the system has his own keys to encrypt and decrypt data. The scheme supports keyword search which enables the server to return only the encrypted data that satisfies an encrypted query without decrypting it. We provide a concrete construction of the scheme and give formal proofs of its security. We also report on the results of our implementation.

read more

Content maybe subject to copyright    Report

Shared and Searchable Encrypted Data for
Untrusted Servers
Changyu Dong
1
, Giovanni Russello
2
, Naranker Dulay
1
1
Department of Computing,
2
Security Area,
Imperial College London, Create-Net,
180 Queen’s Gate, Via alla Cascata 56/D Povo,
London, SW7 2AZ, UK 38100 Trento, Italy
{changyu.dong,n.dulay}@imperial.ac.uk giovanni.russello@create-net.org
Abstract
Current security mechanisms are not suitable for organisations that
outsource their data management to untrusted servers. Encrypting and
decrypting sensitive data at the client side is the normal approach in
this situation but has high communication and computation overheads if
only a subset of the data is required, for example, selecting records in a
database table based on a keyword search. New cryptographic schemes
have been proposed th at support encrypted queries over encrypted data.
But they all depend on a single set of secret keys, which implies single user
access or sharing keys among multiple users, with key revocation requiring
costly data re-encryption. In this pap er, we propose an encryption scheme
where each authorised user in the system has his own keys to encrypt and
decrypt data. The scheme supports keyword search which enables the
server to return only the encrypted data that satisfies an encrypted query
without decrypting it. We provide a concrete construction of the scheme
and give formal proofs of its security. We also report on the results of our
implementation.
1 Introduction
The demand for outsourcing data storage and management has increased dra-
matically in the last decade. The foremost reason is that for near ly all organ-
isations, data growth is inevitable. Data is at the heart of business oper ations
and applications, driving the critical activities that help the organisations im-
prove customer satisfaction and accelerate business g rowth. Huge amounts of
data are collected or generated everyday and put into data storage for future
processing and analysing. According to Forrester Research, enterprise storage
needs grow at 52 percent per year [6]. To reduce the increasing costs of stor-
age management, many organizations would like to outsource their data storage
to third party service providers. Recent research from TheInfoPro shows that
nearly 20% of Fortune 1000 organizations outsource at least some portion of
1

their storage management activities [10]. Apa rt from business data , there is
also an emerging trend in personal data outsourcing. People are demanding
more storage space from service provider for various reasons: data backup [1],
sharing photos and videos with family and friends [2] or even to mana ge their
medical record [3].
One of the biggest challenges raised by data storage outsourcing is data
confidentiality. Business data is vital to many companies, any security breaches
will leave the companies with lost revenues, reduced shareholder value, lawsuits
as well a s damaged reputations. Exp osing this valuable information to outsiders
poses huge risks. While companies may trust a Storage Service Provider’s (SSP)
reliability, availability, fa ult-tole rance and per fo rmance, they cannot trust that
the SSP is not going to use the data for other purposes. The same problem also
exists in personal data outsourcing. For privacy reas ons, individuals want to
be sure tha t the data can o nly be accessed by particular people and certainly
not by the SSP’s employees. The negative impact of this distrust is two-fold.
From the customers’ point of view, it is hard to find a trusted service provider
to host their data. From the SSPs’ p oint of view, as long as they cannot dispel
the co nce rn, they will lose potential customers.
Traditional access controls which are used to provide confidentiality are
mostly designed for in-house services and depend greatly on the system itself to
enforce authorisation policies, effectively relying on a trusted infrastructure. In
the absence of trust, traditional security models are no longer valid. Another
common approach to provide data confidentiality is cryptography. Server s ide
encryption is not appropriate when the server is not trusted. The client must
encrypt the data be fore sending it to the SSP and later the encrypted data c an
be retrieved and decrypted by the client. This would ease a company’s concern
about data leakage, but introduces a new problem. Because the encrypted data
is not meaningful to the SSP’s servers, many us eful data oper ations are not
possible. Fo r example, if a client wants to retrieve documents or records con-
taining certain keywords, can we keep the data incomprehensible to servers and
their administrato rs while efficiently retrieving the data? Consider the following
scenarios:
Scenario 1 Company A is considering outsourcing its data processing centre
to a service provider B. This will cut its annual IT cost by up to 25%. But
the company is concerned about data security. The company’s databases
contain valuable production data and customer information. It would be
unacceptable if competitors got hold of the data. Administrative controls
such as formal contracts, confidential agreements and continuous auditing
provide a certain level of assurance, but the company would also like to
encrypt the sensitive data and have fast searches over it.
Scenario 2 Bob subscribes to a Personal Health Record service. The service
allows Bob to maintain his electronic medical records and share them with
his doctors through a web interface. Bob wants to en crypt his records,
ensuring that the employees of the s ervice provider will not be able to
know what is inside.
2

A trivial solution is to download all the data to the client’s computer and
decrypt it locally. This does not scale to large datasets. Several schemes have
been prop osed to partially address the above problems. The basic idea is to
divide the cryptographic component between the client and the server. The
client performs the data encryption/decryption and manages keys. The server
processes encrypted search queries by c arrying out some computation on the
encrypted data . The server learns nothing about the keys or the plaintexts of
the data nor the queries, but is still able to return the correct results.
These schemes have an important limitation. The operations, e.g. encryp-
tion, decryption and query generation, more or less rely on some shared secre t
keys. This implies that the operations can only be executed by one user, or by a
group of users who share the secret keys somehow. A single user is usually not
an adequate assumption for data outsourcing. Perhaps the biggest problem for
supporting multiple user access to encrypted data is key management. Sharing
keys is generally not a good idea since it increases the risk of key e xposure.
In respo nse to this, keys must be changed regularly. The keys must also be
changed if a user is no longer qualified to access the da ta. However, changing
keys may result in decrypting all the data with the old key and r e -encrypting it
using the new keys. For large data sets, this is not practical.
In this work, we propose a scheme for multi-user searchable data encryption
based on proxy cr yptography. We consider the application scena rio where a
group of users share data through an untrusted data storage server which is
hosted by a third party. Unlike existing schemes for searchable data encryption
in multi-user settings which have constraints such as asymmetric use r permis-
sions (multiple writers, single reader) or read-only shared data set, in our scheme
the shared data set can be updated by the users and each user in the group can
be both reader a nd writer. The s e rver can search on the encrypted data using
encrypted keywords. More importantly o ur scheme do not rely on shared keys.
This significantly simplifies key revocation. Each authorised user in the system
has his own unique key set and can insert encr ypted data, decrypt the data
inserted by other users and search encrypted data without knowing the other
users’ keys. The keys of one user can easily be revoked without affecting other
users or the encrypted data at the server. After a user’s keys have been revoked,
the user will no longer be able to read and search the shared data.
2 Related Work
Many systems designed for securing untrusted storage rely on the clients to
encrypt the data. For example, in cryptographic distributed file sy stems such
as [23, 20, 16, 26], the untrusted file servers store encrypted files and have no
knowledge of the keys. Authorised users can retrieve an encrypted file by its
identifier and decrypt it using a key obtained from the owner of the file. The
servers per fo rm only the basic I/O function and cannot do advanced operations
such as keyword search.
Several schemes have been developed to encrypt data on the client-side and
3

enable ser ver-side searches on encrypted data. Song et.al. [25] introduced the
first pr actical scheme for s e arching on encrypted data. The scheme enables
clients to perform searches on encrypted text without disclosing any informa-
tion about the plaintext to the untrusted server. The untrusted server cannot
learn the plaintext given only the ciphertext, it cannot search without the user’s
authorisation, and it learns nothing more than the encrypted search results. The
basic idea is to g e nerate a keyed hash for the keywords and store this informa-
tion inside the ciphertext. The server can search the keywords by recalculating
and matching the hash value. Yang et. al. [29] prop osed an elega nt scheme
for performing queries on e ncrypted data and also provided a secure index to
sp e e d up queries by two-step mapping. Goh’s scheme [15] enables searches on
encrypted da ta that employed a secure index based on a bloom filter which has
low sto rage ove rheads. Damiani et.al. [12] proposed an approach to indexing
encrypted da ta which allows efficient data access. The indexed attribute values
are encrypted directly with a key or hashed. This approach also support range
queries by creating an encrypted B+-tree. Agrawal et.al. [4] propos ed an order
preserving encryption for numeric data which allows queries based on compari-
son conditions. The plaintext is encrypted so that the ciphertext follows a target
distribution provided by the user while the order of the data is preserved.
In the bucketization approach for searching encry pted databases [17, 18], an
attribute do main is par titioned into a set of buckets each of which is identified
by a tag. These bucket tags are maintained as an index and are utilised by the
server to process the queries. Bucketization has relatively small perfo rmance
overhead a nd enables more complex queries such as range queries and com-
parison queries a t the cost o f revealing more information about the encrypted
data.
All the encrypted search schemes above for searches on encrypted data rely
on secret keys, which implies single user access or sharing keys among a group of
users. Boneh et. al. [8] pre sented a scheme for searches on encrypted data using
a public key system that allows mail gateways to handle email based on whether
certain ke ywords exist in the encrypted message. The application scenario is
similar to [25], but the scheme uses asymmetric encryption schemes instead o f
symmetric ones. Asymmetric keys a llow multiple users to encrypt data using
the public key, but only the user who has the private key can search and decrypt
the data. Curtmola et. al. [11] partly solved the multi-user problem by using
broadcast encryption. The set of authorised use rs share a secret key r (which is
used in conjunction with a trapdoor function). Only people who know r will be
able to acce ss/query the data. A user can be r evoked by changing r, and using
broadcast encryption to send the new key r
to the set of authorised users . The
revoked user does not know r
, and hence cannot s earch. In this scheme, the
database is sear chable, but is read-only and cannot be updated. In our scheme,
any authorised user can read, search and update the encrypted data.
Our scheme is dependent on proxy encryption. The notion o f proxy encryp-
tion was first introduced in [7]. In a proxy encryption scheme, a ciphertext
encrypted by one key can be transformed by a proxy function into the corre-
sp onding ciphertext for another key without revealing any information about
4

the keys and the plaintext. Proxy encryption schemes can be built on top of
different cryptosystems such as El Gamal [14] and RSA [24]. Applications o f
proxy encryption include: secure email lists [22], acces s control systems [5] and
attribute based publishing of data [21]. A comprehensive study on proxy cr yp-
tography c an be found in [19].
3 Threat Model
Before presenting our scheme, it is nece ssary to discuss the threat model. In this
section, we first identify the entities involved and the assumptions underlying
the system design. Then we identify the potential adversaries and the possible
attacks.
3.1 Entities
There are three types of entities in our system:
Users: Authorised users are able to read, write and search encrypted data
residing on the remote se rver. Sometimes we may need to revoke an
authorised user. After being r e voked, the user is no longer able to access
the data .
Server: The main responsibility of the data stor age server is to store and
retrieve encrypted data according to authorised us e rs’ requests.
Key management server (KMS): The KMS is a fully trusted server which is
responsible for generating and revoking keys. It generates key sets for each
authorised user and is also responsible for securely distributing generated
key sets. When a user is no longer trusted to access the data , the KMS
revokes the use r’s permission by revoking his keys.
Authorised users a re fully trusted. They are given permissions to access the
shared data stored on the remote server by the data owner. They are believed
to behave properly and can protect their key sets properly. The data storage
server is not trus ted in the sense that we believe that the server will execute
requests it re c e ived correctly, but we do not rely on them to maintain data
confidentiality. In other words, the server is modelled as honest but curious”
in our trust model.The KMS is also fully trusted. Although r equir ing a trusted
KMS seems at odds with using an untrusted data storage service, we argue that
the KMS requires less resources and less management effor t. Securing the KMS
is much easier since a very limited amo unt of data needs to be protected and
the KMS can be kept offline most of time.
3.2 Assumptions
We assume that there are mechanisms in place which ensure integrity and avail-
ability o f the remotely stored data. Our scheme focuses only on confidentiality
5

Citations
More filters
Journal ArticleDOI

Scalable and Secure Sharing of Personal Health Records in Cloud Computing Using Attribute-Based Encryption

TL;DR: A novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semitrusted servers are proposed and a high degree of patient privacy is guaranteed simultaneously by exploiting multiauthority ABE.
Proceedings Article

Securing Personal Health Records in Cloud Computing: Patient-Centric and Fine-Grained Data Access Control in Multi-owner Settings

TL;DR: Since there are multiple owners (patients) in a PHR system and every owner would encrypt her PHR files using a different set of cryptographic keys, it is important to reduce the key distribution complexity in such multi-owner settings.
Proceedings ArticleDOI

Authorized Private Keyword Search over Encrypted Data in Cloud Computing

TL;DR: This paper shows the necessity of search capability authorization that reduces the privacy exposure resulting from the search results, and establishes a scalable framework for Authorized Private Keyword Search (APKS) over encrypted cloud data, and proposes two novel solutions based on a recent cryptographic primitive, Hierarchical Predicate Encryption (HPE).
Book ChapterDOI

Securing Personal Health Records in Cloud Computing: Patient-Centric and Fine-Grained Data Access Control in Multi-owner Settings

TL;DR: In this article, the authors proposed a fine-grained access control for online personal health record (PHR) data in a multi-user setting, where each owner would encrypt her PHR files using a different set of cryptographic keys.
Journal ArticleDOI

A Survey of Provably Secure Searchable Encryption

TL;DR: The notion of provably secure searchable encryption (SE) is surveyed by giving a complete and comprehensive overview of the two main SE techniques: searchable symmetric encryption (SSE) and public key encryption with keyword search (PEKS).
References
More filters
Journal ArticleDOI

A method for obtaining digital signatures and public-key cryptosystems

TL;DR: An encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key.
Proceedings ArticleDOI

Practical techniques for searches on encrypted data

TL;DR: This work describes the cryptographic schemes for the problem of searching on encrypted data and provides proofs of security for the resulting crypto systems, and presents simple, fast, and practical algorithms that are practical to use today.
Book ChapterDOI

Public Key Encryption with Keyword Search

TL;DR: This work defines and construct a mechanism that enables Alice to provide a key to the gateway that enables the gateway to test whether the word “urgent” is a keyword in the email without learning anything else about the email.
Book ChapterDOI

A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms

TL;DR: In this article, a new signature scheme is proposed together with an implementation of the Diffie-Hellman key distribution scheme that achieves a public key cryptosystem and the security of both systems relies on the difficulty of computing discrete logarithms over finite fields.
Book

Foundations of Cryptography: Volume 2, Basic Applications

TL;DR: This second volume of Foundations of Cryptography contains a rigorous and systematic treatment of three basic applications: Encryption, Signatures, and General Cryptographic Protocols.
Related Papers (5)
Frequently Asked Questions (20)
Q1. What are the contributions in "Shared and searchable encrypted data for untrusted servers" ?

In this paper, the authors propose an encryption scheme where each authorised user in the system has his own keys to encrypt and decrypt data. The authors provide a concrete construction of the scheme and give formal proofs of its security. The authors also report on the results of their implementation. 

One aspect of their future work is to achieve access pattern privacy. By combining PIR, the search queries can be executed without revealing the access pattern to the server. Another possible extension may be to integrate bucketization [ 17, 18 ]. Range queries can be translated into querying a set of bucket tags. 

Bucketization has relatively small performance overhead and enables more complex queries such as range queries and comparison queries at the cost of revealing more information about the encrypted data. 

A′ who controls up to n servers can break the system with non-negligible probability, then in a single-server setting an adversary A can use 

To reduce the increasing costs of storage management, many organizations would like to outsource their data storage to third party service providers. 

A multi-user searchable data encryption scheme is a tuple of probabilistic polynomial time algorithms (Init, Keygen, Enc, Re-enc, Trapdoor, Search, Dec, Revoke) such that:• The initialisation algorithm Init(1k) is run by the KMS which takes as input the security parameter 1k and outputs master public parameters Params and a master key set MSK.• 

Asymmetric keys allow multiple users to encrypt data using the public key, but only the user who has the private key can search and decrypt the data. 

A weakness of their scheme and most of the other keyword-based search schemes is thatthe server knows the access pattern of the users which allows it infer some information about the queries. 

A Multi-user Searchable Data Encryption scheme is a mechanism such that a group of authorised users can share encrypted documents and perform keyword search on the encrypted documents without decrypting them. 

Authorised users can retrieve an encrypted file by its identifier and decrypt it using a key obtained from the owner of the file. 

People are demanding more storage space from service provider for various reasons: data backup [1], sharing photos and videos with family and friends [2] or even to manage their medical record [3]. 

The intuition of using document identifiers is that the adversaries can identify the documents but should not learn anything about the content of the documents. 

non-adaptive indistinguishability security means that given two non-adaptively generated query histories with the same length and outcome, no PPT adversary can distinguish one from another based on what it can “see” in the interaction. 

However in practice, the document can be encrypted by a more efficient hybrid encryption scheme, where a secure symmetric cipher is chosen to encrypt the document under a random key and the random key is then encrypted under PE-U-Enc. 

The user sends id(Di) to the server which locates the ciphertext from the data storage and runs the data pre-decryption algorithm. 

There are two cases to consider: Case 1: If g3 = gγ , the authors know that gγ is a random group element of G because γ is chosen at random. 

It follows from the above theorems that as long as the server is honest and the authorised users can protect their keys, revocability is guaranteed. 

Goh’s scheme [15] enables searches on encrypted data that employed a secure index based on a bloom filter which has low storage overheads. 

The assumption is that a collusion attack is only possible when all the SSPs involved collude and the SSPs are competitors thus are unlikely to co-operate in such a collusion. 

All the encrypted search schemes above for searches on encrypted data rely on secret keys, which implies single user access or sharing keys among a group of users.