scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Practical techniques for searches on encrypted data

14 May 2000-pp 44-55
TL;DR: This work describes the cryptographic schemes for the problem of searching on encrypted data and provides proofs of security for the resulting crypto systems, and presents simple, fast, and practical algorithms that are practical to use today.
Abstract: It is desirable to store data on data storage servers such as mail servers and file servers in encrypted form to reduce security and privacy risks. But this usually implies that one has to sacrifice functionality for security. For example, if a client wishes to retrieve only documents containing certain words, it was not previously known how to let the data storage server perform the search and answer the query, without loss of data confidentiality. We describe our cryptographic schemes for the problem of searching on encrypted data and provide proofs of security for the resulting crypto systems. Our techniques have a number of crucial advantages. They are provably secure: they provide provable secrecy for encryption, in the sense that the untrusted server cannot learn anything about the plaintext when only given the ciphertext; they provide query isolation for searches, meaning that the untrusted server cannot learn anything more about the plaintext than the search result; they provide controlled searching, so that the untrusted server cannot search for an arbitrary word without the user's authorization; they also support hidden queries, so that the user may ask the untrusted server to search for a secret word without revealing the word to the server. The algorithms presented are simple, fast (for a document of length n, the encryption and search algorithms only need O(n) stream cipher and block cipher operations), and introduce almost no space and communication overhead, and hence are practical to use today.

Summary (4 min read)

1 Introduction

  • Today’s mail servers such as IMAP servers [11], file servers and other data storage servers typically must be fully trusted—they have access to the data, and hence must be trusted not to reveal it without authorization—which introduces undesirable security and privacy risks in applications.
  • The authors show how to support searching functionality without any loss of data confidentiality.
  • The techniques provide provable secrecy for encryption, in the sense that the untrusted server cannot learn anything about the plaintext given only the ciphertext.
  • The algorithms the authors present are simple and fast.
  • The authors may control the number of errors by adjusting a parameter in the encryption algorithm; each wrong position will be returned with probability about , so for a -word document, they expect to see about false matches.

2 Searching on Encrypted Data

  • The authors first define the problem of searching on encrypted data.
  • Each document can be divided up into ‘words’.
  • So the approach of using an index is more suitable for mostly-read-only data.
  • The authors adopt the standard definitions of security from the provable security literature [2], and they measure the strength of the cryptographic primitives in terms of the resources needed to break them.
  • The authors say that is a -secure pseudorandom generator if every algorithm with running time at most has advantage Adv .

4 Our Solution with Sequential Scan

  • The authors introduce their solution for searching with sequential scan.
  • The authors first start with a basic scheme and show that its encryption algorithm provides provable secrecy.
  • The authors then show how they can extend the first scheme to handle controlled searching and hidden searches.
  • The authors describe their final scheme which satisfies all the properties they mentioned earlier including query isolation at the end.

4.1 Scheme I: The Basic Scheme

  • Alice wants to encrypt a document which contains the sequence of words .
  • More specifically, the basic scheme is as follows.
  • Alice generates a sequence of pseudorandom values using some stream cipher (namely, the pseudorandom generator ), where each is bits long.
  • Another alternative is to choose a new key for each position independent of all other keys.
  • The basic scheme supports searches over the ciphertext in the following way: if Alice wants to search the word , she can tell Bob and the corresponding to each location where a word may occur.

4.2 Scheme II: Controlled Searching

  • Let be an additional pseudorandom function, which will be keyed independently of .
  • Suppose is a -secure pseudorandom function, is a -secure pseudorandom function, and is a -secure pseudorandom generator.
  • The authors can take this idea even further by using a hierarchical key management scheme.
  • Then she can reveal either (1) for each chapter of interest or (2) itself if she wishes to succinctly authorize Bob to search for in all the chapters.

4.3 Scheme III: Support for Hidden Searches

  • Suppose Alice would now like to ask Bob to search for a word but she is not willing to reveal to Bob.
  • The authors propose a simple extension to the above scheme to support this goal.
  • Note that is not allowed to use any randomness, and the computation of may depend only on and must not depend on the position in the document where is found.
  • After the pre-encryption phase, Alice has a sequence of -encrypted words .
  • Note that this allows Bob to search for without revealing itself.

4.4 Scheme IV: The Final Scheme

  • Careful readers may have noticed that Scheme III actually suffers from a small inadequacy: if Alice generates keys as then Alice can no longer recover the plaintext from just the ciphertext because she would need to know (more precisely, the last bits of ) before she can decrypt.
  • (Scheme II also has a similar inadequacy, but as the authors will show below, the best way to fix it is to introduce pre-encryption as in Scheme III.).
  • In the fixed scheme, the authors split the pre-encrypted word into two parts, ! , where !.
  • Alice to compute and thus finish the decryption.
  • Moreover, if the authors disclose one and consider the reduced sequence obtained by discarding all the values at positions where , then they obtain a -secure pseudorandom generator, where .

5.1 Other Practical Considerations

  • The authors can see that updates in this scheme are easy.
  • If Alice wants to add a new document into Bob’s data storage, she can simply encrypt it in the appropriate way and instruct Bob to append it to the already-stored ciphertext.
  • Moreover, since the keys can be generated hierarchically from a master key, the key storage and management is also very convenient:.
  • Alice only needs to remember one password, the master key.
  • The underlying technique of embedding information in pseudorandom bit streams may also be of independent interest: the authors speculate that this simple trick might prove useful for other applications, too.

5.2 Supporting More Advanced Search Queries

  • The schemes the authors presented earlier only address the problem of searching for a single word.
  • The authors show several ex- amples to illustrate that it is relatively easy to implement more advanced searching functionality using their scheme as a fundamental building block.
  • The authors can also support searches if the query is given as a regular expression using, e.g., wildcards in a limited form.
  • For many applications the purpose of the search is to find documents which contain a specific word, where the position or the number of occurrences are not relevant.
  • The authors add a count to each word, which counts how many times that word occurs previously in that document.

5.3 Dealing with Variable-Length Words

  • In their scheme, the minimal unit the authors can search for is an individual word.
  • One possibility is to pick a fixed-size block that is long enough to contain most words.
  • Such a padding scheme would introduce space inefficiency.
  • When words lengths may vary, it is important to hide the length information from the server, because revealing the length of each word might allow for statistical attacks.
  • Fortunately, in this case the server does not need to know the lengths to perform a search: he may just scan through the file and check for a match at each possible bit boundary.

5.4 Searching with an Encrypted Index

  • Sequential scan may not be efficient enough when the data size is large.
  • The interesting question is how to encrypt the index.
  • Alice may decrypt the encrypted entries and send Bob another request to retrieve the relevant documents.
  • Note that by keeping the lists of pointers in a fixed-size list, the authors are mainly preventing Bob from learning statistical information on the key words that he has not searched.
  • Note that a general disadvantage for index search is that whenever Alice changes her documents, she must update the index.

5.5 More Security Issues

  • In all their schemes, by allowing Bob to search for a word the authors effectively disclose to him a list of potential locations where might occur.
  • If the authors allow Bob to search for too many words, he may be able to use statistical techniques to start learning important information about the documents.
  • One possible defense is to decrease (so that false matches are more prevalent and thus Bob’s information about the plaintext is ‘noisy’), but the authors have not analyzed the costeffectiveness of this tradeoff in any detail.
  • In all the schemes the authors have discussed so far, they must trust Bob to return all the search results.
  • Even when this type of attack is present, it is possible to combine their scheme with hash tree techniques [17] to ensure the integrity of the data and detect such attacks, although a full description of this countermeasure is out of the scope of the paper.

7 Conclusion

  • The authors have described new techniques for remote searching on encrypted data using an untrusted server and provided proofs of security for the resulting crypto systems.
  • The techniques have a number of crucial advantages: they are provably secure; they support controlled and hidden search and query isolation; they are simple and fast (More specifically, for a document of length , the encryption and search algorithms only need stream cipher and block cipher operations); and they introduce almost no space and communication overhead.
  • The authors scheme is also very flexible, and it can easily be extended to support more advanced search queries.
  • The authors conclude that this provides a powerful new building block for the construction of secure services in the untrusted infrastructure.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Practical Techniques for Searches on Encrypted Data
Dawn Xiaodong Song David Wagner Adrian Perrig
dawnsong, daw, perrig
@cs.berkeley.edu
University of California, Berkeley
Abstract
It is desirable to store data on data storage servers such
as mail servers and le servers in encrypted form to reduce
security and privacy risks. But this usually implies that one
has to sacrice functionality for security. For example, if a
client wishes to retrieve only documents containing certain
words, it was not previously known how to let the data stor-
age server perform the search and answer the query without
loss of data condentiality.
In this paper, we describe our cryptographic schemes
for the problem of searching on encrypted data and pro-
vide proofs of security for the resulting crypto systems. Our
techniques have a number of crucial advantages. They are
provably secure: they provide provable secrecy for encryp-
tion, in the sense that the untrusted server cannot learn
anything about the plaintext when only given the cipher-
text; they provide query isolation for searches, meaning
that the untrusted server cannot learn anything more about
the plaintext than the search result; they provide controlled
searching, so that the untrusted server cannot search for an
arbitrary word without the user’s authorization; they also
support hidden queries, so that the user may ask the un-
trusted server to search for a secret word without revealing
the word to the server. The algorithms we present are sim-
ple, fast (for a document of length
, the encryption and
search algorithms only need
stream cipher and block
cipher operations), and introduce almost no space and com-
munication overhead, and hence are practical to use today.
We gratefully acknowledge support for this research from several US
government agencies. This research was suported in part by the Defense
Advanced Research Projects Agency under DARPA contract N6601-99-
28913 (under supervision of the Space and Naval Warfare Systems Center
San Diego), by the National Science foundation under grant FD99-79852,
and by the United States Postal Service under grant USPS 1025 90-98-C-
3513. Views and conclusions contained in this document are those of the
authors and do not necessarily represent the ofcial opinion or policies,
either expressed or implied of the US government or any of its agencies,
DARPA, NSF, USPS.
1 Introduction
Today’s mail servers such as IMAP servers [11], le
servers and other data storage servers typically must be fully
trusted—they have access to the data, and hence must be
trusted not to reveal it without authorization—which intro-
duces undesirable security and privacy risks in applications.
Previous work shows how to build encrypted le systems
and secure mail servers, but typically one must sacrice
functionality to ensure security. The fundamental problem
is that moving the computation to the data storage seems
very difcult when the data is encrypted, and many com-
putation problems over encrypted data previously had no
practical solutions.
In this paper, we show how to support searching func-
tionality without any loss of data condentiality. An exam-
ple is where a mobile user with limited bandwidth wants
to retrieve all email containing the word “Urgent” from an
untrusted mail-storage server in the infrastructure. This is
trivial to do when the server knows the content of the data,
but how can we support search queries if we do not wish to
reveal all our email to the server?
Our answer is to present cryptographic schemes that en-
able searching on encrypted data without leaking any infor-
mation to the untrusted server.
Our techniques are provably secure. The techniques
provide provable secrecy for encryption, in the sense
that the untrusted server cannot learn anything about
the plaintext given only the ciphertext. The tech-
niques provide controlled searching, so that the un-
trusted server cannot search for a word without the
user’s authorization. The techniques support hidden
queries, so that the user may ask the untrusted server
to search for a secret word without revealing the word
to the server. The techniques also support query isola-
tion, meaning that the untrusted server learns nothing
more than the search result about the plaintext.
Our schemes are efcient and practical. The algo-
rithms we present are simple and fast. More speci-

cally, for a document of length
, the encryption and
search algorithms only need
number of stream
cipher and block cipher operations. Our schemes in-
troduce essentially no space and communication over-
head. They are also exible and can be easily extended
to support more advanced searches.
Our schemes all take the form of probabilistic searching:
a search for the word
returns all the positions where
occurs in the plaintext, as well as possibly some other er-
roneous positions. We may control the number of errors
by adjusting a parameter
in the encryption algorithm;
each wrong position will be returned with probability about
, so for a
-word document, we expect to see about

false matches. The user will be able to eliminate all
the false matches (by decrypting), so in remote searching
applications, false matches should not be a problem so long
as they are not so common that they overwhelm the com-
munication channel between the user and the server.
This paper is structured as follows. We rst introduce
the problem of searching on encrypted data in Section 2 and
briey review some important background in Section 3. We
then describe our solution for the case of searching with
sequential scan in Section 4. We discuss further issues such
as advanced search and search with index in Section 5. We
discuss related work in Section 6 and nally we conclude in
Section 7. Appendix A presents the proofs for all of proofs
of security for these schemes.
2 Searching on Encrypted Data
We rst dene the problem of searching on encrypted
data.
Assume Alice has a set of documents and stores them
on an untrusted server Bob. For example, Alice could be a
mobile user who stores her email messages on an untrusted
mail server. Because Bob is untrusted, Alice wishes to en-
crypt her documents and only store the ciphertext on Bob.
Each document can be dividedup into ‘words’. Each ‘word’
may be any token; it may be a 64-bit block, an English
word, a sentence, or some other atomic quantity, according
to the application domain of interest. For simplicity, we typ-
ically assume these ‘words’ have the same length (otherwise
we can either pad the shorter ‘words’ or split longer ‘words’
to make all the ‘words’ to have equal length, or use some
simple extensions for variable length ‘words’; see also Sec-
tion 5.3). Because Alice may have only a low-bandwidth
network connection to the server Bob, she wishes to only
retrieve the documents which contain the word
.Inor-
der to achieve this goal, we need to design a scheme so that
after performing certain computations over the ciphertext,
Bob can determine with some probabilitywhether each doc-
ument contains the word
without learning anything else.
There seem to be two types of approaches. One possibil-
ity is to build up an index that, for each word
of interest,
lists the documents that contain
. An alternative is to per-
form a sequential scan without an index. The advantage of
using an index is that it may be faster than the sequential
scan when the documents are large. The disadvantage of
using an index is that storing and updating the index can be
of substantial overhead. So the approach of using an index
is more suitable for mostly-read-only data.
We rst describe our scheme for searching on encrypted
data without an index. Since the index-based schemes seem
to require less sophisticated constructions, we will defer
discussion of searching with an index until the end of the
paper (see Section 5.4).
3 Background and Denitions
Our scheme requires several fundamental primitives
from classical symmetric-key cryptography. Because we
will prove our scheme secure, we use only primitives with
a well-dened notion of security. We will list here the re-
quired primitives, as well as reviewing the standard deni-
tions of security for them. The denitions may be skipped
on rst reading for those uninterested in our theoretical
proofs of security.
We adopt the standard denitions of security from the
provable security literature [2], and we measure the strength
of the cryptographic primitives in terms of the resources
needed to break them. We will say that an attack
-breaks
a cryptographic primitive if the attack algorithm succeeds
in breaking the primitive with resources specied by
, and
we say that a crypto primitive is
-secure if there is no al-
gorithm that can
-break it. Let
be an arbitrary algorithm and let
and
be random vari-
ables distributed on
. The distinguishing probability
of
—sometimes called the advantage of
—for
and
is
Adv




With this background, our list of required primitives is
as follows:
1. A pseudorandom generator
, i.e., a stream cipher.
We say that
is a

-secure pseu-
dorandom generator if every algorithm
with run-
ning time at most
has advantage Adv
. The
advantage of an adversary
is dened as Adv

 


, where

are random variables distributed uniformly
on

.
2. A pseudorandom function
. We say that
is a

-secure pseudorandom function
if every oracle algorithm
making at most
oracle

queries and with running time at most
has advantage
Adv
. The advantage is dened as Adv




where
represents
a random function selected uniformly from the set of
all maps from
to
, and where the probabilities are
taken over the choice of
and
.
3. A pseudorandom permutation
, i.e., a block cipher.
We say that

is a

-secure pseu-
dorandom function if every oracle algorithm
making
at most
oracle queries and with running time at most
has advantage Adv

. The advantage is dened
as Adv


½



½

where
represents a random permutation selected uni-
formly from the set of all bijections on
, and where
the probabilities are taken over the choice of
and
.
Notice that the adversary is given an oracle for encryp-
tion as well as for decryption; this corresponds to the
adaptive chosen-plaintext/ciphertext attack model.
In general, the intuition is that

-security represents
resistance to attacks that use at most
ofine work and at
most
adaptive chosen-text queries.
There is of course no fundamental need for three sepa-
rate primitives, since in practice all three may be built out
of just one off-the-shelf primitive. For instance, given any
block cipher, we may build a pseudorandom generator us-
ing the counter mode [3] or a pseudorandom function using
the CBC-MAC [4].
We rely on the following notation. If
 
represents a pseudorandom function or permutation, we
write
for the result of applying
to input
with key
. We write

for the concatenation of
and
,
and
for the bitwise XOR of
and
. For the remain-
der of the paper, we let

be a pseudorandom
generator for some
,

be a pseudo-
random function, and
 
be a pseudoran-
dom permutation. Typically we will have
,
, and

.
4 Our Solution with Sequential Scan
In this section, we introduce our solution for searching
with sequential scan. We rst start with a basic scheme
and show that its encryption algorithm provides provable
secrecy. We then show how we can extend the rst scheme
to handle controlled searching and hidden searches. We de-
scribe our nal scheme which satises all the properties we
mentioned earlier including query isolation at the end.
4.1 Scheme I: The Basic Scheme
Alice wants to encrypt a document which contains the
sequence of words
½

. Intuitively, the scheme
works by computing the bitwise exclusive or (XOR) of the
clear-text with a sequence of pseudorandom bits which have
a special structure. This structure will allow to search on the
data without revealing anything else about the clear text.
More specically, the basic scheme is as follows. Alice
generates a sequence of pseudorandom values
½
 
using some stream cipher (namely, the pseudorandom gen-
erator
), where each
is
bits long. To encrypt
a
-bit word
that appears in position
, Alice takes the
pseudorandombits
, sets


, and outputs
the ciphertext

. Note that only Alice can gen-
erate the pseudorandom stream
½
 
so no one else
can decrypt. Of course, encryption can be done on-line,so
that we encrypt each word as it becomes available.
There is some exibility in how the keys
may be cho-
sen. One possibility is to use the same key
at every po-
sition in the document. Another alternative is to choose a
new key
for each position independent of all other keys.
More generally, at each position, Alice can either (a) choose
to be the same as some previous
(

), or (b) choose
independently of all the previous keys. We shall see later
how this exibility allows us to support a variety of inter-
esting features.
The basic scheme provides provable secrecy if the pseu-
dorandom function
and the pseudorandom generator
are secure. By this, we mean that, at each position where
is unknown, the values
are indistinguishable from
truly random bits for any computationally-bounded adver-
sary. We formalize the theorem as below.
Theorem 4.1. If
is a
 
-secure pseudorandom
function and
is a

-secure pseudorandom genera-
tor, and if the key material is chosen as described above,
then the algorithm described above for generating the se-
quence
½
 
is a

-secure pseudorandom
generator, where



and the
constant
is negligible compared to
.
In other words, we expect the basic scheme to be good
for encrypting up to about
max

´
µ
¾
words, if
the pseudorandom function and pseudorandom generator
are adequately secure. See Appendix A for a slightly more
precise statement of the theorem and for a full proof.
The basic scheme supports searches over the ciphertext
in the following way: if Alice wants to search the word
,
she can tell Bob
and the
corresponding to each lo-
cation
where a word
may occur. Bob can then search
for
in the ciphertext by checking whether
is
of the form

for some
. Such a search can be
performed in linear time. At the positions where Bob does
not know
, Bob learns nothing about the plaintext. Thus,
the scheme allows a limited form of control: if Alice only
wants Bob to be able to search over the rst half of the ci-
phertext, Alice should reveal only the
corresponding to

Plaintext
Stream Cipher
Ciphertext
Figure 1. The Basic Scheme
those locations and none of the
used in the second half of
the ciphertext.
As described so far, the basic scheme is not terribly sat-
isfying: if Alice wants to help Bob search for a word
,
either Alice must reveal all the
(thus potentially reveal-
ing the entire document), or Alice must know in advance
which locations
may appear at (which seems to defeat
the purpose of remote searching). However, we shall see
next how to take care of this difculty.
4.2 Scheme II: Controlled Searching
Let

be an additional pseudo-
random function, which will be keyed independently of
.
The main idea is to choose our keys as

¼
.
We require that
be chosen uniformly randomly in
by
Alice and never be revealed. Then, if Alice wish to allow
Bob to search for the word
, she reveals
¼
and
to
him. This allows Bob to identify all the locations where
might occur, but reveals absolutely nothing on the locations
where
. This attains our desired goal of con-
trolled searching. We show the correctness of this approach
in the following theorem.
Theorem 4.2. Suppose
is a
 
-secure pseudoran-
dom function,
is a
 
-secure pseudorandom func-
tion, and
is a

-secure pseudorandom generator.
If the key material is chosen as described above, then
the algorithm described above for generating the sequence
½

will be a

-secure pseudorandom
generator, where



.
This shows that our scheme for controlled searching is
about as good as the basic scheme, if the underlying prim-
itives are secure. See Appendix A for a proof as well as a
more precise formulation.
Various extensions of this idea are possible. If the doc-
ument to be encrypted consists of a series of chapters, an
alternative approach is to generate the key
for the word
in chapter
as

¼

. This allows Al-
ice to control which chapters Bob may search in as well as
controlling which words Bob may search for.
We can take this idea even further by using a hierarchi-
cal key management scheme. Alice sets


and

¼

. Then she can reveal either
(1)
¼
´
½

µ

for each chapter of interest or (2)
¼

itself if she wishes to succinctly authorize Bob
to search for
in all the chapters.
This scheme still does not support hidden search queries:
in order to let Bob search for the location where the word
appears, Alice has to reveal
to Bob. We shall see next
that this problem can be easily xed.
4.3 Scheme III: Support for Hidden Searches
Suppose Alice would now like to ask Bob to search for
a word
but she is not willing to reveal
to Bob. We
propose a simple extension to the above scheme to support
this goal.
Alice should merely pre-encrypt each word
of the
clear text separately using a deterministic encryption algo-
rithm
¼¼
. Note that
is not allowed to use any random-
ness, and the computation of
¼¼
may depend only on
and must not depend on the position
in the document
where
is found. So we may think of this pre-encryption
step as ECB encryption of the words of the document us-
ing some block cipher. (Of course, if the word is very
long, internally the map
¼¼
may be implemented by CBC-
encrypting
with a constant IV, or some such, but the
point is that this process must be the same at every position
of the document.) We let

¼¼
.
After the pre-encryption phase, Alice has a sequence of
-encrypted words
½
 
. Now she post-encrypts
that sequence using the stream cipher construction de-
scribed above to obtain

, where
¼¼
and

To search for a word
, Alice computes

¼¼
and

¼
, and sends

to Bob. Note that this

Stream Cipher
Ciphertext
Plaintext
Figure 2. The Scheme for Hidden Search
allows Bob to search for
without revealing
itself. It
is easy to see that this scheme satises the hidden search
property as long as the pre-encryption
is secure.
4.4 Scheme IV: The Final Scheme
Careful readers may have noticed that Scheme III ac-
tually suffers from a small inadequacy: if Alice generates
keys
as

¼
¼¼

then Alice can no longer
recover the plaintext from just the ciphertext because she
would need to know
¼¼
(more precisely, the last
bits of
¼¼
) before she can decrypt. This defeats the
purpose of an encryption scheme, because even legitimate
principals with access to the decryption keys will be unable
to decrypt. (Scheme II also has a similar inadequacy, but
as we will show below, the best way to x it is to introduce
pre-encryption as in Scheme III.)
We now show a simple x for this problem. In the xed
scheme, we split the pre-encrypted word
¼¼
into two parts,
!

, where
!
(respectively
)
denotes the rst
bits (resp. last
bits) of
. Instead
of generating

¼

, Alice should generate
as

¼
!
. To decrypt, Alice can generate
using
the pseudorandom generator (since Alice knows the seed),
and with
she can recover
!
by XORing
against the
rst
bits of
. Finally, knowledgeof
!
allows Alice
to compute
and thus nish the decryption.
This x is not secure if the
s are not encrypted since it
might be very likely in some cases that different words have
the same rst
bits. Pre-encryption will eliminate
this problem, since with high probability all the
!
s are
distinct. (Assuming that the pre-encryption
is a pseudo-
random permutation, then due to the birthday paradox [15],
the probability that at least one collision happens after en-
crypting
words is at most

´
·½µ
)
With this x, the resulting scheme is provably secure,
and in fact we can also show that it provides query isola-
tion, meaning that even when a single key
is revealed, no
extra information is leaked beyond the ability to identify the
positions where the corresponding word
occurs.
Theorem 4.3. Suppose
is a
 
-secure pseudoran-
dom permutation,
is a
 
-secure pseudorandom
function,
is a
 
-secure pseudorandom function,
is a

-secure pseudorandom generator, and we choose
the key material as described above. Then the algorithm de-
scribed above for generatingthe sequence
½
 
will
beaa

-secure pseudorandom generator, where



.
Moreover, if we disclose one
and consider the reduced
sequence
obtained by discarding all the
values at po-
sitions where
, then we obtain a

-secure
pseudorandom generator, where


.
Strictly speaking, the proof of the theorem does not ac-
tually require
to be a pseudorandom permutation: if
"
denotes the (keyed) map sending
to the rst
bits
of
¼¼
, then we can make do with the much weaker
assumption that collisions in
"
should be rare. As a special
case, if the rst
#
bits of
"
(
#
) can be shown to be
a pseudorandom function, then
will necessarily have the
required property, and we will be able to prove a result anal-
ogous to Theorem 3. This suggests that for pre-encryption

Citations
More filters
01 Jan 2011
TL;DR: To understand the central claims of evolutionary psychology the authors require an understanding of some key concepts in evolutionary biology, cognitive psychology, philosophy of science and philosophy of mind.
Abstract: Evolutionary psychology is one of many biologically informed approaches to the study of human behavior. Along with cognitive psychologists, evolutionary psychologists propose that much, if not all, of our behavior can be explained by appeal to internal psychological mechanisms. What distinguishes evolutionary psychologists from many cognitive psychologists is the proposal that the relevant internal mechanisms are adaptations—products of natural selection—that helped our ancestors get around the world, survive and reproduce. To understand the central claims of evolutionary psychology we require an understanding of some key concepts in evolutionary biology, cognitive psychology, philosophy of science and philosophy of mind. Philosophers are interested in evolutionary psychology for a number of reasons. For philosophers of science —mostly philosophers of biology—evolutionary psychology provides a critical target. There is a broad consensus among philosophers of science that evolutionary psychology is a deeply flawed enterprise. For philosophers of mind and cognitive science evolutionary psychology has been a source of empirical hypotheses about cognitive architecture and specific components of that architecture. Philosophers of mind are also critical of evolutionary psychology but their criticisms are not as all-encompassing as those presented by philosophers of biology. Evolutionary psychology is also invoked by philosophers interested in moral psychology both as a source of empirical hypotheses and as a critical target.

4,670 citations


Cites background from "Practical techniques for searches o..."

  • ...Various techniques exist for searching through encrypted data (Song et al. 2000), which provides a form of privacy protection (the data is encrypted) and selective access to sensitive data....

    [...]

Book ChapterDOI
02 May 2004
TL;DR: This work defines and construct a mechanism that enables Alice to provide a key to the gateway that enables the gateway to test whether the word “urgent” is a keyword in the email without learning anything else about the email.
Abstract: We study the problem of searching on data that is encrypted using a public key system. Consider user Bob who sends email to user Alice encrypted under Alice’s public key. An email gateway wants to test whether the email contains the keyword “urgent” so that it could route the email accordingly. Alice, on the other hand does not wish to give the gateway the ability to decrypt all her messages. We define and construct a mechanism that enables Alice to provide a key to the gateway that enables the gateway to test whether the word “urgent” is a keyword in the email without learning anything else about the email. We refer to this mechanism as Public Key Encryption with keyword Search. As another example, consider a mail server that stores various messages publicly encrypted for Alice by others. Using our mechanism Alice can send the mail server a key that will enable the server to identify all messages containing some specific keyword, but learn nothing else. We define the concept of public key encryption with keyword search and give several constructions.

3,024 citations


Cites background from "Practical techniques for searches o..."

  • ...at al [28] requires very little communication between the user and the database (proportional to the security parameter) and only one round of interaction....

    [...]

  • ...We stress that both the constructions of [26, 17] and the more recent work of [10, 28, 16] apply only to the private-key setting for users who own their data and wish to upload it to a third-party database that they do not trust....

    [...]

Proceedings ArticleDOI
30 Oct 2006
TL;DR: In this paper, the authors proposed a searchable symmetric encryption (SSE) scheme for the multi-user setting, where queries to the server can be chosen adaptively during the execution of the search.
Abstract: Searchable symmetric encryption (SSE) allows a party to outsource the storage of its data to another party (a server) in a private manner, while maintaining the ability to selectively search over it. This problem has been the focus of active research in recent years. In this paper we show two solutions to SSE that simultaneously enjoy the following properties: Both solutions are more efficient than all previous constant-round schemes. In particular, the work performed by the server per returned document is constant as opposed to linear in the size of the data. Both solutions enjoy stronger security guarantees than previous constant-round schemes. In fact, we point out subtle but serious problems with previous notions of security for SSE, and show how to design constructions which avoid these pitfalls. Further, our second solution also achieves what we call adaptive SSE security, where queries to the server can be chosen adaptively (by the adversary) during the execution of the search; this notion is both important in practice and has not been previously considered.Surprisingly, despite being more secure and more efficient, our SSE schemes are remarkably simple. We consider the simplicity of both solutions as an important step towards the deployment of SSE technologies.As an additional contribution, we also consider multi-user SSE. All prior work on SSE studied the setting where only the owner of the data is capable of submitting search queries. We consider the natural extension where an arbitrary group of parties other than the owner can submit search queries. We formally define SSE in the multi-user setting, and present an efficient construction that achieves better performance than simply using access control mechanisms.

1,673 citations

Book ChapterDOI
06 Mar 2011
TL;DR: A new methodology for realizing Ciphertext-Policy Attribute Encryption (CP-ABE) under concrete and noninteractive cryptographic assumptions in the standard model is presented.
Abstract: We present a new methodology for realizing Ciphertext-Policy Attribute Encryption (CP-ABE) under concrete and noninteractive cryptographic assumptions in the standard model Our solutions allow any encryptor to specify access control in terms of any access formula over the attributes in the system In our most efficient system, ciphertext size, encryption, and decryption time scales linearly with the complexity of the access formula The only previous work to achieve these parameters was limited to a proof in the generic group model We present three constructions within our framework Our first system is proven selectively secure under a assumption that we call the decisional Parallel Bilinear Diffie-Hellman Exponent (PBDHE) assumption which can be viewed as a generalization of the BDHE assumption Our next two constructions provide performance tradeoffs to achieve provable security respectively under the (weaker) decisional Bilinear-Diffie-Hellman Exponent and decisional Bilinear Diffie-Hellman assumptions

1,444 citations


Cites background from "Practical techniques for searches o..."

  • ...A related line of work called predicate encryption or searching on encrypted data attempts to evaluate predicates over the encrypted data itself [39, 12, 1, 16, 15, 37, 29]....

    [...]

Posted Content
TL;DR: In this article, the authors present a new methodology for realizing Ciphertext-Policy Attribute Encryption (CP-ABE) under concrete and noninteractive cryptographic assumptions in the standard model.
Abstract: We present a new methodology for realizing Ciphertext-Policy Attribute Encryption (CP-ABE) under concrete and noninteractive cryptographic assumptions in the standard model. Our solutions allow any encryptor to specify access control in terms of any access formula over the attributes in the system. In our most efficient system, ciphertext size, encryption, and decryption time scales linearly with the complexity of the access formula. The only previous work to achieve these parameters was limited to a proof in the generic group model. We present three constructions within our framework. Our first system is proven selectively secure under a assumption that we call the decisional Parallel Bilinear Diffie-Hellman Exponent (PBDHE) assumption which can be viewed as a generalization of the BDHE assumption. Our next two constructions provide performance tradeoffs to achieve provable security respectively under the (weaker) decisional Bilinear-Diffie-Hellman Exponent and decisional Bilinear Diffie-Hellman assumptions.

1,416 citations

References
More filters
Journal ArticleDOI
TL;DR: This work describes schemes that enable a user to access k replicated copies of a database and privately retrieve information stored in the database, so that each individual server gets no information on the identity of the item retrieved by the user.
Abstract: Publicly accessible databases are an indispensable resource for retrieving up-to-date information. But they also pose a significant risk to the privacy of the user, since a curious database operator can follow the user's queries and infer what the user is after. Indeed, in cases where the users' intentions are to be kept secret, users are often cautious about accessing the database. It can be shown that when accessing a single database, to completely guarantee the privacy of the user, the whole database should be down-loaded; namely n bits should be communicated (where n is the number of bits in the database).In this work, we investigate whether by replicating the database, more efficient solutions to the private retrieval problem can be obtained. We describe schemes that enable a user to access k replicated copies of a database (k≥2) and privately retrieve information stored in the database. This means that each individual server (holding a replicated copy of the database) gets no information on the identity of the item retrieved by the user. Our schemes use the replication to gain substantial saving. In particular, we present a two-server scheme with communication complexity O(n1/3).

1,918 citations


"Practical techniques for searches o..." refers background in this paper

  • ...Several researchers have studied the Private Information Retrieval (PIR) problem [9], so that clients may access entries in a distributed table without revealing which entrie s they are interested in....

    [...]

Book ChapterDOI
Ralph C. Merkle1
20 Aug 1989
TL;DR: A practical digital signature system based on a conventionalryption function which is as secure as the conventional encryption function is described, without the several years delay required for certification of an untested system.
Abstract: A practical digital signature system based on a conventional encryption function which is as secure as the conventional encryption function is described. Since certified conventional systems are available it can be implemented quickly, without the several years delay required for certification of an untested system.

1,746 citations


"Practical techniques for searches o..." refers background in this paper

  • ...Even when this type of attack is present, it is possible to combine our scheme with hash tree techniques [ 17 ] to ensure the integrity of the data and detect such attacks, although a full description of this countermeasure is out of the scope of the paper....

    [...]

Proceedings ArticleDOI
23 Oct 1995
TL;DR: Schemes that enable a user to access k replicated copies of a database and privately retrieve information stored in the database and get no information on the identity of the item retrieved by the user are described.
Abstract: We describe schemes that enable a user to access k replicated copies of a database (k/spl ges/2) and privately retrieve information stored in the database. This means that each individual database gets no information on the identity of the item retrieved by the user. For a single database, achieving this type of privacy requires communicating the whole database, or n bits (where n is the number of bits in the database). Our schemes use the replication to gain substantial saving. In particular, we have: A two database scheme with communication complexity of O(n/sup 1/3/). A scheme for a constant number, k, of databases with communication complexity O(n/sup 1/k/). A scheme for 1/3 log/sub 2/ n databases with polylogarithmic (in n) communication complexity.

1,630 citations

Proceedings ArticleDOI
19 Oct 1997
TL;DR: This work studies notions and schemes for symmetric (ie. private key) encryption in a concrete security framework and gives four different notions of security against chosen plaintext attack, providing both upper and lower bounds, and obtaining tight relations.
Abstract: We study notions and schemes for symmetric (ie. private key) encryption in a concrete security framework. We give four different notions of security against chosen plaintext attack and analyze the concrete complexity of reductions among them, providing both upper and lower bounds, and obtaining tight relations. In this way we classify notions (even though polynomially reducible to each other) as stronger or weaker in terms of concrete security. Next we provide concrete security analyses of methods to encrypt using a block cipher, including the most popular encryption method, CBC. We establish tight bounds (meaning matching upper bounds and attacks) on the success of adversaries as a function of their resources.

1,089 citations


"Practical techniques for searches o..." refers background in this paper

  • ...One possibility is to use the same key k at every position in the document....

    [...]

Proceedings ArticleDOI
19 Oct 1997
TL;DR: Based on the quadratic residuosity assumption, a single database, computationally private information retrieval scheme with O(n/sup /spl epsiv//) communication complexity for any /spl Epsiv/>0.0 is presented.
Abstract: We establish the following, quite unexpected, result: replication of data for the computational private information retrieval problem is not necessary. More specifically, based on the quadratic residuosity assumption, we present a single database, computationally private information retrieval scheme with O(n/sup /spl epsiv//) communication complexity for any /spl epsiv/>0.

1,074 citations


"Practical techniques for searches o..." refers background in this paper

  • ..., [16, 13, 10, 7] for important exceptions which allow to remove some—but not all—of those limitations)....

    [...]

Frequently Asked Questions (13)
Q1. What contributions have the authors mentioned in the paper "Practical techniques for searches on encrypted data" ?

In this paper, the authors describe their cryptographic schemes for the problem of searching on encrypted data and provide proofs of security for the resulting crypto systems. They are provably secure: they provide provable secrecy for encryption, in the sense that the untrusted server can not learn anything about the plaintext when only given the ciphertext ; they provide query isolation for searches, meaning that the untrusted server can not learn anything more about the plaintext than the search result ; they provide controlled searching, so that the untrusted server can not search for an arbitrary word without the user ’ s authorization ; they also support hidden queries, so that the user may ask the untrusted server to search for a secret word without revealing the word to the server. The algorithms the authors present are simple, fast ( for a document of length, the encryption and search algorithms only need stream cipher and block cipher operations ), and introduce almost no space and communication overhead, and hence are practical to use today. The authors gratefully acknowledge support for this research from several US government agencies. This research was suported in part by the Defense Advanced Research Projects Agency under DARPA contract N6601-9928913 ( under supervision of the Space and Naval Warfare Systems Center San Diego ), by the National Science foundation under grant FD99-79852, and by the United States Postal Service under grant USPS 1025 90-98-C3513. Views and conclusions contained in this document are those of the authors and do not necessarily represent the official opinion or policies, either expressed or implied of the US government or any of its agencies, DARPA, NSF, USPS. 

Their techniques have a number of crucial advantages: they are provably secure; they support controlled and hidden search and query isolation; they are simple and fast (More specifically, for a document of length , the encryption and search algorithms only need stream cipher and block cipher operations); and they introduce almost no space and communication overhead. 

For instance, given any block cipher, the authors may build a pseudorandom generator using the counter mode [3] or a pseudorandom function using the CBC-MAC [4]. 

Because Alice may have only a low-bandwidth network connection to the server Bob, she wishes to only retrieve the documents which contain the word . 

When words lengths may vary, it is important to hide the length information from the server, because revealing the length of each word might allow for statistical attacks. 

The authors may control the number of errors by adjusting a parameter in the encryption algorithm; each wrong position will be returned with probability about , so for a -word document, the authors expect to see about false matches. 

Today’s mail servers such as IMAP servers [11], file servers and other data storage servers typically must be fully trusted—they have access to the data, and hence must be trusted not to reveal it without authorization—which introduces undesirable security and privacy risks in applications. 

One natural approach is to store the length field before each word in the file, and to glue the length field and word together as one word to perform encryption and search using their standard schemes. 

The techniques provide controlled searching, so that the untrusted server cannot search for a word without the user’s authorization. 

In order to prevent Bob from doing statistical analysis on the index, it is better to keep the lists of pointers in a fixed-size list. 

Note that by keeping the lists of pointers in a fixed-size list, the authors are mainly preventing Bob from learning statistical information on the key words that he has not searched. 

Alice can split the long list into several lists with the fixed size; then, to search for such a word, Alice will need to ask Bob to perform and merge several search queries in parallel. 

In this case, the cost of each scan is increased, because the number of operations is determined by the bit-length of the document rather than by the number of blocks in the document.