scispace - formally typeset
Open AccessJournal ArticleDOI

Deleting Secret Data with Public Verifiability

TLDR
In this paper, the authors proposed a trust-but-verify (SSE) approach to the secure data deletion problem, which enables a user to verify the correct implementation of two important operations inside a Trusted Platform Module (TPM) without accessing its source code.
Abstract
Existing software-based data erasure programs can be summarized as following the same one-bit-return protocol: the deletion program performs data erasure and returns either success or failure. However, such a one-bit-return protocol turns the data deletion system into a black box—the user has to trust the outcome but cannot easily verify it. This is especially problematic when the deletion program is encapsulated within a Trusted Platform Module (TPM), and the user has no access to the code inside. In this paper, we present a cryptographic solution that aims to make the data deletion process more transparent and verifiable. In contrast to the conventional black/white assumptions about TPM (i.e., either completely trust or distrust), we introduce a third assumption that sits in between: namely, “trust-but-verify”. Our solution enables a user to verify the correct implementation of two important operations inside a TPM without accessing its source code: i.e., the correct encryption of data and the faithful deletion of the key. Finally, we present a proof-of-concept implementation of the SSE system on a resource-constrained Java card to demonstrate its practical feasibility. To our knowledge, this is the first systematic solution to the secure data deletion problem based on a “trust-but-verify” paradigm, together with a concrete prototype implementation.

read more

Content maybe subject to copyright    Report

Newcastle University ePrints - eprint.ncl.ac.uk
Hao F, Clarke D, Zorzo AF. Deleting Secret Data with Public Verifiability. IEEE
Transactions on Dependable and Secure Computing 2015. DOI:
10.1109/TDSC.2015.2423684
Copyright:
© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all
other uses, in any current or future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or
reuse of any copyrighted component of this work in other works.
DOI link to article:
http://dx.doi.org/10.1109/TDSC.2015.2423684
Date deposited:
16/05/2016

1
Deleting Secret Data with Public Verifiability
Feng Hao, Member, IEEE, Dylan Clarke, Avelino Francisco Zorzo
Abstract—Existing software-based data erasure programs can be summarized as following the same one-bit-return protocol: the
deletion program performs data erasure and returns either success or failure. However, such a one-bit-return protocol turns the
data deletion system into a black box the user has to trust the outcome but cannot easily verify it. This is especially problematic
when the deletion program is encapsulated within a Trusted Platform Module (TPM), and the user has no access to the code
inside.
In this paper, we present a cryptographic solution that aims to make the data deletion process more transparent and verifiable.
In contrast to the conventional black/white assumptions about TPM (i.e., either completely trust or distrust), we introduce a third
assumption that sits in between: namely, “trust-but-verify”. Our solution enables a user to verify the correct implementation of two
important operations inside a TPM without accessing its source code: i.e., the correct encryption of data and the faithful deletion
of the key. Finally, we present a proof-of-concept implementation of the SSE system on a resource-constrained Java card to
demonstrate its practical feasibility. To our knowledge, this is the first systematic solution to the secure data deletion problem
based on a “trust-but-verify” paradigm, together with a concrete prototype implementation.
F
1 INTRODUCTION
Secure data erasure requires permanently deleting
digital data from a physical medium such that the
data is irrecoverable [13]. This requirement plays a
critical role in all practical data management systems,
and in satisfying several government regulations on
data protection [25]. For the past two decades, this
subject has been extensively studied by researchers
in both academia and industry, resulting in a rich
body of literature [5], [7], [8], [13], [14], [17], [23], [25],
[26], [28], [33], [35]. A recent survey on this topic is
published in [27] .
1.1 One-bit return
To delete data securely is a non-trivial problem. It
has been generally agreed that no existing software-
based solutions can guarantee the complete removal
of data from the storage medium [27]. To explain the
context of this field, we will abstract away imple-
mentation details of existing solutions, and focus at
a higher and more intuitive protocol level. Existing
deletion methods can be described using essentially
the same protocol, which we call the “one-bit-return”
protocol. In this protocol, the user sends a command
usually through a host computer to delete data
from a storage system, and receives a one-bit reply
indicating the status of the operation. The process can
be summarized as follows.
User Storage : Delete data
Storage User : Success/Failure (1 bit )
F. Hao and D. Clarke are with the School of Computing Science,
Newcastle University, UK. Email: {Feng.Hao, Dylan.Clarke}@ncl.ac.uk.
A.F. Zorzo is with Pontifical Catholic University of RS, Brazil. Email:
avelino.zorzo@pucrs.br. The first author would like to acknowledge the
support of EPSRC First Grant EP/J011541/1 and ERC Starting Grant
No. 106591.
Deletion by unlinking. Take the deletion in the
Windows operating system as an example. When the
user wishes to delete a file (say by hitting the “delete”
button), the operating system removes the link of the
file from the underlying file system, and returns one
bit to the user: Success. However, the return of the
“Success” bit can be misleading. Although the link
of the file has been removed, the content of the file
remains on the disk. An attacker with a forensic tool
can easily recover the deleted file by scanning the disk
[12]. The same problem also applies to the default
deletion program bundled in other operating systems
(e.g., Apple and Linux).
Deletion by overwriting. Obviously, merely unlink-
ing the file is not sufficient. In addition, the content of
the file should be overwritten with random data. This
has been proposed in several papers [5], [13], [14] and
specified in various standards (e.g., [18]). However,
one inherent limitation with the overwriting methods
is that they cannot guarantee the complete removal of
data. As concluded in [13]: “it is effectively impossible
to sanitize storage locations by simply overwriting
them, no matter how many overwrite passes are made
or what data patterns are written.” The conclusion
holds for not only magnetic drives [13], but also tapes
[7], optical disks [14] and flash-based solid state drives
[33]. In all these cases, an attacker, equipped with
advanced microsoping tools, may recover overwritten
data based on the physical remanence of the deleted
data left on the storage medium. Therefore, although
overwriting data makes the recovery harder, it does
not change the basic one-bit-return protocol. Same as
before, the return of “Success” cannot guarantee the
actual deletion of data.
Deletion by cryptography. Boneh and Lipton [7]
were among the first in proposing the use of cryp-
tography to address the secure data erasure problem,

2
with a number of follow-up works [17], [20], [21],
[24]–[26], [35]. In general, a cryptography-based so-
lution works by encrypting all data before saving it
to the disk, and later deleting the data by discard-
ing the decryption key. This approach is especially
desirable when duplicate copies of data are backed
up in distributed locations so it becomes impossible
to overwrite every copy [7]. The use of cryptography
essentially changes the problem of deleting a large
amount of data to that of deleting a short key (say
a 128-bit AES key). Still, the fundamental question
remains: how to securely delete the key?
1.2 Key management
When cryptography is used to address the data era-
sure problem, the key management becomes critically
important. There are several approaches proposed in
the past literature to manage cryptographic keys.
The first method is to simply save the key on the
disk, alongside the encrypted data (typically as part
of the meta data in the file header) [17], [20], [25],
[26]. Deleting the data involves overwriting the disk
location where the key is stored. Once the key is
erased, the ciphertext immediately becomes useless
[7]. This has the advantage of quickly erasing data
since only a small block of data (16 bytes for AES-128)
needs to be overwritten. However, if the key is saved
on the disk, cryptography may not add much security
in ensuring data deletion [16]. On the contrary, it
may even degrade security if not handled properly
instead of recovering a large amount of overwritten
data, the attacker now just needs to recover a short
128-bit key. This may significantly increase the chance
of a total recovery. Once the key is restored, the
deleted data will be fully recovered. (We assume the
ciphertext is available to the attacker, which is usually
the case.)
The second method is to use a user-defined pass-
word as the encryption key [35]. The key is derived on
the fly in RAM upon the user’s entry of the password
so it is never saved on the disk. However, passwords
are naturally bounded by low entropy (typically 20-
30 bits) [3]. Hence, cryptographic keys derived from
passwords are subject to brute-force attacks. As soon
as the attacker has access to ciphertext data, the
ciphertext becomes an oracle, against which the at-
tacker can recover the key through the exhaustive
search. Instead of directly using a password-derived
encryption key, Lee et al. proposed to first generate a
random AES key for encrypting data and then use the
password to wrap the AES key and store the wrapped
key on the disk [21]. This is essentially equivalent to
deriving the key from the password. The wrapped key
now becomes an oracle, against which the attacker can
run the exhaustive search.
The third method is to store the key in a dencentral-
ized network. Along this line, Geambasu et. al. pro-
pose a solution called Vanish, which generates a
random key to encrypt the user’s data locally and
then distributes shares of the key using Shamir’s
secret sharing scheme to a global-sale, peer-to-peer,
distributed hash tables (DHTs). The shares of the
key naturally disappear (vanish), due to the fact that
the DHT is constantly changing. However, Wochok
et. al. [32] subsequently show two Sybil attacks
that work by continuously crawling the DHT and
recovering the stored key shared before they vanish.
They conclude that the original Vanish scheme cannot
guarantee the secure deletion of the key.
The fourth method is to store the key in a tamper
resistant hardware module (e.g., TPM) and define the
Application Programming Interface (API) to manage
the stored keys. This is in line with the standard
practice employed in financial industry for key man-
agement [3]. In this paper, we will adopt the same
TPM-based approach. However, the main difficulty
with the TPM lies in how the API should be defined.
In 2005, Perlman first proposed to use a TPM for
assured data deletion [24]. In her solution, data is
always encrypted before being saved onto the disk.
All decryption keys are stored in a tamper resistant
module and do not live outside the module. Erasing
the keys will effectively delete the data. To delete
a key, the user simply sends a delete command to
the module with a reference to that key and receives
a one-bit confirmation if the operation is successful.
Clearly, this design still follows the one-bit return
protocol, which assumes complete trust on the correct
implementation of the software inside the module.
1.3 Motivation for public verifiability
There are similar examples of black-box systems in
security. For instance, as explained in [19], the Direct
Recording Electronic (DRE) e-voting machines, widely
used in the US between 2000 and 2004, worked like
a black box. The system returns a tally at the end of
the election, which the voters have to trust but cannot
easily verify. The lack of verifiability had raised wide-
spread suspicion about the integrity of the software
inside the voting machine and hence the integrity of
the election, eventually forcing several states in the US
to abandon DRE machines. Today, the importance of
having public verifiability in any e-voting system has
been commonly acknowledged and progress is being
made in deploying verifiable e-voting in real-world
elections [2], [6].
Unfortunately, the need for public verifiability has
been almost entirely neglected in the secure data
erasure field. This is an important omission that we
aim to address in this research work.
When a TPM is used for key management, the
trust assumption about the TPM becomes a critical
question. In the past literature [3], there exist two
disparate assumptions about TPM: either completely
trust or totally distrust. However, we find neither of

3
such black/white assumptions is adequate in captur-
ing the reality. On one hand, the fact that a TPM stores
cryptographic keys implies an inherent trust. But on
the other hand, the encapsulated nature of a TPM
prevents users from verifying the internal software,
which inevitably adds distrust. These seemingly con-
tradictory dual-facets are echoes of similar problems
in e-voting, where a DRE machine is used as a trusted
device to record votes, but the public have no access to
its internal code. The established solution to address
this dilemma is “trust-but-verify” [2], [6], [15]: i.e.,
demanding the voting machine to produce additional
cryptographic proofs such that by verifying the cor-
rectness of those proofs a voter can gain confidence
about the integrity of the internal software (this is also
succinctly summarized by Ron Rivest and John Wack
as the “software independence” principle).
Summary of main idea. The main idea of this work
follows the same design principle based on “trust-but-
verify”. By applying cryptographic techniques, we
allow an end user to verify the correct implementation
of two important operations inside a TPM: encryption
and deletion.
First, the user is able to explicitly verify that the en-
cryption follows the correct procedure (i.e., the cipher-
text is free from containing any trap-door block). By
contrast, previous cryptography-based data deletion
solutions only provide implicit assurance: by checking
if the decryption produces the same original plaintext,
one gains implicit assurance about the correctness
of the encryption. However, we argue that such an
implicit assurance is inadequate (in light of Snowden
revelations [40]): a TPM manufacturer might be co-
erced by a state-funded adversary to compress a trap-
door block into the ciphertext so to keep the output
length the same. The user will not be able to notice
any difference and the decryption can still produce
the original plaintext (we will explain more details
in Section 6.2.2). This issue will be addressed in our
solution through the Audit function.
Second, the user is able to verify the outcome of a
deletion process. Obviously, because using software
means can never guarantee the complete deletion
of data, verifying the successful erasure of data ap-
pears intuitively impossible. However, “you normally
change the problem if you can’t solve it.” (David
Wheeler [31]) Here, we slightly change the prob-
lem by shifting verifying the successful deletion of
data to verifying the failure of that operation. The
deletion process returns a digital signature, which
cryptographically binds the deletion program’s com-
mitment of deleting a secret key to the outcome of
that operation. In case the supposedly deleted key is
recovered later, the signature can serve as publicly
verifiable evidence to prove the vendor’s liability.
More technical details will be explained in Section 4
after we cover the related work in Section 2 and the
relevant cryptographic primitives in Section 3. Sec-
tion 5 explains the proof-of-concept implementation
with detailed performance measurements, followed
by security analysis in Section 6. Finally, Section 7
concludes the paper.
2 RELATED WORK
In this section, we review related works that discuss
the importance of verifiability for secure data deletion.
In 2010, Paul and Saxena [22] aim to give users
the ability to verify the outcome of secure data dele-
tion. They propose a scheme called the “Proof of
Erasability” (PoE), in which a host program deletes
data by overwriting the disk with random patterns
and the disk must return the same patterns as the
proof of erasability. Clearly, this so-called proof is
not cryptographically binding, nor publicly verifiable,
since the data storage system may cheat by echoing
the received patterns without actually overwriting the
disk.
In ESORICS’10, Perito and Tsudik [23] study how
to securely erase memory in an embedded device,
as a preparatory step for updating the firmware in
the device. They propose a protocol called Proofs of
Secure Erasure (PoSE-s). In this protocol, the host
program sends a string of random patterns to the
embedded device. To prove that the memory has been
securely erased, the embedded device should return
the same string of patterns. It is assumed that the
embedded device has limited memory - just enough
to hold the received random patterns. This protocol
works essentially the same way as the PoE in [22], but
with an additional assumption of bounded storage.
Finally, in 2012, Wanson and Wei [34] investigate
the effectiveness of the built-in data erasure mecha-
nisms in several commercial Solid State Drives (SSDs).
They discovered that the built-in “sanitize” methods
in several SSD were completely ineffective due to
software bugs. Based on this discovery, they stress the
importance of being able to independently verify the
data deletion outcome. They propose a verification
method that works as follows. First of all, a series
of recognizable patterns are written to the entire
drive. Then, the drive is erased by calling the built-
in “sanitize” command. Next, the drive is manually
dismantled and a custom-built probing tool (made by
the authors) is used to read raw bits from the memory
in search for any unerased data. This approach can
be useful for factory testing. However, it may prove
difficult for ordinary users to perform.
In summary, several researchers have recognized
the importance of verifiability in the secure data dele-
tion process and proposed some solutions. But none
of those solutions have used any cryptography. Our
work differs from theirs in that we aim to provide
public verifiability for a secure data deletion system by
adopting public key cryptography.

4
3 CRYPTOGRAPHIC PRIMITIVES
In this section, we explain two relevant cryptographic
primitives: the Diffie-Hellman Integrated Encryption
Scheme (DHIES) and Chaum-Pedersen Zero Knowl-
edge Proof.
3.1 DHIES
The DHIES is a public key encryption system adapted
from the Diffie-Hellman key exchange protocol and
has been included into the draft standards of ANSI
X9.63 and IEEE P1363a [1]. The scheme is designed to
provide security against chosen ciphertext attacks. It
makes use of a finite cyclic group, which for example
can be the same cyclic group used in DSA or ECDSA
[29]. Here, we use the ECDSA-like group for illustra-
tion. Let E be an underlying elliptic curve for ECDSA
and G be a base point on the curve with the prime
order n.
Assume the user’s private key is v, which is chosen
at random from [1, n 1]. The corresponding public
key is Q
v
= v · G. The encryption in DHIES works
as follows. The program first generates an ephemeral
public key Q
u
= u · G where u
R
[1, n 1]. It then
derives a shared secret following the Diffie-Hellman
protocol: S = u · Q
v
. The shared secret is then hashed
through a cryptographic hash function H, and the
output is split into two keys: encKey and macKey.
First, the encKey key is used to encrypt a message to
obtain encM. Then, the macKey key is used to com-
pute a MAC tag from the encrypted message encM.
The final ciphertext consists of the ephemeral key Q
u
,
the MAC tag and the encrypted message encM. This
encryption process is summarized in Figure 1.
The decryption procedure starts with checking if
the ephemeral public key Q
u
is a valid element in
the designated group a step commonly known as
“public key validation”
1
. Next, it derives the same
shared secret value following the Diffie-Hellman pro-
tocol. Based on the shared secret, a hash function is
applied to derive encKey and macKey, according to
Figure 1. Upon the successful validation of the MAC
tag by using the macKey, the encrypted message will
be decrypted accordingly by using the encKey. More
details about DHIES can be found in [1].
It is worth noting that DHIES is essentially built on
the Diffie-Hellman key exchange protocol, but with
adaptations to make it suitable for a secure data
storage application. For example, Alice can encrypt a
message under her own public key using DHIES, so
that only she can decrypt the message at a later time.
1. The original DHIES paper [1] does not explicitly mandate
public key validation on the ephemeral public key, but as explained
by Antipa et al. in [4], the security proofs in DHIES [1] implicitly
assume the received points must be on the valid elliptic curve;
otherwise, the scheme may be subject to invalid-curve attacks.
In our specification, we regard such public key validation as a
mandatory step.
G
v·G
u·G
T
Make
secret value
u
Make
ephemeral PK
H
macKey encKey
E
M
User’s
public key
Ephemeral
public key
Secret value
tag encM
Figure 1: Encrypting with DHIES [1]. The symmetric
encryption algorithm is denoted as E, the MAC algo-
rithm as T and the hash function as H. The shaded
rectangles constitute the ciphertext.
In some sense, it is like Alice securely communicating
with herself in the future.
For any key exchange protocol, there is always
a key confirmation step, which is either implicit or
explicit [29]. The original DHIES scheme is designed
to provide only implicit key confirmation the key is
implicitly confirmed by checking the MAC tag. How-
ever, there are two drawbacks with this approach.
First, it does not distinguish two different failure
modes in case the MAC verification is unsuccessful.
In the first mode, wrong session keys may have been
derived from the key exchange process. For example,
the message had been encrypted by a different key
v
0
· G, v
0
6= v. In the second mode, the encrypted
message encM may have been corrupted (due to
storage errors or malicious tampering). It is sometimes
useful for an application to be able to distinguish the
two modes and handle the failure accordingly, but
this is not possible in the original DHIES. The second
drawback is performance. In DHIES, the latency for
performing implicit key confirmation (through check-
ing MAC) is always linear to the size of the ciphertext.
However, this linear time complexity O(n) can prove
unnecessarily inefficient if the MAC failure was due to
the derivation of wrong session keys. (We will explain
more on this after we describe the Audit function in
Section 4.)
We address both limitations by adding an explicit
key confirmation step to DHIES. This change provides
explicit assurance on the correct derivation of the
session keys. It is consistent with the common under-
standing that in key exchange protocols, explicit key
confirmation is generally considered more desirable
than implicit key confirmation [29]. We will explain
the modified DHIES in detail in Section 4.
3.2 Chaum-Pedersen protocol
Assume the same Elliptic Curve setting (E, G, n) as
above. Given a tuple (G, X, R, Z) = (G, x·G, r·G, x·r ·
G) where x, r
R
[1, n1], the Chaum-Pedersen proto-
col is an honest verifier Zero-Knowledge Proof (ZKP)

Citations
More filters
Journal ArticleDOI

Blockchain-based publicly verifiable data deletion scheme for cloud storage

TL;DR: This paper proposes a novel blockchain-based data deletion scheme, which can make the deletion operation more transparent and can achieve public verification without any trusted third party.
Journal ArticleDOI

Provable data transfer from provable data possession and deletion in cloud storage

TL;DR: A provable data transfer protocol based on provableData possession and deletion for secure cloud storage is proposed which allows the data owner to transfer the outsourced data from one cloud to another, without retrieving the entire data from the old cloud, and checking the data integrity in the new cloud and worrying about the deletion of the removed data.
Journal ArticleDOI

A comprehensive meta-analysis of cryptographic security mechanisms for cloud computing

TL;DR: This work explores the new directions in cloud computing security, while highlighting the correct selection of these fundamental technologies from cryptographic point of view.
Book ChapterDOI

DRE-ip: A Verifiable E-Voting Scheme Without Tallying Authorities

TL;DR: In this article, the authors proposed DRE-ip (DRE-i with enhanced privacy) which is able to encrypt ballots in real time in such a way that the election tally can be publicly verified without decrypting the cast ballots.
Journal ArticleDOI

Toward Assured Data Deletion in Cloud Storage

TL;DR: This article investigates the system model, desirable security properties as well as the potential solutions to the issues of assured deletion, and proposes several methods which satisfy the timeliness, fine granularity and the verification of deletion at the same time.
References
More filters
Book ChapterDOI

How to prove yourself: practical solutions to identification and signature problems

TL;DR: Simple identification and signature schemes which enable any user to prove his identity and the authenticity of his messages to any other user without shared or public keys are described.
Book

Cryptography: Theory and Practice

TL;DR: The object of the book is to produce a general, comprehensive textbook that treats all the essential core areas of cryptography.
Book

Security Engineering: A Guide to Building Dependable Distributed Systems

TL;DR: In almost 600 pages of riveting detail, Ross Anderson warns us not to be seduced by the latest defensive technologies, never to underestimate human ingenuity, and always use common sense in defending valuables.
Proceedings Article

Plutus: Scalable Secure File Sharing on Untrusted Storage

TL;DR: The mechanisms in Plutus to reduce the number of cryptographic keys exchanged between users by using filegroups, distinguish file read and write access, handle user revocation efficiently, and allow an untrusted server to authorize file writes are explained.
Proceedings Article

Helios: web-based open-audit voting

TL;DR: Helios is the first web-based, open-audit voting system, publicly accessible today: anyone can create and run an election, and any willing observer can audit the entire process.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What are the contributions in "Deleting secret data with public verifiability" ?

Secure Storage and Erasure ( SSE ) this paper is a secure data erasure protocol based on the trust-but-verify paradigm, which enables a user to verify the correct implementation of cryptographic operations inside a TPM without having to access its internal source code. 

Future work includes extending the “ trust-butverify ” paradigm to other crypto primitives, in particular, the secure random number generator. The problem of permitting end users to audit if a random number has been generated correctly in a TPM as part of the encryption process ( or a cryptographic protocol ) is still largely unsolved and deserves further research. Java cards can be purchased from various sources, e. g., [ 41 ], [ 42 ]. 

Because of the use of a key-confirmation string, it is unnecessary to feed in the entire encrypted message (i.e., EAuthkη (m)) into the audit function input. 

Given that the Java card that the authors use has 80 KB EEPROM in total and that the SSE program takes up 16 KB storage in EEPROM, the authors can create about 650 random EC public/private pairs. 

In one attack, the user can do as a data thief would do: 1) compromising the tamper resistance to gain access to the TPM’s protected memory; 2) recovering the overwritten key value in the protected memory in the TPM. 

The first method is to simply save the key on the disk, alongside the encrypted data (typically as part of the meta data in the file header) [17], [20], [25], [26]. 

before the system falls into the enemy hands, the authors assume that the user erases keys by calling the Delete function, or in the extreme case, physically destroying the TPM chip. 

As an example, with 160-bit n, 32-bit index Ci and a TPM of 16 MB EEPROM memory (see [38]), up to 666,667 user instances can be created. 

This is attributed to the use of explicit key confirmation; otherwise, with the original DHIES, the authors will have to feed in the entire encrypted message and the latency for auditing will have a linear time3.