scispace - formally typeset
Search or ask a question
Book ChapterDOI

Cache attacks and countermeasures: the case of AES

13 Feb 2006-pp 1-20
TL;DR: In this article, the authors describe side-channel attacks based on inter-process leakage through the state of the CPU's memory cache, which can be used for cryptanalysis of cryptographic primitives that employ data-dependent table lookups.
Abstract: We describe several software side-channel attacks based on inter-process leakage through the state of the CPU’s memory cache. This leakage reveals memory access patterns, which can be used for cryptanalysis of cryptographic primitives that employ data-dependent table lookups. The attacks allow an unprivileged process to attack other processes running in parallel on the same processor, despite partitioning methods such as memory protection, sandboxing and virtualization. Some of our methods require only the ability to trigger services that perform encryption or MAC using the unknown key, such as encrypted disk partitions or secure network links. Moreover, we demonstrate an extremely strong type of attack, which requires knowledge of neither the specific plaintexts nor ciphertexts, and works by merely monitoring the effect of the cryptographic process on the cache. We discuss in detail several such attacks on AES, and experimentally demonstrate their applicability to real systems, such as OpenSSL and Linux’s dm-crypt encrypted partitions (in the latter case, the full key can be recovered after just 800 writes to the partition, taking 65 milliseconds). Finally, we describe several countermeasures for mitigating such attacks.
Citations
More filters
Proceedings ArticleDOI
09 Nov 2009
TL;DR: It is shown that it is possible to map the internal cloud infrastructure, identify where a particular target VM is likely to reside, and then instantiate new VMs until one is placed co-resident with the target, and how such placement can then be used to mount cross-VM side-channel attacks to extract information from a target VM on the same machine.
Abstract: Third-party cloud computing represents the promise of outsourcing as applied to computation. Services, such as Microsoft's Azure and Amazon's EC2, allow users to instantiate virtual machines (VMs) on demand and thus purchase precisely the capacity they require when they require it. In turn, the use of virtualization allows third-party cloud providers to maximize the utilization of their sunk capital costs by multiplexing many customer VMs across a shared physical infrastructure. However, in this paper, we show that this approach can also introduce new vulnerabilities. Using the Amazon EC2 service as a case study, we show that it is possible to map the internal cloud infrastructure, identify where a particular target VM is likely to reside, and then instantiate new VMs until one is placed co-resident with the target. We explore how such placement can then be used to mount cross-VM side-channel attacks to extract information from a target VM on the same machine.

2,230 citations

Proceedings ArticleDOI
19 May 2019
TL;DR: Spectre as mentioned in this paper is a side channel attack that can leak the victim's confidential information via side channel to the adversary. And it can read arbitrary memory from a victim's process.
Abstract: Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try to guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing and side-channel attacks. These attacks represent a serious threat to actual systems since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.

1,317 citations

Proceedings Article
20 Aug 2014
TL;DR: This paper presents FLUSH+RELOAD, a cache side-channel attack technique that exploits a weakness in the Intel X86 processors to monitor access to memory lines in shared pages and recovers 96.7% of the bits of the secret key by observing a single signature or decryption round.
Abstract: Sharing memory pages between non-trusting processes is a common method of reducing the memory footprint of multi-tenanted systems In this paper we demonstrate that, due to a weakness in the Intel X86 processors, page sharing exposes processes to information leaks We present FLUSH+RELOAD, a cache side-channel attack technique that exploits this weakness to monitor access to memory lines in shared pages Unlike previous cache side-channel attacks, FLUSH+RELOAD targets the Last-Level Cache (ie L3 on processors with three cache levels) Consequently, the attack program and the victim do not need to share the execution core We demonstrate the efficacy of the FLUSH+RELOAD attack by using it to extract the private encryption keys from a victim program running GnuPG 1413 We tested the attack both between two unrelated processes in a single operating system and between processes running in separate virtual machines On average, the attack is able to recover 967% of the bits of the secret key by observing a single signature or decryption round

1,001 citations

Proceedings ArticleDOI
17 May 2015
TL;DR: This work presents an effective implementation of the Prime+Probe side-channel attack against the last-level cache of GnuPG, and achieves a high attack resolution without relying on weaknesses in the OS or virtual machine monitor or on sharing memory between attacker and victim.
Abstract: We present an effective implementation of the Prime Probe side-channel attack against the last-level cache. We measure the capacity of the covert channel the attack creates and demonstrate a cross-core, cross-VM attack on multiple versions of GnuPG. Our technique achieves a high attack resolution without relying on weaknesses in the OS or virtual machine monitor or on sharing memory between attacker and victim.

950 citations

Proceedings ArticleDOI
16 Oct 2012
TL;DR: This paper details the construction of an access-driven side-channel attack by which a malicious virtual machine (VM) extracts fine-grained information from a victim VM running on the same physical computer and demonstrates the attack in a lab setting by extracting an ElGamal decryption key from a victims using the most recent version of the libgcrypt cryptographic library.
Abstract: This paper details the construction of an access-driven side-channel attack by which a malicious virtual machine (VM) extracts fine-grained information from a victim VM running on the same physical computer. This attack is the first such attack demonstrated on a symmetric multiprocessing system virtualized using a modern VMM (Xen). Such systems are very common today, ranging from desktops that use virtualization to sandbox application or OS compromises, to clouds that co-locate the workloads of mutually distrustful customers. Constructing such a side-channel requires overcoming challenges including core migration, numerous sources of channel noise, and the difficulty of preempting the victim with sufficient frequency to extract fine-grained information from it. This paper addresses these challenges and demonstrates the attack in a lab setting by extracting an ElGamal decryption key from a victim using the most recent version of the libgcrypt cryptographic library.

839 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper shows how to do an on-line simulation of an arbitrary RAM by a probabilistic oblivious RAM with a polylogaithmic slowdown in the running time, and shows that a logarithmic slowdown is a lower bound.
Abstract: Software protection is one of the most important issues concerning computer practice. There exist many heuristics and ad-hoc methods for protection, but the problem as a whole has not received the theoretical treatment it deserves. In this paper, we provide theoretical treatment of software protection. We reduce the problem of software protection to the problem of efficient simulation on oblivious RAM.A machine is oblivious if thhe sequence in which it accesses memory locations is equivalent for any two inputs with the same running time. For example, an oblivious Turing Machine is one for which the movement of the heads on the tapes is identical for each computation. (Thus, the movement is independent of the actual input.) What is the slowdown in the running time of a machine, if it is required to be oblivious? In 1979, Pippenger and Fischer showed how a two-tape oblivious Turing Machine can simulate, on-line, a one-tape Turing Machine, with a logarithmic slowdown in the running time. We show an analogous result for the random-access machine (RAM) model of computation. In particular, we show how to do an on-line simulation of an arbitrary RAM by a probabilistic oblivious RAM with a polylogaithmic slowdown in the running time. On the other hand, we show that a logarithmic slowdown is a lower bound.

1,752 citations

01 Jan 1998
TL;DR: A new block cipher is proposed that uses S-boxes similar to those of DES in a new structure that simultaneously allows a more rapid avalanche, a more efficient bitslice implementation, and an easy analysis that enables it to be more secure than three-key triple-DES.
Abstract: We propose a new block cipher as a candidate for the Advanced Encryption Standard. Its design is highly conservative, yet still allows a very efficient implementation. It uses S-boxes similar to those of DES in a new structure that simultaneously allows a more rapid avalanche, a more efficient bitslice implementation, and an easy analysis that enables us to demonstrate its security against all known types of attack. With a 128-bit block size and a 256-bit key, it is as fast as DES on the market leading Intel Pentium/MMX platforms (and at least as fast on many others); yet we believe it to be more secure than three-key triple-DES.

433 citations

Book ChapterDOI
20 Jan 1997
TL;DR: A new optimized standard implementation of DES on 64-bit processors is described, which is about twice faster than the fastest known standard DES implementation on the same processor.
Abstract: In this paper we describe a fast new DES implementation. This implementation is about five times faster than the fastest known DES implementation on a (64-bit) Alpha computer, and about three times faster than than our new optimized DES implementation on 64-bit computers. This implementation uses a non-standard representation, and view the processor as a SIMD computer, i.e., as 64 parallel one-bit processors computing the same instruction. We also discuss the application of this implementation to other ciphers. We describe a new optimized standard implementation of DES on 64-bit processors, which is about twice faster than the fastest known standard DES implementation on the same processor. Our implementations can also be used for fast exhaustive search in software, which can find a key in only a few days or a few weeks on existing parallel computers and computer networks.

372 citations

Book ChapterDOI
21 Feb 2005
TL;DR: This article introduces a new masking countermeasure which is not only secure against first-order side-channel attacks, but which also leads to relatively small implementations compared to other masking schemes implemented in dedicated hardware.
Abstract: So far, efficient algorithmic countermeasures to secure the AES algorithm against (first-order) differential side-channel attacks have been very expensive to implement. In this article, we introduce a new masking countermeasure which is not only secure against first-order side-channel attacks, but which also leads to relatively small implementations compared to other masking schemes implemented in dedicated hardware. Our approach is based on shifting the computation of the finite field inversion in the AES S-box down to GF(4). In this field, the inversion is a linear operation and therefore it is easy to mask. Summarizing, the new masking scheme combines the concepts of multiplicative and additive masking in such a way that security against first-order side-channel attacks is maintained, and that small implementations in dedicated hardware can be achieved.

362 citations

Book ChapterDOI
18 Feb 2002
TL;DR: This article presents a hardware implementation of the S-Boxes from the Advanced Encryption Standard (AES), and shows that a calculation of this function and its inverse can be done efficiently with combinational logic.
Abstract: This article presents a hardware implementation of the S-Boxes from the Advanced Encryption Standard (AES). The SBoxes substitute an 8-bit input for an 8-bit output and are based on arithmetic operations in the finite field GF(28). We show that a calculation of this function and its inverse can be done efficiently with combinational logic. This approach has advantages over a straight-forward implementation using read-only memories for table lookups. Most of the functionality is used for both encryption and decryption. The resulting circuit offers low transistor count, has low die-size, is convenient for pipelining, and can be realized easily within a semi-custom design methodology like a standard-cell design. Our standard cell implementation on a 0.6 ?m CMOS process requires an area of only 0.108 mm2 and has delay below 15 ns which equals a maximum clock frequency of 70 MHz. These results were achieved without applying any speed optimization techniques like pipelining.

355 citations