Efficient Cache Attacks on AES, and Countermeasures
read more
Citations
Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds
Introduction to Embedded Systems - A Cyber-Physical Systems Approach
FLUSH+RELOAD: a high resolution, low noise, L3 cache side-channel attack
Last-Level Cache Side-Channel Attacks are Practical
References
The Design of Rijndael: AES - The Advanced Encryption Standard
The Java Virtual Machine Specification
Differential cryptanalysis of DES-like cryptosystems
Software protection and simulation on oblivious RAMs
Related Papers (5)
Frequently Asked Questions (9)
Q2. What is the way to filter out interruptions?
Major interruptions, such as context switches to other processes, are filtered out by excluding excessively long time measurements.
Q3. What is the way to ensure that the AES operation is invulnerable to attacks?
Assuming the hardware executes the basic AES operation with constant resource consumption, this allows for efficient AES execution that is invulnerable to their attacks.
Q4. How does the probe code avoid polluting its own samples?
to avoid “polluting” its own samples, the probe code stores each obtained sample into the same cache set it has just finished measuring.
Q5. How many times more data and analysis will be needed to execute a synchronous attack?
The attack will thus take about log(1−0.105)/ log(1−2−14.9) ≈ 3386 times more data and analysis, which is inconvenient but certainly feasible for the attacker.
Q6. How many samples are needed to eliminate all the wrong candidates?
to eliminate all the wrong candidates out of the δ4, the authors need about log δ−4/ log(1− δ/256 · (1− δ/256)38) samples, i.e., about 2056 samples when δ = 16.
Q7. How do you normalize the measurement scores?
Note that to obtain a visible signal it is necessary to normalize the measurement scores by subtracting, from each sample, the average timing of its cache set.
Q8. What is the function used to monitor the time it takes to load a cache set?
For each cache set, the attacker thread runs a loop which closely monitors the time it takes to repeatedly load a set of memory blocks that exactly fills that cache set with W memory blocks (similarly to step (c) of the Prime+Probe measurements).
Q9. What is the difference between bitsliced and lookup-based AES?
For AES, bitsliced implementation on popular architectures can offer a throughput comparable to that of lookup-based implementations [52][34][51][35][31][26], but only when several independent blocks are processed in parallel.35 Bitsliced AES is thus efficient for parallelized encryption modes such as CTR [35] and for exhaustive key search [62], but not for chained modes such as CBC.