Data-Oriented Programming: On the Expressiveness of Non-control Data Attacks

doi:10.1109/SP.2016.62

Data-Oriented Programming:

On the Expressiveness of Non-Control Data Attacks

Hong Hu, Shweta Shinde, Sendroiu Adrian, Zheng Leong Chua, Prateek Saxena, Zhenkai Liang

Department of Computer Science, National University of Singapore

{huhong, shweta24, sendroiu, chuazl, prateeks, liangzk}@comp.nus.edu.sg

Abstract—As control-ﬂow hijacking defenses gain adoption,

it is important to understand the remaining capabilities of

adversaries via memory exploits. Non-control data exploits are

used to mount information leakage attacks or privilege escalation

attacks program memory. Compared to control-ﬂow hijacking at-

tacks, such non-control data exploits have limited expressiveness;

however, the question is: what is the real expressive power of non-

control data attacks? In this paper we show that such attacks are

Turing-complete. We present a systematic technique called data-

oriented programming (DOP) to construct expressive non-control

data exploits for arbitrary x86 programs. In the experimental

evaluation using 9 programs, we identiﬁed 7518 data-oriented

x86 gadgets and 5052 gadget dispatchers, which are the building

blocks for DOP. 8 out of 9 real-world programs have gadgets to

simulate arbitrary computations and 2 of them are conﬁrmed

to be able to build Turing-complete attacks. We build 3 end-to-

end attacks to bypass randomization defenses without leaking

addresses, to run a network bot which takes commands from the

attacker, and to alter the memory permissions. All the attacks

work in the presence of ASLR and DEP, demonstrating how

the expressiveness offered by DOP signiﬁcantly empowers the

attacker.

I. INTRODUCTION

Control-hijacking attacks are the predominant category of

memory exploits today. The early generation of control-

hijacking attacks focused on code injection, while in recent

years advanced code-reuse attacks, such as return-oriented

programming (ROP) and its variants, have surfaced [1]–

[5]. In response, numerous principled defenses for control-

hijacking attacks have been proposed. Examples of these

include control-ﬂow integrity (CFI) [6]–[11], protection of

code pointers (CCFI, CPI) [12], [13], timely-randomization

of code pointers (TASR) [14], memory randomization [15],

and write-xor-execute (W⊕X, or data-execution prevention,

DEP) [16]. All of these defenses aim to ensure that the control

ﬂow of the program remains legitimate (with high probability)

under all inputs.

A natural question is to analyze the limits of protection

offered by control-ﬂow defenses, and the remaining capabil-

ities of the adversary. In a concrete execution, the program

memory can be conceptually split into the control plane and

the data plane. The control plane consists of memory variables

which are used directly in control-ﬂow transfer instructions

(e.g., returns, indirect calls, and so on). In concept, control-

ﬂow defenses aim to ensure that the execution of the program

stays legitimate — often by protecting the integrity of the

control plane memory [12], [14] or by directly checking the

targets of control transfers [6]–[10], [17], [18]. However, the

data plane, which consists of memory variables not directly

used in control-ﬂow transfer instructions, offers an additional

source of advantage for attackers. Attacks targeting the data

plane, which are referred to as non-control data attacks [19],

are known to cause signiﬁcant damage — such as leakage of

secret keys (HeartBleed) [20], enabling untrusted code import

in browsers [21], and privilege escalation in servers [22]. How-

ever, non-control data attacks provide limited expressiveness

in attack payloads (e.g., allowing corruption or leakage of a

few security-critical data bytes).

In this paper, we show that non-control data attacks with

rich expressiveness can be crafted using systematic construc-

tion techniques. We demonstrate that non-control data attacks

resulting from a single memory error can be Turing-complete.

The key idea in our construction is to ﬁnd data-oriented

gadgets — short sequences of instructions in the program’s

control-abiding execution that enable speciﬁc operations sim-

ulating a Turing machine (e.g., assignment, arithmetic, and

conditional decisions). Then, we ﬁnd gadget dispatchers which

are fragments of logic that chain together disjoint gadgets in an

arbitrary sequence. Such expressive attacks allow the remote

adversary to force the program to do its bidding, carrying

out computation of the adversary’s choice on the program

memory. Our constructions are analogous to return-oriented

programming, wherein return-oriented instruction sequences

are chained [1]. ROP attacks are known to be Turing-complete

because of a similar systematic construction [1], [23]. Thus,

our attacks enable data-oriented programming (DOP), which

only uses data plane values for malicious purposes, while

maintaining complete integrity of the control plane.

Experimental Findings. To estimate the practicality of DOP

attacks, we automate the procedure for ﬁnding data-oriented

gadgets in a tool for Linux x86 binaries. In our evaluation

of 9 programs, we statically ﬁnd 7518 data-oriented gadgets

in benign executions of these programs. 1273 of these are

conﬁrmed to be reachable from known proof of concept

exploits for known CVEs. Gadgets offer a variety of computa-

tion controls, such as arithmetic, logical, bit-wise, conditional

and assignment operations between values under attacker’s

inﬂuence. Chaining of such gadgets is possible with memory

errors if we ﬁnd dispatchers. We automate the ﬁnding of

dispatcher loops, such that the vulnerabilities could be used

to corrupt the control variable. This allows the attacker to

create inﬁnite (or attacker-controlled) repetition. We ﬁnd 5052

2016 IEEE Symposium on Security and Privacy

DOI 10.1109/SP.2016.62

969

2016 IEEE Symposium on Security and Privacy

DOI 10.1109/SP.2016.62

969

of such dispatcher loops in x86 applications. To determine

the ﬁnal feasibility of chaining gadgets using dispatchers

(which is a search problem with a prohibitively large space),

we resorted to constructing proof-of-concept exploits semi-

manually guided by intuition. We show 3 end-to-end exploits

in our case-studies. All of our exploits leave the control-plane

data unchanged, including all code pointers, and control-ﬂow

execution always conforms to the static control-ﬂow graph

(CFG). Further, our exploits execute reliably with commodity

ASLR and DEP implementations turned on.

Implications. In our ﬁrst end-to-end exploit, we show how

DOP attacks result in bypassing ASLR defenses without

leaking addresses to the network. High expressiveness in DOP

attacks also allows the adversary to interact repeatedly with

the program memory, acting out arbitrary functionality in

each invocation. Our second exploit uses the interaction to

simulate an adaptive adversary with arbitrary computation

power running inside the program’s memory space (e.g., a

bot on the victim server). We probe the application over 700

times to effect the ﬁnal attack! Finally, we discuss how to use

DOP to subvert several CFI defenses which trust the secrecy

or integrity of the security metadata in memory. Speciﬁcally,

our third exploit changes the permissions of read-only pages

to bypass a speciﬁc implementation of CFI. As a consequence,

we recommend future purely control-ﬂow defenses to consider

an adversary model with arbitrary computation and access to

memory at the point of vulnerability.

Contributions. In summary, we make the following contribu-

tions in this paper:

• DOP. We propose data-oriented programming (DOP), a

general method to build Turing-complete non-control data

attacks against vulnerable programs. We propose con-

crete methods to identify data-oriented gadgets, gadget

dispatchers and a search strategy to stitch these gadgets.

• Prevalence. Our evaluation of 9 real world applications

shows that programs do have a large number (1273)of

data-oriented gadgets reachable from real-world vulnera-

bilities, which are required by data-oriented programming

operations.

• Practicality. We show that Turing-complete non-control

data exploits for common memory errors are practical.

8 out of 9 applications provide data-oriented gadgets to

build Turing-complete attacks. We build 3 end-to-end

non-control data exploits which work even in the presence

of DEP and ASLR, demonstrating the effectiveness of

data-oriented programming.

Our attacks and tools are available at http:// huhong-nus.

github.io/ advanced-DOP/ .

II. P

ROBLEM

A. Background: Non-control Data Attacks

Non-control data attacks tamper with or leak security-

sensitive memory, which is not directly used in control transfer

instructions. Such attacks were conceptually introduced a

decade ago by Chen et al. to show that they can have serious

1 FILE

*

getdatasock( ...){...

2 seteuid(0);

3 setsockopt( ... ); ...

4 seteuid(pw->pw_uid); // corrupted uid (0)

5 ...

6 }

Code 1. Code snippet from wu-ftpd to demonstrate non-control data

attacks. Attackers change pw->pw_uid to 0 before the 2nd seteuid call

to get the root user’s privilege.

implications [19]. Recently, Hu et al. provided a general

construction for automatically synthesizing simple payloads

to effect such attacks [22]. Their construction shows that two

existing dataﬂows in the program can be stitched automati-

cally, therefore alleviating the effort for human analysis. The

constructed payload required the corruption of a small number

(up-to 2 or 3) of non-control-pointers. The attack payloads,

however, exhibit limited expressiveness, such as writing a

target variable of choice or leaking contents of a sensitive

memory region. Such simple payloads can enable privilege

escalation and sensitive data-leakage attacks — for instance,

Code 1 shows a well-known attack that escalates the program’s

privileges to root by corrupting one variable (pw_uid).

Is corruption of a few bytes of memory sufﬁcient to

enable Turing-complete attacks for remote adversaries? In

some programs, the answer is yes. Consider web browsers,

which embody interpreters for web languages such as CSS,

HTML, JavaScript, and so on. The data consumed by the

interpreter is inherently under the remote attacker’s control.

Further, browsers can import machine code and directly use

it, like ActiveX code. By using a few bytes of corruption, it

is possible to cause the web browser to making it interpret

Turing-complete functionality in another website’s origin, or

execute arbitrary untrusted code. Such attacks are known in

the wild [21], [24]. However, one may argue that such attacks

apply only to limited applications such as browsers, which can

use process-sandboxing as a second line of defense.

Recently, Carlini et al. showed a more subtle example of

“interpreter-like” functionality embedded in many common

applications [25]. Their work show that certain functions, such

as printf, take format string arguments and are Turing-

complete “interpreters” for the format-string language. There-

fore, if a non-control data attack can allow the adversary

completely control over the format string argument, then the

attacker can construct expressive payloads. However, these

examples are speciﬁc to certain (4 or 5) functions such as

printf which permit expressiveness in their format-string

language. One way to disable such attacks is to limit the

expressiveness of these handful of functions — for instance,

the implementation of printf in Linux [26]

1

and Win-

dows [27]

2

sanitizes or blocks the use of %n, which severely

limits the expressiveness of the attack. The question about

how expressive are non-control data attacks arising from

common memory errors in arbitrary pieces of code is not

well-understood. Since non-control data attacks cannot divert

1

A compile-time ﬂag called FORTIFY

SOURCE enables this check.

2

“%n” is disabled by default in Visual Studio.

970970

the control ﬂow to arbitrary locations, unlike ROP attacks [1],

[23], the expressiveness is believed to be very limited.

B. Example of Data-oriented Programming

In fact, non-control data attacks can offer rich exploits

from common vulnerabilities. To see an example, consider the

vulnerable code snippet shown in Code 2. The code is modeled

after an FTP server, which processes network requests based

on the message type. It truncates the “STREAM” message

(line 10), maintains the total size of bytes received (line

13) and throttles user requests to a maximum upper limit

(line 6). Let us assume that the code has a buffer overﬂow

vulnerability on line 7, failing to check the bounds of the ﬁxed-

size buffer buf in function readData. As a consequence,

all local variables, including srv, connect_limit, size

and type are under the control of attackers.

1 struct server{ int

*

cur_max, total, typ;}

*

srv;

2 int connect_limit = MAXCONN; int

*

size,

*

type;

3 char buf[MAXLEN];

4 size = &buf[8]; type = &buf[12];

5 ...

6 while(connect_limit--) {

7 readData(sockfd, buf); // stack bof

8 if(

*

type == NONE ) break;

9 if(

*

type == STREAM) // condition

10

*

size =

*

(srv->cur_max); // dereference

11 else {

12 srv->typ =

*

type; // assignment

13 srv->total +=

*

size; // addition

14 } ... (following code skipped) ...

15 }

Code 2. Vulnerable FTP server with data-oriented gadgets.

1 struct Obj{struct Obj

*

next; unsigned int prop;}

2 void updateList(struct Obj

*

list, int addend) {

3 for(; list != NULL; list = list->next)

4 list->prop += addend;

5 }

Code 3. A function increments the integer ﬁeld of a linked list by a given

value. It can be simulated by chaining data-oriented gadgets in Code 2.

This code does not invoke any security-critical functions in

its benign control-ﬂow, and the vulnerability just corrupts a

handful of local variables. Could the adversary exploit this

vulnerability to simulate an expressive computation on the

program state? A closer inspection reveals that the answer

is yes. Consider the individual operations executed by the

program. The line 12 is an assignment operation on memory

locations pointed by two local variables (srv and type),

which are under the inﬂuence of the memory error. Line 10 has

a dereference operation, the source pointer (srv) for which

is corruptible. Similarly, Line 13 has a controllable addition

operation. We can think of each of these micro-operations

in the program as data-oriented gadgets. If we can execute

these gadgets on attacker-controlled inputs, and chain their

execution in a sequence, then an expressive computation can

be executed. Notice that the loop in line 6 to 15 allows

chaining and dispatching gadgets in an inﬁnite sequence, since

the loop condition is a variable (i.e., connect_limit) that

is under the memory error’s inﬂuence. We call such loops

gadget dispatchers. A sequence of data-oriented gadgets in

buf[] type size srv

“AAA…AAA” p q 0x100 n-8

“AAA…AAA” m p 0x100 p

stack layout

malicious

input for

one round

stack growth

connect_limit

Fig. 1. Malicious input to trigger the loop body in Code 3 by stitching data-

oriented gadgets in Code 2. The upper side is the stack layout of Code 2.

Refer Table I for details of p, q, m and n.

TABLE I

Simulating the loop body in Code 3 with the data-oriented gadgets in

Code 2. In column “Simulated Instr.”, highlighted instructions are useful for

the simulation, while other instructions are side effects of the attack.

Overﬂow Executed Instr. (Code 2) Simulated Instr. (Code 3)

type ← p

size ← q

srv ← n-8

if(*type == NONE) break; if(list == NULL) break;

srv->typ = *type; srv = list;

srv->total += *size; list->prop += addend;

type ← m

size ← p

srv ← p

if(*type == NONE) break; if(list == NULL) break;

if(*type == STREAM) if(list == STREAM)

*size = *(srv->cur max); list = list->next;

p – &list; q – &addend; m – &STREAM; n – &srv

Code 2 would allow the remote adversary to simulate the

function shown in Code 3, which maintains a linked list of

integers in memory and increments each integer by a desired

value. Table I illustrates how the code in the loop body gets

simulated with the malicious input in Figure 1. Attackers can

repeatedly send the same input sequence to implement the

updateList function in Code 3.

This non-control data attack shows subtle expressiveness in

payloads and prevalence: with a single memory error, it re-

interprets the vulnerable server as a virtual CPU, to perform

an expressive calculation on behalf of attackers. It does not

require any speciﬁc security-critical data or functions to enable

such attack. The control ﬂow conforms to the precise CFG.

C. Research Questions

In this paper, we aim to answer the following questions

about non-control data attacks:

• Q1: How often do data-oriented gadgets arise in real-

world programs? How often do gadget dispatchers exist?

• Q2: Is it possible to chain gadgets for a desired compu-

tation? Can attackers build Turing-complete attacks with

this method?

• Q3: What is the security implication of this attack method

for current defense mechanisms?

III. D

ATA -ORIENTED PROGRAMMING

We illustrate the idea behind a general technique called

Data-Oriented Programming (DOP) that can simulate Turing-

complete computations by exploiting a memory error.

A. DOP Overview

Data-oriented programming is a technique that allows the

attacker to simulate expressive computations on the program

memory, without exhibiting any illegitimate control ﬂow with

respect to the program CFG. As shown in Section II-B, the

key is to manipulate non-control data such that the executed

971971

TABLE II

M

INDOP language. To simulate (conditional) jump, data-oriented gadget

changes the virtual input pointer (vpc) accordingly.

Semantics

Instructions

in C

Data-Oriented

Gadgets in DOP

arithmetic / logical aopb

*

pop

*

q

assignment a=b

*

p=

*

q

load a=

*

b

*

p=

**

q

store

*

a=b

**

p=

*

q

jump goto L vpc = &input

conditional jump if a goto L vpc = &input if

*

p

p – &a; q – &b; op – arithmetic / logical operation

instructions do the attacker’s bidding. In order to give a

concrete and systematic construction, we deﬁne a simple mini-

language called M

INDOP with a virtual instruction set and

virtual register operands, in which the attacker’s payload can

be speciﬁed. We show how M

INDOP can be simulated by

small snippets of x86 instructions that are abundant in common

real-world programs on Linux, as our empirical evaluation in

Section V conﬁrms. M

INDOP is Turing-complete, which we

establish in Section III-D.

The M

INDOP language (shown in Table II) has 6 kinds

of virtual instructions, each operating on virtual register

operands. The ﬁrst four virtual instructions include arithmetic /

logical calculation, assignment, load and store operations. The

last two virtual operations, namely conditional and uncondi-

tional jumps, allow the implementation of control structures

in a M

INDOP virtual program. Each virtual operation is

simulated by real x86 instruction sequences available in the

vulnerable program, which we call data-oriented gadgets. The

control structure allows chaining of gadgets, and the x86

code sequences that simulate the virtual control operations

are referred to as gadget dispatchers. None of the gadgets

or dispatchers modify any code-pointers or violate CFI in the

real program execution. We next explain each virtual operation

and show concrete real-world gadgets that simulate them.

B. Data-Oriented Gadgets

Virtual operations in M

INDOP are simulated using concrete

x86 instruction sequences in the vulnerable program execution.

Such instruction sequences or gadgets read inputs from and

write outputs to memory locations which simulate virtual

register operands in M

INDOP. Hardware registers are not a

judicious choice for simulating virtual registers because the

original program frequently uses hardware registers for its

own computation. Gadgets are scattered in the program logic,

within the legitimate CFG of the program. As a result, between

two gadgets there may be several uninteresting instructions

which may clobber hardware registers and the memory state

outside of the attacker’s control. Therefore, in M

INDOP

we implement virtual registers with carefully-chosen memory

locations (not hardware registers) used only under the control

of gadget operations.

Conceptually, a data-oriented gadget simulates three logical

micro-operations: the load micro-operation, the intended vir-

tual operation’s semantics, and the ﬁnal store micro-operation.

The load micro-operation simulates the read of the virtual

TABLE III

Example data-oriented gadget of addition operation. The ﬁrst row is the C

code of the gadget, and the second row is the corresponding assembly code.

C Code srv->total +=

*

size;

ASM Code

1 mov (%esi), %ebx //load micro-op

2 mov 0x4(%edi), %eax //load micro-op

3 add %ebx, %eax //addition

4 mov %eax, 0x4(%edi) //store micro-op

register operand(s) from memory. The store micro-operation

writes the computation result back to a virtual register. The

operation’s semantics are different for each gadget. A number

of different x86 instruction sequences can sufﬁce to simulate

a virtual operation. The x86 instruction set supports several

memory addressing modes, and as long as the order of the

micro-operations is correct, different sequences can work. As

a concise example, the x86 instruction add %eax, (%ecx)

performs all three micro-operations (load, arithmetic and store)

in a single x86 instruction. We later provide other gadget

implementations as well.

Data-oriented gadgets are similar to code gadgets employed

in return-oriented programming (ROP) [1], or in jump-oriented

programming (JOP) [2]. They are short instruction sequences

and are connected sequentially to achieve the desired function-

ality. However, there are two differences between data-oriented

gadgets and code gadgets. First, data-oriented gadgets require

to deliver operation result with memory. In contrast, code

gadgets can use either memory or register to persist outputs

of a gadget. Second, data-oriented gadgets must execute in

at least one legitimate control ﬂow, and need not execute

immediately one after another. In fact, they can be spread

across several basic blocks or even functions. In contrast, code

gadgets need not execute in any benign control-ﬂow path of the

program, and may even start at invalid instruction boundaries.

Data-oriented gadgets have more stringent requirements than

code gadgets in general. However, we show that such gadgets

exist with examples from real-world applications.

Simulating Arithmetic Operations. Addition and subtraction

can be simulated using a variety of x86 instructions sequences

that we ﬁnd empirically. Table III shows one example of

addition gadget with C and the assembly representation. This

gadget is modeled from the real-world program ProFTPD [28].

In the assembly representation, the code in line 1 and line 2

constitute the load micro-operation. The code in line 3 imple-

ments the addition, and line 4 is the store micro-operation.

With addition over arbitrary values, it is possible to simulate

multiplication efﬁciently if the language supports conditional

jumps. M

INDOP supports conditional jumps which allow to

check if a value is smaller / greater than a constant. To see

why this combination is powerful, note that we can compute

the bit-decomposition of a ﬁnite-size integer. To compute the

most signiﬁcant bit of a, we can add a to itself (equivalently

left-shifting it) and conditionally jump based on the carry bit.

Proceeding by repetition, we can obtain the bit-decomposition

of a. With bit-decomposition, simulating a multiplication a·b

reduces to the efﬁcient shift-and-add procedure, adding a to

itself in each step conditioned on the bits in b. Converting a

972972

TABLE IV

Example data-oriented gadget of assignment operation.

C Code srv->typ =

*

type;

ASM Code

1 mov (%esi), %ebx // load micro-op

2 mov %ebx, %eax // move

3 mov %eax, 0x8(%edi) // store micro-op

TABLE V

Example data-oriented gadget of dereference operation. The ﬁrst row gives

three examples. The second row shows the assembly code of the ﬁrst one.

C Code LOAD1:

*

size =

*

(srv->cur_max);

LOAD2: memcpy(dst,

*

src_p, size);

STORE: memcpy(

*

dst_p, src, size);

ASM Code (of LOAD1)

1 mov (%esi), %ebx // load micro-op

2 mov (%ebx), %eax // load

3 mov %eax, (%edi) // store micro-op

bit-decomposed value to its integer representation is similarly

a multiply-and-add operation over powers of two. Bit-wise op-

erations are simply arithmetic on the bit-decomposed versions.

Simulating Assignment Operations.InM

INDOP, assign-

ment gadgets read data from one memory location and directly

write to another memory location. In this case, we can skip

the load section of the destination operand. The C code and

ASM code of an assignment gadget is shown in Table IV.

Simulating Dereference (Load / Store) Operations. The

load and store instructions in C require memory dereferences,

which take one register as address and visits the memory

location for reading or writing. In data-oriented programming,

registers are simulated by memory, therefore the memory

dereference is simulated by two memory dereferences: the

ﬁrst memory dereference to simulate the register access, and

a second memory dereference with the ﬁrst dereference result

(the register value) as the address. As shown in Table II,

the memory dereference

*

b in the load instruction in C is

represented by

**

q, where q is the address of b,

*

q is

the value of b, and

**

q is the ﬁnal memory value

*

b.A

similar representation is used in the store gadget. We show two

examples of load gadgets and one example of a store gadget in

Table V in C representation, and the assembly representation

of the ﬁrst load gadget. As we can see from the assembly

code, there are still three sections in load / store gadgets, with

the semantics on memory dereference with loaded operands.

C. Gadget Dispatcher

Various gadgets can be chained sequentially by gadget dis-

patchers to enable recursive computation. Gadget dispatchers

are sequences of x86 instructions that equip attackers with the

ability to repeat gadget invocations and, for each invocation,

to selectively activate speciﬁc gadgets. One common sequence

of x86 instructions that can simulate gadget dispatchers is a

loop, which iterates over computation that simulates gadgets

and should have a selector in it. Each iteration executes a

subset of gadgets using outputs from gadgets in the previous

iteration. To direct the outputs of one gadget in iteration i into

the inputs to a gadget in iteration i+1, the selector changes the

load address of iteration i+1 to the store addresses of iteration

i. The selector’s behavior is controlled by attackers through the

memory error. In our running example in Code 2, line 6 and

Gadget Dispatcher

gadget1

gadget4

gadget2 gadget3

gadget5

…… …… ……

round1

round2

round3

roundN

……

loop selector

corruptible by mem-err

Fig. 2. The design model of DOP in MINDOP. The gadget dispatcher

includes a loop and a selector. The loop keeps the program passing by the

selector and various data-oriented gadgets. For each round, the selector is

controlled by the memory error to activate particular data-oriented gadgets.

7 in the loop constitute a dispatcher. The selector on line 7 is

the memory error itself, which repeatedly corrupts the local

variables to setup the execution of gadgets in that iteration. The

corruption is done in a way that it enables only the gadgets of

the attacker’s choice. These gadgets take as input the outputs of

the previous round’s gadget by selectively corrupting operand

pointers. The remaining gadgets may still get executed, but

their inputs and outputs are set up such that they behave

like NOPs (operating on unused memory locations). Figure 2

shows the design model of data-oriented programming in

M

INDOP. The left part is the gadget dispatcher inside the

vulnerable program, which is corruptible by the memory error;

the solid gadgets are activated in iteration i and the gray

gadgets are executed like NOPs.

It remains to explain in iteration i, how to selectively

activate a particular gadget in that iteration and whether the

simulation should continue to iteration i +1. Our running

example in Code 2 shows a scenario where the attacker can

“interact” with the program by repeatedly corrupting program

variables at line 7 using a buffer overﬂow. This attack is an

interactive attack, where the attacker can prepare the memory

state at the start of loop iteration i in a way that the desired

gadget works as required and other gadgets operate on unused

memory. Let M

j

be the state of memory for executing gadget

j selectively. In an interactive attack, the attacker can corrupt

local variables to conﬁgure M

j

to execute j in that round, and

provide multiple rounds of such malicious inputs to perform an

expressive computation. When the attacker wishes to terminate

the loop, it can corrupt the loop condition variable to stop.

Another class of DOP attacks are non-interactive, whereby

the attacker provides the entire malicious input as a single

data transmission. In such a scenario, all the memory setup

and conditions for deciding loop termination and selective

gadget activation need to be encoded in a single malicious

payload. To support such attacks, M

INDOP has two virtual

operations that enable conditional chaining of operations, or

virtual jumps. The basic idea is as follows: the attacker

provides the memory conﬁguration M

j

necessary for each

gadget j to be selectively executed in a particular iteration

in the input payload. In addition, it keeps a pointer called

the virtual PC which points to the desired conﬁguration M

j

at the start of each iteration. It sufﬁces to corrupt only the

virtual PC, so that the program execution in that iteration

operates on the conﬁguration M

j

. To decide how to switch to

973973

Data-Oriented Programming: On the Expressiveness of Non-control Data Attacks

Citations

Cyber-Physical Systems Security: Limitations, Issues and Future Trends

C-FLAT: Control-Flow Attestation for Embedded Systems Software

SoK: Sanitizing for Security

Protecting Bare-Metal Embedded Systems with Privilege Overlays

Block Oriented Programming: Automating Data-Only Attacks

References

Efficient software-based fault isolation

The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86)

Control-flow integrity

Cyclone: A Safe Dialect of C

SoK: Eternal War in Memory

Related Papers (5)

The geometry of innocent flesh on the bone: return-into-libc without function calls (on the x86)

Non-control-data attacks are realistic threats

Control-flow integrity

Just-In-Time Code Reuse: On the Effectiveness of Fine-Grained Address Space Layout Randomization

SoK: Eternal War in Memory