scispace - formally typeset
Open AccessProceedings ArticleDOI

Secure program execution via dynamic information flow tracking

TLDR
This work presents a simple architectural mechanism called dynamic information flow tracking that can significantly improve the security of computing systems with negligible performance overhead and is transparent to users or application programmers.
Abstract
We present a simple architectural mechanism called dynamic information flow tracking that can significantly improve the security of computing systems with negligible performance overhead. Dynamic information flow tracking protects programs against malicious software attacks by identifying spurious information flows from untrusted I/O and restricting the usage of the spurious information.Every security attack to take control of a program needs to transfer the program's control to malevolent code. In our approach, the operating system identifies a set of input channels as spurious, and the processor tracks all information flows from those inputs. A broad range of attacks are effectively defeated by checking the use of the spurious values as instructions and pointers.Our protection is transparent to users or application programmers; the executables can be used without any modification. Also, our scheme only incurs, on average, a memory overhead of 1.4% and a performance overhead of 1.1%.

read more

Content maybe subject to copyright    Report

Secure Program Execution via Dynamic Information Flow Tracking
G. Edward Suh, Jaewook Lee, Srinivas Devadas
Computer Science and Artificial Intelligence Laboratory (CSAIL)
Massachusetts Institute of Technology
Cambridge, MA 02139, USA
{suh,leejw,devadas}@mit.edu
Abstract
Dynamic information flow tracking is a hardware mech-
anism to protect programs against malicious attacks by
identifying spurious information flows and restricting the
usage of spurious information. Every security attack to take
control of a program needs to transfer the program’s con-
trol to malevolent code. In our approach, the operating sys-
tem identifies a set of input channels as spurious, and the
processor tracks all information flows from those inputs. A
broad range of attacks are effectively defeated by disallow-
ing the spurious data to be used as instructions or jump tar-
get addresses. We describe two different security policies
that track differing sets of dependencies. Implementing the
first policy only incurs, on average, a memory overhead of
0.26% and a performance degradation of 0.02%. This pol-
icy does not require any modification of executables. The
stronger policy incurs, on average, a memory overhead of
4.5% and a performance degradation of 0.8%, and requires
binary annotation.
1 Introduction
Malicious attacks often exploit program bugs to obtain
unauthorized accesses to a system. We propose an architec-
tural mechanism called dynamic information flow tracking,
which provides a powerful tool to protect a computer sys-
tem from malicious software attacks. With this mechanism,
higher level software such as an operating system can make
strong security guarantees even for vulnerable programs.
The most frequently-exploited program vulnerabilities
are buffer overflows and format strings, which allow an
attacker to overwrite memory locations in the vulnerable
program’s memory space with malicious code and program
pointers. Exploiting the vulnerability, a malicious entity can
gain controlofa programand perform any operation that the
compromised program has permissions for. Since hijacking
a single privileged program gives attackers full access to
the system, vulnerable programs represent a serious secu-
rity risk.
Unfortunately, it is very difficult to protect programs by
stopping the first step of an attack, namely, exploiting pro-
gram vulnerabilities to overwrite memory locations. There
can be as many, if not more, types of exploits as there are
program bugs. Moreover, malicious overwrites cannot be
easily identified since vulnerable programs themselves per-
form the writes. Conventional access controls do not work
in this case. As a result, protection schemes which target
detection of malicious overwrites have only had limited suc-
cess they block only the specific types of exploits they are
designed for.
To be effective for a broad range of security exploits, at-
tacks can be thwarted by preventing the final step, namely,
the malevolent transfer of control. In order to be successful,
every attack has to change a program’s control flow so as to
execute malicious code. Unlike memory overwrites, there
are only a few ways to change a program’s control flow.
Attacks may change a pointer to indirect jumps, or inject
malicious code at a place that will be executed without re-
quiring malevolent control transfer. Thus, control transfers
are much easier to protect for a broad range of exploits. The
challenge is to distinguish malicious control transfers from
many legitimate ones.
We make the observationthat potentially malicious input
channels, i.e., channels from which malicious attacks may
come, are managed by operating systems. Therefore, oper-
ating systems can mark inputs from those channels as spu-
rious so that they are not allowed to be used as instructions
or jump targets. Unfortunately, spurious inputs are used in
various ways at run-time to generate new spurious data that
may result in malicious control transfers. Therefore, only
restricting the use of spurious input data is not sufficient to
prevent many attacks.
Dynamic information flow tracking is a simple hardware
mechanism to track spurious information flows at run-time.
On every operation, a processor determines whether the re-
sult is spurious or not based on the inputs and the type of
1

the operation. With the tracked information flows, the pro-
cessor can easily check whether an instruction or a branch
target is spurious or not, which prevents changes of con-
trol flows by potentially malicious inputs and dynamic data
generated from them.
Experimental results demonstrate our protection scheme
is very effective and efficient. A broad range of secu-
rity attacks exploiting notorious buffer overflows and for-
mat strings are detected and stopped. Our restrictions do
not cause any false alarms for applications in the SPEC
CPU2000 suite. We describe two different security poli-
cies that track differing sets of dependencies. Implementing
the first policy only requires, on average, a memory over-
head of 0.26% and a performance degradation of 0.02%.
The stronger policy requires, on average, additional mem-
ory space of 4.5% and a performance overhead of 0.8%.
The stronger policy also requires annotation of executables
prior to execution.
We describe our security model and generalapproach for
protection in Section 2 and Section 3, respectively. Sec-
tion 4 presents architectural mechanisms to track spurious
information flow at run-time. The two security policies we
consider are also defined. Practical considerations in mak-
ing our scheme efficient are discussed in Section 5. We
evaluate the first security policy in Section 6. Our second
security policy is described in detail and evaluated in Sec-
tion 7. Finally, we compare our approach with related work
in Section 8 and conclude the paper in Section 9.
2 Security Attack Model
In this paper, we consider attacks whose goal is to gain
unauthorized access to a computer system by taking control
of a vulnerable privileged program. Security attacks can
also try to crash programs, make programs produce incor-
rect results, read program’s execution state, etc. However,
attackers will not be able to obtain unauthorized access un-
less they hijack a privileged program. Therefore, we focus
on this specific type of attacks.
We assume that attackers can exploit a vulnerability that
allows them to modifyan arbitrary memory location with an
arbitrary value. Thus, attackers effectively have write per-
mission to any stored program address. The only restriction
for attackers is that any initial input from attackers should
be through a communication channel that can be identified
by an operating system. This is a reasonable assumption
since all I/O channels are managed by the operating system
in modern computer systems.
Protected programs and compilers that generated the
programs are assumed to not be malicious. For example, we
do not prevent programs from being compromised if a back
door is implemented as a part of original program function-
ality. The protected programs, however, can be buggy and
contain vulnerabilities. To achieve the goal of taking con-
trol, attackers should either change a control flow of a pro-
gram in an unintended way, or inject its own code.
Given the above assumptions, our protection scheme tar-
gets the prevention of any attack that tries to take control
of a protected program. In the rest of the section, we ex-
plain how security attacks work in more detail. In the next
section, we show how we can defeat the attacks.
I/O, other processes
Step 1. Inject malicious input
through legitimate channels
Program
Vulnerability
Malicious Code
Operating System
Step 2. Program vulnerability
- Inject malicious code
- Change jump/branch targets
Step 3. Execute unintended
(or malicious) code
- Injected code
- Unintended control transfer
Figure 1. Security attack scenario.
Figure 1 illustrates attacks which attempt to take control
of a vulnerable program. A program has legitimate com-
munication channels to the outside world, which are either
managed by the operating system as in most I/O channels
or set up by the operating system as in inter-process com-
munication. An attacker can control an input to one of these
channels.
Knowing a vulnerability in the program, attackers pro-
vide a malicious input that exploits the bug. This malicious
input makes the program change values in its address space,
in a way that is not intended in the original program func-
tionality.
Two frequently exploited bugs are buffer overflows and
format strings. The buffer overflow vulnerability occurs
when the bound of an input buffer is not checked. Attackers
can provide an input that is longer than an allocated buffer
size, and overwrite memory locations near the buffer. For
example, a stack smash attack can change a return address
stored in the stack [12] by overflowing a buffer allocated in
the stack.
The format string vulnerability [11] occurs when the for-
mat string of the printf family is given by input data.
Using the %n flag in the format, which stores the number of
characters written so far in the memory location indicated
by an argument, attackers can potentially modify any mem-
ory location to any value.
Finally, the modifiedvalues in the memorycause thepro-
gram to perform unintended operations. This final step of an
2

attack can happen in two ways. First, attackers may inject
malicious code exploiting the vulnerabilities and make the
program execute the injected code. Second, attackers can
simply reuse existing code and change the program’s con-
trol flow to execute code fragments that otherwise would
not have been executed by modifying one of the program
pointers in the memory.
For example, in the stack smash attack, attackers inject
malicious code into the overflown buffer as well as modify
a return address in the stack to point to the injected code.
When a function returns, the victim program jumps to the
injected code and executes it.
3 Protection Scheme
This section explains our approach to stop attacks un-
der the security model presented in the previous section.
We also provide two examples to illustrate our protection
scheme.
3.1 Overview
We protect vulnerable programs from malicious attacks
by restricting executable instructions and control transfers.
In order to take control of a program, every attack should
either make a processor execute injected malicious code
or change a program’s control flow to execute unintended
code. Attackers may still be able to make a program pro-
duce incorrect results, for instance, by overwriting the pro-
gram’s states in Step 2 of Figure 1. However, they will
not be able to gain unauthorized access to a system as long
as executable instructions and control transfers are properly
protected in Step 3 of Figure 1.
The key question in this approach is how to distinguish
malicious code from legitimate code, or malicious program
pointers from legitimate pointers. Because there are many
legitimate uses of dynamically generated instructions such
as just-in-time compilation, and legitimate uses of indirect
jumps, the question does not have a straightforward answer.
Figure 2 shows our approach to identify and prevent ma-
licious instructions and control transfers. Since the oper-
ating system manages communication channels for a pro-
gram, it identifies potentially malicious channels such as
network I/O, and tags all data from those channels as spuri-
ous. On the other hand, other instructions and data includ-
ing the original program when it gets loaded are marked
as authentic. Note that the operating system can always
be conservative and consider all I/O channels as spurious.
Thus, identifying potentially malicious channels is not a
major problem.
For our purposes, the term authenticity is used to indicate
whether the value is under a program’s control or not. For
example, a return address stored by the processor is under
I/O, other processes
Program
Vulnerability
Detect
Malicious Code
Step 1. Operating systems
tag potentially malicious
data: spurious
Operating System
Step 2. Processors track
the flow of potentially
malicious inputs
Step 3. Detect attacks!!
- Spurious instructions
- Spurious jump targets
Figure 2. Our protection scheme against security
exploits.
the program’s control and safe to be used as a jump target.
On the other hand, a program cannot predict a value from
an I/O channel, and it will cause unpredictable behavior if
the value is used as a jump target.
During an execution, malicious data may be processed
by the program before being used as an instruction or a
jump target address. Therefore, the processor also tags the
data generated from spurious data as spurious. We call this
technique dynamic information flow tracking.
Finally, if the processor detects the use of spurious data
as jump target addresses or execution of spurious instruc-
tions, it generates an exception, which will be handled by
the operating system. In general, the exception indicates an
intrusion, and the operating system needs to terminate the
victimized process.
3.2 Security Policies
Since programs and systems will have different mali-
cious I/O channels and different security requirements, the
protection scheme should be flexible enough to handle this
variance. For this purpose, security policies specify what
should be identified as spurious, and what operations are
allowed (or not allowed) with the spurious data. In our
scheme, the security policy consists of 3 parts: spurious in-
put channels, dependencies to be tracked, and restrictions.
The security policy first specifies which input channels
should be tagged as spurious. For most privileged applica-
tions such as daemons, attacks are mainly from network I/O
and it will be sufficient to tag the network input as spurious.
However, one should be careful in placing absolute trust in
an untracked channel since no attack from this channel can
be detected.
Spurious information can propagatein various ways dur-
ing a program execution. Section 4 discuss the types of de-
3

pendencies in detail. The security policy specifies which
dependencies should be tracked by the processor. As noted
above, one can always be conservative and track all input
channels and all possible dependencies. The experiments
show that our protectionscheme does not cause false alarms
even in this case. However, tracking unnecessary input
channels and dependencies will incur higher memory space
and performance overhead.
Finally, the security policy determines what kind of op-
erations are allowed for spurious data. In this paper, we
assume that spurious data is allowed for all operations ex-
cept when used as an instruction or jump target addresses.
These restrictions are enough to prevent attackers from
gaining control of protected programs. The operating sys-
tem may be able to provide security against a broader class
of attacks if it further restricts the use of spurious informa-
tion. We do not address this here.
In this paper, we assume the security policy is specified
in the operating system and enforced by the processor the
processor throws an exception when it detects a security vi-
olation. It is also possible to have other software layers such
as program shepherding [8] to enforce more complicated
security policies using information from the flow tracking
mechanism. However, the flexibility provided by an addi-
tional software layer comes with increased space and per-
formance overheads (cf. Section 8).
3.3 Example 1: Stack Smashing
A simple example of the stack smashing attack is pre-
sented to demonstrate how our protection scheme works.
The example is constructed from vulnerable code reported
for Tripbit Secure Code Analizer at SecurityFocus
T M
in
June 2003.
int single_source(char *fname)
{
char buf[256];
FILE *src;
src = fopen(fname, "rt");
while(fgets(buf, 1044, src)) {
...
}
return 0;
}
The above function reads source code line-by-line from
a file to analyze it. The program stack at the beginning of
the function is shown in Figure 3 (a). The return address
pointer is saved by the calling conventionand the local vari-
able buf is allocated in the stack. If an attacker provides
a source file with a line longer than 256 characters, buf
overflows and the stack next to the buffer is overwritten as
Other variables
buf
(256 Bytes)
Return Address
Top of Stack
Attack
(a)
Other variables
Malicious
Input data
from
fget()
Top of Stack
Used for
return
Tagged
“spurious”
(b)
Figure 3. The states of the program stack before
and after a stack smashing attack.
in Figure 3 (b). An attacker can modify the return address
pointer arbitrarily, and change the control flow when the
function returns.
Now let us consider how this attack is detected in our
scheme. When a function uses fgets to read a line from
the source file, it invokes a system call to access the file.
Since an operating system knows the data is from the file
I/O, it tags the I/O inputs as spurious. In fgets, the input
string is copied and put into the buffer. Dynamic informa-
tion flow tracking tags these processed values as spurious
(cf. copy dependency in Section 4). As a result, the val-
ues written to the stack by fgets are tagged spurious. Fi-
nally, when the function returns, it uses the ret instruction.
Since the instruction is a register-based jump, the processor
checks the security tag of the return address, and generates
an exception since the pointer is spurious.
3.4 Example 2: Format String Attacks
We also show how our protection scheme detects a for-
mat string attack with %n to modify program pointers in
memory. The following example is constructed based on
Newsham’s document on format string attacks [11].
int main(int argc, char **argv)
{
char buf[100];
if (argc != 2) exit(1);
snprintf(buf, 100, argv[1]);
buf[sizeof buf - 1] = 0;
printf(‘‘buffer: %s\n’’, buf);
return 0;
}
The general purpose of this example is quite simple:
print out a value passed on the command line. Note that the
4

code is written carefully to avoid buffer overflows. How-
ever, the snprintf statement causes the formatstring vul-
nerability because argv[1] is directly given to the func-
tion without a format string.
For example, an attacker may provide ’’aaaa%n’’
to overwrite the address 0x61616161 with 4. First, the
snprintf copies the first four bytes aaaa of the input
into buf in the stack. Then, it encounters %n, which is in-
terpreted as a formatstring to store the number of characters
written so far to the memory location indicated by an argu-
ment. The number of characters written at this point is four.
Without an argument specified, the next value in the stack
is used as the argument, which happens to be the first four
bytes of buf. This value is 0x61616161, which corre-
sponds to the copied aaaa. Therefore, the program writes
4 into 0x61616161. Using the same trick, an attacker can
simply modify a return address pointer to take control of the
program.
The detection of the format string attack is similar to
the buffer overflow case. First, knowing that argv[1]
is from a spurious I/O channel, the operating system tags
it as spurious. This value is passed to snprintf and
copied into buf. Finally, for the %n conversion specifi-
cation, snprintf uses a part of this value as an address
to store the number of characters written at that point (4 in
the example). All these spurious flows are tracked by our
information flow tracking mechanism (cf. copy dependency
and store-address dependency in Section 4). As a result, the
value written by snprintf is tagged spurious. The pro-
cessor detects an attack and generates an exception when
this spurious value is used as a jump target address.
4 Dynamic Information Flow Tracking
The effectiveness of our protection scheme largely de-
pends on the processor’s ability of tracking flows of spu-
rious data. An attack can be detected only if a malicious
information flow is tracked by the processor. This section
discusses the types of information flows that are relevant to
attacks under our attack model and explains how they can
be efficiently tracked in the processor.
4.1 Security Tags
We use a one-bit tag to indicate whether the correspond-
ing data block is authentic or spurious. It is straightforward
to extend our scheme to multiple-bit tags if there are many
types or sources of data. However, since we only have to
distinguish two types of data, one bit is sufficient for this
particular setting. In the following discussion, tags with
zero indicate authentic data and tags with one indicate spu-
rious data.
In the processor, each register needs to be tagged. In
the memory, data blocks with the smallest granularity that
can be accessed by the processor are tagged separately. We
assume that there is a tag per byte since many architec-
tures support byte granularity memory accesses and I/O.
Section 5 shows how the per-byte tags can be efficiently
managed with minimal space overhead.
The tags for registers are initialized to be zero at program
start-up. Similarly, all memory blocks are initially tagged
with zero. The operating system tags the data with one only
if they are from a potentially malicious input channel.
The security tags are a part of program state, and should
be managed by the operating system accordingly. On a con-
text switch, the tags for registers are saved and restored with
the register values. The operating system manages a sepa-
rate tag space for each process, just as it manages a separate
virtual memory space per process.
4.2 Tracking Information Flows
Spurious data can affect the authenticity of other reg-
isters or memory locations in many different ways. We
categorize these dependencies into four types: copy depen-
dency, computation dependency, load-address dependency,
and store-address dependency.
Copy dependency: If a spurious value is simply copied
into a different location, the value of the new location
is also spurious.
Computation (Comp) dependency: A spurious value
may be used as an input operand of a computation.
In this case, the result of the computation directly de-
pends on the input value. For example, in an arith-
metic instruction ADD Rd, Rs1, Rs2, the value in
Rd directly depends on the values of Rs1 and Rs2.
If either of the inputs are spurious, the output data is
considered spurious.
Load-address (LDA) dependency: When a spurious
value is used to specify the address to access, the
loaded value is considered spurious. Unless the bound
of the spurious value is explicitly checked by the pro-
gram, the result could be any value since it is from an
unpredictable address.
Store-address (STA) dependency: The stored value be-
comes spurious if the store address is determined by a
spurious value. If a program does not know where it
is storing a value, it would not expect the value in the
location to be changed when it loads from that address
in the future.
Processors dynamically track spurious information flows
by tagging the result of an operation as spurious if it has
5

Citations
More filters
Journal ArticleDOI

TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones

TL;DR: TaintDroid as mentioned in this paper is an efficient, system-wide dynamic taint tracking and analysis system capable of simultaneously tracking multiple sources of sensitive data by leveraging Android's virtualized execution environment.
Proceedings ArticleDOI

TaintDroid: an information-flow tracking system for realtime privacy monitoring on smartphones

TL;DR: Using TaintDroid to monitor the behavior of 30 popular third-party Android applications, this work found 68 instances of misappropriation of users' location and device identification information across 20 applications.
Proceedings Article

Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software

TL;DR: TaintCheck as mentioned in this paper performs dynamic taint analysis by performing binary rewriting at run time, which can reliably detect most types of exploits and produces no false positives for any of the many different programs that were tested.
Proceedings ArticleDOI

Control-flow integrity

TL;DR: Control-Flow Integrity provides a useful foundation for enforcing further security policies, as it is demonstrated with efficient software implementations of a protected shadow call stack and of access control for memory regions.
Proceedings ArticleDOI

Panorama: capturing system-wide information flow for malware detection and analysis

TL;DR: This work proposes a system, Panorama, to detect and analyze malware by capturing malicious information access and processing behavior, which separates these malicious applications from benign software.
References
More filters
Journal ArticleDOI

The SimpleScalar tool set, version 2.0

TL;DR: This document describes release 2.0 of the SimpleScalar tool set, a suite of free, publicly available simulation tools that offer both detailed and high-performance simulation of modern microprocessors.
Proceedings Article

StackGuard: automatic adaptive detection and prevention of buffer-overflow attacks

TL;DR: StackGuard is described: a simple compiler technique that virtually eliminates buffer overflow vulnerabilities with only modest performance penalties, and a set of variations on the technique that trade-off between penetration resistance and performance.
Journal ArticleDOI

SPEC CPU2000: measuring CPU performance in the New Millennium

J.L. Henning
- 01 Jul 2000 - 
TL;DR: CPU2000 as mentioned in this paper is a new CPU benchmark suite with 19 applications that have never before been in a SPEC CPU suite, including high-performance numeric computing, Web servers, and graphical subsystems.
Proceedings Article

Cyclone: A Safe Dialect of C

TL;DR: This paper examines safety violations enabled by C’s design, and shows how Cyclone avoids them, without giving up C”s hallmark control over low-level details such as data representation and memory management.
Related Papers (5)
Frequently Asked Questions (15)
Q1. What contributions have the authors mentioned in the paper "Secure program execution via dynamic information flow tracking" ?

In their approach, the operating system identifies a set of input channels as spurious, and the processor tracks all information flows from those inputs. The authors describe two different security policies that track differing sets of dependencies. 

The authors plan to investigate other applications of information flow tracking with more complicated security policies. 

The annotation algorithm uses a set S to track all the registers that contribute to the branch condition, and attempts to find all the load instructions that can affect the value in the branch register. 

If the authors assume a mechanism to decouple data and tag computations, even twolf with a 32 KB tag cache has only 5% performance degradation. 

the snprintf statement causes the format string vulnerability because argv[1] is directly given to the function without a format string. 

vudo requires three dependencies to detect since it reads a spurious pointer to a node of a double-linked list (copy), reads the prev field using a proper offset with the pointer to the node (load-address), and updates the prev->next field (store-address). 

Null HTTPd: By passing a negative content length value to the server, attacks can modify the allocation size of the read buffer, which results in a heap overflow.• vudo: 

The second way to ensure the safety of spurious data is to check the bound using conditional branches as shown in the switch example. 

If there is a store operation with a small granularity for a page that currently has per-quadword security tags, the operating system reallocates the space for per-byte tags and initializes them properly. 

The authors categorize these dependencies into four types: copy dependency, computation dependency, load-address dependency, and store-address dependency.• 

Both techniques only work for specific type of buffer overflow attacks that modify a return address in a stack, and require recompilation. 

given that the authors have only 0.21% overhead for security tags, small tag caches do not hurt the performance for Policy 1 as shown in Figure 6. 

Even though a program can manipulate values in memory with byte granularity, writing each byte separately is not the common case. 

As a result, the existing program shepherding schemes only allows code that is originally loaded, which prevents legitimate use of dynamic code. 

The advantage of having a software layer rather than a processor itself checking a security policy is that the policies can be more complex.