scispace - formally typeset
Open AccessProceedings ArticleDOI

Mondrian memory protection

Reads0
Chats0
TLDR
This work extends MMP to support segment translation which allows a memory segment to appear at another location in the address space, and uses this translation to implement zero-copy networking underneath the standard read system call interface.
Abstract
Mondrian memory protection (MMP) is a fine-grained protection scheme that allows multiple protection domains to flexibly share memory and export protected services. In contrast to earlier page-based systems, MMP allows arbitrary permissions control at the granularity of individual words. We use a compressed permissions table to reduce space overheads and employ two levels of permissions caching to reduce run-time overheads. The protection tables in our implementation add less than 9% overhead to the memory space used by the application. Accessing the protection tables adds than 8% additional memory references to the accesses made by the application. Although it can be layered on top of demand-paged virtual memory, MMP is also well-suited to embedded systems with a single physical address space. We extend MMP to support segment translation which allows a memory segment to appear at another location in the address space. We use this translation to implement zero-copy networking underneath the standard read system call interface, where packet payload fragments are connected together by the translation system to avoid data copying. This saves 52% of the memory references used by a traditional copying network stack.

read more

Content maybe subject to copyright    Report

Mondrian Memory Protection
Emmett Witchel, Josh Cates, and Krste Asanovi
´
c
MIT Laboratory for Computer Science, Cambridge, MA 02139
witchel,cates,krste
@lcs.mit.edu
ABSTRACT
Mondrian memory protection (MMP) is a fine-grained protection
scheme that allows multiple protection domains to flexibly share
memory and export protected services. In contrast to earlier page-
based systems, MMP allows arbitrary permissions control at the
granularity of individual words. We use a compressed permissions
table to reduce space overheads and employ two levels of permis-
sions caching to reduce run-time overheads. The protection tables
in our implementation add less than 9% overhead to the memory
space used by the application. Accessing the protection tables adds
less than 8% additional memory references to the accesses made
by the application. Although it can be layered on top of demand-
paged virtual memory, MMP is also well-suited to embedded sys-
tems with a single physical address space. We extend MMP to
support segment translation which allows a memory segment to
appear at another location in the address space. We use this trans-
lation to implement zero-copy networking underneath the standard
read system call interface, where packet payload fragments are
connected together by the translation system to avoid data copy-
ing. This saves 52% of the memory references used by a traditional
copying network stack.
1. INTRODUCTION
Operating systems must provide protection among different user
processes and between all user processes and trusted supervisor
code. In addition, operating systems should support flexible shar-
ing of data to allow applications to co-operate efficiently. The im-
plementors of early architectures and operating systems [5, 26] be-
lieved the most natural solution to the protected sharing problem
was to place each allocated region in a segment, which has the pro-
tection information. Although this provides fine-grain permission
control and flexible memory sharing, it is difficult to implement ef-
ficiently and is cumbersome to use because each address has two
components: the segment pointer and the offset within the segment.
Modern architectures and operating systems have moved to-
wards a linear addressing scheme, in which each user process has a
separate linear demand-paged virtual address space. Each address
space has a single protection domain, shared by all threads that run
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ASPLOS-X ’02 San Jose, CA
Copyright 2002 ACM 1-58113-574-2/02/0010 ...$5.00.
within the process. A thread can only have a different protection
domain if it runs in a different address space. Sharing is only pos-
sibly at page granularity, where a single physical memory page can
be mapped into two or more virtual address spaces. Although this
addressing scheme is now ubiquitous in modern OS designs and
hardware implementations, it has significant disadvantages when
used for protected sharing. Pointer-based data structures can be
shared only if the shared memory region resides at the same virtual
address for all participating processes, and all words on a page must
have the same permissions. The interpretation of a pointer depends
on addressing context, and any transfer of control between pro-
tected modules requires an expensive context switch. The coarse
granularity of protection regions and the overhead of inter-process
communication limit the ways in which protected sharing can be
used by application developers. Although designers have been cre-
ative in working around these limitations to implement protected
sharing for some applications [9], each application requires con-
siderable custom engineering effort to attain high performance.
We believe the need for flexible, efficient, fine-grained memory
protection and sharing has been neglected in modern computing
systems. The need for fine-grained protection in the server and
desktop domains is clear from the examples of a web server and
a web client. These systems want to provide extensiblility where
new code modules can be linked in to provide new functionality.
The architects of these systems have rejected designs using the na-
tive OS support for a separate address space per module because
of the complexity and run-time overhead of managing multiple ad-
dress contexts. Instead, modern web servers and clients have solved
the extensiblity problem with a plugin architecture. Plugins allow
a user to link a new module into the original program to provide
a new service. For instance, the Apache web server has a plugin
for the interpretation of perl code in web pages [2], and browsers
support plugins to interpret PDF documents [1]. Linking in code
modules makes communication between the server and the plugin
fast and flexible, but because there is no protection between mod-
ules in the same address space it is also unsafe. Plugins can crash
an entire browser, or open a security hole in a server (e.g., from a
buffer overrun).
Embedded systems have the same problem since they are of-
ten organized as a set of tasks (sometimes including the operating
system) that share physically-addressed memory (see Section 7).
Without inter-task protection, an error in part of the system can
make the entire system unreliable. Similarly, loadable OS kernel
modules (such as in Linux) all run in the kernel’s unprotected ad-
dress space, leading to potential reliability and security problems.
Figure 1 illustrates a general protection system and is based on
the diagrams in [17] and [18]. Each column represents one protec-
tion domain [16] while each row represents a range of memory ad-

Memory
Addresses
None
Read−only
Read−write
Execute−read
Protection domains
Permissions Key
0
0xFFF...
Figure 1: A visual depiction of multiple memory protection do-
mains within a single shared address space.
dresses. The address space can be virtual or physical—protection
domains are independent from how virtual memory translation is
done (if it is done at all). A protection domain can contain many
threads, and every thread is associated with exactly one protection
domain at any one point in its execution. Protection domains that
want to share data with each other must share at least a portion
of their address space. The color in each box represents the per-
missions that each protection domain has to access the region of
memory. An ideal protection system would allow each protection
domain to have a unique view of memory with permissions set on
arbitrary-sized memory regions.
The system we present in this paper implements this ideal pro-
tection system. We call this Mondrian memory protection (MMP)
because it allows the grid in Figure 1 to be painted with any pat-
tern of access permissions, occasionally resembling works by the
eponymous early twentieth century artist. Our design includes all
of the flexibility and high-performance protected memory sharing
of a segmented architecture, with the simplicity and efficiency of
linear addressing. The design is completely compatible with ex-
isting ISAs, and can easily support conventional operating system
protection semantics.
To reduce the space and run-time overheads of providing fine-
grained protection, MMP uses a highly-compressed permissions
table structure and two levels of hardware permissions caching.
MMP overheads are less than 9% even when the system is used
aggressively to provide separate protection for every object in a
program. We believe the increase in design robustness and the re-
duction in application design complexity will justify these small
run-time overheads. In some cases, the new application struc-
ture enabled by fine-grained protection will improve performance
by eliminating cross-context function calls and data copying. We
demonstrate this by saving 52% of the memory traffic in our zero-
copy networking implementation (see Section 5.3). The network-
ing example also illustrates a fine-grain segment translation scheme
which builds upon the base MMP data structures to provide a fa-
cility to present data at different addresses in different protection
domains. The MMP design also has the desirable property that
the overhead is only incurred when fine-grain protection is used,
with less than 1% overhead when emulating conventional coarse-
grained protection.
The rest of the paper is structured as follows. In Section 2 we
give a motivating example and discuss our requirement for mem-
ory system protection. Then we present the hardware and software
components of the MMP design in Section 3. We quantitatively
measure the overheads of our design in our implementation model
in Section 4. We discuss translation in Section 5 and describe its
use in zero-copy networking. We include a discussion of uses for
fine-grained protection and sharing in Section 6, and a discussion
of related work in Section 7. We conclude in Section 8.
2. EXAMPLE AND REQUIREMENTS
We provide a brief example to motivate the need for the MMP
system. More examples are discussed in Section 6. Consider a
network stack where when a packet arrives, the network card uses
DMA to place a packet into a buffer provided to it by the kernel
driver. Instead of the kernel copying the network payload data
to a user supplied buffer as is normally done, the kernel makes
the packet headers inaccessible and the packet data read-only and
passes a pointer to the user, saving the cost of a copy.
Implementing this example requires a memory system to support
the following requirements:
different: Different protection domains can have differ-
ent permissions on the same memory region.
small: Sharing granularity can be smaller than a page.
Putting every network packet on its own page is wasteful of
memory if packets are small. Worse, to give the header sep-
arate permissions from the payload would require copying
them to separate pages in a page-based system (unless the
payload starts at a page boundary).
revoke: A protection domain owns regions of memory and
is allowed to specify the permissions that other domains see
for that memory. This includes the ability to revoke permis-
sions.
Previous memory sharing models fail one or more of these require-
ments.
Conventional linear, demand-paged virtual memory systems can
meet the different requirement by placing each thread in a sep-
arate address space and then mapping in physical memory pages to
the same virtual address in each address context. These systems
fail the small requirement because permissions granularity is at
the level of pages.
Page-group systems [16], such as HP-PA RISC and PowerPC,
define protection domains by which page-groups (collections of
memory pages) are accessible. Every domain that has access to a
page-group sees the same permissions for all pages in the group,
violating the different requirement. They also violate the
small requirement because they work at the coarse granularity of
a page or multiple pages. Domain-page systems [16] are similar to
our design in that they have an explicit domain identifier, and each
domain can specify a permissions value for each page. They fail to
meet the small requirement because permissions are managed at
page granularity.
Capability systems [10, 18] are an extension of segmented archi-
tectures where a capability is a special pointer that contains both
location and protection information for a segment. Although de-
signed for protected sharing, these fail the different require-
ment for the common case of shared data structures that contain
pointers. Threads sharing the data structure use its pointers (capa-
bilities) and therefore see the same permissions for objects accessed
via the shared structure. Many capability systems fail to meet the
revoke requirement because revocation can require an exhaustive
sweep of the memory in a protection domain [7]. Some capability
systems meet the different and revoke requirements by per-
forming an indirect lookup on each capability use [13, 29], which
adds considerable run-time overhead.
Large sparse address spaces provide an opportunity for proba-
bilistic protection [35], but this strategy violates the revoke and
different requirement.

refill
Permissions
Table
PLB
Domain ID
Perm Table Base
MEMORY
lookup
refill
SidecarsAddress Regs
CPU
Figure 2: The major components of the Mondrian memory protec-
tion system. On a memory reference, the processor checks permissions
in the address register sidecar. If the reference is out of range of the
sidecar information, or the sidecar is not valid, it attempts to reload the
sidecar from the PLB. If the PLB does not have the permissions infor-
mation, either hardware or software walks the permissions table which
resides in memory. The matching entry from the permissions table is
cached in the PLB and is used to reload the address register sidecar
with a new segment descriptor
3. MMP DESIGN
The major challenge in a MMP system is reducing the space and
run-time overheads. In the following, we describe our initial explo-
ration of this design space and our trial implementation. The imple-
mentation and results are for a 32-bit address space, but MMP can
be readily extended to 64-bit addresses as discussed in Section 3.9.
3.1 MMP Features
MMP provides multiple protection domains within a single ad-
dress space (physical or virtual). Addressing is linear and is com-
patible with existing binaries for current ISAs. A privileged su-
pervisor protection domain is available which provides an API to
modify protection information. A user thread can change permis-
sions for a range of addresses, a user segment, by specifying the
base word address, the length in words, and the desired permission
value. Changing memory protections only incurs the cost of an
inter-protection domain call (Section 3.8), not a full system call.
In all the designs discussed in this section, we provide two bits
of protection information per word, as shown in Table 1. MMP
can be easily modified to support more permission bits or different
permission types.
Perm Value Meaning
00 no perm
01 read-only
10 read-write
11 execute-read
Table 1: Example permission values and their meaning.
Every allocated region of memory is owned by a protection do-
main, and this association is maintained by the supervisor. To sup-
port the construction of protected subsystems, we allow the owner
of a region to export protected views of this region to other protec-
tion domains.
3.2 MMP System Structure
Figure 2 shows the overall structure of an MMP system. The
Binary
Search
...
Address (30)
Address (30)
0x00100020
0x0
Perm (2)
00
01
Translation (32)
+0x2841
+0x0
Figure 3: A sorted segment table (SST). Entries are kept in sorted or-
der and binary searched on lookup. The shaded part contains optional
translation information.
CPU contains a hardware control register which holds the pro-
tection domain ID (PD-ID [16]) of the currently running thread.
Each domain has a permissions table, stored in privileged memory,
which specifies the permission that domain has for each address in
the address space. This table is similar to the permissions part of
a page table, but permissions are kept for individual words in an
MMP system. Another CPU control register holds the base address
of the active domain’s permissions table.
The MMP protection table represents each user segment using
one or more table segments, where a table segment is a convenient
unit for the table representation. We use the term block to mean an
address range that is naturally aligned and whose size is a power of
two. In some MMP variants, all table segments are blocks.
Every memory access must be checked to see if the domain has
appropriate access permissions. A permissions lookaside buffer
(PLB) caches entries fromthe permissions table to avoid long walks
through the memory resident table. As with a conventional TLB
miss, a PLB miss can use hardware or software to search the per-
mission tables. To further improve performance, we also add a
sidecar register for every architectural address register in the ma-
chine (in machines that have unified address and data registers, a
sidecar would be needed for every integer register). The sidecar
caches the last table segment accessed through this address regis-
ter. As discussed below, the information stored in the sidecar can
map a wider address range than the index address range of the PLB
entry from which it was fetched, avoiding both PLB lookups and
PLB misses while a pointer moves within a table segment. The in-
formation retrieved from the tables on a PLB miss is written to both
the register sidecar and the PLB.
The next two subsections discuss alternative layouts of the en-
tries in the permissions tables. In choosing a format of the permis-
sions table we must balance space overhead, access time overhead,
PLB utilization, and the time to modify the tables when permis-
sions change.
3.3 Sorted Segment Table
A simple design for the permissions table is just a linear array
of segments ordered by segment start address. Segments can be
any number of words in length and start on any word boundary, but
cannot overlap. Figure 3 shows the layout of the sorted segment ta-
ble (SST). Each entry is four bytes wide, and includes a 30-bit start
address (which is word aligned, so only 30 bits are needed) and a
2-bit permissions field (the shaded part is optional and will be dis-
cussed in Section 5). The start address of the next segment implic-
itly encodes the end of the current segment, so segments with no
permissions are used to encode gaps and to terminate the list. On
a PLB miss, binary search is used to locate the segment contain-
ing the demand address. The SST is a compact way of describing
the segment structure, especially when the number of segments is

small, but it can take many steps to locate a segment when the num-
ber of segments is large. Because the entries are contiguous, they
must be copied when a new entry is inserted. Furthermore, the SST
table can only be shared between domains in its entirety, i.e., two
domains have to have identical permissions maps.
3.4 Multi-level Permissions Table
Mid Index (10) Leaf Index (6)
Address from program (bits 31−0)
Bits (21−12) Bits (11−6) Bits (5−0)Bits (31−22)
Leaf Offset (6)Root Index (10)
Figure 4: How an address indexes the multi-level permissions table
(MLPT).
An alternative design is a multi-level permissions table (MLPT).
The MLPT is organized like a conventional forward mapped page
table, but with an additional level. Figure 4 shows which bits of
the address are used to index the table, and Figure 5 shows the
MLPT lookup algorithm. Entries are 32-bits wide. The root table
has 1024 entries, each of which maps a 4MB block. Entries in
the mid-level table map 4 KB blocks. The leaf level tables have 64
entries which each provide individual permissions for 16 four-byte
words. The supervisor can reduce MLPT space usage by sharing
lower level tables across different protection domains when they
share the same permissions map.
We next examine different formats for the entries in the MLPT.
3.4.1 Permission Vector Entries
A simple format for an MLPT entry is a vector of permission
values, where each leaf entry has 16 two-bit values indicating the
permissions for each of 16 words, as shown in Figure 6. User seg-
ments are represented with the tuple < base addr, length, permis-
sions>. Addresses and lengths are given in bytes unless otherwise
noted. The user segment <0xFFC, 0x50, RW> is broken up
into three permission vectors, the latter two of which are shown
in the figure. We say an address range owns a permissions table
entry if looking up any address in the range finds that entry. For
example, in Figure 6, 0x10000x103F owns the first permission
vector entry shown.
Upper level MLPT entries could simply be pointers to lower
level tables, but to reduce space and run-time overhead for large
user segments, we allow an upper level entry to hold either a pointer
to the next level table or a permissions vector for sub-blocks (Fig-
PERM_ENTRY MLPT_lookup(addr_t addr) {
PERM_ENTRY e = root[addr >> 22];
if(is_tbl_ptr(e)) {
PERM_TABLE* mid = e<<2;
e = mid[(addr >> 12) & 0x3FF];
if(is_tbl_ptr(e)) {
PERM_TABLE* leaf = e<<2;
e = leaf[(addr >> 6) & 0x3F];
}
}
return e;
}
Figure 5: Pseudo-code for the MLPT lookup algorithm. The table is
indexed with an address and returns a permissions table entry, which is
cached in the PLB. The base of the root table is held in a dedicated CPU
register. The implementation of is
tbl ptr depends on the encoding
of the permission entries.












0x1080
0x1040
0x1000
Address
Space
4 bytes
<0x1060, 0x8, RO>
<0x1068, 0x20, RW>
<0xFFC, 0x50, RW>
User segments
10 10 10 00 00 00 00 00 01 01 10 10 10 10 10 10
1010 10 10 10 10 10 10 10 10 10 10 10 10 1010
Leaf table entries
Permission Vector Owned By 0x1040−0x107F
Permission Vector Owned By 0x1000−0x103F
Figure 6: A MLPT entry consisting of a permissions vector. User
segments are broken up into individual word permissions.
Type (1)
1 Perm for 8 sub−blocks (8x2b)Unused (15)
0 Unused (1) Ptr to lower level table (30)
bool is tbl ptr(PERM ENTRY e)
return(e>>31)==0;
Figure 7: The bit allocation for upper level entries in the permis-
sions vector MLPT, and the implementation of the function used in
MLPT lookup.
ure 7). Permission vector entries in the upper levels contain only
eight sub-blocks because the upper bit is used to indicate whether
the entry is a pointer or a permissions vector. For example, each
mid-level permissions vector entry can represent individual permis-
sions for the eight 512 B blocks within the 4KB block mapped by
this entry.
3.4.2 Mini-SST entries
Although permission vectors are a simple format for MLPT en-
tries, they do not take advantage of the fact that most user segments
are longer than a single word. Also, the upper level entries are
inefficient at representing the common case of non-aligned, non-
power-of-two sized user segments.
The sorted segment table demonstrated a more compact encod-
ing for abutting segments—only base and permissions are needed
because the length of one segment is implicit in the base of the next.
A mini-SST entry uses the same technique to increase the encoding
density of an individual MLPT entry.
Figure 8: The bit allocation for a mini-SST permission table entry.
Figure 8 shows the bit encoding for a mini-SST entry which can
represent up to four table segments crossing the address range of
an entry. As with the SST, start offsets and permissions are given
for each segment, allowing length (for the first three entries) to be
implicit in the starting offset of the next segment. The mini-SST
was broken up into four segments because experiments showed that
the size of heap allocated objects was usually greater than 16 bytes.
Mini-SST entries encode permissions for a larger region of mem-
ory than just the 16 words (or 16 sub-blocks at the upper table lev-
els) that own it. The first segment has an offset which represent
its start point as the number of sub-blocks (0–31) before the base
address of the entry’s owning range. Segments mid0 and mid1
must begin and end within this entry’s 16 sub-blocks. The last
segment can start at any sub-block in the entry except the first (a

First = <−17, (20), RW>
Mini−SST segment
owned by
0x1040−0x107F
Mini−SST segment
owned by
0x1000−0x103F
Mid0 = <3, (5), NONE>
Mid1 = <8, (2), RO>
Last = <10, 8, RW>








0x1080
0x1040
0x1000
User segments
Address
Space
4 bytes
<0x1060, 0x8, RO>
<0x1068, 0x20, RW>
<0xFFC, 0x50, RW>
Last =
<−1, (17), RW>First =
<16, 3, RW>
Figure 9: An example of segment representation for mini-SST entries.
zero offset means the last segment starts at the end address of the
entry) and it has an explicit length that extends up to 31 sub-blocks
from the end of the entry’s owning range. The largest span for an
entry is 79 sub-blocks (31 before, 16 in, 32 after).
The example in Figure 6 illustrates the potential benefit of stor-
ing information for words beyond the owning address range. If the
entry owned by 0x10000x103F could provide permissions in-
formation for memory at 0x1040 then we might not have to load
the entry owned by 0x1040.
Figure 9 shows a small example of mini-SST entry use. Seg-
ments within an SST entry are labelled using a < base, length, per-
mission > tuple. Lengths shown in parentheses are represented
implicitly as a difference in the base offsets of neighboring ta-
ble segments. The entry owned by 0x1000-0x103F has seg-
ment information going back to 0xFFC, and going forward to
0x104C. Because of the internal representation limits of the mini-
SST format, the user segment mapped by the entry at address range
0x1000-0x103F has been split across the first and last mini-SST
table segments.
Mini-SST entries can contain overlapping address ranges, which
complicates table updates. When the entry owned by one range
is changed, any other entries which overlap with that range might
also need updating. For example, if we free part of the user segment
starting at 0xFFC by protecting a segment as <0x1040, 0xC,
NONE>, we need to read and write the entries for both 0x1000
0x103F and 0x10400x107F even though the segment being
written does not overlap the address range 0x10000x103F. All
entries overlapping the modified user segment must also be flushed
from the PLB to preserve consistency.
We can design an efficient MLPT using mini-SST entries as our
primary entry type. The mini-SST format reserves the top two bits
for an entry type tag, Table 2 shows the four possible types of en-
try. The upper tables can contain pointers to lower level tables. Any
level can have a mini-SST entry. Any level can contain a pointer
to a vector of 16 permissions. This is necessary because mini-SST
entries can only represent up to four abutting segments. If a re-
gion contains more than four abutting segments, we represent the
permissions using a permission vector held in a separate word of
storage, and pointed to by the entry. Finally, we have a pointer to
a record that has a mini-SST entry and additional information. We
use this extended record to implement translation as discussed in
Section 5.
3.5 Protection Lookaside Buffer
The protection lookaside buffer (PLB) caches protection table
entries in the same way as a TLB caches page table entries. The
PLB hardware uses a conventional ternary content addressable
memory (CAM) structure to hold address tags that have a vary-
Type Description
00 Pointer to next level table.
11 Mini-SST entry (4 segments spanning 79 sub-blocks).
01 Pointer to permission vector (16x2b).
10 Pointer to mini-SST+ (e.g., translation (6x32b)).
bool is tbl ptr(PERM ENTRY e)
return(e>>30)==0;
Table 2: The different types of MLPT entries, and the imple-
mentation of the function used in MLPT lookup. Type is the
type code. Leaf tables do not have type 00 pointers.
ing number of significant bits (as with variable page size TLBs
[15]). The PLB tags have to be somewhat wider than a TLB as
they support finer-grain addressing (26 tag bits for our example de-
sign). Entries are also tagged with protection domain identifiers
(PD-IDs).
The ternary tags stored in the PLB entry can contain additional
low-order “don’t care” address bits to allow the tag to match ad-
dresses beyond the owning address range. For example, the tag
0x10XX, where XX are don’t care bits, will match any address
from 0x10000x10FF. On a PLB refill, the tag is set to match on
addresses within the largest naturally aligned power-of-two sized
block for which the entry has complete permissions information.
Referring to the example in Figure 9, a reference to 0x1000 will
pull in the entry for the block 0x10000x103F and the PLB tag
will match any address in that range. A reference to 0x1040
will bring in the entry for the block 0x10400x107F, but this
entry can be stored with a tag that matches the range 0x1000
0x107F because it has complete information for that naturally
aligned power-of-two sized block. This technique increases effec-
tive PLB capacity by allowing a single PLB entry to cache permis-
sions for a larger range of addresses.
When permissions are changed for a region in the permissions
tables, we need to flush any out-of-date PLB entries. Permissions
modification occurs much more frequently than page table modifi-
cations in a virtual memory system. To avoid excessive PLB flush-
ing, we use a ternary search key for the CAM tags to invalidate po-
tentially stale entries in one cycle. The ternary search key has some
number of low order “don’t care” bits, to match all PLB entries
within the smallest naturally aligned power-of-two sized block that
completely encloses the region we are modifying (this is a conser-
vative scheme that may invalidate unmodified entries that happen to
lie in this range). A similar scheme is used to avoid having two tags
hit simultaneously in the PLB CAM structure. On a PLB refill, all
entries that are inside the range of a new tag are first searched for
and invalidated using a single search cycle with low-order “don’t
care” bits.
3.6 Sidecar Registers
Addr(32) Valid (1) Base (32) Bound (32) Perm (2) Trans. offset (32)
SidecarAddress register
Figure 10: The layout of an address register with sidecar. The shaded
portion is optional translation information.
Each address register in the machine has an associated sidecar
register which holds information for one table segment as depicted
in Figure 10. The program counter also has its own sidecar used for
instruction fetches. Sidecar registers are an optional component of
the design, but they help reduce traffic to the fully-associative PLB.

Figures
Citations
More filters
Proceedings ArticleDOI

Secure program execution via dynamic information flow tracking

TL;DR: This work presents a simple architectural mechanism called dynamic information flow tracking that can significantly improve the security of computing systems with negligible performance overhead and is transparent to users or application programmers.
Journal ArticleDOI

Improving the reliability of commodity operating systems

TL;DR: Nooks, a reliability subsystem that seeks to greatly enhance operating system reliability by isolating the OS from driver failures, represents a substantial step beyond the specialized architectures and type-safe languages required by previous efforts directed at safe extensibility.
Proceedings ArticleDOI

Minos: Control Data Attack Prevention Orthogonal to Memory Model

TL;DR: A microarchitectural implementation of Minos is presented that achieves negligible impact on cycle time with a small investment in die area, and minor changes to the Linux kernel to handle the tag bits and perform virtual memory swapping.
Proceedings ArticleDOI

XFI: software guards for system address spaces

TL;DR: This work has implemented XFI for Windows on the x86 architecture using binary rewriting and a simple, stand-alone verifier; the implementation's correctness depends on the verifier, but not on the rewriter.
Proceedings Article

Enhancing server availability and security through failure-oblivious computing

TL;DR: Failure-oblivious computing is presented, a new technique that enables servers to execute through memory errors without memory corruption and enables the servers to continue to operate successfully to service legitimate requests and satisfy the needs of their users even after attacks trigger their memory errors.
References
More filters
Journal ArticleDOI

The protection of information in computer systems

TL;DR: In this article, the authors explore the mechanics of protecting computer-stored information from unauthorized use or modification, focusing on those architectural structures-whether hardware or software-that are necessary to support information protection.
Proceedings ArticleDOI

Proof-carrying code

TL;DR: It is shown in this paper how proof-carrying code might be used to develop safe assembly-language extensions of ML programs and the adequacy of concrete representations for the safety policy, the safety proofs, and the proof validation is proved.
Proceedings ArticleDOI

Simultaneous multithreading: maximizing on-chip parallelism

TL;DR: Simultaneous multithreading has the potential to achieve 4 times the throughput of a superscalar, and double that of fine-grain multi-threading, and is an attractive alternative to single-chip multiprocessors.
Proceedings ArticleDOI

Efficient software-based fault isolation

TL;DR: It is demonstrated that for frequently communicating modules, implementing fault isolation in software rather than hardware can substantially improve end-to-end application performance.
Proceedings ArticleDOI

Extensibility safety and performance in the SPIN operating system

TL;DR: This paper describes the motivation, architecture and performance of SPIN, an extensible operating system that provides an extension infrastructure together with a core set of extensible services that allow applications to safely change the operating system's interface and implementation.
Related Papers (5)
Frequently Asked Questions (13)
Q1. What contributions have the authors mentioned in the paper "Mondrian memory protection" ?

The authors use this translation to implement zero-copy networking underneath the standard read system call interface, where packet payload fragments are connected together by the translation system to avoid data copying. This saves 52 % of the memory references used by a traditional copying network stack. 

An advantage of the protection domain approach is that conventional pointers can be used, and permissions can be easily revoked by modifying the per-process permissions tables. 

If the authors charge 2 cycles for the unaligned loads that cross cache line boundaries, 10 cycles for the seamed loads and discount all other instructions, the translation implementation still saves 46% of the reference time of a copying implementation. 

The implementors of early architectures and operating systems [5, 26] believed the most natural solution to the protected sharing problem was to place each allocated region in a segment, which has the protection information. 

A permissions lookaside buffer (PLB) caches entries from the permissions table to avoid long walks through the memory resident table. 

The architects of these systems have rejected designs using the native OS support for a separate address space per module because of the complexity and run-time overhead of managing multiple address contexts. 

In addition to counting additional memory references, the authors also fed address traces containing the table accesses to a cache simulator to measure the increase in miss rate caused by the table lookups. 

If they are modified, any processor which might be caching the data must be notified so it can invalidate its sidecar registers and invalidate the necessary section of the PLB. 

The authors can make a segment of memory appear to reside in a different address range by storing a translation offset in the table segment descriptor. 

If a region contains more than four abutting segments, the authors represent the permissions using a permission vector held in a separate word of storage, and pointed to by the entry. 

The supervisor can keep protection information for its text, stack, and data in these entries so they do not need to be faulted in on every supervisor call. 

The sidecar registers can be physically located by the load/store unit and only need as many read ports as the number of simultaneous load and store instructions supported. 

Although permission vectors are a simple format for MLPT entries, they do not take advantage of the fact that most user segments are longer than a single word.