A space-efficient flash translation layer for CompactFlash systems

doi:10.1109/TCE.2002.1010143

366

IEEE

Transactions

on

Consumer

Electronics,

Vol.

48,

No.

2,

MAY

2002

A

SPACE-EFFICIENT FLASH TRANSLATION LAYER FOR COMPACTFLASH SYSTEMS

Jesung

Kim,

Jong

Min

Kim,

Sam

H.

Noh,

Sang Lyul Min

and

Yookun

Cho

Abstract-Flash memory is becoming increasingly im-

portant

as

nonvolatile storage for mobile consumer elec-

tronics due to its low power consumption and shock resis-

tance. However, it imposes technical challenges in that a

write should be preceded by

an

erase operation, and that

this erase operation can be performed only in

a

unit much

larger than the write unit.

To

address these technical hur-

dles, an intermediate software layer called a

flash

transla-

tion layer (FTL) is generally employed to redirect logical ad-

dresses from the host system to physical addresses in flash

memory. Previous approaches have performed this address

translation at the granularity of either

a

write unit (page)

or an erase unit (block).

In

this paper, we propose a novel

FTL design that combines the two different granularities

in address translation. This is motivated by the idea that

coarse grain address translation lowers resources required

to maintain translation information, which

is

crucial in mo-

bile consumer products for cost and power consumption

reasons, while fine grain address translation is efficient in

handling small size writes. Performance evaluation based

on trace-driven simulation shows that the proposed scheme

significantly outperforms previously proposed approaches.

Index Terms-Flash memory, NAND-type flash memory,

FTL, CompactFlash, address translation.

I. INTRODUCTION

ECENmY, mobile computing devices such as

PDAs

R

and digital cameras have become very popular.

These mobile devices impose different design require-

ments such

as

small size, lightweight, low power con-

sumption, and shock resistance. These new requirements

necessitate redesign of various components of the under-

lying computer system. In particular, design of nonvolatile

storage subsystems for mobile devices is one of the most

challenging areas since traditional magnetic disks are gen-

erally lacking in power efficiency and shock resistance due

to their mechanical nature.

Flash memory has been recognized

as

an attractive

Jesung

Kim, Sang Lyul Min, and Yookun Cho are with the School

of

Computer Science and Engineering, Seoul National University,

Korea. E-mail: jskim @archi.snu,ac.kl;

symin@dandelion.snu.ac.kl;

cho@ssrnet.snu.ac. kr:

Jong

Min Kim is with Samsung Electronics, Korea. Email:

jmkim @archi.snu.ac.kr:

Sam

H.

Noh is with the School

of

Information and Computer Engi-

neering,

Hong-Ik

University, Korea. E-mail: samhnoh@ hongik.ac.kr:

long-term storage media for mobile computers because

of

its superiority in small size, shock resistance, and low

power consumption

[l],

[2].

Moreover,

as

there

is

no me-

chanical delay involved, random access is possible thereby

providing excellent performance. These properties make

flash-memory-based storage subsystems that emulate hard

disks very popular for secondary storage of mobile com-

puters (e.g., CompactFlash

[3],

and SmartMedia

[4]).

It

is anticipated that as the capacity of flash memory grows

[5],

the use of flash memory will become more prevalent,

coexisting with hard disks or even replacing hard disks

outright even in conventional computer systems.

Flash memory, however, has several characteristics that

make difficult straightforward replacement of magnetic

disks. First,

a

write in flash memory should be preceded

by an erase operation, which takes an order of magnitude

longer than

a

write operation. Second, erase operations

can only be performed in

a

much larger unit than the write

operation. This implies that, for an update of even a single

byte, an erase operation

as

well

as

restoration of a large

amount of data would be required. This not only degrades

the potential performance significantly, but also gives rise

to an integrity problem since data may be lost if the power

goes down unexpectedly during the restoration process,

which may happen frequently in hand-held devices.

To

address these problems, an intermediate software

layer called ajlash

translation layer

(FTL)

has been em-

ployed between the host application and flash memory

[6],

[7].

The FTL redirects each write request from the host to

an empty location in flash memory that has been erased in

advance. Although this technique rectifies the aforemen-

tioned limitation of erase-before-write, it comes at

a

price

of extra flash memory operations to prepare empty loca-

tions and extra storage to maintain the address translation

information, the amount of which varies significantly de-

pending on the management algorithm.

In this paper, we propose

a

novel FTL design aimed

at mass storage CompactFlash systems

[3].

The main

motivation of the proposed

FTL

is that coarse grain ad-

dress translation lowers management overhead whereas

fine grain address translation is efficient in handling small

size writes. The proposed scheme combines the two dif-

ferent granularities in address translation to allow efficient

handling of write requests smaller than

a

block while re-

Contributed

Paper

Manuscript received April

10,

2002

0098

3063/00

$10.00

2002

IEEE

Kim

et

al.:

A Space-Efficient Flash Translation Layer

for

CompactFlash Systems

3

67

Media

DRAM

ducing storage overhead of page-level address translation.

We also propose a method that assures consistency of

translation information stored in flash memory despite any

unexpected power-outages. The proposed scheme is based

on incremental updates of translation information in a ded-

icated region in flash memory. Consistency of translation

information is achieved by performing multiple updates of

translation information required to process

a

host request

in

a

single atomic flash memory write operation. The pro-

posed scheme

also

has

an additional advantage

of

having

a short startup time, which is an important feature in con-

sumer devices where systems are frequently turned on and

off by users.

The organization of the remainder of the paper is

as

fol-

lows. The next section gives an overview

of

flash mem-

ory and surveys previous approaches in managing flash

memory. A detailed description of our proposed scheme is

presented in the following section. Next, we compare the

performance of our scheme with that of previous schemes

based on trace-driven simulations. Finally, concluding re-

marks are given in the last section.

11.

BACKGROUND

Flash memory is

a

version of EEPROM that allows in-

system programming. In flash memory, data ran be writ-

ten by issuing

a

sequence of programming commands and

dataladdresses to the flash memory chip. The stored data

is sustained even after power is turned

off

and, thus, it

can be used as a nonvolatile storage media, especially for

mobile consumer devices that require small size and low

power consumption. Moreover, flash memory has the ad-

vantage of being accessed in

a

truly random fashion and,

thus, has potential for high performance. Table I compares

the characteristics

of

various storage media including two

major types (NAND and NOR types)

of

flash memory.

There are three basic operations that can be applied to

both types of flash memory, namely,

read, write,

and

erase

operations. The unit of read and write operations is re-

ferred to as

a

page,

and the size of a page is fixed from

1

byte to much larger sizes such as 2KB depending on the

product. For the erase operation, the unit is referred to

as

a

bEock,

which consists of multiple pages, and the size of

a

block is generally somewhere between

4KB

and

128KB.

For NOR-type flash memory, the page size is typically

1

byte, meaning that each byte can be read and written in-

dividually. NAND-type flash memory, on the other hand,

is

optimized more for mass storage, and the page size is

typically 512 bytes coinciding with the size

of

a sector

in hard disks. This gives

an

order-of-magnitude higher

write bandwidth compared to NOR-type Aash memory

since programming of each byte in the same page is fully

Read Write

1

Erase

60ns(2B)

I

60ns(2B)

I

TABLE

I

CHARACTERISTICS

OF

DIFFERENT

STORAGE

MEDIA.

NOR

FLASH

NAND

FLASH

Disk

I

II

Access time

I

2S6p (512B) 2.56~ (512B)

14.4~ (512B)

3.53ms

(512B) (128KB)

10.2~ (IB) 201p (1B) 2ms

35.9~ (512B) 226p (512B) (16KB)

12.4ms (512B) 12.4ms (512B)

(average) (average)

15011s (1B) 211p (le) 1.2s

overlapped. However, due to the block-device-like charac-

teristics, early FTL designs

[6],

[7],

[8]

relying on the in-

dividual byte programming capability of NOR-type flash

memory are not directly applicable to NAND-type flash

memory.

To aid FTL designers, NAND-type flash memory usu-

ally provides additional storage in each page called a

spare

area

to store

a

few bytes of management information

[5].

This spare area can be written at the same time when the

data is written with virtually no overhead. The spare area

is also used to store ECC code generated by outside logic

to detect errors while reading and writing

[9].

Hereafter,

we limit our attention to NAND-type flash memory be-

cause of its nice properties as mass storage, such

as

effi-

cient bulk read/write operations, and relatively short erase

time.

Fig.

1

depicts the internal organization of a typical

NAND-type flash-memory-based CompactFlash system.

It consists

of

one or more NAND-type flash memory

chips, a controller executing the FTL code stored in ROM,

SRAM storing data structures relevant to address trans-

lation, and an interface to the host. The host issues

redwrite commands along with the sector address and

the request size to the CompactFlash system like a hard

disk drive. Upon receipt of

a

command, address, and

the size, the FTL translates them into a sequence of

flash

memory intrinsic commands (read/write/erase) and phys-

ical addresses. The address translation is performed by

looking up the mapping table stored in SRAM, which is

initially constructed by scanning the spare area of flash

memory. By remapping each write request to different lo-

cations, the FTL can rectify the limitation of flash memory

prohibiting overwrites transparently to the host.

The mapping between the logical address and the physi-

368

CompactFlash system

Logical

address

T

I

SRAM

address

1

I

1

erased

region

,

Fig.

1.

Internal organization of

a

CompactFlash

system.

cal address can be maintained either at the page (Le., write

unit) level or at the block (Le., erase unit) level. Page-

level address mapping allows more flexible management

because a logical page can be mapped to any physical

page in flash memory. However, this mapping requires

a large amount of SRAM to store the needed mapping ta-

ble. For example, a CompactFlash system with a 16MB

flash memory chip with a page size of 512 bytes requires

64KB

of SRAM for the mapping table. Moreover, the size

of the mapping table scales as the capacity

of

flash mem-

ory increases, requiring 4MB of SRAM

in

the case of the

state-of-the-art 1GB CompactFlash system, which is pro-

hibitively large for cost, size, and power consumption rea-

sons.

In block-level address mapping, the logical address is

divided into a logical block address and a block offset, and

only the logical block address is translated into a physical

block address in flash memory in the mapping (Le., the

block offset is invariant in the translation). This mapping

is similar to the traditional address translation mechanism

found in paged virtual memory

[

101.

Although this block-

level address mapping places a restriction that the block

offset in the mapped physical block be the same as that in

the logical block, it requires a much smaller mapping ta-

ble. For example, the same 16MB CompactFlash system

with a page size

of

512 bytes and a block size of 16KB

(32 pages) requires only 2-

of

SRAM for the mapping

table. Moreover, the size of the block-level mapping ta-

ble does not linearly increase as in the case of page-level

mapping, since high-capacity flash memory generally has

a larger block size. For example, a 256MB flash memory

chip with a block size

of

128KB

[5]

would require only

8KB

of SRAM.

However, block-level address mapping generally in-

volves extra flash memory operations when a write request

requires an update of only part of a block. For exam-

ple, the simplest method would operate such that when-

ever there

is

a

write request to a single page, the physical

IEEE

Transactions on Consumer Electronics,

Vol.

48,

No.

2,

MAY

2002

block that contains the requested page is remapped to a

free physical block, the write operation is performed to

the page in the new physical block with the same block

offset, and all the other pages in the same block are copied

from the original physical block to the new physical block.

To eliminate such an expensive copy operation, a tech-

nique based on the concept of a replacement block is pro-

posed

[7],

[SI.

This technique, which we call the

replace-

ment

block

scheme,

allocates a temporary block called a

replacement block when there is an overwrite to an ex-

isting page in a block and performs the write operation

to the page in the replacement block with the same block

offset. Moreover, the replacement block itself may have

its own replacement block if one of its pages is ovenvrit-

ten again. Such replacement blocks belonging to the same

logical block are maintained in a linked list and traversed

for both read and write operations: for a read operation, it

is traversed to find the most up-to-date page in the replace-

ment blocks; for a write operation, it is traversed to find in

the replacement blocks the first free page with the same

block offset. When there is an overwrite request and there

is

no free space, the longest linked list is merged into one

block by copying the most up-to-date pages from the re-

placement blocks to the last replacement block in the list,

which becomes the new physical block representing the

logical block. After this merge operation, the former log-

ical block and the replacement blocks except the last one

are erased and become free blocks available for replace-

ment blocks to other data blocks.

111. AN FTL

DESIGN

BASED

ON

LOG

BLOCKS

In this section, we describe our proposed scheme what

we call a log

block

scheme

in detail. Our goal is to han-

dle both small size writes and long sequential writes effi-

ciently while limiting the size of SRAM needed for map-

ping purposes. This goal

is

achieved by introducing a

few page-level managed blocks what we call

log

blocks.

An additional goal of the proposed scheme is to guaran-

tee consistency of the stored data even after unexpected

power-outages. We achieve this goal by performing up-

dates of mapping information in a single atomic write op-

eration in dedicated blocks what we call

map

blocks.

A.

The

Log

Block

Our scheme manages most of the blocks at the block

level, while a small fixed number of blocks are managed

at the finer page level. The former holds ordinary data

and are called

data

blocks.

We refer to the latter as log

blocks.

Log

blocks are used as temporary storage for small

size writes to data blocks. When an update to a page in

a data block is requested, a log block is allocated from

Kim

et al.:

A

Space-Efficient Flash Translation Layer

for

CompactFlash Systems

369

the pool of free blocks that have been erased in advance

and the update is performed to the log block incrementally

from the first page. On each write, the logical address of

the page is also stored in the spare area that is associated

with each page. Note that the write to the spare area can

be performed simultaneously as the corresponding page is

written with virtually no overhead.

In this setting, for a read request the log blocks have to

be checked to see if the requested page is present. If the

requested page is present in the log block, it is provided to

the host system, shadowing the corresponding page in the

data block.

To

make this checking process efficient, we

maintain a page-level mapping table for each log block

in

SRAM.

This table is constructed on system startup by

scanning the logical address stored in the spare area of

each page in the log blocks, and updated on every write

to point to the up-to-date pages. Note that pages in the

log

block whose logical pages have been updated multi-

ple times do not require any special handling because, by

scanning the log block backward from the last page, we

can always identify the page that contains the up-to-date

copy of a logical page.

B.

Merge Operation

Once a log block is allocated for a data block, write re-

quests to the data block can be performed in the log block

without any extra operations, until all the pages in the log

block are consumed. When this happens,

we

reclaim the

log block by merging it with the corresponding data block.

The merge operation is very simple.

It allocates an

erased block from the pool of free blocks and then fills

each page with the up-to-date page, either from the log

block if the corresponding page is present, or from the data

block otherwise (see Fig. 2a). After copying all the pages,

the new block now becomes the data block, and the for-

mer data block and the log block are returned to the pool

of free blocks, waiting to be erased. The merge operation

requires n-page read operations, n-page write operations,

and two-block erase operations (one for the log block and

the other for the former data block), where n is the number

of pages per block.

As

a result, it produces two free blocks

while consuming one free block, giving an efficiency of

producing one free block per two erase operations.

There are special situations where the merge operation

can be performed with only one erase operation, resulting

in an ideal efficiency of producing one free block per one

erase operation. This situation occurs when all the pages

in a block are written sequentially starting from the first

logical page to the last logical page. In this case, we can

simplify the merge operation by making the log block the

new data block and returning the data block to the pool of

Data block

Free

block

Log

block

(a)

Log block merge

Data block

(b)

Log block switch

Data block Replacement blocks

(c) Replacement block merge

Victim block

New

block

(d) Cleaning in

a

log-structured

file

system

Fig.

2.

Comparison

of

merge operations.

free blocks (see Fig. 2b). We call this simplified version

of the merge operation the

switch

operation.

Note that a similar operation is also possible in the re-

placement block scheme. However, the merge operation

in the replacement block scheme incurs excessive erasure

since many pages in the replacement blocks may remain

unused due to the limitation of the placement of each page

(see Fig. 2c). Our merge operations are similar more to the

cleaning mechanism used in the log-structured file system

[

111

in that it collects valid items to reclaim free space (see

370

IEEE Transactions on Consumer Electronics,

Vol.

48,

No.

2,

MAY

2002

map

Data block

Log block

Free block

Mapping

fragments

'

fable

Fig.

3.

Mapping information management.

Fig. 2d). The difference lies in the fact that our scheme re-

stricts pages in

a

log block to be from the same data block,

and efficiency of the merge operation is independent of

the number of valid pages in the log block.

In

contrast, the

cleaning operation in the log-structured

file

system con-

sumes

a

portion of an empty block to relocate valid pages

in the block being cleaned, and thus the efficiency is in-

versely proportional to the number of valid pages, which

depends on the utilization

as

well

as

the policy of selecting

a block to be cleaned [l], [ll], [12], [131, [141.

C.

The

Map

block

Both the merge operation and the switch operation

change the mapping of the data block and thus, require an

update to the mapping information. In previous schemes,

mapping information is stored for each pageblock in the

associated spare area in the form of logical address tags.

Basically, these tags provide physical-to-logical reverse

translation information and needs to be reconstructed as

a

conventional logical-to-physical mapping table. This

requires scanning of the entire space of flash memory

to collect logical address tags scattered across all the

pageshlocks, which is prohibitively time- and power-

consuming especially in NAND-type flash memory where

reading a few bytes costs virtually the same as reading

a

page. Moreover, the mapping table is generally required

to

be present in SRAM as a whole.

In our proposed scheme, the mapping table is stored in

dedicated blocks what we call

map

blocks

to enable faster

startup and on-demand fetching. The map block is orga-

nized at the page level similarly to the log block such that

each page stores an incremental update of the mapping ta-

ble. The map of the mapping table, what we call a

map

directory,

is maintained in SRAM and is used to locate

each portion

of

the mapping table stored in map blocks.

This setting is similar to the traditional two-level page ta-

ble structure [lo], except that updates to the mapping table

stored in flash memory cannot be done in place, and thus

the map of the mapping table changes on every update.

Fig.

3

illustrates the contents of the mapping table be-

fore and after a merge operation. Note that the mapping ta-

ble

also

maps log blocks and free blocks, although they are

visible only in the virtual address space of the controller

of the storage system and transparent to the logical address

space of the host. A merge operation involves three blocks

(a data block,

a

log block, and

a

free block) and thus re-

quires updates of three mapping table entries. This may

require up to three write operations on map blocks. In our

approach, the mapping table is fragmented into

a

unit of

a

half of

a

page

so

that two mapping table fragments fit

into

a

single page. By limiting the number of log blocks

plus free blocks below the maximum number of blocks

that a single mapping table fragment can map

(128

in our

configuration), the updates of three mapping table entries

can be performed in

a

single write operation to the map

block. This assures consistency

of

the mapping table even

when the power goes down

at

an unexpected time and thus

greatly simplifies the recovery process.

Since an update of the mapping table consumes

a

page

in map blocks, eventually free pages will be exhausted.

Thus,

a

mechanism that reclaims used map blocks has to

be provided. For simplicity, our scheme uses map blocks

in a round-robin manner, and the next of the currently used

map block is cleaned before the current map block is ex-

hausted. The cleaning operation copies valid pages (i.e.,

pages that are pointed to by an entry in the map directory)

present in the next map block to the current map block, up-

dates the map directory entries of the pages being copied,

and erases the next map block.

The map directory is initially constructed by scanning

the map blocks in fixed locations from the block that was

used last (i.e., the block containing

a

free page) in the re-

verse order until all the pages are located. To further sim-

plify the process, the map directory can

also

be included

in the mapping table along with the maps for log blocks

and free blocks provided that there is a room.

Once the map directory is constructed, address trans-

lation can be performed similarly to the case of the con-

ventional two-level page table. First, the map directory is

looked up using high-ordered bits of the logical address

from the host to obtain the location of the page contain-

ing the required mapping table entry. Second, the located

page is fetched to SRAM if it is not currently there (Le.,

fetched on demand). The required mapping table entry

can then be accessed by indexing the fetched page using

middle-ordered bits of the logical address that are obtained

by truncating the high-ordered bits used

to

index the map

directory and the low-ordered bits used for the block

off-

set. Finally, we get the physical address of the target page

by adding the block offset to the physical block address

recorded in the mapping table entry.

A space-efficient flash translation layer for CompactFlash systems

Citations

DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings

A log buffer-based flash translation layer using fully-associative sector translation

Understanding intrinsic characteristics and system implications of flash memory based solid state drives

BPLRU: a buffer management scheme for improving random writes in flash storage

A superblock-based flash translation layer for NAND flash memory

References

The design and implementation of a log-structured file system

Scale and performance in a distributed file system

Flash EEprom system

Flash file system

eNVy: a non-volatile, main memory storage system

Related Papers (5)

A log buffer-based flash translation layer using fully-associative sector translation

DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings

Flash file system

A flash-memory based file system

Design tradeoffs for SSD performance