scispace - formally typeset
Search or ask a question
Author

Dong-Ho Lee

Other affiliations: Samsung, Seoul National University
Bio: Dong-Ho Lee is an academic researcher from Hanyang University. The author has contributed to research in topics: Flash file system & Flash memory. The author has an hindex of 16, co-authored 77 publications receiving 1659 citations. Previous affiliations of Dong-Ho Lee include Samsung & Seoul National University.


Papers
More filters
Journal ArticleDOI
TL;DR: There is much room for performance improvement in the log buffer block scheme, and an enhanced log block buffer scheme, called FAST (full associative sector translation), is proposed, which improves the space utilization of log blocks using fully-associative sector translations for the log block sectors.
Abstract: Flash memory is being rapidly deployed as data storage for mobile devices such as PDAs, MP3 players, mobile phones, and digital cameras, mainly because of its low electronic power, nonvolatile storage, high performance, physical stability, and portability. One disadvantage of flash memory is that prewritten data cannot be dynamically overwritten. Before overwriting prewritten data, a time-consuming erase operation on the used blocks must precede, which significantly degrades the overall write performance of flash memory. In order to solve this “erase-before-write” problem, the flash memory controller can be integrated with a software module, called “flash translation layer (FTL).” Among many FTL schemes available, the log block buffer scheme is considered to be optimum. With this scheme, a small number of log blocks, a kind of write buffer, can improve the performance of write operations by reducing the number of erase operations. However, this scheme can suffer from low space utilization of log blocks. In this paper, we show that there is much room for performance improvement in the log buffer block scheme, and propose an enhanced log block buffer scheme, called FAST (full associative sector translation). Our FAST scheme improves the space utilization of log blocks using fully-associative sector translations for the log block sectors. We also show empirically that our FAST scheme outperforms the pure log block buffer scheme.

682 citations

Journal ArticleDOI
TL;DR: This paper surveys the state-of-the-art FTL software for flash memory, defines the problems, addresses algorithms to solve them, and discusses related research issues.

286 citations

Book ChapterDOI
01 Aug 2006
TL;DR: A survey of state-of-the-art FTL software for flash memory can be found in this paper, where the authors describe problem definitions, several algorithms proposed to solve them, and related research issues.
Abstract: Recently, flash memory is widely adopted in embedded applications since it has several strong points: non-volatility, fast access speed, shock resistance, and low power consumption. However, due to its hardware characteristic, namely “erase before write”, it requires a software layer called FTL (Flash Translation Layer). This paper surveys the state-of-the-art FTL software for flash memory. This paper also describes problem definitions, several algorithms proposed to solve them, and related research issues. In addition, this paper provides performance results based on our implementation of each of FTL algorithms

125 citations

Journal ArticleDOI
TL;DR: A novel FTL algorithm called Hybrid Flash Translation Layer (HFTL) is proposed that adaptively exploits the sector mapping and log block based mapping schemes and yields better performance than conventional FTLs.
Abstract: For the last years, a number of flash translation layers (FTL) have been proposed for hiding erase-before-write architecture of NAND flash memory. However, although many conventional FTLs efficiently provide the logical to physical address remapping algorithms, they could not escape from the performance degradation when handling the hot data which tends to generate so many overwrites on the same logical address. In this paper, we propose a novel FTL algorithm called Hybrid Flash Translation Layer (HFTL) that adaptively exploits the sector mapping and log block based mapping schemes. To do so, HFTL first separates the hot data from the cold data by using the hot data identifier. And then it dynamically manages the former by using the sector mapping scheme showing an optimal performance for intensive overwrites at the same location, and the latter by using the log block based mapping scheme. By using this adaptive hybrid method, HFTL is always guaranteed to yield good performance for the pattern with the hot data as well as the pattern without it. Through a series of experiments, we show that HFTL yields better performance than conventional FTLs.

54 citations

Proceedings ArticleDOI
22 Jun 2014
TL;DR: Experimental results show that the proposed method can achieve similar results compared with state-of-the-art method whereas it requires low memory and computation cost.
Abstract: In this paper, we focus on generating compact but efficient video signatures on mobile devices so that users quickly know whether there are near-duplicates in the social network systems when they upload a video. For this, the proposed method employs the idea of inverted index that is one of the most popular text retrieval methods. Experimental results show that our method can achieve similar results compared with state-of-the-art method whereas it requires low memory and computation cost.

52 citations


Cited by
More filters
Journal Article
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

1,992 citations

Proceedings ArticleDOI
07 Mar 2009
TL;DR: This work proposes a complete paradigm shift in the design of the core FTL engine from the existing techniques with a Demand-based Flash Translation Layer (DFTL), which selectively caches page-level address mappings and develops a flash simulation framework called FlashSim.
Abstract: Recent technological advances in the development of flash-memory based devices have consolidated their leadership position as the preferred storage media in the embedded systems market and opened new vistas for deployment in enterprise-scale storage systems. Unlike hard disks, flash devices are free from any mechanical moving parts, have no seek or rotational delays and consume lower power. However, the internal idiosyncrasies of flash technology make its performance highly dependent on workload characteristics. The poor performance of random writes has been a cause of major concern, which needs to be addressed to better utilize the potential of flash in enterprise-scale environments. We examine one of the important causes of this poor performance: the design of the Flash Translation Layer (FTL), which performs the virtual-to-physical address translations and hides the erase-before-write characteristics of flash. We propose a complete paradigm shift in the design of the core FTL engine from the existing techniques with our Demand-based Flash Translation Layer (DFTL), which selectively caches page-level address mappings. We develop a flash simulation framework called FlashSim. Our experimental evaluation with realistic enterprise-scale workloads endorses the utility of DFTL in enterprise-scale storage systems by demonstrating: (i) improved performance, (ii) reduced garbage collection overhead and (iii) better overload behavior compared to state-of-the-art FTL schemes. For example, a predominantly random-write dominant I/O trace from an OLTP application running at a large financial institution shows a 78% improvement in average response time (due to a 3-fold reduction in operations of the garbage collector), compared to a state-of-the-art FTL scheme. Even for the well-known read-dominant TPC-H benchmark, for which DFTL introduces additional overheads, we improve system response time by 56%.

832 citations

Journal ArticleDOI
TL;DR: There is much room for performance improvement in the log buffer block scheme, and an enhanced log block buffer scheme, called FAST (full associative sector translation), is proposed, which improves the space utilization of log blocks using fully-associative sector translations for the log block sectors.
Abstract: Flash memory is being rapidly deployed as data storage for mobile devices such as PDAs, MP3 players, mobile phones, and digital cameras, mainly because of its low electronic power, nonvolatile storage, high performance, physical stability, and portability. One disadvantage of flash memory is that prewritten data cannot be dynamically overwritten. Before overwriting prewritten data, a time-consuming erase operation on the used blocks must precede, which significantly degrades the overall write performance of flash memory. In order to solve this “erase-before-write” problem, the flash memory controller can be integrated with a software module, called “flash translation layer (FTL).” Among many FTL schemes available, the log block buffer scheme is considered to be optimum. With this scheme, a small number of log blocks, a kind of write buffer, can improve the performance of write operations by reducing the number of erase operations. However, this scheme can suffer from low space utilization of log blocks. In this paper, we show that there is much room for performance improvement in the log buffer block scheme, and propose an enhanced log block buffer scheme, called FAST (full associative sector translation). Our FAST scheme improves the space utilization of log blocks using fully-associative sector translations for the log block sectors. We also show empirically that our FAST scheme outperforms the pure log block buffer scheme.

682 citations

Journal ArticleDOI
TL;DR: A comprehensive survey of recent text summarization extractive approaches developed in the last decade is presented and the discussion of useful future directions that can help researchers to identify areas where further research is needed are discussed.
Abstract: As information is available in abundance for every topic on internet, condensing the important information in the form of summary would benefit a number of users. Hence, there is growing interest among the research community for developing new approaches to automatically summarize the text. Automatic text summarization system generates a summary, i.e. short length text that includes all the important information of the document. Since the advent of text summarization in 1950s, researchers have been trying to improve techniques for generating summaries so that machine generated summary matches with the human made summary. Summary can be generated through extractive as well as abstractive methods. Abstractive methods are highly complex as they need extensive natural language processing. Therefore, research community is focusing more on extractive summaries, trying to achieve more coherent and meaningful summaries. During a decade, several extractive approaches have been developed for automatic summary generation that implements a number of machine learning and optimization techniques. This paper presents a comprehensive survey of recent text summarization extractive approaches developed in the last decade. Their needs are identified and their advantages and disadvantages are listed in a comparative manner. A few abstractive and multilingual text summarization approaches are also covered. Summary evaluation is another challenging issue in this research field. Therefore, intrinsic as well as extrinsic both the methods of summary evaluation are described in detail along with text summarization evaluation conferences and workshops. Furthermore, evaluation results of extractive summarization approaches are presented on some shared DUC datasets. Finally this paper concludes with the discussion of useful future directions that can help researchers to identify areas where further research is needed.

581 citations

Proceedings ArticleDOI
15 Jun 2009
TL;DR: This study reveals several unanticipated aspects in the performance dynamics of SSD technology that must be addressed by system designers and data-intensive application users in order to effectively place it in the storage hierarchy.
Abstract: Flash Memory based Solid State Drive (SSD) has been called a "pivotal technology" that could revolutionize data storage systems. Since SSD shares a common interface with the traditional hard disk drive (HDD), both physically and logically, an effective integration of SSD into the storage hierarchy is very important. However, details of SSD hardware implementations tend to be hidden behind such narrow interfaces. In fact, since sophisticated algorithms are usually, of necessity, adopted in SSD controller firmware, more complex performance dynamics are to be expected in SSD than in HDD systems. Most existing literature or product specifications on SSD just provide high-level descriptions and standard performance data, such as bandwidth and latency.In order to gain insight into the unique performance characteristics of SSD, we have conducted intensive experiments and measurements on different types of state-of-the-art SSDs, from low-end to high-end products. We have observed several unexpected performance issues and uncertain behavior of SSDs, which have not been reported in the literature. For example, we found that fragmentation could seriously impact performance -- by a factor of over 14 times on a recently announced SSD. Moreover, contrary to the common belief that accesses to SSD are uncorrelated with access patterns, we found a strong correlation between performance and the randomness of data accesses, for both reads and writes. In the worst case, average latency could increase by a factor of 89 and bandwidth could drop to only 0.025MB/sec. Our study reveals several unanticipated aspects in the performance dynamics of SSD technology that must be addressed by system designers and data-intensive application users in order to effectively place it in the storage hierarchy.

529 citations