scispace - formally typeset
Search or ask a question
Author

Dong-U Lee

Bio: Dong-U Lee is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Unit in the last place & Random number generation. The author has an hindex of 22, co-authored 40 publications receiving 1599 citations. Previous affiliations of Dong-U Lee include Broadcom & University of California.

Papers
More filters
Journal ArticleDOI
TL;DR: An automated static approach for optimizing bit widths of fixed-point feedforward designs with guaranteed accuracy, called MiniBit, is presented and is demonstrated with polynomial approximation, RGB-to-YCbCr conversion, matrix multiplication, B-splines, and discrete cosine transform placed and routed on a Xilinx Virtex-4 FPGA.
Abstract: An automated static approach for optimizing bit widths of fixed-point feedforward designs with guaranteed accuracy, called MiniBit, is presented. Methods to minimize both the integer and fraction parts of fixed-point signals with the aim of minimizing the circuit area are described. For range analysis, the technique in this paper identifies the number of integer bits necessary to meet range requirements. For precision analysis, a semianalytical approach with analytical error models in conjunction with adaptive simulated annealing is employed to optimize the number of fraction bits. The analytical models make it possible to guarantee overflow/underflow protection and numerical accuracy for all inputs over the user-specified input intervals. Using a stream compiler for field-programmable gate arrays (FPGAs), the approach in this paper is demonstrated with polynomial approximation, RGB-to-YCbCr conversion, matrix multiplication, B-splines, and discrete cosine transform placed and routed on a Xilinx Virtex-4 FPGA. Improvements for a given design reduce the area and the latency by up to 26% and 12%, respectively, over a design using optimum uniform fraction bit widths. Studies show that MiniBit-optimized designs are within 1% of the area produced from the integer linear programming approach

226 citations

Journal ArticleDOI
TL;DR: A hardware Gaussian noise generator based on the Box-Muller method that provides highly accurate noise samples and is currently being used at the Jet Propulsion Laboratory, NASA to evaluate the performance of low-density parity-check codes for deep-space communications.
Abstract: We present a hardware Gaussian noise generator based on the Box-Muller method that provides highly accurate noise samples. The noise generator can be used as a key component in a hardware-based simulation system, such as for exploring channel code behavior at very low bit error rates, as low as 10-12 to 10-13. The main novelties of this work are accurate analytical error analysis and bit-width optimization for the elementary functions involved in the Box-Muller method. Two 16-bit noise samples are generated every clock cycle and, due to the accurate error analysis, every sample is analytically guaranteed to be accurate to one unit in the last place. An implementation on a Xilinx Virtex-4 XC4VLX100-12 FPGA occupies 1,452 slices, three block RAMs, and 12 DSP slices, and is capable of generating 750 million samples per second at a clock speed of 375 MHz. The performance can be improved by exploiting concurrent execution: 37 parallel instances of the noise generator at 95 MHz on a Xilinx Virtex-II Pro XC2VP100-7 FPGA generate seven billion samples per second and can run over 200 times faster than the output produced by software running on an Intel Pentium-4 3 GHz PC. The noise generator is currently being used at the Jet Propulsion Laboratory, NASA to evaluate the performance of low-density parity-check codes for deep-space communications

144 citations

Proceedings ArticleDOI
11 Dec 2005
TL;DR: A novel hardware accelerator for Monte Carlo (MC) simulation, based on a generic architecture, which combines speed and flexibility by integrating a pipelined MC core with an on-chip instruction processor is described.
Abstract: This paper describes a novel hardware accelerator for Monte Carlo (MC) simulation, and illustrates its implementation in field programmable gate array (FPGA) technology for speeding up financial applications. Our accelerator is based on a generic architecture, which combines speed and flexibility by integrating a pipelined MC core with an on-chip instruction processor. We develop a generic number system representation for determining the choice of number representation that meets numerical precision requirements. Our approach is then used in a complex financial engineering application, involving the Brace, Gatarek and Musiela (BGM) interest rate model for pricing derivatives. We address, in our BGM model, several challenges including the generation of Gaussian distributed random numbers and pipelining of the MC simulation. Our BGM application, based on an off-the-shelf system with a Xilinx XC2VP30 device at 50 MHz, is over 25 times faster than software running on a 1.5 GHz, Intel Pentium machine

108 citations

Journal ArticleDOI
TL;DR: A quantitative comparison between the energy costs associated with direct transmission of uncompressed images and sensor platform-based JPEG compression followed by transmission of the compressed image data is presented.
Abstract: One of the most important goals of current and future sensor networks is energy-efficient communication of images. This paper presents a quantitative comparison between the energy costs associated with 1) direct transmission of uncompressed images and 2) sensor platform-based JPEG compression followed by transmission of the compressed image data. JPEG compression computations are mapped onto various resource-constrained platforms using a design environment that allows computation using the minimum integer and fractional bit-widths needed in view of other approximations inherent in the compression process and choice of image quality parameters. Advanced applications of JPEG, such as region of interest coding and successive/progressive transmission, are also examined. Detailed experimental results examining the tradeoffs in processor resources, processing/transmission time, bandwidth utilization, image quality, and overall energy consumption are presented.

103 citations

Journal ArticleDOI
TL;DR: A hardware-based Gaussian noise generator used as a key component in a hardware simulation system, for exploring channel code behavior at very low bit error rates (BERs) in the range of 10/sup -9/ to 10/Sup -10/.
Abstract: Hardware simulation offers the potential of improving code evaluation speed by orders of magnitude over workstation or PC-based simulation. We describe a hardware-based Gaussian noise generator used as a key component in a hardware simulation system, for exploring channel code behavior at very low bit error rates (BERs) in the range of 10/sup -9/ to 10/sup -10/. The main novelty is the design and use of nonuniform piecewise linear approximations in computing trigonometric and logarithmic functions. The parameters of the approximation are chosen carefully to enable rapid computation of coefficients from the inputs while still retaining high fidelity to the modeled functions. The output of the noise generator accurately models a true Gaussian Probability Density Function (PDF) even at very high /spl sigma/ values. Its properties are explored using: 1) several different statistical tests, including the chi-square test and the Anderson-Darling test, and 2) an application for decoding of low-density parity-check (LDPC) codes. An implementation at 133 MHz on a Xilinx Virtex-II XC2V4000-6 FPGA produces 133 million samples per second, which is seven times faster than a 2.6 GHz Pentium-IV PC; another implementation on a Xilinx Spartan-IIE XC2S300E-7 FPGA at 62 MHz is capable of a three times speedup. The performance can be improved by exploiting parallelism: an XC2V4000-6 FPGA with nine parallel instances of the noise generator at 105 MHz can run 50 times faster than a 2.6 GHz Pentium-IV PC. We illustrate the deterioration of clock speed with the increase in the number of instances.

101 citations


Cited by
More filters
01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

Book
02 Nov 2007
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
Abstract: The main characteristic of Reconfigurable Computing is the presence of hardware that can be reconfigured to implement specific functionality more suitable for specially tailored hardware than on a simple uniprocessor. Reconfigurable computing systems join microprocessors and programmable hardware in order to take advantage of the combined strengths of hardware and software and have been used in applications ranging from embedded systems to high performance computing. Many of the fundamental theories have been identified and used by the Hardware/Software Co-Design research field. Although the same background ideas are shared in both areas, they have different goals and use different approaches.This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology. It will take a reader with a background in the basics of digital design and software programming and provide them with the knowledge needed to be an effective designer or researcher in this rapidly evolving field. · Treatment of FPGAs as computing vehicles rather than glue-logic or ASIC substitutes · Views of FPGA programming beyond Verilog/VHDL · Broad set of case studies demonstrating how to use FPGAs in novel and efficient ways

531 citations

01 Jan 2010
TL;DR: This journal special section will cover recent progress on parallel CAD research, including algorithm foundations, programming models, parallel architectural-specific optimization, and verification, as well as other topics relevant to the design of parallel CAD algorithms and software tools.
Abstract: High-performance parallel computer architecture and systems have been improved at a phenomenal rate. In the meantime, VLSI computer-aided design (CAD) software for multibillion-transistor IC design has become increasingly complex and requires prohibitively high computational resources. Recent studies have shown that, numerous CAD problems, with their high computational complexity, can greatly benefit from the fast-increasing parallel computation capabilities. However, parallel programming imposes big challenges for CAD applications. Fully exploiting the computational power of emerging general-purpose and domain-specific multicore/many-core processor systems, calls for fundamental research and engineering practice across every stage of parallel CAD design, from algorithm exploration, programming models, design-time and run-time environment, to CAD applications, such as verification, optimization, and simulation. This journal special section will cover recent progress on parallel CAD research, including algorithm foundations, programming models, parallel architectural-specific optimization, and verification. More specifically, papers with in-depth and extensive coverage of the following topics will be considered, as well as other topics relevant to the design of parallel CAD algorithms and software tools. 1. Parallel algorithm design and specification for CAD applications 2. Parallel programming models and languages of particular use in CAD 3. Runtime support and performance optimization for CAD applications 4. Parallel architecture-specific design and optimization for CAD applications 5. Parallel program debugging and verification techniques particularly relevant for CAD The papers should be submitted via the Manuscript Central website and should adhere to standard ACM TODAES formatting requirements (http://todaes.acm.org/). The page count limit is 25.

459 citations

Journal ArticleDOI
25 Jul 2005
TL;DR: It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.
Abstract: Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Virtex 4 FPGA devices. The authors identify major trends in general-purpose and special-purpose design methods. It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.

414 citations