Home
/
Authors
/
Dong-U Lee

Author

Dong-U Lee

Other affiliations: Broadcom, University of California, Imperial College London

Bio: Dong-U Lee is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Unit in the last place & Random number generation. The author has an hindex of 22, co-authored 40 publications receiving 1599 citations. Previous affiliations of Dong-U Lee include Broadcom & University of California.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Accuracy-Guaranteed Bit-Width Optimization

[...]

Dong-U Lee¹, A.A. Gaffar², Ray C. C. Cheung², Oskar Mencer², Wayne Luk², George A. Constantinides² - Show less +2 more•Institutions (2)

University of California, Los Angeles¹, Imperial College London²

01 Oct 2006-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: An automated static approach for optimizing bit widths of fixed-point feedforward designs with guaranteed accuracy, called MiniBit, is presented and is demonstrated with polynomial approximation, RGB-to-YCbCr conversion, matrix multiplication, B-splines, and discrete cosine transform placed and routed on a Xilinx Virtex-4 FPGA.

...read moreread less

Abstract: An automated static approach for optimizing bit widths of fixed-point feedforward designs with guaranteed accuracy, called MiniBit, is presented. Methods to minimize both the integer and fraction parts of fixed-point signals with the aim of minimizing the circuit area are described. For range analysis, the technique in this paper identifies the number of integer bits necessary to meet range requirements. For precision analysis, a semianalytical approach with analytical error models in conjunction with adaptive simulated annealing is employed to optimize the number of fraction bits. The analytical models make it possible to guarantee overflow/underflow protection and numerical accuracy for all inputs over the user-specified input intervals. Using a stream compiler for field-programmable gate arrays (FPGAs), the approach in this paper is demonstrated with polynomial approximation, RGB-to-YCbCr conversion, matrix multiplication, B-splines, and discrete cosine transform placed and routed on a Xilinx Virtex-4 FPGA. Improvements for a given design reduce the area and the latency by up to 26% and 12%, respectively, over a design using optimum uniform fraction bit widths. Studies show that MiniBit-optimized designs are within 1% of the area produced from the integer linear programming approach

...read moreread less

226 citations

Journal Article•DOI•

A hardware Gaussian noise generator using the Box-Muller method and its error analysis

[...]

Dong-U Lee¹, John Villasenor¹, Wayne Luk, Philip H. W. Leong•Institutions (1)

University of California, Los Angeles¹

01 Jun 2006-IEEE Transactions on Computers

TL;DR: A hardware Gaussian noise generator based on the Box-Muller method that provides highly accurate noise samples and is currently being used at the Jet Propulsion Laboratory, NASA to evaluate the performance of low-density parity-check codes for deep-space communications.

...read moreread less

Abstract: We present a hardware Gaussian noise generator based on the Box-Muller method that provides highly accurate noise samples. The noise generator can be used as a key component in a hardware-based simulation system, such as for exploring channel code behavior at very low bit error rates, as low as 10-12 to 10-13. The main novelties of this work are accurate analytical error analysis and bit-width optimization for the elementary functions involved in the Box-Muller method. Two 16-bit noise samples are generated every clock cycle and, due to the accurate error analysis, every sample is analytically guaranteed to be accurate to one unit in the last place. An implementation on a Xilinx Virtex-4 XC4VLX100-12 FPGA occupies 1,452 slices, three block RAMs, and 12 DSP slices, and is capable of generating 750 million samples per second at a clock speed of 375 MHz. The performance can be improved by exploiting concurrent execution: 37 parallel instances of the noise generator at 95 MHz on a Xilinx Virtex-II Pro XC2VP100-7 FPGA generate seven billion samples per second and can run over 200 times faster than the output produced by software running on an Intel Pentium-4 3 GHz PC. The noise generator is currently being used at the Jet Propulsion Laboratory, NASA to evaluate the performance of low-density parity-check codes for deep-space communications

...read moreread less

144 citations

Proceedings Article•DOI•

Reconfigurable acceleration for Monte Carlo based financial simulation

[...]

Guanglie Zhang¹, Philip H. W. Leong¹, C.H. Ho¹, K.H. Tsoi¹, C.C.C. Cheung, Dong-U Lee, Ray C. C. Cheung, Wayne Luk - Show less +4 more•Institutions (1)

The Chinese University of Hong Kong¹

11 Dec 2005

TL;DR: A novel hardware accelerator for Monte Carlo (MC) simulation, based on a generic architecture, which combines speed and flexibility by integrating a pipelined MC core with an on-chip instruction processor is described.

...read moreread less

Abstract: This paper describes a novel hardware accelerator for Monte Carlo (MC) simulation, and illustrates its implementation in field programmable gate array (FPGA) technology for speeding up financial applications. Our accelerator is based on a generic architecture, which combines speed and flexibility by integrating a pipelined MC core with an on-chip instruction processor. We develop a generic number system representation for determining the choice of number representation that meets numerical precision requirements. Our approach is then used in a complex financial engineering application, involving the Brace, Gatarek and Musiela (BGM) interest rate model for pricing derivatives. We address, in our BGM model, several challenges including the generation of Gaussian distributed random numbers and pipelining of the MC simulation. Our BGM application, based on an off-the-shelf system with a Xilinx XC2VP30 device at 50 MHz, is over 25 times faster than software running on a 1.5 GHz, Intel Pentium machine

...read moreread less

108 citations

Journal Article•DOI•

Energy-Efficient Image Compression for Resource-Constrained Platforms

[...]

Dong-U Lee¹, Hyungjin Kim¹, Mohammad Rahimi¹, Deborah Estrin¹, J.D. Villasenor¹ - Show less +1 more•Institutions (1)

University of California, Los Angeles¹

01 Sep 2009-IEEE Transactions on Image Processing

TL;DR: A quantitative comparison between the energy costs associated with direct transmission of uncompressed images and sensor platform-based JPEG compression followed by transmission of the compressed image data is presented.

...read moreread less

Abstract: One of the most important goals of current and future sensor networks is energy-efficient communication of images. This paper presents a quantitative comparison between the energy costs associated with 1) direct transmission of uncompressed images and 2) sensor platform-based JPEG compression followed by transmission of the compressed image data. JPEG compression computations are mapped onto various resource-constrained platforms using a design environment that allows computation using the minimum integer and fractional bit-widths needed in view of other approximations inherent in the compression process and choice of image quality parameters. Advanced applications of JPEG, such as region of interest coding and successive/progressive transmission, are also examined. Detailed experimental results examining the tradeoffs in processor resources, processing/transmission time, bandwidth utilization, image quality, and overall energy consumption are presented.

...read moreread less

103 citations

Journal Article•DOI•

A Gaussian noise generator for hardware-based simulations

[...]

Dong-U Lee¹, Wayne Luk¹, John D. Villasenor, Peter Y. K. Cheung•Institutions (1)

Imperial College London¹

01 Dec 2004-IEEE Transactions on Computers

TL;DR: A hardware-based Gaussian noise generator used as a key component in a hardware simulation system, for exploring channel code behavior at very low bit error rates (BERs) in the range of 10/sup -9/ to 10/Sup -10/.

...read moreread less

Abstract: Hardware simulation offers the potential of improving code evaluation speed by orders of magnitude over workstation or PC-based simulation. We describe a hardware-based Gaussian noise generator used as a key component in a hardware simulation system, for exploring channel code behavior at very low bit error rates (BERs) in the range of 10/sup -9/ to 10/sup -10/. The main novelty is the design and use of nonuniform piecewise linear approximations in computing trigonometric and logarithmic functions. The parameters of the approximation are chosen carefully to enable rapid computation of coefficients from the inputs while still retaining high fidelity to the modeled functions. The output of the noise generator accurately models a true Gaussian Probability Density Function (PDF) even at very high /spl sigma/ values. Its properties are explored using: 1) several different statistical tests, including the chi-square test and the Anderson-Darling test, and 2) an application for decoding of low-density parity-check (LDPC) codes. An implementation at 133 MHz on a Xilinx Virtex-II XC2V4000-6 FPGA produces 133 million samples per second, which is seven times faster than a 2.6 GHz Pentium-IV PC; another implementation on a Xilinx Spartan-IIE XC2S300E-7 FPGA at 62 MHz is capable of a three times speedup. The performance can be improved by exploiting parallelism: an XC2V4000-6 FPGA with nine parallel instances of the noise generator at 105 MHz can run 50 times faster than a 2.6 GHz Pentium-IV PC. We illustrate the deterioration of clock speed with the increase in the number of instances.

...read moreread less

101 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

Book•

IEEE transactions on computer-aided design of integrated circuits and systems : a publication of the IEEE Circuits and Systems Society

[...]

Ieee Circuits

01 Jan 1982

729 citations

Book•

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

[...]

Scott Hauck, André DeHon

02 Nov 2007

TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.

...read moreread less

Abstract: The main characteristic of Reconfigurable Computing is the presence of hardware that can be reconfigured to implement specific functionality more suitable for specially tailored hardware than on a simple uniprocessor. Reconfigurable computing systems join microprocessors and programmable hardware in order to take advantage of the combined strengths of hardware and software and have been used in applications ranging from embedded systems to high performance computing. Many of the fundamental theories have been identified and used by the Hardware/Software Co-Design research field. Although the same background ideas are shared in both areas, they have different goals and use different approaches.This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology. It will take a reader with a background in the basics of digital design and software programming and provide them with the knowledge needed to be an effective designer or researcher in this rapidly evolving field. · Treatment of FPGAs as computing vehicles rather than glue-logic or ASIC substitutes · Views of FPGA programming beyond Verilog/VHDL · Broad set of case studies demonstrating how to use FPGAs in novel and efficient ways

...read moreread less

531 citations

Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems

[...]

Kurt Keutzer, Peng Li, Li Shang, Hai Zhou

01 Jan 2010

TL;DR: This journal special section will cover recent progress on parallel CAD research, including algorithm foundations, programming models, parallel architectural-specific optimization, and verification, as well as other topics relevant to the design of parallel CAD algorithms and software tools.

...read moreread less

Abstract: High-performance parallel computer architecture and systems have been improved at a phenomenal rate. In the meantime, VLSI computer-aided design (CAD) software for multibillion-transistor IC design has become increasingly complex and requires prohibitively high computational resources. Recent studies have shown that, numerous CAD problems, with their high computational complexity, can greatly benefit from the fast-increasing parallel computation capabilities. However, parallel programming imposes big challenges for CAD applications. Fully exploiting the computational power of emerging general-purpose and domain-specific multicore/many-core processor systems, calls for fundamental research and engineering practice across every stage of parallel CAD design, from algorithm exploration, programming models, design-time and run-time environment, to CAD applications, such as verification, optimization, and simulation. This journal special section will cover recent progress on parallel CAD research, including algorithm foundations, programming models, parallel architectural-specific optimization, and verification. More specifically, papers with in-depth and extensive coverage of the following topics will be considered, as well as other topics relevant to the design of parallel CAD algorithms and software tools. 1. Parallel algorithm design and specification for CAD applications 2. Parallel programming models and languages of particular use in CAD 3. Runtime support and performance optimization for CAD applications 4. Parallel architecture-specific design and optimization for CAD applications 5. Parallel program debugging and verification techniques particularly relevant for CAD The papers should be submitted via the Manuscript Central website and should adhere to standard ACM TODAES formatting requirements (http://todaes.acm.org/). The page count limit is 25.

...read moreread less

459 citations

Journal Article•DOI•

Reconfigurable computing: architectures and design methods

[...]

Tim Todman¹, George A. Constantinides¹, Steven J. E. Wilton², Oskar Mencer¹, Wayne Luk¹, Peter Y. K. Cheung¹ - Show less +2 more•Institutions (2)

Imperial College London¹, University of British Columbia²

25 Jul 2005

TL;DR: It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.

...read moreread less

Abstract: Reconfigurable computing is becoming increasingly attractive for many applications. This survey covers two aspects of reconfigurable computing: architectures and design methods. The paper includes recent advances in reconfigurable architectures, such as the Alters Stratix II and Xilinx Virtex 4 FPGA devices. The authors identify major trends in general-purpose and special-purpose design methods. It is shown that reconfigurable computing designs are capable of achieving up to 500 times speedup and 70% energy savings over microprocessor implementations for specific applications.

...read moreread less

414 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse