scispace - formally typeset
W

Wayne Luk

Researcher at Imperial College London

Publications -  737
Citations -  13643

Wayne Luk is an academic researcher from Imperial College London. The author has contributed to research in topics: Field-programmable gate array & Reconfigurable computing. The author has an hindex of 54, co-authored 703 publications receiving 12517 citations. Previous affiliations of Wayne Luk include Fudan University & University of London.

Papers
More filters
Journal ArticleDOI

Performance Tuning and Analysis for Stencil-Based Applications on POWER8 Processor

TL;DR: This article demonstrates an approach for combining general tuning techniques with the POWER8 hardware architecture through optimizing three representative stencil benchmarks, and provides useful guidance for optimizing stencil-based scientific applications on POWER systems.
Proceedings ArticleDOI

ADAM: Automated Design Analysis and Merging for Speeding up FPGA Development

TL;DR: ADAM is introduced, an approach for merging multiple FPGA designs into a single hardware design, so that multiple place-and-route tasks can be replaced by a single task to speed up functional evaluation of designs, especially during the development process.
Proceedings ArticleDOI

A Heterogeneous Computing Framework for Computational Finance

TL;DR: The Forward Financial Framework allows the computational finance problem specification to be captured precisely yet succinctly, then automatically creates efficient implementations for heterogeneous platforms, utilising both multi-core CPUs and FPGAs.
Proceedings ArticleDOI

Pipelined Genetic Propagation

TL;DR: A new hardware-oriented approach to GAs, called Pipelined Genetic Propagation (PGP), which is intrinsically distributed and pipelined, which allows the solution to be scaled to the available resources, and also to dynamically change topology at run-time to explore different solution strategies.
Proceedings ArticleDOI

Optimizing residue arithmetic on FPGAs

TL;DR: An extensive comparison between RNS and other number representations at both the arithmetic unit level and the application level shows that, for applications involving a large number of multiplications, the RNS designs can reduce up to 1/2 DSP48s for large bit-width settings.