scispace - formally typeset
Search or ask a question
Author

Koen Bertels

Bio: Koen Bertels is an academic researcher from Delft University of Technology. The author has contributed to research in topics: Speedup & Reconfigurable computing. The author has an hindex of 32, co-authored 278 publications receiving 4890 citations. Previous affiliations of Koen Bertels include Katholieke Universiteit Leuven & Université de Namur.


Papers
More filters
Journal ArticleDOI
TL;DR: A microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution of the processor and to prove the viability of the proposal, the proposal was experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA.
Abstract: In this paper, we present a polymorphic processor paradigm incorporating both general-purpose and custom computing processing. The proposal incorporates an arbitrary number of programmable units, exposes the hardware to the programmers/designers, and allows them to modify and extend the processor functionality at will. To achieve the previously stated attributes, we present a new programming paradigm, a new instruction set architecture, a microcode-based microarchitecture, and a compiler methodology. The programming paradigm, in contrast with the conventional programming paradigms, allows general-purpose conventional code and hardware descriptions to coexist in a program: In our proposal, for a given instruction set architecture, a onetime instruction set extension of eight instructions, is sufficient to implement the reconfigurable functionality of the processor. We propose a microarchitecture based on reconfigurable hardware emulation to allow high-speed reconfiguration and execution. To prove the viability of the proposal, we experimented with the MPEG-2 encoder and decoder and a Xilinx Virtex II Pro FPGA. We have implemented three operations, SAD, DCT, and IDCT. The overall attainable application speedup for the MPEG-2 encoder and decoder is between 2.64-3.18 and between 1.56-1.94, respectively, representing between 93 percent and 98 percent of the theoretically obtainable speedups.

436 citations

Journal ArticleDOI
TL;DR: This work uses a first-published methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources.
Abstract: High-level synthesis (HLS) is increasingly popular for the design of high-performance and energy-efficient heterogeneous systems, shortening time-to-market and addressing today’s system complexity. HLS allows designers to work at a higher-level of abstraction by using a software program to specify the hardware functionality. Additionally, HLS is particularly interesting for designing field-programmable gate array circuits, where hardware implementations can be easily refined and replaced in the target device. Recent years have seen much activity in the HLS research community, with a plethora of HLS tool offerings, from both industry and academia. All these tools may have different input languages, perform different internal optimizations, and produce results of different quality, even for the very same input description. Hence, it is challenging to compare their performance and understand which is the best for the hardware to be implemented. We present a comprehensive analysis of recent HLS tools, as well as overview the areas of active interest in the HLS research community. We also present a first-published methodology to evaluate different HLS tools. We use our methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources.

433 citations

Journal ArticleDOI
TL;DR: In this paper, a scalable scheme for executing the error-correction cycle of a monolithic surface-code fabric composed of fast-flux-tunable transmon qubits with nearest-neighbor coupling is presented.
Abstract: We present a scalable scheme for executing the error-correction cycle of a monolithic surface-code fabric composed of fast-flux-tunable transmon qubits with nearest-neighbor coupling. An eight-qubit unit cell forms the basis for repeating both the quantum hardware and coherent control, enabling spatial multiplexing. This control uses three fixed frequencies for all single-qubit gates and a unique frequency-detuning pattern for each qubit in the cell. By pipelining the interaction and readout steps of ancilla-based X- and Z-type stabilizer measurements, we can engineer detuning patterns that avoid all second-order transmon-transmon interactions except those exploited in controlled-phase gates, regardless of fabric size. Our scheme is applicable to defect-based and planar logical qubits, including lattice surgery.

179 citations

Proceedings ArticleDOI
09 Mar 2015
TL;DR: The paper first highlights some challenges of the new born Big Data paradigm and shows that the increase of the data size has already surpassed the capabilities of today's computation architectures suffering from the limited bandwidth, programmability overhead, energy inefficiency, and limited scalability.
Abstract: One of the most critical challenges for today's and future data-intensive and big-data problems is data storage and analysis. This paper first highlights some challenges of the new born Big Data paradigm and shows that the increase of the data size has already surpassed the capabilities of today's computation architectures suffering from the limited bandwidth, programmability overhead, energy inefficiency, and limited scalability. Thereafter, the paper introduces a new memristor-based architecture for data-intensive applications. The potential of such an architecture in solving data-intensive problems is illustrated by showing its capability to increase the computation efficiency, solving the communication bottleneck, reducing the leakage currents, etc. Finally, the paper discusses why memristor technology is very suitable for the realization of such an architecture; using memristors to implement dual functions (storage and logic) is illustrated.

159 citations

Proceedings ArticleDOI
12 Nov 2007
TL;DR: The carried experiments on the MOLEN polymorphic processor prototype suggest overall application speedups between 1.4x and 6.8x, corresponding to 13% to 94% of the theoretically achievable maximums, constituted by Amdahl's law.
Abstract: In this paper, we present the DWARV C-to-VHDL generation toolset. The toolset provides support for broad range of application domains. It exploits the operation parallelism, available in the algorithms. Our designs are generated with a view of actual hardware/software co-execution on a real hardware platform. The carried experiments on the MOLEN polymorphic processor prototype suggest overall application speedups between 1.4x and 6.8x, corresponding to 13% to 94% of the theoretically achievable maximums, constituted by Amdahl's law.

115 citations


Cited by
More filters
01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading modern applied statistics with s. As you may know, people have search hundreds times for their favorite readings like this modern applied statistics with s, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. modern applied statistics with s is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the modern applied statistics with s is universally compatible with any devices to read.

5,249 citations

Journal ArticleDOI
TL;DR: This article reviews in a selective way the recent research on the interface between machine learning and the physical sciences, including conceptual developments in ML motivated by physical insights, applications of machine learning techniques to several domains in physics, and cross fertilization between the two fields.
Abstract: Machine learning (ML) encompasses a broad range of algorithms and modeling tools used for a vast array of data processing tasks, which has entered most scientific disciplines in recent years. This article reviews in a selective way the recent research on the interface between machine learning and the physical sciences. This includes conceptual developments in ML motivated by physical insights, applications of machine learning techniques to several domains in physics, and cross fertilization between the two fields. After giving a basic notion of machine learning methods and principles, examples are described of how statistical physics is used to understand methods in ML. This review then describes applications of ML methods in particle physics and cosmology, quantum many-body physics, quantum computing, and chemical and material physics. Research and development into novel computing architectures aimed at accelerating ML are also highlighted. Each of the sections describe recent successes as well as domain-specific methodology and challenges.

1,504 citations

Journal ArticleDOI
TL;DR: In this paper, the authors provide an introductory guide to the central concepts and challenges in the rapidly accelerating field of superconducting quantum circuits, including qubit design, noise properties, qubit control and readout techniques.
Abstract: The aim of this review is to provide quantum engineers with an introductory guide to the central concepts and challenges in the rapidly accelerating field of superconducting quantum circuits. Over the past twenty years, the field has matured from a predominantly basic research endeavor to a one that increasingly explores the engineering of larger-scale superconducting quantum systems. Here, we review several foundational elements—qubit design, noise properties, qubit control, and readout techniques—developed during this period, bridging fundamental concepts in circuit quantum electrodynamics and contemporary, state-of-the-art applications in gate-model quantum computation.

969 citations

Journal ArticleDOI
TL;DR: The time is ripe for describing some of the recent development of superconducting devices, systems and applications as well as practical applications of QIP, such as computation and simulation in Physics and Chemistry.
Abstract: During the last ten years, superconducting circuits have passed from being interesting physical devices to becoming contenders for near-future useful and scalable quantum information processing (QIP). Advanced quantum simulation experiments have been shown with up to nine qubits, while a demonstration of quantum supremacy with fifty qubits is anticipated in just a few years. Quantum supremacy means that the quantum system can no longer be simulated by the most powerful classical supercomputers. Integrated classical-quantum computing systems are already emerging that can be used for software development and experimentation, even via web interfaces. Therefore, the time is ripe for describing some of the recent development of superconducting devices, systems and applications. As such, the discussion of superconducting qubits and circuits is limited to devices that are proven useful for current or near future applications. Consequently, the centre of interest is the practical applications of QIP, such as computation and simulation in Physics and Chemistry.

809 citations