scispace - formally typeset
Search or ask a question
Author

Christophe Desmouliers

Bio: Christophe Desmouliers is an academic researcher from Illinois Institute of Technology. The author has contributed to research in topics: Field-programmable gate array & Discrete wavelet transform. The author has an hindex of 5, co-authored 10 publications receiving 71 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This paper addresses the challenges of System-on-Chip designs using High-Level Synthesis using HLS tools and Fast Fourier Transform implementation in ANSI C is examined in order to explore the important design issues such as concurrency, data recurrences and memory accesses that need to be resolved before generating the hardware.
Abstract: This paper addresses the challenges of System-on-Chip designs using High-Level Synthesis (HLS). HLS tools convert algorithms designed in C into hardware modules. This approach is a practical choice for developing complex applications. Nevertheless, certain hardware considerations are required when writing C applications for HLS tools. Hence, in order to demonstrate the fundamental hardware design concepts, a case studyis presented. Fast Fourier Transform (FFT) implementation in ANSI C is examined in order to explore the important design issues such as concurrency, data recurrences and memory accesses that need to be resolved before generating the hardware using HLS tools. There are additional language constraints that need to be addressed including use of pointers, recursion and floating point types.

17 citations

Proceedings ArticleDOI
20 May 2010
TL;DR: Several examples of video processing applications, such as a Canny edge detector, motion detector and object tracking that have been realized using IVPP for real-time video processing are presented.
Abstract: The objective of this work is to design and implement an Image and Video Processing Platform (IVPP) on FGPAs using PICO based HLS. This hardware/software codesign platform has been implemented on a Xilinx Virtex-5 FPGA. The video interface blocks are done in RTL and the initialization phase is done using a MicroBlaze processor allowing the support of multiple video resolutions. This paper discusses the architectural building blocks showing the flexibility of the proposed platform. This flexibility is achieved by using a new design flow based on PICO. IVPP allows custom-processing blocks to be plugged-in to the platform architecture without modifying the front-end (capturing video data) and back-end (displaying processed output). This paper presents several examples of video processing applications, such as a Canny edge detector, motion detector and object tracking that have been realized using IVPP for real-time video processing.

14 citations

Journal ArticleDOI
TL;DR: The IVPP is implemented on a Xilinx Virtex-5 FPGA using a high-level synthesis and can be used to realise and test complex algorithms for real-time image and video processing applications.
Abstract: In this study, an image and video processing platform (IVPP) based on field programmable gate array (FPGAs) is presented. This hardware/software co-design platform has been implemented on a Xilinx Virtex-5 FPGA using a high-level synthesis and can be used to realise and test complex algorithms for real-time image and video processing applications. The video interface blocks are done in Register Transfer Languages and can be configured using the MicroBlaze processor allowing the support of multiple video resolutions. The IVPP provides the required logic to easily plug-in the generated processing blocks without modifying the front-end (capturing video data) and the back-end (displaying processed output data). The IVPP can be a complete hardware solution for a broad range of real-time image/video processing applications including video encoding/decoding, surveillance, detection and recognition.

11 citations

Journal ArticleDOI
TL;DR: Three different hardware architectures for implementing multiple wavelet kernels for discrete wavelet transform are presented and FPGA synthesis results for simultaneous implementation of six different wavelets for the proposed methods are presented.
Abstract: Designing a universal embedded hardware architecture for discrete wavelet transform is a challenging problem because of the diversity among wavelet kernel filters. In this work, the authors present three different hardware architectures for implementing multiple wavelet kernels. The first scheme utilises fixed, parallel hardware for all the required wavelet kernels, whereas the second scheme employs a processing element (PE)-based datapath that can be configured for multiple wavelet filters during run-time. The third scheme makes use of partial run-time configuration of FPGA units for dynamically programming any desired wavelet filter. As a case study, the authors present FPGA synthesis results for simultaneous implementation of six different wavelets for the proposed methods. Performance analysis and comparison of area, timing and power results are presented for the Virtex-II Pro FPGA implementations.

11 citations

Proceedings ArticleDOI
01 Sep 2009
TL;DR: A fast and scalable data compression System-on-Chip (SoC) architecture based on Discrete Wavelet Transform (DWT) is proposed that can process A-Scan, B-Scan and C-Scan signals and images in real-time and reduce the data and bandwidth requirements substantially without degrading the signal fidelity.
Abstract: Ultrasonic 3D imaging is an important tool in NDE applications for quality control, flaw detection, and material characterization. However, ultrasonic 3D images often encompass immense amounts of data, making it very challenging for volumetric image analysis, transmission and storage. In this study, a fast and scalable data compression System-on-Chip (SoC) architecture based on Discrete Wavelet Transform (DWT) is proposed. This compression SoC can process A-Scan, B-Scan and C-Scan signals and images in real-time and reduce the data and bandwidth requirements substantially without degrading the signal fidelity. A volumetric image of 128x128x2048 samples is compressed by 96.9% in less than one second by the proposed compression system implemented on a Virtex-5 FPGA.

8 citations


Cited by
More filters
Proceedings ArticleDOI
21 Feb 2016
TL;DR: A case study using HLS for a full H.264 decoder for an application with over 6000 lines of code and over 100 functions, and the experience on code conversion for synthesizability, various HLS optimizations, HLS limitations while dealing with complex input code, and general design insights are shared.
Abstract: High level synthesis (HLS) is gaining wider acceptance for hardware design due to its higher productivity and better design space exploration features. In recent years, HLS techniques and design flows have also advanced significantly, and as a result, many new FPGA designs are developed with HLS. However, despite many studies using HLS, the size and complexity of such applications remain generally small, and it is not well understood how to design and optimize for HLS with large, complex reference code. Typical HLS benchmark applications contain somewhere between 100 to 1400 lines of code and about 20 sub-functions, but typical input applications may contain many times more code and functions. To study such complex applications, we present a case study using HLS for a full H.264 decoder: an application with over 6000 lines of code and over 100 functions. We share our experience on code conversion for synthesizability, various HLS optimizations, HLS limitations while dealing with complex input code, and general design insights. Through our optimization process, we achieve 34 frames/s at 640x480 resolution (480p). To enable future study and benefit the research community, we open-source our synthe- sizable H.264 implementation.

54 citations

Journal Article
TL;DR: An architecture for a system-on-a-chip solution is proposed, based on reconfigurable computing, capable of performing up to ten complex image manipulations simultaneously and in real-time on video resolutions up to XVGA.
Abstract: The increasing ubiquity of embedded digital video capture creates demand for high-throughput, low-power, flexible and adaptable integrated image processing systems An architecture for a system-on-a-chip solution is proposed, based on reconfigurable computing The inherent system modularity and the communication infrastructure are targeted at enhancing design productivity and reuse Power consumption is addressed by a combination of efficient streaming data transfer and reuse mechanisms It is estimated that the proposed system would be capable of performing up to ten complex image manipulations simultaneously and in real-time on video resolutions up to XVGA

41 citations

Journal ArticleDOI
TL;DR: A real-time seizure detection algorithm based on STFT and support vector machine (SVM) and its field-programmable gate array (FPGA) implementation is proposed and the possibility of integrating the proposed algorithm and FPGA implementation into a wearable seizure control device is affirm.
Abstract: Closed-loop stimulation of many neurological disorders, such as epilepsy, is an emerging technology and regarded as a promising alternative for surgical and drug treatment. In this paper, a real-time seizure detection algorithm based on STFT and support vector machine (SVM) and its field-programmable gate array (FPGA) implementation are proposed. With a two-stage patient-specific channel selection and feature selection mechanism, those redundant and uncorrelated spectral features are removed from the entire feature set. The evaluation results on CHB-MIT epilepsy database show that the mean detection latency of the proposed algorithm is 6 s, the sensitivity is 98.4%, and the false detection rate is 0.356/h. The performance of our proposed algorithm is comparable to other existing seizure detection algorithms. Moreover, we implement the proposed seizure detection algorithm on Xilinx Zynq-7000 XC7Z020 with high level synthesis. Each classification of the input electroencephalography signal can be finished within 313 $\mu \text{s}$ , and the power consumption of the programmable logic is only 380 mW at 100 MHz. In hardware implementation, an optimization strategy for the nested-loop structure within nonlinear SVM is proposed to improve pipeline efficiency. Compared with existing method, the experimental result shows that our method can speed up the nonlinear SVM by $1.70\times $ , $1.53\times $ , $1.37\times $ , and $1.26\times $ with the unroll factor equal to 1–4 at the same DSP utilization rate. The evaluation results affirm the possibility of integrating the proposed algorithm and FPGA implementation into a wearable seizure control device.

36 citations

Journal ArticleDOI
TL;DR: A universal voltage-mode filter configuration employing a voltage differencing inverting buffered amplifier, two capacitors and a resistor is proposed, and even the internal structure of the new building block is possibly the simplest among all recently introduced new building blocks.
Abstract: A universal voltage-mode filter configuration employing a voltage differencing inverting buffered amplifier (VDIBA), two capacitors and a resistor is proposed. The presented structure can realize all the five standard biquadratic filters: low-pass, high-pass, band-pass, band-reject and all-pass, without changing the circuit topology. The proposed filter circuit also provides the following advantageous features, not available simultaneously in any of the single active element/device based universal biquad realizing all the five filter responses known earlier so far: (i) independent electronic tuning of natural angular frequency (ω 0) and bandwidth (BW), (ii) no requirement of any element matching condition or inversion of input signal(s) (as needed in most of the earlier reported structures), and (iii) low active and passive sensitivities. Moreover, even the internal structure of the new building block is possibly the simplest among all recently introduced new building blocks. The workability of the proposed filter has been confirmed by SPICE simulations using 0.18 μm technology.

35 citations

Book
30 Nov 2003
TL;DR: In this paper, the authors propose a Verilog at the RTL level for elementary functions using multiplication, multiplication, and square root using multiplicative-based methods, respectively.
Abstract: Preface. 1. Motivation. 2. Verilog at the RTL Level. 3. Addition. 4. Multiplication. 5. Division Using Recurrence. 6. Elementary Functions. 7. Division and Square Root Using Multiplicative-Based Methods. References. Index.

25 citations