scispace - formally typeset
Search or ask a question
Author

EkhtiyarMohammad Aasim

Bio: EkhtiyarMohammad Aasim is an academic researcher from Dresden University of Technology. The author has contributed to research in topics: Multiplier (economics) & Throughput (business). The author has co-authored 1 publications.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the use of SIMD components in Field-Programmable GAssembles has been studied for error-resilient programs intertwined with their quest for high throughput.
Abstract: The rapid evolution of error-resilient programs intertwined with their quest for high throughput has motivated the use of Single Instruction, Multiple Data (SIMD) components in Field-Programmable G...

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: RAPID as mentioned in this paper is the first pipelined approximate multiplier and divider architecture, customized for FPGAs, which efficiently utilizes 6-input Look-up Tables (6-LUTs) and fast carry chains to implement Mitchell's approximate algorithms.
Abstract: The rapid updates in error-resilient applications along with their quest for high throughput have motivated designing fast approximate functional units for Field-Programmable Gate Arrays (FPGAs). Studies that proposed imprecise functional techniques are posed with three shortcomings: first, most inexact multipliers and dividers are specialized for Application-Specific Integrated Circuit (ASIC) platforms. Second, state-of-the-art (SoA) approximate units are substituted, mostly in a single kernel of a multi-kernel application. Moreover, the end-to-end assessment is adopted on the Quality of Results (QoR), but not on the overall gained performance. Finally, existing imprecise components are not designed to support a pipelined approach, which could boost the operating frequency/throughput of, e.g., division-included applications. In this paper, we propose RAPID, the first pipelined approximate multiplier and divider architecture, customized for FPGAs. The proposed units efficiently utilize 6-input Look-up Tables (6-LUTs) and fast carry chains to implement Mitchell's approximate algorithms. Our novel error-refinement scheme not only has negligible overhead over the baseline Mitchell's approach but also boosts its accuracy to 99.4% for arbitrary size of multiplication and division. Experimental results demonstrate the efficiency of the proposed pipelined and non-pipelined RAPID multipliers and dividers over accurate counterparts. Moreover, the end-to-end evaluations of RAPID, deployed in three multi-kernel applications in the domains of bio-signal processing, image processing, and moving object tracking for Unmanned Air Vehicles (UAV) indicate up to 45% improvements in area, latency, and Area-Delay-Product (ADP), respectively, over accurate kernels, with negligible loss in QoR.

1 citations

Journal ArticleDOI
TL;DR: RAPID, the first pipelined approximate multiplier and divider architectures, customized for FPGAs is proposed, which has negligible overhead over the baseline Mitchell’s approach, but also boosts its accuracy to 99.4% for arbitrary size of multiplication and division.
Abstract: The rapid updates in error-resilient applications along with their quest for high throughput has motivated designing fast approximate functional units for field-programmable gate arrays (FPGAs). Studies have proposed various imprecise functional techniques, albeit posed with three shortcomings: first, most existing inexact multipliers and dividers are specialized for application-specific integrated circuit (ASIC) platforms. Therefore, due to the architectural differences of underlying building blocks in FPGA and ASIC, ASIC-customized designs have not yielded comparable improvements when directly synthesized and ported to FPGAs. Second, state-of-the-art (SoA) approximate units are substituted, mostly in a single kernel of a multikernel application. Moreover, the end-to-end assessment is adopted on the quality of results (QoR), but not on the overall gained performance. Finally, the existing imprecise components are not designed to support a pipelined approach, which could boost the operating frequency/throughput of, e.g., division-included applications. In this article, we propose RAPID, the first pipelined approximate multiplier and divider architectures, customized for FPGAs. The proposed units efficiently utilize 6-input look-up tables (6-LUTs) and fast carry chains to implement Mitchell’s approximate algorithms. Our novel error-refinement scheme not only has negligible overhead over the baseline Mitchell’s approach but also boosts its accuracy to 99.4% for arbitrary size of multiplication and division. Experimental results obtained with Xilinx Vivado demonstrate the efficiency of the proposed pipelined and nonpipelined RAPID multipliers and dividers over accurate counterparts. In particular, the 4-stage pipelined architecture of a 32-bit RAPID multiplier (divider) enables $3.3\times $ ( $5.1\times $ ) higher throughput, $2.3\times $ ( $6.8\times $ ) higher throughput/Watt, and 52% (31%) savings of look-up tables (LUTs), over their 4-stage pipelined, accurate Intellectual Property (IP) counterparts. Moreover, the end-to-end evaluations of nonpipelined RAPID, deployed in three multikernel applications in the domains of biosignal processing, image processing, and moving object tracking for unmanned aerial vehicles (UAVs) indicate up to 35%, 33%, and 45% improvements in area, latency, and area-delay-product (ADP), respectively, over accurate kernels, with negligible loss in QoR. To springboard future research in reconfigurable and approximate computing communities, our implementations will be available and opensourced at https://cfaed.tu-dresden.de/pd-downloads.