High-performance implementation of regular and easily scalable sorting networks on an FPGA

doi:10.1016/J.MICPRO.2014.03.003

Journal ArticleDOI

High-performance implementation of regular and easily scalable sorting networks on an FPGA

Valery Sklyarov, +1 more

- 01 Jul 2014 -

Microprocessors and Microsystems

- Vol. 38, Iss: 5, pp 470-484

Chats0

TLDR

The paper found that the even-odd transition network is the most regular network that can be implemented very efficiently in FPGA, so it is proposed new, easily scalable hardware solutions and processing techniques based on this.

About:

This article is published in Microprocessors and Microsystems.The article was published on 2014-07-01. It has received 45 citations till now. The article focuses on the topics: Merge algorithm & Sorting algorithm.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Comparison of On-chip Communications in Zynq-7000 All Programmable Systems-on-Chip

João Figueira Silva, +2 more

- 03 Feb 2015 -

IEEE Embedded Systems Letters

TL;DR: This letter analyses and compares on-chip interfaces for hardware/software communications in the Zynq-7000 all programmable systems-on-chip to enable the most effective interfaces for specific types of data to be identified and the effectiveness of Zynqu-based hardware accelerators to be assessed.

...read moreread less

Journal ArticleDOI

Performance Characterization and Design Guidelines for Efficient Processor–FPGA Communication in Cyclone V FPSoCs

Roberto Fernandez Molanes, +2 more

- 01 May 2018 -

IEEE Transactions on Industrial Electron...

TL;DR: This paper presents an extensive characterization and analysis of processor-FPGA communications in a widely used family of FPSoCs, namely Cyclone V devices, and introduces a set of design guidelines to help FPSoC designers take the most possible advantage of the excellent characteristics of these devices.

...read moreread less

Journal ArticleDOI

RTHS: A Low-Cost High-Performance Real-Time Hardware Sorter, Using a Multidimensional Sorting Algorithm

Amin Norollah, +3 more

- 08 May 2019 -

IEEE Transactions on Very Large Scale In...

TL;DR: Implementing the RTHS design on a Virtex-7 field-programmable gate array (FPGA) reveals that the number of lookup tables (LUTs) of the proposed method has decreased compared to the conventional Bitonic sorting network (CBSN) and the state-of-the-art PHSA, respectively.

...read moreread less

Proceedings ArticleDOI

Analysis and Comparison of Attainable Hardware Acceleration in All Programmable Systems-on-Chip

Valery Sklyarov, +3 more

TL;DR: It is found that efficiency of software/hardware solutions depends on many mutually related factors such as the volume of processed data, applied parallelism, and involved high-performance ports.

...read moreread less

Journal ArticleDOI

A High Performance FPGA-Based Sorting Accelerator with a Data Compression Mechanism

Ryohei Kobayashi, +1 more

- 01 May 2017 -

IEICE Transactions on Information and Sy...

Collapse

References

PDF

Open Access

More filters

Book

The Art of Computer Programming

Donald Ervin Knuth

TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.

...read moreread less

The Art in Computer Programming

Andrew Hunt, +1 more

TL;DR: Here the authors haven’t even started the project yet, and already they’re forced to answer many questions: what will this thing be named, what directory will it be in, what type of module is it, how should it be compiled, and so on.

...read moreread less

Proceedings ArticleDOI

Sorting networks and their applications

Kenneth E. Batcher

TL;DR: To achieve high throughput rates today's computers perform several operations simultaneously; not only are I/O operations performed concurrently with computing, but also, in multiprocessors, several computing operations are done concurrently.

...read moreread less

Proceedings ArticleDOI

Designing efficient sorting algorithms for manycore GPUs

Nadathur Satish, +2 more

TL;DR: The design of high-performance parallel radix sort and merge sort routines for manycore GPUs, taking advantage of the full programmability offered by CUDA, are described, which are the fastest GPU sort and the fastest comparison-based sort reported in the literature.

...read moreread less

Proceedings ArticleDOI

Scan primitives for GPU computing

Shubhabrata Sengupta, +3 more

TL;DR: Using the scan primitives, this work shows novel GPU implementations of quicksort and sparse matrix-vector multiply, and analyzes the performance of the scanPrimitives, several sort algorithms that use the scan Primitives, and a graphical shallow-water fluid simulation using the scan framework for a tridiagonal matrix solver.

...read moreread less

High-performance implementation of regular and easily scalable sorting networks on an FPGA

Citations

Comparison of On-chip Communications in Zynq-7000 All Programmable Systems-on-Chip

Performance Characterization and Design Guidelines for Efficient Processor–FPGA Communication in Cyclone V FPSoCs

RTHS: A Low-Cost High-Performance Real-Time Hardware Sorter, Using a Multidimensional Sorting Algorithm

Analysis and Comparison of Attainable Hardware Acceleration in All Programmable Systems-on-Chip

A High Performance FPGA-Based Sorting Accelerator with a Data Compression Mechanism

References

The Art of Computer Programming

The Art in Computer Programming

Sorting networks and their applications

Designing efficient sorting algorithms for manycore GPUs

Scan primitives for GPU computing

Related Papers (5)

Sorting networks on FPGAs

Sorting networks and their applications

Comparison of On-chip Communications in Zynq-7000 All Programmable Systems-on-Chip

Synthesis and Optimization of FPGA-Based Systems

Energy and performance exploration of accelerator coherency port using Xilinx ZYNQ