scispace - formally typeset
Open AccessProceedings ArticleDOI

VLSI Structure-aware Placement for Convolutional Neural Network Accelerator Units

Reads0
Chats0
TLDR
In this paper, a kernel-based placement framework for CNN accelerator units is proposed, which extracts kernels from the circuit and inserts kernelbased regions to guide placement and minimize routing congestion.
Abstract
AI-dedicated hardware designs are growing dramatically for various AI applications. These designs often contain highly connected circuit structures, reflecting the complicated structure in neural networks, such as convolutional layers and fully-connected layers. As a result, such dense interconnections incur severe congestion problems in physical design that cannot be solved by conventional placement methods. This paper proposes a novel placement framework for CNN accelerator units, which extracts kernels from the circuit and insert kernel-based regions to guide placement and minimize routing congestion. Experimental results show that our framework effectively reduces global routing congestion without wirelength degradation, significantly outperforming leading commercial tools.

read more

Citations
More filters
Book ChapterDOI

Modified Floating Point Adder and Multiplier IP Design

TL;DR: In this article , the IEEE 754 format for single/double precision floating-point architecture is used for adding and subtracting real numbers in a single-input single-output (SISO) architecture.
Proceedings ArticleDOI

Routability-aware Placement Guidance Generation for Mixed-size Designs

TL;DR: In this article , the authors explore the possibility of using placement guidance to mitigate routing congestion and reduce design rule violations for mixed-size designs by extracting the underlying knowledge of a mixed size design using a graph neural network, and generate an embedding for each standard cell.
References
More filters
Journal ArticleDOI

Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks

TL;DR: A novel dataflow, called row-stationary (RS), is presented, that minimizes data movement energy consumption on a spatial architecture and can adapt to different CNN shape configurations and reduces all types of data movement through maximally utilizing the processing engine local storage, direct inter-PE communication and spatial parallelism.
Proceedings ArticleDOI

Garp: a MIPS processor with a reconfigurable coprocessor

TL;DR: Novel aspects of the Garp Architecture are presented, as well as a prototype software environment and preliminary performance results, which suggest that a Garp of similar technology could achieve speedups ranging from a factor of 2 to as high as a factors of 24 for some useful applications.
Book ChapterDOI

ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix

TL;DR: A novel architecture with tightly coupled very long instruction word (VLIW) processor and coarse-grained reconfigurable matrix is proposed, which has good performance and is very compiler-friendly.
Proceedings ArticleDOI

A New Algorithm for Floorplan Design

TL;DR: A new algorithm for floorplan design using the method of simulated annealing to carry out the neighborhood search effectively and achieves a simultaneous minimization of area and total interconnection length in the final solution.
Proceedings ArticleDOI

B*-Trees: a new representation for non-slicing floorplans

TL;DR: An efficient, flexible, and effective data structure, B-trees for non-slicing floorplans, based on ordered binary trees and the admissible placement presented in [1], and a B-tree based simulated annealing scheme for floorplan design.