scispace - formally typeset
Search or ask a question
Topic

Gate count

About: Gate count is a research topic. Over the lifetime, 1020 publications have been published within this topic receiving 13535 citations.


Papers
More filters
Posted Content
TL;DR: This work presents the direct CMOS gate-level implementation of the multi-neuron column model as the key building block for TNNs, and develops a multi-layer TNN prototype of 32M gates that is evaluated's performance and complexity relative to a recent state-of-the-art TNN model.
Abstract: Temporal Neural Networks (TNNs) use time as a resource to represent and process information, mimicking the behavior of the mammalian neocortex. This work focuses on implementing TNNs using off-the-shelf digital CMOS technology. A microarchitecture framework is introduced with a hierarchy of building blocks including: multi-neuron columns, multi-column layers, and multi-layer TNNs. We present the direct CMOS gate-level implementation of the multi-neuron column model as the key building block for TNNs. Post-synthesis results are obtained using Synopsys tools and the 45 nm CMOS standard cell library. The TNN microarchitecture framework is embodied in a set of characteristic equations for assessing the total gate count, die area, compute time, and power consumption for any TNN design. We develop a multi-layer TNN prototype of 32M gates. In 7 nm CMOS process, it consumes only 1.54 mm^2 die area and 7.26 mW power and can process 28x28 images at 107M FPS (9.34 ns per image). We evaluate the prototype's performance and complexity relative to a recent state-of-the-art TNN model.

7 citations

Proceedings ArticleDOI
20 Oct 1999
TL;DR: This processor has a 2-issue VLIW architecture with 64-bit SIMD arithmetic functional units to exploit the instruction-level and subword data parallelism found in multimedia applications and shows a comparable or higher performance when compared to the 8-issue TMS320C62xx.
Abstract: As the complexity of multimedia applications increases, the need for efficient and compiler-friendly processor architectures also grows. In this paper, a new multimedia processor architecture is proposed. This processor has a 2-issue VLIW architecture with 64-bit SIMD arithmetic functional units to exploit the instruction-level and subword data parallelism found in multimedia applications. Moreover, densely encoded instructions supporting memory operands, DSP-like addressing modes, and SIMD capability boost the performance while keeping the code size and hardware cost small. To maximally utilize this architecture, a software environment including a code converter, a VLIW compiler system, and a compiled simulator has also been developed. The processor core has been synthesized for LSI logic 0.25 /spl mu/m library, which results in the total gate count of 102 K. In spite of the relatively smaller issue rate, the proposed processor shows a comparable or higher performance in terms of both the cycle count and the code size when compared to the 8-issue TMS320C62xx, for DSP benchmark kernels and an H.263 video encoder.

7 citations

13 Apr 1994
TL;DR: The implementation of algorithms which attempt to provide this type of optimisation for the two previously mentioned problems, and the resultant software uses genetic algorithms to select, breed and test the fitness of potential solutions, and thereby recommend a near-optimal solution.
Abstract: The work described in this paper began some time ago as an investigation into two problems associated with logic minimisation or optimisation. These are respectively, the state assignment problem in the design of finite state machines, and the optimisation of combinational logic circuits using Reed-Muller (RM) techniques. When faced with such designs, the use of FPGAs to implement circuits is clearly appropriate. However, because of the limited resources available on FPGA parts, in terms of the number of available CLBs, and the increased difficulty that place and route software will experience in the layout of increasingly complex designs, it is felt that some form of optimisation of the design before implementation is still a necessary stage in the design process. This paper describes the implementation of algorithms which attempt to provide this type of optimisation for the two previously mentioned problems. The resultant software uses genetic algorithms to select, breed and test the fitness of potential solutions, and thereby recommend a near-optimal solution. In practice, these recommended solutions represent a considerable saving (in terms of gate count) on many circuit implementations, as experimental results demonstrate.

7 citations

Proceedings ArticleDOI
01 Nov 2014
TL;DR: A new detection method with a faster estimation of gate scaling factors by solving the normal equation of linear regression model is presented, which has high detection sensitivity as long as the Trojan-to-circuit gate count ratio exceeds 0.4%.
Abstract: Due to outsourcing of IC fabrication, chip supply contamination is a clear and present danger, of which hardware Trojans (HTs) pose the greatest threat. This paper reviews the limitation of existing gate level characterization approaches to HT detection and presents a new detection method with a faster estimation of gate scaling factors by solving the normal equation of linear regression model. The HT-infected circuit can be distinguished from the genuine circuit without the need for a golden reference chip by their discrepancies in the bias parameter of the linear regression and a subset of the accurately estimated scaling factors. It has high detection sensitivity as long as the Trojan-to-circuit gate count ratio exceeds 0.4%.

7 citations

Proceedings ArticleDOI
01 Nov 2008
TL;DR: This paper proposes a new application specific processor and compiler targeting H.264 inverse transform and inverse quantization based on the 6-stage pipelined dual issue VLIW+SIMD architecture, and compiler mapping techniques such as CKF (compiler known function), inline assembly and CGD (code generator description).
Abstract: This paper proposes a new application specific processor and compiler targeting H.264 inverse transform and inverse quantization. They are based on the 6-stage pipelined dual issue VLIW+SIMD architecture, efficient instructions for inverse transform and inverse quantization, and compiler mapping techniques such as CKF (compiler known function), inline assembly and CGD (code generator description). The proposed architecture whose approximate gate count is about 130 K runs at 100 MHz. Compared to the ARM1020E processor, the proposed architecture and compiler result in about 20~46% improvement in terms of total cycles as well as smaller hardware complexity.

7 citations


Network Information
Related Topics (5)
CMOS
81.3K papers, 1.1M citations
84% related
Electronic circuit
114.2K papers, 971.5K citations
81% related
Integrated circuit
82.7K papers, 1M citations
80% related
Transistor
138K papers, 1.4M citations
79% related
Decoding methods
65.7K papers, 900K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20236
202219
202151
202047
201938
201847