scispace - formally typeset
Search or ask a question
Topic

Gate count

About: Gate count is a research topic. Over the lifetime, 1020 publications have been published within this topic receiving 13535 citations.


Papers
More filters
Proceedings ArticleDOI
06 Dec 2004
TL;DR: The proposed reconfigurable coprocessor can easily implement communication operations, such as scrambling, interleaving, convolutional encoding, Viterbi decoding, FFT, etc., and it can be used for next generation communication platforms to satisfy high speed operations.
Abstract: This paper proposes a reconfigurable coprocessor for communication systems, which can support high speed computations and various functions The proposed reconfigurable coprocessor can easily implement communication operations, such as scrambling, interleaving, convolutional encoding, Viterbi decoding, FFT, etc, and it can be used for next generation communication platforms to satisfy high speed operations The proposed architecture has been modeled by VHDL and synthesized using the SEC 018 /spl mu/m standard cell library The gate count is about 35,000 and the critical path is 384 ns with the 018 /spl mu/m technology The proposed coprocessor shows performance improvements compared with existing DSP chips for communication algorithms

14 citations

Journal ArticleDOI
TL;DR: Application-specific instructions and their bit manipulation unit (BMU), which efficiently support scrambling, convolutional encoding, puncturing, interleaving, and bit stream multiplexing, are proposed.
Abstract: This paper proposes application-specific instructions and their bit manipulation unit (BMU), which efficiently support scrambling, convolutional encoding, puncturing, interleaving, and bit stream multiplexing. The proposed DSP employs the BMU supporting parallel shift and XOR (exclusive-OR) operations and bit insertion/extraction operations on multiple data. The proposed architecture has been modeled by VHDL and synthesized using the SEC 0.18µm standard cell library and the gate count of the BMU is only about 1700 gates. Performance comparisons show that the number of clock cycles can be reduced about 40%-80% for scrambling, convolutional encoding, and interleaving compared with existing DSPs.

14 citations

Proceedings ArticleDOI
19 Apr 2021
TL;DR: In this article, the authors propose a new compiler structure, Orchestrated Trios, that first decomposes to the three-qubit Toffoli, routes the inputs of the higher-level operations to groups of nearby qubits, then finishes decomposition to hardware-supported gates.
Abstract: Current quantum computers are especially error prone and require high levels of optimization to reduce operation counts and maximize the probability the compiled program will succeed. These computers only support operations decomposed into one- and two-qubit gates and only two-qubit gates between physically connected pairs of qubits. Typical compilers first decompose operations, then route data to connected qubits. We propose a new compiler structure, Orchestrated Trios, that first decomposes to the three-qubit Toffoli, routes the inputs of the higher-level Toffoli operations to groups of nearby qubits, then finishes decomposition to hardware-supported gates. This significantly reduces communication overhead by giving the routing pass access to the higher-level structure of the circuit instead of discarding it. A second benefit is the ability to now select an architecture-tuned Toffoli decomposition such as the 8-CNOT Toffoli for the specific hardware qubits now known after the routing pass. We perform real experiments on IBM Johannesburg showing an average 35% decrease in two-qubit gate count and 23% increase in success rate of a single Toffoli over Qiskit. We additionally compile many near-term benchmark algorithms showing an average 344% increase in (or 4.44x) simulated success rate on the Johannesburg architecture and compare with other architecture types.

14 citations

Journal ArticleDOI
TL;DR: This work is the most compact ECDSA engine with capability for a wide range of curves and different applications and allows it to be implemented on any application specific integrated circuit (ASIC) or FPGA platform with dual-port memory support.
Abstract: Security problems introduced with rapid increase in deployment of Internet-of-Things devices can be overcome only with lightweight cryptographic schemes and modules. A compact prime field (GF(p)) elliptic curve digital signature algorithm (ECDSA) engine suitable for use in such applications is presented. Generic architecture of the engine makes it suitable for other elliptic curve (EC) based schemes (EC Diffie–Hellman key exchange, EC integrated encryption, EC factoring etc.) with slight modifications. The presented engine is composed of a simple microcoded controller and application-specific processing units. It can work with ECs of up to 256 bits, while 160-bit ECDSA signature generation takes 490 K cycles. The engine is implemented as an intellectual property (IP) in a 180 nm process. However, its architecture allows it to be implemented on any application specific integrated circuit (ASIC) or FPGA platform with dual-port memory support. In view of its gate count of 11,366 gate equivalents, the presented work is the most compact ECDSA engine with capability for a wide range of curves and different applications.

14 citations

Proceedings ArticleDOI
21 Jul 2015
TL;DR: The design successfully integrates the SVD module with the QR decomposition (QRD) module (for MIMO signal detection) under a unified hardware framework and outperform previous designs significantly.
Abstract: Precoding is an effective scheme in pre-compensating the wireless channel impairments and the singular value decomposition (SVD) scheme is a popular choice. This paper presents a unified high throughput SVD/QRD precoder chip design for MIMO OFDM systems. A hardware-implementation-friendly Givens Rotation (GR) based SVD computing scheme is developed first. It starts with a bi-diagonalization phase followed by an iterative diagonalization phase consisting of successive nullification sweeps. A convergence detection mechanism is employed to terminate the computations if the required precision is achieved. The design successfully integrates the SVD module (for precoding) with the QR decomposition (QRD) module (for MIMO signal detection) under a unified hardware framework. The design features a two-level pipelined, fully parallel architecture and CORDIC processors are employed to implement the GR modules efficiently. Various design optimization techniques are applied to reduce the circuit complexity and the power consumption. The implementation using TSMC 90nm process technology indicates a 35.75M SVDs per second throughput rate when operating at 143MHz. Both the throughput rate and the gate count efficiency of the proposed one outperform previous designs significantly.

14 citations


Network Information
Related Topics (5)
CMOS
81.3K papers, 1.1M citations
84% related
Electronic circuit
114.2K papers, 971.5K citations
81% related
Integrated circuit
82.7K papers, 1M citations
80% related
Transistor
138K papers, 1.4M citations
79% related
Decoding methods
65.7K papers, 900K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20236
202219
202151
202047
201938
201847