scispace - formally typeset
Search or ask a question
Topic

Gate count

About: Gate count is a research topic. Over the lifetime, 1020 publications have been published within this topic receiving 13535 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper introduces several novel optimization techniques for resource efficient implementation of the baseband modem which has highly, i.e., 8-way, parallel architecture, such as new processing structures for a (de)interleaver and a packet synchronizer and algorithm reconstruction for a carrier frequency offset compensator.
Abstract: The multi-band orthogonal frequency-division multiplexing modem needs to process large amount of computations in short time for support of high data rates, i.e., up to 480 Mbps. In order to satisfy the performance requirement while reducing power consumption, a multi-way parallel architecture has been proposed. But the use of the high degree parallel architecture would increase chip resource significantly, thus a resource efficient design is essential. In this paper, we introduce several novel optimization techniques for resource efficient implementation of the baseband modem which has highly, i.e., 8-way, parallel architecture, such as new processing structures for a (de)interleaver and a packet synchronizer and algorithm reconstruction for a carrier frequency offset compensator. Also, we describe how to efficiently design several other components. The detailed analysis shows that our optimization technique could reduce the gate count by 27.6% on average, while none of techniques degraded the overall system performance. With 0.18-μm CMOS process, the gate count and power consumption of the entire baseband modem were about 785 kgates and less than 381 mW at 66 MHz clock rate, respectively.

7 citations

Journal ArticleDOI
TL;DR: A prediction unit (PU) loop unrolling scheme is proposed to solve the pipeline stall problem and a computational redundancy among PUs within a coding unit is reduced through a search step synchronization and search point sharing scheme, eliminating the inefficiency of hardware design.
Abstract: Extensive efforts have been made to design hardware-based integer motion estimation (IME) that is much faster than software-based IME but suffers from the degradation in the coding efficiency. This is because the strategy for previous efforts was a simple algorithmic modification of the fast IME to facilitate the given hardware design at the expense of coding efficiency. This paper proposes a novel hardware design of the IME that not only offers real-time processing capability but also provides a flexible tradeoff between computational complexity and coding efficiency. First, a prediction unit (PU) loop unrolling scheme is proposed to solve the pipeline stall problem owing to the nature of fast IME algorithms such as the test zone search (TZS). It reduces idle cycles by 89.24%. Next, to further reduce the computational complexity of the TZS algorithm, a computational redundancy among PUs within a coding unit is reduced through a search step synchronization and search point sharing scheme. Thus, the computational complexity is reduced by 72.25%. The proposed schemes eliminate the inefficiency of hardware design; thus, they do not suffer from serious degradation in the coding efficiency. Consequently, the proposed hardware-based IME processes $7680\times4320$ videos at 30 frames per second while increasing the Bjontegaard delta bitrate by only 0.90% on average. The hardware design is synthesized using a 65 nm general purpose CMOS technology, and its gate count is 268.5K at an operating clock frequency of 500 MHz.

7 citations

Proceedings ArticleDOI
01 Nov 2010
TL;DR: An area efficient, low energy, high speed pipelined architecture for a Reed-Solomon decoder based on Decomposed Inversionless Berlekamp-Massey Algorithm, where the error locator and evaluator polynomial can be computed serially.
Abstract: This paper proposes an area efficient, low energy, high speed pipelined architecture for a Reed-Solomon decoder based on Decomposed Inversionless Berlekamp-Massey Algorithm, where the error locator and evaluator polynomial can be computed serially. In the proposed architecture, a new scheduling of t Finite Field Multipliers (FFMs) is used to calculate the error locator and evaluator polynomials to achieve a good balance between area, latency, and throughput. This architecture is tested in two different decoders. The first one is a pipelined two parallel decoder, as two parallel syndrome and two parallel Chien search are used. The second one is a conventional pipelined decoder, as conventional syndrome and Chien search are used. Both decoders have been implemented by 0.13µm CMOS IBM standard cells. The two parallel RS(255, 239) decoder has gate count of 37.6K and area of 1.18mm2, simulation results show this approach can work successfully at the data rate 7.4Gbps and the power dissipation is 50mW. The conventional RS(255, 239) decoder has gate count of 30.7K and area of 0.99mm2. Simulation results show this approach can work successfully at the data rate 4.85Gbps and the power dissipation is 29.28mW.

7 citations

Proceedings ArticleDOI
Chenxin Zhang1, Hemanth Prabhu1, Liang Liu1, Ove Edfors1, Viktor Öwall1 
01 Nov 2012
TL;DR: This paper presents a low-complexity energy efficient channel pre-processing update scheme, targeting the emerging 3GPP long term evolution advanced (LTE-A) downlink, and has been designed as a dedicated unit in a 65 nm CMOS technology.
Abstract: This paper presents a low-complexity energy efficient channel pre-processing update scheme, targeting the emerging 3GPP long term evolution advanced (LTE-A) downlink. Upon channel matrix renewals, the number of explicit QR decompositions (QRD) and channel matrix inversions are reduced since only the upper triangular matrices R and R−1 are updated, based on an on-line update decision mechanism. The proposed channel pre-processing updater has been designed as a dedicated unit in a 65 nm CMOS technology, resulting in a core area of 0.242mm2 (equivalent gate count of 116K). Running at a 330MHz clock, each QRD or R−1 update consumes 4 or 2 times less energy compared to one exact state-of-the-art QRD in open literature.

7 citations

Proceedings ArticleDOI
26 Mar 2007
TL;DR: A four-context ORGA architecture and a multi-context holographic memory recording system used for it are proposed and experimentally demonstrated results of recording a holog graphic memory and reconfiguring an ORGA-VLSI are described.
Abstract: Optically reconfigurable gate arrays (ORGAs) offer the possibility of providing a virtual gate count that is much larger than those of currently available VLSIs by exploiting the large storage capacity of a holographic memory. The first ORGA was developed to achieve rapid reconfiguration and a number of reconfiguration contexts; it consisted of a gate array VLSI, a holographic memory, and a laser diode array. The ORGA achieved a 16 mus to 20 mus reconfiguration period that was faster than that of FPGAs, with 100 reconfiguration contexts. However, the ORGA requires the gate array to halt during reconfiguration. Therefore, the ORGA cannot be reconfigured frequently because of the associated reconfiguration overhead. On the other hand, new ORGA-VLSIs that have less than 10 ns reconfiguration capability without any related overhead have already been fabricated. However, to date, a multi-holographic reconfiguration system that is suitable for such rapidly reconfigurable ORGA-VLSIs without any overhead has never been developed. For such realization, this paper proposes a four-context ORGA architecture and a multi-context holographic memory recording system used for it. In addition, experimentally demonstrated results of recording a holographic memory and reconfiguring an ORGA-VLSI are described.

7 citations


Network Information
Related Topics (5)
CMOS
81.3K papers, 1.1M citations
84% related
Electronic circuit
114.2K papers, 971.5K citations
81% related
Integrated circuit
82.7K papers, 1M citations
80% related
Transistor
138K papers, 1.4M citations
79% related
Decoding methods
65.7K papers, 900K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20236
202219
202151
202047
201938
201847