Topic

Gate count

About: Gate count is a research topic. Over the lifetime, 1020 publications have been published within this topic receiving 13535 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Resource Efficient Implementation of Low Power MB-OFDM PHY Baseband Modem With Highly Parallel Architecture

[...]

Seok Joong Hwang¹, Youngsun Han, Seon Wook Kim¹, Jongsun Park¹, Byung Gueon Min - Show less +1 more•Institutions (1)

Korea University¹

01 Jul 2012-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper introduces several novel optimization techniques for resource efficient implementation of the baseband modem which has highly, i.e., 8-way, parallel architecture, such as new processing structures for a (de)interleaver and a packet synchronizer and algorithm reconstruction for a carrier frequency offset compensator.

...read moreread less

Abstract: The multi-band orthogonal frequency-division multiplexing modem needs to process large amount of computations in short time for support of high data rates, i.e., up to 480 Mbps. In order to satisfy the performance requirement while reducing power consumption, a multi-way parallel architecture has been proposed. But the use of the high degree parallel architecture would increase chip resource significantly, thus a resource efficient design is essential. In this paper, we introduce several novel optimization techniques for resource efficient implementation of the baseband modem which has highly, i.e., 8-way, parallel architecture, such as new processing structures for a (de)interleaver and a packet synchronizer and algorithm reconstruction for a carrier frequency offset compensator. Also, we describe how to efficiently design several other components. The detailed analysis shows that our optimization technique could reduce the gate count by 27.6% on average, while none of techniques degraded the overall system performance. With 0.18-μm CMOS process, the gate count and power consumption of the entire baseband modem were about 785 kgates and less than 381 mW at 66 MHz clock rate, respectively.

...read moreread less

7 citations

Journal Article•DOI•

Fast Hardware-Based IME With an Idle Cycle and Computational Redundancy Reduction

[...]

Tae Sung Kim¹, Chae Eun Rhee², Hyuk-Jae Lee³•Institutions (3)

Samsung¹, Inha University², Seoul National University³

01 Jun 2020-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A prediction unit (PU) loop unrolling scheme is proposed to solve the pipeline stall problem and a computational redundancy among PUs within a coding unit is reduced through a search step synchronization and search point sharing scheme, eliminating the inefficiency of hardware design.

...read moreread less

Abstract: Extensive efforts have been made to design hardware-based integer motion estimation (IME) that is much faster than software-based IME but suffers from the degradation in the coding efficiency. This is because the strategy for previous efforts was a simple algorithmic modification of the fast IME to facilitate the given hardware design at the expense of coding efficiency. This paper proposes a novel hardware design of the IME that not only offers real-time processing capability but also provides a flexible tradeoff between computational complexity and coding efficiency. First, a prediction unit (PU) loop unrolling scheme is proposed to solve the pipeline stall problem owing to the nature of fast IME algorithms such as the test zone search (TZS). It reduces idle cycles by 89.24%. Next, to further reduce the computational complexity of the TZS algorithm, a computational redundancy among PUs within a coding unit is reduced through a search step synchronization and search point sharing scheme. Thus, the computational complexity is reduced by 72.25%. The proposed schemes eliminate the inefficiency of hardware design; thus, they do not suffer from serious degradation in the coding efficiency. Consequently, the proposed hardware-based IME processes $7680\times4320$ videos at 30 frames per second while increasing the Bjontegaard delta bitrate by only 0.90% on average. The hardware design is synthesized using a 65 nm general purpose CMOS technology, and its gate count is 268.5K at an operating clock frequency of 500 MHz.

...read moreread less

7 citations

Proceedings Article•DOI•

A low energy high speed Reed-Solomon decoder using Decomposed Inversionless Berlekamp-Massey Algorithm

[...]

Hazem A. Ahmed¹, Hamed Salah¹, Tallal Elshabrawy¹, Hossam A. H. Fahmy²•Institutions (2)

German University in Cairo¹, Cairo University²

01 Nov 2010

TL;DR: An area efficient, low energy, high speed pipelined architecture for a Reed-Solomon decoder based on Decomposed Inversionless Berlekamp-Massey Algorithm, where the error locator and evaluator polynomial can be computed serially.

...read moreread less

Abstract: This paper proposes an area efficient, low energy, high speed pipelined architecture for a Reed-Solomon decoder based on Decomposed Inversionless Berlekamp-Massey Algorithm, where the error locator and evaluator polynomial can be computed serially. In the proposed architecture, a new scheduling of t Finite Field Multipliers (FFMs) is used to calculate the error locator and evaluator polynomials to achieve a good balance between area, latency, and throughput. This architecture is tested in two different decoders. The first one is a pipelined two parallel decoder, as two parallel syndrome and two parallel Chien search are used. The second one is a conventional pipelined decoder, as conventional syndrome and Chien search are used. Both decoders have been implemented by 0.13µm CMOS IBM standard cells. The two parallel RS(255, 239) decoder has gate count of 37.6K and area of 1.18mm2, simulation results show this approach can work successfully at the data rate 7.4Gbps and the power dissipation is 50mW. The conventional RS(255, 239) decoder has gate count of 30.7K and area of 0.99mm2. Simulation results show this approach can work successfully at the data rate 4.85Gbps and the power dissipation is 29.28mW.

...read moreread less

7 citations

Proceedings Article•DOI•

Energy efficient MIMO channel pre-processor using a low complexity on-line update scheme

[...]

Chenxin Zhang¹, Hemanth Prabhu¹, Liang Liu¹, Ove Edfors¹, Viktor Öwall¹ - Show less +1 more•Institutions (1)

Lund University¹

01 Nov 2012

TL;DR: This paper presents a low-complexity energy efficient channel pre-processing update scheme, targeting the emerging 3GPP long term evolution advanced (LTE-A) downlink, and has been designed as a dedicated unit in a 65 nm CMOS technology.

...read moreread less

Abstract: This paper presents a low-complexity energy efficient channel pre-processing update scheme, targeting the emerging 3GPP long term evolution advanced (LTE-A) downlink. Upon channel matrix renewals, the number of explicit QR decompositions (QRD) and channel matrix inversions are reduced since only the upper triangular matrices R and R−1 are updated, based on an on-line update decision mechanism. The proposed channel pre-processing updater has been designed as a dedicated unit in a 65 nm CMOS technology, resulting in a core area of 0.242mm2 (equivalent gate count of 116K). Running at a 330MHz clock, each QRD or R−1 update consumes 4 or 2 times less energy compared to one exact state-of-the-art QRD in open literature.

...read moreread less

7 citations

Proceedings Article•DOI•

A multi-context holographic memory recording system for Optically Reconfigurable Gate Arrays

[...]

R. Miyazaki, Minoru Watanabe¹, Fuminori Kobayashi¹•Institutions (1)

Kyushu Institute of Technology¹

26 Mar 2007

TL;DR: A four-context ORGA architecture and a multi-context holographic memory recording system used for it are proposed and experimentally demonstrated results of recording a holog graphic memory and reconfiguring an ORGA-VLSI are described.

...read moreread less

Abstract: Optically reconfigurable gate arrays (ORGAs) offer the possibility of providing a virtual gate count that is much larger than those of currently available VLSIs by exploiting the large storage capacity of a holographic memory. The first ORGA was developed to achieve rapid reconfiguration and a number of reconfiguration contexts; it consisted of a gate array VLSI, a holographic memory, and a laser diode array. The ORGA achieved a 16 mus to 20 mus reconfiguration period that was faster than that of FPGAs, with 100 reconfiguration contexts. However, the ORGA requires the gate array to halt during reconfiguration. Therefore, the ORGA cannot be reconfigured frequently because of the associated reconfiguration overhead. On the other hand, new ORGA-VLSIs that have less than 10 ns reconfiguration capability without any related overhead have already been fabricated. However, to date, a multi-holographic reconfiguration system that is suitable for such rapidly reconfigurable ORGA-VLSIs without any overhead has never been developed. For such realization, this paper proposes a four-context ORGA architecture and a multi-context holographic memory recording system used for it. In addition, experimentally demonstrated results of recording a holographic memory and reconfiguring an ORGA-VLSI are described.

...read moreread less

7 citations

Collapse

Network Information

Performance

Metrics

1,045

Papers

14,870

Citations

No. of papers in the topic in previous years
Year	Papers
2023	6
2022	19
2021	51
2020	47
2019	38
2018	47

Gate count

Papers published on a yearly basis

Papers

Trending Questions (4)

Network Information

Related Topics (5)

Performance

Metrics