scispace - formally typeset
Search or ask a question
Topic

Gate count

About: Gate count is a research topic. Over the lifetime, 1020 publications have been published within this topic receiving 13535 citations.


Papers
More filters
Proceedings ArticleDOI
01 Dec 2008
TL;DR: In this article, the authors present the first programmable ORGA architecture and experimental results, and discuss the availability of the architecture and future plans, as well as their experimental results.
Abstract: Recently, optically reconfigurable gate arrays (ORGAs) consisting of a gate array VLSI, a holographic memory, and a laser array have been developed to achieve huge virtual gate counts that is much larger than those of currently available VLSIs. Using ORGA architecture, greater than 1 tera gate count VLSIs are possible by exploiting the storage capacity of a holographic memory. Conventional ORGAs have only one shortcoming compared with current field programmable gate arrays (FPGAs): they are not reprogrammable after their fabrication because, to reprogram ORGAs, a holographic memory must be disassembled from its ORGA package, reprogrammed outside of the ORGA package using a holographic memory writer, and implemented into the ORGA package with high precision beyond that available by manual assembly. To improve that shortcoming, this paper presents the world's first programmable ORGA architecture and experimental results. Furthermore, in light of those experimental results, this paper presents discussion of the availability of this architecture and future plans.
Patent
27 Oct 2020
TL;DR: In this article, an optimized multiport NVMe controller on a single die is proposed to reduce area and gate count for multipath I/O requirements over prior implementations without compromising any performance requirements.
Abstract: This provides an optimized multiport NVMe controller on a single die that significantly reduces area and gate count for multipath I/O requirements over prior implementations without compromising any performance requirements. The arrangement implements minimal logic per NVMe controller as per NVMe specification requirements and implements shared logic for all common functions. This results in the desired substantial savings in gate count and area. The optimized multiport NVMe controller is used in multipath I/O-based memory subsystem where multiple hosts access Namespaces through their own dedicated queues. Illustratively, the optimized multiport NVMe controller shares common logic among NVMe controllers, providing area efficient solution for multipath I/O implementations. Shared logic across all NVMe controllers are the DMA Engine (Hardware block which handles all NVMe commands based on PRP or SGL pointers), Firmware Request Queue (FWRQ). Firmware Completion Queue (FWCQ) and DMACQ (DMA Completion Queue).
01 Jan 2011
TL;DR: This study designs a variable block size motion estimation (VBSME) engine based on hybrid grained processing elements (PEs) and a 2D programmable interconnect structure, which is adaptive to all block size configurations of H.264.
Abstract: This study contributes to the domain of application specific adaptive hardware architectures with a design approach on processing element array, interconnect structure and memory interface concurrently. As summarized below, our architectural design choices push the limits of on-chip data reuse and avoid redundant computations that are essential for the high throughput, small area, and low power demands of the consumer market. Motion estimation (ME) is a key component in the H.264/AVC standard. Full Search (FS) based ME achieves optimal peak signal-to-noise-ratio (PSNR), and is the most adopted algorithm for developing hardware motion estimators. In this study, we first design a variable block size motion estimation (VBSME) engine based on hybrid grained processing elements (PEs) and a 2D programmable interconnect structure, which is adaptive to all block size configurations of H.264. PEs operate in bit-serial manner using MSB-first arithmetic for early termination to reduce the amount of computations, and the 2D architecture enables on-chip data reuse between neighboring PEs in a bit-by-bit pipelined fashion. Our design reduces the gate count by 7x compared to its ASIC counterpart, operates at a comparable frequency while sustaining 30 and 60 frames per second (fps); and outperforms bit parallel and bit
Journal ArticleDOI
TL;DR: Cost-effective two-dimensional (2D) discrete cosine transform (DCT) and inverse DCT architectures capable of supporting multiple standards of MPEG, H.264 and VC-1 are presented.
Abstract: Cost-effective two-dimensional (2D) discrete cosine transform (DCT) and inverse DCT architectures capable of supporting multiple standards of MPEG, H.264 and VC-1 are presented. The proposed core utilises a 1D core and a transposed memory to achieve a low cost design. Multi-level factor share is implemented in conjunction with distributed arithmetic in a system to enable the sharing of the coefficient matrix circuit in order to reduce hardware costs. The proposed approach employs a time-distribution scheme to enable the simultaneous processing of the first and second dimensions to enhance throughput. A high efficiency of this approach was verified by fabricating a test chip using the TSMC 0.18 μm CMOS process. The architecture has an operating frequency of 200 MHz, and throughput of 800 M-pixels/s with a gate count of 44.5 K.
Journal ArticleDOI
TL;DR: A block parallel architecture of interpolation for high-performance H.264/AVC Fractional Motion Estimation in 8K UHD() video real time processing is proposed to improve throughput and minimize redundant storage of the reference pixel.
Abstract: In this paper, we proposed a block parallel architecture of interpolation for high-performance H.264/AVC Fractional Motion Estimation in 8K UHD() video real time processing. To improve throughput, we design block parallel interpolation. For supplying the reference data for interpolation, we design 2D cache buffer which consists of the memory arrays. We minimize redundant storage of the reference pixel by applying the Search Area Stripe Reuse scheme(SASR), and implement high-speed plane interpolator with 3-stage pipeline(Horizontal Vertical 1/2 interpolation, Diagonal 1/2 interpolation, 1/4 interpolation). The proposed architecture was simulated in 0.13um standard cell library. The gate count is 436.5Kgates. The proposed H.264/AVC Fractional Motion Estimation can support 8K UHD at 30 frames per second by running at 187MHz.

Network Information
Related Topics (5)
CMOS
81.3K papers, 1.1M citations
84% related
Electronic circuit
114.2K papers, 971.5K citations
81% related
Integrated circuit
82.7K papers, 1M citations
80% related
Transistor
138K papers, 1.4M citations
79% related
Decoding methods
65.7K papers, 900K citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20236
202219
202151
202047
201938
201847