scispace - formally typeset
Search or ask a question
Topic

Clock gating

About: Clock gating is a research topic. Over the lifetime, 7838 publications have been published within this topic receiving 107903 citations.


Papers
More filters
Proceedings ArticleDOI
03 Dec 2003
TL;DR: Two statemachines that track parallelism on-the-fly are introduced, and the supply voltage is scaled depending on the level of parallelism, to avoid problems with circuit-level complexity concerns, stability and signal-propagationspeed issues which limit how fast VSV may transition between the voltages, and energy overhead factors which disallow supply-voltage scaling of largeRAM structures such as caches and register file.
Abstract: Energy efficient processor design is becoming more and more important with technology scaling and with high performance requirements. Supply-voltage scaling is an efficient way to reduce energy by lowering the operating voltage and the clock frequency of processor simultaneously. We propose a variable supply-voltage scaling (VSV) technique based on the following key observation: upon an L2 miss, the pipeline performs some independent computations but almost always ends up stalling and waiting for data, despite out-of-order issue and other latency-hiding techniques. Therefore, during an L2 miss we scale down the supply voltage of certain sections of the processor in order to reduce power dissipation while it carries on the independent computations at a lower speed. However, operating at a lower speed may degrade performance, if there are sufficient independent computations to overlap with the L2 miss. Similarly, returning to high speed may degrade power savings, if there are multiple outstanding misses and insufficient independent computations to overlap with them. To avoid these problems, we introduce two state machines that track parallelism on-the-fly, and we scale the supply voltage depending on the level of parallelism. We also consider circuit-level complexity concerns which limit VSV to two supply voltages, stability and signal-propagation speed issues which limit how fast VSV may transition between the voltages, and energy overhead factors which disallow supply-voltage scaling of large RAM structures such as caches and register file. Our simulation shows that VSV achieves an average of 20.7% total processor power reduction with 2.0% performance degradation in an 8-way, out-of-order-issue processor that implements deterministic clock gating and software prefetching, for those SPEC2K benchmarks that have high L2 miss rates. Averaging across all the benchmarks, VSV reduces total processor power by 7.0% with 0.9% performance degradation.

64 citations

Proceedings ArticleDOI
31 May 2005
TL;DR: The experimental results show that the link based non-tree approach can reduce the maximal skew by 47, improve the skew yield from 15% to 73% on average with a decrease on the total wire and buffer capacitance.
Abstract: Clock skew is becoming increasingly difficult to control due to variations. Link based non-tree clock distribution is a cost-effective technique for reducing clock skew variations. However, previous works based on this technique were limited to unbuffered clock networks and neglected spatial correlations in the experimental validation. In this work, we overcome these shortcomings and make the link based non-tree approach feasible for realistic designs. The short circuit risk and multi-driver delay issues in buffered non-tree clock networks are investigated. Our approach is validated with SPICE based Monte Carlo simulations, considering spatial correlations among variations. The experimental results show that our approach can reduce the maximal skew by 47%, improve the skew yield from 15% to 73% on average with a decrease on the total wire and buffer capacitance.

64 citations

Patent
David J. Ayers1, Edward T. Grochowski1, David J. Sager1, Vivek Tiwari1, Ian Young1 
14 Jun 2001
TL;DR: In this article, a mechanism for adjusting the activity of an integrated digital circuit such as a processor to reduce voltage changes attributable to current changes triggered by clock gating is presented, where the processor includes one or more functional units and a current control circuit that monitors activity states of the processor's functional units to estimate the current consumed over n clock cycles.
Abstract: The present invention provides a mechanism for adjusting the activity of an integrated digital circuit such as a processor to reduce voltage changes attributable to current changes triggered by clock gating. The processor includes one or more functional units and a current control circuit that monitors activity states of the processor's functional units to estimate the current consumed over n clock cycles. The current control circuit estimates the current change for a given clock cycle from the n activity states and compares the estimated current change with first and second thresholds. The processors activity is decreased if the estimated current change is greater than the first threshold, and the processor activity is decreased if the estimated current change is less than the second threshold.

64 citations

Patent
11 Aug 1998
TL;DR: In this article, the authors propose a power-down control circuit that outputs an internal clock signal such that an output signal of a command decoder can be latched, and a period of time from the latching of the command to the time when the command can be transferred will be reduced.
Abstract: When a clock enable signal asynchronous with a clock signal is set at a high level, a power-down control circuit sets a power-down signal at a high level to release a power-down mode. When the power-down mode is released, a clock control circuit outputs an internal clock signal such that an output signal of a command decoder can be latched. According to such a constitution, a period of time from the latching of the command after releasing the power-down mode to the time when the command can be transferred will be reduced, and a high-speed operation can be attained.

64 citations

Patent
Shekhar Borkar1, Stephen R. Mooney1
26 Jun 1995
TL;DR: In this paper, a daisy chained clock distribution scheme for distributing a clock signal from a central communications clock driver to the nodes of a massively parallel multi-processor computer or supercomputer is presented.
Abstract: A daisy chained clock distribution scheme for distributing a clock signal from a central communications clock driver to the nodes of a massively parallel multi-processor computer or supercomputer. The daisy chained clocking scheme is implemented using point-to-point clock distribution of a differential clock signal to the communication nodes of a plurality of processors in a multicomputer system or to components connected to a common bus in a high speed microprocessor system. Differential signaling is employed wherein the differentiality is maintained including through silicon. In an alternate embodiment, the clock pulse is also regenerated in each node component.

64 citations


Network Information
Related Topics (5)
CMOS
81.3K papers, 1.1M citations
89% related
Integrated circuit
82.7K papers, 1M citations
85% related
Electronic circuit
114.2K papers, 971.5K citations
85% related
Semiconductor memory
45.4K papers, 663.1K citations
83% related
Transistor
138K papers, 1.4M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202324
202231
202137
202050
201968
201884