scispace - formally typeset
Search or ask a question

Showing papers on "Clock gating published in 2002"


Proceedings ArticleDOI
02 Feb 2002
TL;DR: An alternative approach is described, which is called a multiple clock domain (MCD) processor, in which the chip is divided into several clock domains, within which independent voltage and frequency scaling can be performed.
Abstract: As clock frequency increases and feature size decreases, clock distribution and wire delays present a growing challenge to the designers of singly-clocked, globally synchronous systems. We describe an alternative approach, which we call a multiple clock domain (MCD) processor, in which the chip is divided into several clock domains, within which independent voltage and frequency scaling can be performed. Boundaries between domains are chosen to exploit existing queues, thereby minimizing inter-domain synchronization costs. We propose four clock domains, corresponding to the front end , integer units, floating point units, and load-store units. We evaluate this design using a simulation infrastructure based on SimpleScalar and Wattch. In an attempt to quantify potential energy savings independent of any particular on-line control strategy, we use off-line analysis of traces from a single-speed run of each of our benchmark applications to identify profitable reconfiguration points for a subsequent dynamic scaling run. Using applications from the MediaBench, Olden, and SPEC2000 benchmark suites, we obtain an average energy-delay product improvement of 20% with MCD compared to a modest 3% savings from voltage scaling a single clock and voltage system.

508 citations


Journal ArticleDOI
01 May 2002
TL;DR: This study indicates that further pipelining can at best improve performance of integer programs by a factor of 2 over current designs, and proposes and evaluates a high-frequency design called a segmented instruction window.
Abstract: Microprocessor clock frequency has improved by nearly 40% annually over the past decade. This improvement has been provided, in equal measure, by smaller technologies and deeper pipelines. From our study of the SPEC 2000 benchmarks, we find that for a high-performance architecture implemented in 100nm technology, the optimal clock period is approximately 8 fan-out-of-four (FO4) inverter delays for integer benchmarks, comprised of 6 FO4 of useful work and an overhead of about 2 FO4. The optimal clock period for floating-point benchmarks is 6 FO4. We find these optimal points to be insensitive to latch and clock skew overheads. Our study indicates that further pipelining can at best improve performance of integer programs by a factor of 2 over current designs. At these high clock frequencies it will be difficult to design the instruction issue window to operate in a single cycle. Consequently, we propose and evaluate a high-frequency design called a segmented instruction window.

249 citations


Proceedings ArticleDOI
08 Apr 2002
TL;DR: In this paper, the authors present an interface wrapper circuit for point-to-point communication in a globally asynchronous locally synchronous (GALS) system, a possible methodology for managing the predicted increase in clock domains.
Abstract: Reliable, low-latency channel communication between independent clock domains may be achieved using a combination of clock pausing techniques, self-calibrating delay lines and an asynchronous interconnect. Such a scheme can be used for point-to-point communication in a globally asynchronous locally synchronous (GALS) system, a possible methodology for managing the predicted increase in clock domains. We present interface wrapper circuits which permit communication between a locally synchronous producer and a locally synchronous consumer via an asynchronous interconnect. Such interfaces can also be used to mix asynchronous and synchronous modules. Clock pausing is used to guarantee that metastability will never result in failure. Arbitration between channel communication and the local clock is performed concurrently so that metastability resolution will rarely delay the clock. Simulation results show that the maximum performance of one data item per consumer clock cycle is achieved when the producer: consumer clock ratio is equal or greater to one.

131 citations


Patent
30 Sep 2002
TL;DR: In this article, a clock frequency control unit for an integrated circuit (IC) includes a clock generator, a finite state machine (FSM), and a gating circuit (GC).
Abstract: A clock frequency control unit for an integrated circuit (IC) includes a clock generator, a finite state machine (FSM), and a gating circuit (GC). The FSM has at least first and second states corresponding to non-low workload low workload states, respectively. In the first state, the GC provides a clock signal to functional units of the IC with the same frequency as the clock generator output. In the second state, the GC reduces the frequency of the clock signal. In one embodiment, the GC masks out selected cycles of the clock generator output to reduce the clock signal frequency. The FSM monitors the operation of the IC to transition from the first state to the second state when selected “low workload” conditions are detected (e.g., long latency cache miss). Similarly, the FSM transitions from the second state to the first state when selected “non-low workload” conditions are detected.

128 citations


Proceedings ArticleDOI
Victor Zyuban1, Philip N. Strenski1
12 Aug 2002
TL;DR: A metric of hardware intensity is proposed, which is useful for evaluating issues that affect both circuits and architecture, and derived for the optimal balance between the architectural complexity, hardware intensity and power supply.
Abstract: Evaluation of architectural tradeoffs is complicated by implications in the circuit domain which are typically not captured in the analysis but substantially affect the results. We propose a metric of hardware intensity (/spl eta/), which is useful for evaluating issues that affect both circuits and architecture. Analyzing data for actual designs we show how to measure the introduced parameters and discuss variations between observed results and common theoretical assumptions. For a power-efficient design we derive relations for /spl eta/ and supply voltage V under progressively more general situations, and incorporate /spl eta/ into a prior art architectural energy-efficiency criterion. Then, a more general relation is derived for the optimal balance between the architectural complexity, hardware intensity and power supply. Modified forms for these relations are obtained in special cases where the supply voltage is constrained or when clock gating is disallowed.

117 citations


Patent
30 Sep 2002
TL;DR: In this paper, a basic holdover loop for retaining the current reconstructed clock frequency signal receives weighted corrections from an open loop and a network time protocol filter loop, which then synchronizes the clock of the local telecommunications network.
Abstract: Techniques for synchronizing the clock of a local telecommunications network connected to a remote clock source through an asynchronous transport network such as an Ethernet metropolitan area transport network. A basic holdover loop for retaining the current reconstructed clock frequency signal receives weighted corrections from an open loop and a network time protocol filter loop. The open loop measures data packet interarrival times on the local network and calculates a first reconstructed clock frequency signal. The network time protocol loop applies network time protocol to generate timestamps over the asynchronous transport network which are used to generate a second reconstructed clock frequency signal. The first and second reconstructed clock frequency signals are combined using dynamically adjusted weight factors and compared with the current reconstructed clock frequency signal to correct the latter which then synchronizes the clock of the local telecommunications network.

114 citations


Journal ArticleDOI
TL;DR: This paper presents theory and algorithms for building a low-power clock tree by distributing the clock signal at a lower voltage and translating it to a higher voltage at the utilization points, using reduced swing and multiple-supply voltages.
Abstract: Clock networks account for a significant fraction of the power dissipation of a chip and are critical to performance. This paper presents theory and algorithms for building a low-power clock tree by distributing the clock signal at a lower voltage and translating it to a higher voltage at the utilization points. Two low-power schemes are used: reduced swing and multiple-supply voltages. We analyze the issue of tree construction and present conclusions relevant to various technology generations according to the NTRS. Our experimental results show that power savings of an average of 45% are possible for a 0.25 /spl mu/m technology using multiple supply voltages, and about 32% using a single external supply voltage.

112 citations


Patent
14 Jan 2002
TL;DR: In this paper, a crosspoint switch circuit generates both a master bit clock and a master word clock signal from an incoming high-speed serial data stream using a clock and data recovery circuit.
Abstract: An asynchronous serial crosspoint switch is word synchronized to each of a number of transceiver circuits. The crosspoint switch circuit generates both a master bit clock and a master word clock signal. A transceiver circuit recovers the master bit clock signal from an incoming high-speed serial data stream using a clock and data recovery circuit. The recovered bit clock signal is used as a timing signal by which data is serialized and transmitted to the crosspoint switch circuit. The data stream transmitted to the switch circuit is frequency locked to the master bit clock signal, such that the serial data stream need only be phase adjusted with a data recovery circuit. To recover word timing, the switch circuit issues alignment words to the transceivers during link initialization. The transceivers perform word alignment and establish a local word lock. Alignment words are then reissued to the switch circuit using the local word clock. The switch circuit compares the boundary of the received word clock to the master word clock and, if misaligned, the transceiver shifts its transmitted word by one bit and retries. Necessary edge transition density is provided by overhead bits which also designate special command words asserted between a transceiver and a switch circuit. Flow control information is routed from a receiving transceiver back to the transmitting transceiver using the overhead bits in order to assert a ready-to-receive or a not-ready-to-receive flow control signal. The overhead bits additionally provide information regarding connection requests and other information.

105 citations


Proceedings ArticleDOI
08 Apr 2002
TL;DR: A novel technique based on local clock gating and synchronous handshake protocols that achieves stage level interlocking characteristics in synchronous pipelines similar to that of asynchronous pipelines is presented.
Abstract: Locality principles are becoming paramount in controlling advancement of data through pipelined systems. Achieving fine grained power down and progressive pipeline stalls at the local stage level is therefore becoming increasingly, important to enable lower dynamic power consumption while keeping introduced switching noise under control as well as avoiding global distribution of timing critical stall signals. It has long been known that the interlocking properties of as asynchronous pipelined systems have a potential to provide such benefits. However it has not been understood how such interlocking can be achieved in synchronous pipelines. This paper presents a novel technique based on local clock gating and synchronous handshake protocols that achieves stage level interlocking characteristics in synchronous pipelines similar to that of asynchronous pipelines. The presented technique is directly applicable to traditional synchronous pipelines and works equally well for two-phase clocked pipelines based on transparent latches, as well as one-phase clocked pipelines based on master-slave latches.

103 citations


Journal ArticleDOI
TL;DR: A high level model for evaluating the energy dissipation of the clock generation and distribution circuitry, including both the dynamic and leakage power components is proposed and validation results show that the model is reasonably accurate.
Abstract: The clock distribution and generation circuitry forms a critical component of current synchronous digital systems and is known to consume at least a quarter of the power budget of existing microprocessors. We propose and validate a high level model for evaluating the energy dissipation of the clock generation and distribution circuitry, including both the dynamic and leakage power components. The validation results show that the model is reasonably accurate, with the average deviation being within 10% of SPICE simulations. Access to this model can enable further research at high-level design stages in optimizing the system clock power. To illustrate this, a few architectural modifications are considered and their effect on the clock subsystem and the total system power budget is assessed.

97 citations


Proceedings ArticleDOI
10 Nov 2002
TL;DR: This paper study the effect of using a Globally Asynchronous Locally Synchronous (GALS) organization for a superscalar, out-of-order processor, both in terms of power and performance.
Abstract: Due to increasing clock speeds, increasing design sizes and shrinking technologies, it is becoming more and more challenging to distribute a single global clock throughout a chip. In this paper we study the effect of using a Globally Asynchronous Locally Synchronous (GALS) organization for a superscalar, out-of-order processor, both in terms of power and performance. To this end, we propose a novel modeling and simulation environment for multiple clock cores with static or dynamically variable voltages for each synchronous block. Using this design exploration environment we were able to assess the power/performance tradeoffs available for Multiple Clock, Single Voltage (MCSV), as well as Multiple Clock, Dynamic Voltage (MCDV) cores. Our results show that MCSV processors are 10% more power efficient when compared to single-clock single voltage designs with a performance penalty of about 10%. By exploiting the flexibility of independent dynamic voltage scaling the various clock domains, the power efficiency of GALS designs can be improved by 12% on average, and up to 20% more in select cases. The power efficiency of MCDV cores becomes comparable with the one of Single Clock, Dynamic Voltage (SCDV) cores, while being up to 8% better in some cases. Our results show that MCDV cores consume 22% less power at an average 12% performance loss.

Journal ArticleDOI
TL;DR: A low-swing clock double-edge triggered flip-flop (LSDBF) is developed to reduce power consumption significantly compared to conventional FFs and avoids unnecessary internal node transition and reduces conflicting currents.
Abstract: A low-swing clock double-edge triggered flip-flop (LSDFF) is developed to reduce power consumption significantly compared to conventional flip-flops. The LSDFF avoids unnecessary internal node transitions to reduce power consumption. In addition, power consumption in the clock tree is reduced because LSDFF uses a double-edge triggered operation as well as a low-swing clock. To prevent performance degradation of the LSDFF due to low-swing clock, low-V/sub t/ transistors are used for the clocked transistors without significant leakage current problems. The power saving in flip-flop operation is estimated to be 28.6% to 49.6% with additional 78% power saving in the clock network.

Patent
Bong Kyun Kim1
05 Mar 2002
TL;DR: In this paper, a glitch free clock multiplexer circuit with a state region transition generating unit is presented, where a clock signal can be outputted by removing a glitch occurred in a clock conversion due to a timing difference between a plurality of asynchronous clock signals and a selection signal.
Abstract: In a glitch free clock multiplexer circuit and a method thereof, the glitch free clock multiplexer circuit includes a delay unit for receiving asynchronous clock signals (Clock A, Clock B) and an external selection signal (Sel) and outputting a delay signal by delaying a clock signal selected by the external selection signal (Sel) for a certain clock cycle, a state region transition generating unit for comparing the delay signal with a count value provided from a user, outputting a first control signal (Sel_clock) according to a comparison value and a second control signal (enable) for controlling the first control signal in a logic low state, and a glitch removal unit for outputting a clock output signal (Clock_out) by performing an AND operation of a temporary clock signal (Temp_clock) selected by the first control signal and a third control signal generated by delaying the second control signal (enable) for a certain clock cycle. Accordingly, a glitch free clock signal can be outputted by removing a glitch occurred in a clock conversion due to a timing difference between a plurality of asynchronous clock signals and a selection signal.

Patent
20 Nov 2002
TL;DR: In this paper, a scheme for multi-frequency at-speed logic built-in self-test (BIST) is presented, which is applicable to testing of circuits with multiple clock domains which can be either the same frequency or different frequency.
Abstract: A scheme for multi-frequency at-speed logic Built-In Self Test (BIST) is provided. This scheme allows at-speed testing of very high frequency integrated circuits controlled by a clock signal generated externally or on-chip. The scheme is also applicable to testing of circuits with multiple clock domains which can be either the same frequency or different frequency. Scanable memory elements of the digital circuit are connected to define plurality of scan chains. The loading and unloading of scan chains is separated from the at-speed testing of the logic between the respective domains and may be done at a faster or slower frequency than the at-speed testing. The BIST controller, Pseudo-Random Pattern Generator (PRPG) and Multi-input Signature Register (MISR) work at slower frequency than the fastest clock domain. After loading of a new test pattern, a clock suppression circuit allows a scan enable signal to propagate for more that one clock cycle before multiple capture clock is applied. This feature relaxes the speed and skew constraints on scan enable signal design. Only the capture cycle is performed at the corresponding system timing. A programmable capture window makes it possible to test every intra- and inter-domain at-speed without the negative impact of clock skew between clock domains.

Patent
Simon M. Tam1, Stefan Rusu2
26 Jul 2002
TL;DR: In this article, a clock frequency of a clock signal generated from the clock generator is adjusted in response to an evaluation of the sampled values of the supply voltage at a plurality of locations.
Abstract: A method and an apparatus for dynamically varying a clock frequency in a processor to adapt to VCC voltage changes. The method of one embodiment includes sampling a supply voltage at a plurality of locations. The values of said supply voltage are communicated to a clock generator. A clock frequency of a clock signal generated from the clock generator is adjusted in response to an evaluation of the sampled values of the supply voltage.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: The clock distribution on the Power4 dual-processor chip supplies a single critical 1.5 GHz clock from one SOI-optimized PLL to 15,200 pins on a large chip with 20 ps skew and 35 ps jitter to achieve targets on schedule with no adjustment circuitry.
Abstract: The clock distribution on the Power4 dual-processor chip supplies a single critical 1.5 GHz clock from one SOI-optimized PLL to 15,200 pins on a large chip with 20 ps skew and 35 ps jitter. The network contains 64 tuned trees driving a single grid, and specialized tools to achieve targets on schedule with no adjustment circuitry.

Journal ArticleDOI
TL;DR: In this paper, an all-optical clock recovery circuit for operation with short data packets of 10-Gb/s rate is presented, using a Fabry-Perot etalon and a nonlinear UNI gate and is capable of acquiring the clock signal within a few bits.
Abstract: We demonstrate an all-optical clock recovery circuit for operation with short data packets of 10-Gb/s rate. The circuit uses a Fabry-Perot etalon and a nonlinear UNI gate and is capable of acquiring the clock signal within a few bits.

01 Jan 2002
TL;DR: This paper proposes a simple, efficient and low cost method to compensate for the clock frequencies difference that rely only on regular time stamped packets transmissions and may be used in many cases.
Abstract: Exchange of time stamped events between different stations raises the problem of the clock frequencies difference as soon as one station try to compensate for the transmission delay and to render the events with a minimum time distortion. We propose a simple, efficient and low cost method to compensate for the clock frequencies difference. This method rely only on regular time stamped packets transmissions and may be used in many cases. It provides good performances to the receiver station in regard of the sender reference time even on a heavily loaded communication channel. It operates also very efficiently on a low latency local network

Patent
25 Nov 2002
TL;DR: In this article, a method and apparatus for adjusting the clock frequency and voltage supplied to an integrated circuit is provided, and the slew rate of the clock is controlled so that at least a minimum required voltage for each operating frequency is provided.
Abstract: A method and apparatus for adjusting the clock frequency and voltage supplied to an integrated circuit is provided. A request signal is sent to the clock, and in response, the clock lowers the clock frequency supplied to the integrated circuit. A frequency detection circuit monitors the clock signal and causes a voltage regulator to reduce the voltage supplied to the integrated circuit in response to the reduced clock frequency. Similarly, a request signal is sent to the clock, and in response, the clock raises the clock frequency supplied to the integrated circuit. The frequency detection circuit monitors the clock signal and causes a voltage regulator to raise the voltage supplied to the integrated circuit in response to the increased clock frequency. The slew rate of the clock is controlled so that at least a minimum required voltage for each operating frequency is provided while the clock frequency is being changed. In this manner, reliable operation of the processor is assured while the clock speed and operating voltage are being changed.

Patent
Byong-mo Moon1
16 Dec 2002
TL;DR: In this article, a system includes modules, a clock generator that generates a first clock signal that is applied to the modules and a chipset that controls the modules, the chipset having a clock buffer that generates the second clock signal.
Abstract: A system includes modules, a clock generator that generates a first clock signal that is applied to the modules, and a chipset that controls the modules, the chipset having a clock buffer that generates a second clock signal. The system includes a first clock line that transfer the first clock signal to the clock buffer, the first clock line connected between the clock generator and a first termination circuit. The system includes a second clock line that transfer the second clock signal to the modules, the second clock line electrically isolated from the first clock line, the second clock line connected between the clock buffer and a second termination circuit.

PatentDOI
TL;DR: In this paper, a clock generator circuit is proposed to generate two pre-phase clock signals having approximately the same frequency for use in sampling an analog signal in a generally alternating fashion.
Abstract: The present invention relates to a clock generator circuit which comprises a clock generator subcircuit which is operable to generate two clock signals having approximately the same frequency for use in sampling an analog signal in a generally alternating fashion. The clock generator circuit further comprises a pre-phase clock generator subcircuit associated with the clock generator subcircuit which is operable to generate two pre-phase clock signals, wherein each are associated with a respective one of the two clock signals generated by the clock generator subcircuit. In the pre-phase clock generator circuit, a signal transition of each of the pre-phase clock signals occurs before a signal transition of the respective clock signal generated by the clock generator subcircuit; in addition, a timing of a falling edge of the pre-phase clock signals is dictated by a global clock signal. Thus the clock generator circuit avoids sampling error in a double-sampled sample and hold circuit and harmonic distortion associated therewith.

Patent
02 Jul 2002
TL;DR: In this article, a synchronous integrated circuit such as a scalar processor or superscalar processor is clocked by and synchronized to a common system clock, and a local clock generator in each clocked unit combines the common clock and stall status from one or more other units to adjust register clock frequency up or down.
Abstract: A synchronous integrated circuit such as a scalar processor or superscalar processor. Circuit components or units are clocked by and synchronized to a common system clock. At least two of the clocked units include multiple register stages, e.g., pipeline stages. A local clock generator in each clocked unit combines the common system clock and stall status from one or more other units to adjust register clock frequency up or down.

Patent
22 Jan 2002
TL;DR: In this article, a synchronous integrated circuit device including a clock receiver to receive an external clock signal and a plurality of output drivers to output data is presented, where a first portion of the data is output synchronously with respect to a rising edge transition of the external clock signals.
Abstract: A synchronous integrated circuit device including a clock receiver to receive an external clock signal and a plurality of output drivers to output data. A first portion of the data is output synchronously with respect to a rising edge transition of the external clock signal. A second portion of the data is output synchronously with respect to a falling edge transition of the external clock signal. In addition, the integrated circuit device includes a delay locked loop, coupled to the plurality of output drivers and the clock receiver, to synchronize the output of the first and second portions of the data with the external clock signal.

Patent
Leonard Forbes1
29 Oct 2002
TL;DR: In this paper, a method and apparatus for evaluating logical inputs electronically using electronic logic circuits in monotonic dynamic-static pseudo-NMOS configurations is presented, including alternating dynamic and static circuit portions adapted to transition monotonically in response to a common clock (or complemented clock) signal.
Abstract: A method and apparatus for evaluating logical inputs electronically using electronic logic circuits in monotonic dynamic-static pseudo-NMOS configurations. The apparatus includes alternating dynamic and static circuit portions adapted to transition monotonically in response to a common clock (or complemented clock) signal. The circuit portions include pseudo-NMOS configured switching circuits implementing logical functions.

Patent
13 Feb 2002
TL;DR: In this paper, a system for controlling the duty cycle of a clock signal is presented. But the system is not suitable for the use of a single-input single-output (SISO) circuit.
Abstract: A system for controlling the duty cycle of a clock signal. The system includes a duty cycle adjustment circuit that receives an input clock signal and generates an output clock signal. The duty cycle adjustment circuit charges a capacitor when the input clock signal has a first logic level and discharges the capacitor with the input clock signal has a second logic level. The rates of charge and discharge are controlled by first and second control signals. When the capacitor has been charged to a first transition level, the output clock signal transitions to a first logic level, and when the capacitor has been discharged to a second transition level, the output clock signal transitions to a second logic level. The first and second control signals are supplied by a feedback circuit, which is implemented using an integrator circuit that receives the output clock signal and generates a feedback signal indicative of the duty cycle of the output clock signal. A transconductance amplifier compares the feedback signal to a reference voltage, and generates first and second currents corresponding thereto. These currents are converted to the first and second control signals by a control circuit, which includes a current mirror. The control circuit provides good immunity from power supply fluctuations.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: A comprehensive framework for the estimation of systemwide and clock sub-system power as function of technology scaling is presented, indicating that clock power will remain a significant contributor to the total chip power, as long as techniques are used to limit leakage power consumption.
Abstract: The clock distribution and generation circuitry is known to consume more than a quarter of the power budget of existing microprocessors. A previously derived clock energy model is briefly reviewed while a comprehensive framework for the estimation of systemwide (chip level) and clock sub-system power as function of technology scaling is presented. This framework is used to study and quantify the impact that various intensifying concerns associated with scaling (i.e., increased leakage currents, increased interwire capacitance) will have on clock energy and their relative impact on the overall system energy. The results obtained indicate that clock power will remain a significant contributor to the total chip power, as long as techniques are used to limit leakage power consumption.

Patent
19 Jan 2002
TL;DR: In this paper, a digital circuit comprising a digital processing component, an adjustable power supply and power supply adjustment circuitry is disclosed, where the power supply adjusts the level of VDD such that the maximum delay time of the critical path of the digital processor is less than a pulse-width duration between a first clock edge of the first selected clock signal and a second clock edge.
Abstract: There is disclosed a digital circuit comprising a digital processing component, an adjustable power supply and power supply adjustment circuitry. The digital processing component is capable of operating at a plurality of selected clock frequencies, wherein a maximum delay time of a critical path in the digital processing component is determined by a level of a power supply, VDD, of the digital processing component. The adjustable power supply is capable of supplying VDD to the digital processing component. The power supply adjustment circuitry is operable to receive a first selected clock signal and adjusts the level of VDD such that the maximum delay time of the critical path of the digital processing component is less than a pulse-width duration between a first clock edge of the first selected clock signal and a second clock edge of the first selected clock signal immediately following the first clock edge.

Proceedings ArticleDOI
07 Aug 2002
TL;DR: The use of the gated clock approach to reduce power consumption is analyzed and compared and it is worth noting that implementation of the three gated Clock strategies leads also to a design with the smallest gate count.
Abstract: In this paper the use of the gated clock approach to reduce power consumption is analyzed and compared. The approach has been implemented following three different strategies that allow the approach to be efficiently used under different design conditions. To verify the strength of the approach it has been implemented during the design of a programmable interrupt controller (PIC). The results found show a 2/spl times/ factor reduction in the average power consumption through the use of the three strategies. Moreover, the results have been also compared with those obtained through an automatic implementation of one of the gated clock strategies allowed by Synopsys's power compiler. In this second case only about 25% of power consumption is saved. It is worth noting that implementation of the three gated clock strategies leads also to a design with the smallest gate count.

Proceedings ArticleDOI
12 Aug 2002
TL;DR: This paper presents an activity-sensitive clock tree construction technique for low power design of VLSI clock networks, and introduces the term of node difference based on module activity information, and shows its relationship with the power consumption.
Abstract: This paper presents an activity-sensitive clock tree construction technique for low power design of VLSI clock networks. We introduce the term of node difference based on module activity information, and show its relationship with the power consumption. A binary clock tree is built using the node difference between different modules to optimize the power consumption due to the interconnections (i.e., clock gating signals and clock edges). We also develop a method to determine gating signals with minimum number of transitions. After the clock tree is constructed, the gating signals are optimized for further power savings.

Patent
09 Oct 2002
TL;DR: In this article, the authors propose a frequency offset circuit to compensate a frequency of the transmitter using the frequency offset of the receiver's clock frequency from data received from the host by the receiver.
Abstract: A device communicates with a host and includes a transmitter, a receiver and a clock generator that generates a signal having a local clock frequency. A clock recovery circuit communicates with the receiver and recovers a host clock frequency from data received from the host by the receiver. A frequency offset circuit communicates with the clock recovery circuit and the clock generator and generates a frequency offset based on the clock frequency and the recovered host clock frequency. A frequency compensator compensates a frequency of the transmitter using the frequency offset. The host and the device may communicate using a serial ATA standard. Frequency compensation can be performed during spread spectrum operation.