Buffer reduction algorithm for mesh-based clock distribution

doi:10.1109/ICAECC.2014.7002454

Home
/
Papers
/
Buffer reduction algorithm for mesh-based clock distribution

Proceedings Article•DOI•

Buffer reduction algorithm for mesh-based clock distribution

John Reuben¹, V. Mohammed Zackriya¹, Harish M. Kittur¹•Institutions (1)

VIT University¹

01 Oct 2014-pp 1-4

TL;DR: This short paper proposes a buffer reduction algorithm which can reduce the power dissipated in clock meshes by 15-18% at the cost of 10-20 ps increase in skew when compared to the previously published work.

read less

Abstract: In deep sub-micron technology, Mesh-based clock distribution is becoming a preferred method to distribute the clock since it is tolerant to process variations Buffers are placed on the mesh nodes to drive the mesh wire capacitance and large load capacitance of clock sinks In this short paper, we propose a buffer reduction algorithm which can reduce the power dissipated in clock meshes We calculate the importance of each buffer by the impact its removal has on the clock latency and clock slew at sinks We then calculate a rank for each buffer and buffers with lower ranks are removed Our buffer reduction algorithm is able to achieve 15–18% reduction in power at the cost of 10–20 ps increase in skew when compared to the previously published work

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A Platform of Resynthesizing a Clock Architecture Into Power-and-Area Effective Clock Trees

[...]

Tung-Liang Lin¹, Sao-Jie Chen¹•Institutions (1)

National Taiwan University¹

01 Oct 2020-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A novel design platform, merging and replacing of multiple multiplexers and dividers (MRMMD), is developed to intelligently identify those suspicious clock architectures and resynthesize them into a power-and-area effective and less complicated clock structure.

...read moreread less

Abstract: To trigger events for application-specific data transfer among registers in a multimillion-gate system-on-chip (SoC), various kinds of clock signals, selectively driven by different frequency-dependent sources and/or dividers (DIVs), are usually centralized in one or more clock generation modules, where clock gating cells (CGCs), multiplexers (MUXes) and DIVs are used to create the clocks required by different functional operations in an SoC. These modules will introduce uncommon and longer timing paths for clock propagations and further make the clock tree synthesis (CTS) process become more challenging due to the on-chip-variation (OCV) effects. In addition, high volume of switching activities in the increased number of clock logic cells will consume more power. In this article, a novel design platform, merging and replacing of multiple multiplexers and dividers (MRMMD), is developed to intelligently identify those suspicious clock architectures and resynthesize them into a power-and-area effective and less complicated clock structure. Using our resynthesis platform, not only the number of clock-related timing paths and their corresponding logic levels can be reduced, but also the corresponding analysis and implementations of clock skew minimizations during CTS become much easier. The experimental results implemented in TSMC 55- and 28-nm process nodes on optimizing some industrial clock architectures showed that significant reductions of area, power, latency, skew and clock path, logic level, OCV impact, total wire length, and implementation runtime are achieved using our MRMMD platform.

...read moreread less

4 citations

Cites methods from "Buffer reduction algorithm for mesh..."

...A buffer reduction method for meshbased clock distribution to achieve smaller clock network area was shown [8]....
[...]

Book Chapter•DOI•

Skew Analysis on Multisource Clock Tree Synthesis Using H-Tree Structure

[...]

Vinayak Krishna Bhat, H. H. Surendra, H. R. Archana

01 Jan 2020

TL;DR: K skew minimization design should be introduced in VLSI physical design at early stages of SoC’s where it has the highest benefits for QoR.

...read moreread less

Abstract: The most critical constraints in System on chip (SoC’s), to determine the performance are area and power. As technology scales down, innovative clock tree design techniques are required to improve the skew. Hence, skew minimization design should be introduced in VLSI physical design at early stages of SoC’s where it has the highest benefits for QoR. In this paper, skew balance methodology using H-Tree is introduced in Multisource CTS design.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects

[...]

Smruti R. Sarangi, Brian Greskamp¹, Radu Teodorescu¹, Jun Nakano¹, Abhishek Tiwari¹, Josep Torrellas¹ - Show less +2 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

07 Feb 2008-IEEE Transactions on Semiconductor Manufacturing

TL;DR: In this paper, a microarchitecture-aware model for process variation is proposed, including both random and systematic effects, and the model is specified using a small number of highly intuitive parameters.

...read moreread less

Abstract: Within-die parameter variation poses a major challenge to high-performance microprocessor design, negatively impacting a processor's frequency and leakage power. Addressing this problem, this paper proposes a microarchitecture-aware model for process variation-including both random and systematic effects. The model is specified using a small number of highly intuitive parameters. Using the variation model, this paper also proposes a framework to model timing errors caused by parameter variation. The model yields the failure rate of microarchitectural blocks as a function of clock frequency and the amount of variation. With the combination of the variation model and the error model, we have VARIUS, a comprehensive model that is capable of producing detailed statistics of timing errors as a function of different process parameters and operating conditions. We propose possible applications of VARIUS to microarchitectural research.

...read moreread less

386 citations

Journal Article•DOI•

MeshWorks: A Comprehensive Framework for Optimized Clock Mesh Network Synthesis

[...]

Anand Rajaram¹, David Z. Pan²•Institutions (2)

Magma Design Automation¹, University of Texas at Austin²

01 Dec 2010-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The MeshWorks framework is presented, the first comprehensive automated framework for planning, synthesis, and optimization of clock mesh networks that addresses the above issues and can achieve an additional reduction of 31% in buffer area, 21% in wirelength, and 23% in power.

...read moreread less

Abstract: Clock mesh networks are well known for their variation tolerance. But their usage is limited to high-end designs due to the significantly high resource requirements compared to clock trees and the lack of automatic mesh synthesis tools. Most existing works on clock mesh networks either deal with semi-custom design or perform optimizations on a given clock mesh. However, the problem of obtaining a good initial clock mesh has not been addressed. Also, the problem of achieving a smooth tradeoff between variation tolerance and resource requirements has not been addressed adequately. In this paper, we present our MeshWorks framework, the first comprehensive automated framework for planning, synthesis, and optimization of clock mesh networks that addresses the above issues. Experimental results suggest that our algorithms can achieve an additional reduction of 31% in buffer area, 21% in wirelength, and 23% in power, compared to the best previous work, with similar worst case maximum frequency. We also demonstrate the effectiveness of our framework under several practical issues such as blockages, multiple clocks, uneven load distribution, and electromigration violations.

...read moreread less

26 citations

"Buffer reduction algorithm for mesh..." refers background or methods in this paper

...In [1], the buffers are placed using a set-cover algorithm with a discrete buffer library....
[...]
...A detailed study has been made on leaf level clock mesh synthesis in [1], [2] and [3]....
[...]
...But this comes at the cost of increased power dissipation since mesh has increased wire capacitance [1]....
[...]

Journal Article•DOI•

A new clock network synthesizer for modern VLSI designs

[...]

Jingwei Lu¹, Wing-Kai Chow¹, Chiu-Wing Sham¹•Institutions (1)

Hong Kong Polytechnic University¹

01 Mar 2012-Integration

TL;DR: A novel clock tree synthesizer with dual-MST geometric approach of perfect matching is developed for symmetric clock tree construction and a special technique of buffer sizing is introduced to reduce the variation effect.

...read moreread less

17 citations

"Buffer reduction algorithm for mesh..." refers background in this paper

...This increased skew (upto 30 ps) is still acceptable since tree based distribution have reported skew of 45-70 ps on the same ISPD2010 benchmarks under similar simulation conditions and variations ([8])....
[...]

Journal Article•DOI•

High-performance clock mesh optimization

[...]

Matthew R. Guthaus¹, Xuchu Hu¹, Gustavo Wilke², Guilherme Flach², Ricardo Reis² - Show less +1 more•Institutions (2)

University of California, Santa Cruz¹, Universidade Federal do Rio Grande do Sul²

05 Jul 2012-ACM Transactions on Design Automation of Electronic Systems

TL;DR: This work presents two techniques to optimize high-performance clock meshes, the first of which is a mesh perturbation methodology for nonuniform mesh routing and the second a skew-aware buffer placement through iterative buffer deletion.

...read moreread less

Abstract: Clock meshes are extremely effective at producing low-skew regional clock networks that are tolerant of environmental and process variations. For this reason, clock meshes are used in most high-performance designs, but this robustness consumes significant power. In this work, we present two techniques to optimize high-performance clock meshes. The first technique is a mesh perturbation methodology for nonuniform mesh routing. The second technique is a skew-aware buffer placement through iterative buffer deletion. We demonstrate how these optimizations can achieve significant power reductions and a near elimination of short-circuit power. In addition, the total wire length is decreased, the number of required buffers is decreased, and both skew and robustness are improved on average when variation is considered.

...read moreread less

16 citations

"Buffer reduction algorithm for mesh..." refers background or methods or result in this paper

...The IBD of [2] is applied to the same ISPD2010 benchmarks under the same simulation conditions as ours....
[...]
...To ascertain the effectiveness of our algorithm, we compare our results with skew and power of [2] in Table IV and note that our buffer reduction algorithm can achieve 15 - 18% reduction in power at the cost of increased skew....
[...]
...A detailed study has been made on leaf level clock mesh synthesis in [1], [2] and [3]....
[...]
...To the best of our knowledge, the Iterative Buffer Deletion algorithm (IBD) presented in [2] is the only published work on buffer reduction in clock mesh....
[...]
...ispd10 This work IBD of [2] % Pwr reduction Skew(ps) Pwr(mW) Skew(ps) Pwr(mW) cns06 21....
[...]

Journal Article•DOI•

Low Power Clock Network Design

[...]

Inna P.-Vaisband¹, Eby G. Friedman², Ran Ginosar², Avinoam Kolodny¹•Institutions (2)

University of Rochester¹, Technion – Israel Institute of Technology²

19 May 2011-Journal of Low Power Electronics and Applications

TL;DR: Different methods to manage skew and skew variations within tree and non-tree clock distribution networks are reviewed and compared and metrics to determine the most power efficient technique for a given circuit are discussed and verified with simulation.

...read moreread less

Abstract: Power is a primary concern in modern circuits. Clock distribution networks, in particular, are an essential element of a synchronous digital circuit and a significant power consumer. Clock distribution networks are subject to clock skew due to process, voltage, and temperature (PVT) variations and load imbalances. A target skew between sequentially-adjacent registers can be obtained in a balanced low power clock tree using techniques such as buffer and wire sizing. Existing skew mitigation techniques in tree-based clock distribution networks, however, are not efficient in coping with post design variations; whereas the latest non-tree mesh-based solutions reliably handle skew variations, albeit with a significant increase in dissipated power. Alternatively, crosslink-based methods provide low power and variation-efficient skew solutions. Existing crosslink-based methods, however, only address skew at the network topology level and do not target low power consumption. Different methods to manage skew and skew variations within tree and non-tree clock distribution networks are reviewed and compared in this paper. Guidelines for inserting crosslinks within a buffered low power clock tree are provided. Metrics to determine the most power efficient technique for a given circuit are discussed and verified with simulation.

...read moreread less

9 citations