scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Performance evaluation of finFET based SRAM under statistical VT variability

TL;DR: The performance of extremely scaled FinFET-based 256-bit (6T) SRAM is evaluated with technology scaling for channel lengths of 20nm down to 7nm showing the scaling trends of basic performance metrics.
Abstract: FinFET devices are the most promising solutions for further technology scaling in the long term projections of the ITRS The performance of extremely scaled FinFET-based 256-bit (6T) SRAM is evaluated with technology scaling for channel lengths of 20nm down to 7nm showing the scaling trends of basic performance metrics In addition, the impact of threshold voltage variations on the delay, power, and stability is reported considering die-to-die variations Significant performance degradation is found starting from the 10nm channel length and continues down to 7nm
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the impact of the ITRS-2013 scaling strategy on the BR and ON-/OFF states is discussed, and two critical points along the channel are characterized with a change in the electron acceleration showing the physical significance of the off-equilibrium transport with scaling the channel length.
Abstract: Nanoscale trigate FinFET with channel lengths down to 9.7 nm as projected by the 2013 International Technology Roadmap of Semiconductors (ITRS-2013) are simulated by means of quantum corrected 3-D Monte Carlo technique in the ballistic and quasi-ballistic regimes. Ballisticity ratio (BR) is extracted and found to reach values as high as 90% at $L_{G}=9.7$ nm. The impact of the ITRS-2013 scaling strategy on the BR, and ON-/OFF-states is discussed. Forward and backward electron velocity components are extracted along the channel to analyze the electron transport in detail. Velocity profile is found to be characterized by two critical points along the channel, each is associated with a change in the electron acceleration showing the physical significance of the off-equilibrium transport with scaling the channel length.

11 citations


Cites background from "Performance evaluation of finFET ba..."

  • ...This can have serious implications on advanced circuit design, as discussed in [21]....

    [...]

Proceedings ArticleDOI
07 Sep 2020
TL;DR: A graph convolutional network (GCN) is introduced for quick ECO leakage optimization and a heuristic Vth reassignment is proposed to correct such timing as well as to remove any minimum implant width (MIW) violations.
Abstract: At the very late design stage, engineering change order (ECO) leakage optimization is often performed to swap some cells for the ones with lower leakage, e.g. the cells with higher threshold voltage (Vth) or with longer gate length. It is very effective but time consuming due to iterative nature of swap and timing check with correction. We introduce a graph convolutional network (GCN) for quick ECO leakage optimization. GCN receives a number of input parameters that model the current timing information of a netlist as well as the connectivity of the cells in a form of a weighted connectivity matrix. Once it is trained, GCN predicts exact Vth (with Vth given by commercial ECO leakage optimization as a reference) of 83% of cells, on average of test circuits. The remaining 17% of cells are responsible for some negative timing slack. To correct such timing as well as to remove any minimum implant width (MIW) violations, we propose a heuristic Vth reassignment. The combined GCN and heuristic achieve 52% reduction of leakage, which can be compared to 61% reduction from commercial ECO, but with less than half of runtime.

5 citations


Cites background from "Performance evaluation of finFET ba..."

  • ...up to 40% in 7nm technology node [1])....

    [...]

Proceedings ArticleDOI
17 Jan 2022
TL;DR: In this paper , a directed GNN based method which learns information from different neighbors respectively and contains rich local topology information was proposed for fast and accurate power optimization by considering neighbors' information.
Abstract: In modern design, engineering change order (ECO) is often utilized to perform power optimization including gate-sizing and Vth-assignments, which is efficient but highly timing consuming. Many graph neural network (GNN) based methods are recently proposed for fast and accurate ECO power optimization by considering neighbors' information. Nonetheless, these works fail to learn high-quality node representations on directed graph since they treat all neighbors uniformly when gathering their information and lack local topology information from neighbors one or two-hop away. In this paper, we introduce a directed GNN based method which learns information from different neighbors respectively and contains rich local topology information, which was validated by the Opencores and IWLS 2005 benchmarks with TSMC 28nm technology. Experimental results show that our approach outperforms prior GNN based methods with at least 7.8% and 7.6% prediction accuracy improvement for seen and unseen designs respectively as well as 8.3% to 29.0% leakage optimization improvement. Compared with commercial EDA tool PrimeTime, the proposed framework achieves similar power optimization results with up to 12X runtime improvement.

2 citations

Proceedings ArticleDOI
28 Jun 2022
TL;DR: Cloak exploits page-level data reuse in the LLC, to hide NVM read latency, and uses an LLC layout that accelerates the discovery of LLC-resident cache lines from the page to enable the high-bandwidth, low-latency transfer of lines of a page to the page buffers.
Abstract: The increased memory demands of workloads are putting high pressure on Last Level Caches (LLCs). In general, there is limited opportunity to increase the capacity of LLCs due to the area and power requirements of the underlying SRAM technology. Interestingly, emerging Non-Volatile Memory (NVM) technologies promise a feasible alternative to SRAM for LLCs due to their higher area density. However, NVMs have substantially higher read and write latencies, which offset their density benefit. Although researchers have proposed methods to tolerate NVM's higher write latency, little emphasis has been placed on the critical NVM read latency. To address this problem, this paper proposes Cloak. Cloak exploits page-level data reuse in the LLC, to hide NVM read latency. Specifically, on certain L1 DTLB misses, Cloak transfers LLC-resident data belonging to the TLB-missing page from the LLC NVM array to a set of small SRAM Page Buffers that will service subsequent requests to this page. Further, to enable the high-bandwidth, low-latency transfer of lines of a page to the page buffers, Cloak uses an LLC layout that accelerates the discovery of LLC-resident cache lines from the page. We evaluate Cloak with full-system simulations of a 4-core processor across 14 workloads. We find that, on average, a machine with Cloak is faster than one with an SRAM LLC by 23.8% and one with an NVM-only LLC by 8.9%---in both cases, with negligible change in area. Further, Cloak reduces the ED2 metric relative to these designs by 39.9% and 17.5%, respectively.

2 citations

Proceedings ArticleDOI
17 Jan 2022
TL;DR: A directed GNN based method which learns information from different neighbors respectively and contains rich local topology information, which was validated by the Opencores and IWLS 2005 benchmarks with TSMC 28nm technology and achieves similar power optimization results with up to 12X runtime improvement.
Abstract: In modern design, engineering change order (ECO) is often utilized to perform power optimization including gate-sizing and Vth-assignments, which is efficient but highly timing consuming. Many graph neural network (GNN) based methods are recently proposed for fast and accurate ECO power optimization by considering neighbors' information. Nonetheless, these works fail to learn high-quality node representations on directed graph since they treat all neighbors uniformly when gathering their information and lack local topology information from neighbors one or two-hop away. In this paper, we introduce a directed GNN based method which learns information from different neighbors respectively and contains rich local topology information, which was validated by the Opencores and IWLS 2005 benchmarks with TSMC 28nm technology. Experimental results show that our approach outperforms prior GNN based methods with at least 7.8% and 7.6% prediction accuracy improvement for seen and unseen designs respectively as well as 8.3% to 29.0% leakage optimization improvement. Compared with commercial EDA tool PrimeTime, the proposed framework achieves similar power optimization results with up to 12X runtime improvement.

2 citations

References
More filters
Proceedings ArticleDOI
12 Jun 2012
TL;DR: In this paper, a 22nm generation logic technology is described incorporating fully-depleted tri-gate transistors for the first time, which provides steep sub-threshold slopes (∼70mV/dec) and very low DIBL ( ∼50m V/V).
Abstract: A 22nm generation logic technology is described incorporating fully-depleted tri-gate transistors for the first time. These transistors feature a 3rd-generation high-k + metal-gate technology and a 5th generation of channel strain techniques resulting in the highest drive currents yet reported for NMOS and PMOS. The use of tri-gate transistors provides steep subthreshold slopes (∼70mV/dec) and very low DIBL (∼50mV/V). Self-aligned contacts are implemented to eliminate restrictive contact to gate registration requirements. Interconnects feature 9 metal layers with ultra-low-k dielectrics throughout the interconnect stack. High density MIM capacitors using a hafnium based high-k dielectric are provided. The technology is in high volume manufacturing.

705 citations


"Performance evaluation of finFET ba..." refers methods in this paper

  • ...ri-gate (TG) FinFET has been deployed as the first winning successor of the conventional planar transistor for the sub 22 nm technology node due to its superior electrostatics and subthreshold leakage control [1-3]....

    [...]

Proceedings ArticleDOI
01 Dec 2012
TL;DR: In this paper, a leading edge 22nm 3-D tri-gate transistor technology has been optimized for low power SoC products for the first time, and a low standby power 380Mb SRAM capable of operating at 2.6GHz with 10pA/cell standby leakages.
Abstract: A leading edge 22nm 3-D tri-gate transistor technology has been optimized for low power SoC products for the first time. Low standby power and high voltage transistors exploiting the superior short channel control, < 65mV/dec subthreshold slope and <40mV DIBL, of the Tri-Gate architecture have been fabricated concurrently with high speed logic transistors in a single SoC chip to achieve industry leading drive currents at record low leakage levels. NMOS/PMOS Idsat=0.41/0.37mA/um at 30pA/um Ioff, 0.75V, were used to build a low standby power 380Mb SRAM capable of operating at 2.6GHz with 10pA/cell standby leakages. This technology offers mix-and-match flexibility of transistor types, high-density interconnect stacks, and RF/mixed-signal features for leadership in mobile, handheld, wireless and embedded SoC products.

284 citations


"Performance evaluation of finFET ba..." refers methods in this paper

  • ...ri-gate (TG) FinFET has been deployed as the first winning successor of the conventional planar transistor for the sub 22 nm technology node due to its superior electrostatics and subthreshold leakage control [1-3]....

    [...]

Proceedings ArticleDOI
03 Apr 2012
TL;DR: A high-performance, voltage-scalable 162Mb SRAM array is developed in a 22nm tri-gate bulk technology featuring 3rd-generation high-k metal-gate transistors and 5th-generation strained silicon to address process variation and fin quantization at 22nm.
Abstract: Future product applications demand increasing performance with reduced power consumption, which motivates the pursuit of high-performance at reduced operating voltages. Random and systematic device variations pose significant challenges to SRAM V MIN and low-voltage performance as technology scaling follows Moore's law to the 22nm node. A high-performance, voltage-scalable 162Mb SRAM array is developed in a 22nm tri-gate bulk technology featuring 3rd-generation high-k metal-gate transistors and 5th-generation strained silicon. Tri-gate technology reduces short-channel effects (SCE) and improves subthreshold slope to provide 37% improved device performance at 0.7V. Continuous device width sizing in planar technology is replaced by combining parallel silicon fins to multiply drive current. Process-circuit co-optimization of transient voltage collapse write assist (TVC-WA) and wordline underdrive read assist (WLUD-RA) features address process variation and fin quantization at 22nm and enable a 175mV reduction in the supply voltage required for 2GHz SRAM operation. Figure 13.1.1 shows an SEM top-down view of a 0.092μm2 high-density 6T SRAM bitcell (HDC) and a 0.108μm2 low-voltage 6T SRAM cell (LVC) after gate and diffusion processing. Computational OPC/RET techniques extend the capabilities of 193nm immersion lithography to allow a 1.85× increase in array density relative to 32nm designs [1].

177 citations


"Performance evaluation of finFET ba..." refers methods in this paper

  • ...ri-gate (TG) FinFET has been deployed as the first winning successor of the conventional planar transistor for the sub 22 nm technology node due to its superior electrostatics and subthreshold leakage control [1-3]....

    [...]

Journal ArticleDOI
TL;DR: A novel methodology for FinFET-based keeper design is introduced, which exploits the exclusive property ofFinFET devices (capacitive coupling between the front gate and the back gate in a four-terminal FinFet) to simultaneously achieve higher performance and lower power consumption.
Abstract: Design optimization of FinFET domino logic is particularly challenging due to the unique width quantization property of FinFET devices. Since the keeper device in domino logic is sized based on the leakage current of the pull-down network (PDN) (to meet the noise margin constraint), a reliable statistical framework is required to accurately estimate the domino gate leakage current. Considering the width quantization property, this paper presents such a statistical framework, which provides a reliable design window for keeper sizing to meet the noise margin constraint (for the practical range of threshold voltage variation in sub-32-nm technology nodes). On the other hand, the width quantization property restricts the design optimization (including power/performance characteristics) typically achieved via continuous keeper sizing in planar-CMOS domino logic designs. To cope with this restriction, this paper also introduces a novel methodology for FinFET-based keeper design, which exploits the exclusive property of FinFET devices (capacitive coupling between the front gate and the back gate in a four-terminal FinFET) to simultaneously achieve higher performance and lower power consumption. Using this new methodology, the keeper device is made weaker at the beginning of the evaluation phase to reduce its contention with the PDN, but gradually becomes stronger to provide a higher noise margin.

35 citations


"Performance evaluation of finFET ba..." refers background in this paper

  • ...In addition to new design issues such as width quantization which limits the design optimization [6]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the design space, including fin thickness, fin height, fin ratio of bit-cell transistors, and surface orientation, is researched to optimize the stability, leakage current, array dynamic energy, and read/write delay of the FinFET SRAM under layout area constraints.
Abstract: In this paper, the design space, including fin thickness (Tfin), fin height (Hfin), fin ratio of bit-cell transistors, and surface orientation, is researched to optimize the stability, leakage current, array dynamic energy, and read/write delay of the FinFET SRAM under layout area constraints. The simulation results, which consider the variations of both Tfin and threshold voltage (Vth), show that most FinFET SRAM configurations achieve a superior read/write noise margin when compared with planar SRAMs. However, when two fins are used as pass gate transistors (PG) in FinFET SRAMs, enormous array dynamic energy is required due to the increased effective gate and drain capacitance. On the other hand, a FinFET SRAM with a one-fin PG in the (110) plane shows a smaller write noise margin than the planar SRAM. Thus, the one-fin PG in the (100) plane is suitable for FinFET SRAM design. The one-fin PG FinFET SRAM with Tfin = 10 nm and Hfin = 40 nm in the (100) plane achieves a three times larger noise margin when compared with the planar SRAM and consumes a 17% smaller bit-line toggling array energy at a cost of a 22% larger word-line toggling energy. It also achieves a 2.3 times smaller read delay and a 30% smaller write delay when compared with the planar SRAM.

33 citations


"Performance evaluation of finFET ba..." refers background or methods in this paper

  • ...However, having new geometry parameters such as the fin thickness, the quantized number of fins, and even surface orientation opens the way for new design optimization techniques [7]....

    [...]

  • ...In [7], the FinFET SRAM design space is discussed, under different fin thicknesses and fin heights, to optimize stability, delays and leakage current but at constant channel length....

    [...]