scispace - formally typeset
Search or ask a question

Showing papers on "Process corners published in 2009"


01 Jan 2009
TL;DR: In this article, a method to extract process variations from a test chip fabricated in a 65-nm process is presented, and three critical IV points from the cut-off and linear re-gions are identified.
Abstract: Statistical circuit analysis and optimization are crit- ical for robust nanoscale CMOS design. To accurately perform such analysis, primary process variation sources need to be identi- fied and modeled for further circuit simulation. In this work, a rig- orousmethodtoextractprocessvariationsfrominsituIVmeasure- ments is present. Transistor statistics are collected from a test chip fabricated in a 65-nm process. Gate length ( ), threshold voltage and mobility are recognized as the leading variation sources, due to the tremendous process challenges in lithography, channel doping, and the stress engineering. To decompose these variations, three critical IV points from the cut-off and linear re- gionsareidentified.Theextracted , and variationsarenor- mally distributed, with negligible spatial correlation. By including extracted variations in the nominal model file, accurate prediction of the change of drive current in all operation regions and process corners is achieved. The new extraction method guarantees excel- lent model matching with hardware for further statistical circuit analysis.

62 citations


Journal ArticleDOI
TL;DR: A rigorous method to extract process variations from in situ IV measurements is present, and the new extraction method guarantees excellent model matching with hardware for further statistical circuit analysis.
Abstract: Statistical circuit analysis and optimization are critical for robust nanoscale CMOS design. To accurately perform such analysis, primary process variation sources need to be identified and modeled for further circuit simulation. In this work, a rigorous method to extract process variations from in situ IV measurements is present. Transistor statistics are collected from a test chip fabricated in a 65-nm process. Gate length (L ), threshold voltage (Vth) and mobility (?) are recognized as the leading variation sources, due to the tremendous process challenges in lithography, channel doping, and the stress engineering. To decompose these variations, three critical IV points from the cut-off and linear regions are identified. The extracted L , Vth and ? variations are normally distributed, with negligible spatial correlation. By including extracted variations in the nominal model file, accurate prediction of the change of drive current in all operation regions and process corners is achieved. The new extraction method guarantees excellent model matching with hardware for further statistical circuit analysis.

54 citations


Proceedings ArticleDOI
18 Dec 2009
TL;DR: This paper proposes a tool, called AutoRex, that produces clock-tuning assignments automatically, by taking data from a volume experiment across multiple process corners and analyzes this data using Satisfiability Modulo Theory (SMT) solvers to create a single “recipe” for delay buffer assignments such that the clock frequency of the chip is improved as much as possible over the entire sample of chips.
Abstract: Post-silicon clock-tuning is a technique used as part of speed-debug efforts to increase the allowable clock frequency of a chip. These days, it is not uncommon for high-end microprocessors to have cores containing a few thousand clock-tuning elements (i.e., variable-delay buffers). Each such buffer can be assigned to one of several possible discrete delay values, as part of the post-silicon speed debugging process. With the proper mix of assignments, many chips that initially could not meet targeted speed requirements, can now run within specification. With thousands of tunable buffers available on chip, the possible combination of assignments to the delay values is quite large. In addition, process variation causes the same design, once fabricated into silicon, to have different critical paths across different chips. Thus a specific buffer-delay assignment that most improves clock frequency for some chips may not be optimal for all chips. In this paper, we propose a tool we call AutoRex, that produces clock-tuning assignments automatically. AutoRex operates by taking data from a volume experiment across multiple process corners and analyzes this data using Satisfiability Modulo Theory (SMT) solvers to create a single “recipe” for delay buffer assignments such that the clock frequency of the chip is improved as much as possible over the entire sample of chips. Our results show up to a 9% improvement in frequency using AutoRex.

25 citations


Journal ArticleDOI
TL;DR: A divide-by-4 injection-locked frequency divider based on a novel process and temperature compensation technique that possesses a wide locking range over process corners and a wide temperature range due to the proposed compensation technique is described.
Abstract: The authors propose a ring-based injection-locked frequency divider (ILFD) incorporating a novel process and temperature compensation technique. The core of the ILFD consists of a process and temperature compensated ring oscillator based on modified symmetric load delay elements. Measurement results show that the natural frequency of oscillation of the ring oscillator varies only 4.4% across six different chips and for a temperature range of 0–80°C. The ILFD possesses a wide locking range over process corners as well as a wide temperature range because of the proposed compensation technique and the incorporation of the delay cell architecture in the design. A calibration circuitry can be used to further enhance the locking range. Measurement results show that the proposed ILFD functions as a divide-by-4 for an input frequency range of 1.8–3.2 GHz for an input power level as low as −3 dBm. The worst-case power consumption was approximately 2 mW from a 1.8 V power supply. The proposed ILFD can be used as a low-power prescaler for multi-band applications.

23 citations


Proceedings ArticleDOI
16 Mar 2009
TL;DR: The effects of body bias and source bias in 65nm technology through simulations on SRAM standby current shows a 8X reduction in cell Isb at 125°C FF process corner with a 1.0V NMOS body bias.
Abstract: Standby power is one of the most critical issues in low power chip applications. In this paper, we have investigated the effects of body bias and source bias in 65nm technology through simulations on SRAM standby current (Isb). The simulation results show a 8X reduction in cell Isb at 125°C FF process corner with a 1.0V NMOS body bias. This has been experimentally verified on a 16Mb SRAM testchip. Source biasing is shown to be a more effective technique for room temperature leakage reduction (~3X lower Isb@0.4V bias). Optimizing the SRAM cell is crucial to meet the product performance requirements across corners and a methodology for the same is also described. The 16Mb testchip was characterized for read disturb, write margin and read current margin at process corners by applying forward and reverse body biases to shift the cell transistor parameters. Different test sequences tailored for the parameter being measured were used to determine the failing bit count in each case. Voltage schmoo plots were generated from the measured data to obtain the Vccmin at each body bias condition. Based on the above, the threshold voltages of the cell transistors for maximum operating margin were derived.

17 citations


Proceedings ArticleDOI
10 May 2009
TL;DR: A CMS scheme with dynamic overdriving driver (DOD) whose performance is robust against intra-die and inter-die process variations and that of the scheme in [2] degrades by 36% in the worst case process corner.
Abstract: Current mode signaling(CMS) scheme is one of the promising alternatives to voltage mode buffer insertion scheme for high-speed low-power data transmission over long on-chip interconnects. In this paper we present a CMS scheme with dynamic overdriving driver (DOD) whose performance is robust against intra-die and inter-die process variations. We show that throughput of the CMS scheme proposed in [1] degrades by 33% in the presence of intra-die process variations whereas that of the scheme in [2] degrades by 36% in the worst case process corner. Simulation results show that throughput of the proposed CMS scheme degrades by only 9.5% in presence of intra-die process variations and 22% in the worst case process corner. In this process corner, logic speed itself degrades by 23% and hence 22% of throughput degradation of the proposed signaling scheme is not a major concern. In the typical process corner, the proposed CMS scheme shows 14% and 19% improvement in delay and power, respectively over CMS scheme proposed in [1].

15 citations


Proceedings ArticleDOI
11 Dec 2009
TL;DR: In this paper, a closed-loop time-amplifier (TA) with self-calibration technique by adjusting the output capacitance of the conventional TA is presented. But the performance of the proposed TA is limited.
Abstract: This paper presents a closed-loop time-amplifier (TA) with a novel self-calibration technique by adjusting the output capacitance of the conventional TA. The gain of the TA is stabilized, with an input of 0.05∼1 T d (one buffer delay), over a large Process-Voltage-Temperature (PVT) variation: from SS to FF process corner, +/−10% supply voltage, and −40 to 80 °C. The proposed TA is designed with SMIC 0.18-µm mixed-signal CMOS process. Simulation results show that the gain deviation of TA is well controlled within 0.35% under all circumstances, with regard to the gain in typical PVT condition, and the whole circuit consumes 600 µA with an input signal of 40 MHz1.

14 citations


Proceedings ArticleDOI
19 Aug 2009
TL;DR: A centralized control system with region-specific bias control to mitigate the impact of within-die (WID) process variation is introduced and an algorithm for determining the minimum required global supply voltage across all the regions and optimal body-biasing voltages for the individual regions is illustrated.
Abstract: With the scaling of MOSFET dimensions and the enhancements introduced to boost its performance, variation in semiconductor manufacturing has increased. The manufactured designs are usually shifted from the intended operating point, degrading the parametric yield. In this paper, we partition the chip into multiple regions with localized sensors and introduce a centralized control system with region-specific bias control to mitigate the impact of within-die (WID) process variation. An algorithm for determining the minimum required global supply voltage across all the regions and optimal body-biasing voltages for the individual regions is illustrated. This system ensures the desired frequency of operation for the chip under optimal power conditions for each of the regions. Design considerations, simulation results and power-performance characteristics of this fine-grain body biasing compensation technique are presented based on simulations of the IBM 65 nm technology. This method achieves an average reduction of 7.2% in total power dissipated across process corners while bringing the critical path delay in all modules within the desired +/− 3% of nominal delay.

11 citations


Patent
30 Sep 2009
TL;DR: A write boost circuit as mentioned in this paper provides automatic mode control for boost with different modalities with respect to the external supply voltage and also with the extent of boost required at different process corners.
Abstract: A write boost circuit provides an automatic mode control for boost with different modalities with respect to the external supply voltage and also with respect to the extent of boost required at different process corners. The write boost circuit also takes care of the minimum boost provided to process corners with good writability where less boost is required. The boost is realized in terms of ground raising in the particular context and in general applicable to all other methods.

10 citations


Patent
02 Jun 2009
TL;DR: An integrated circuit for achieving power reduction in a transceiver may include a jammer detector that determines an interference level corresponding to a received signal, and a transmit power detector that determined a required transmit power level for a transmitted signal.
Abstract: An integrated circuit for achieving power reduction in a transceiver may include a jammer detector that determines an interference level corresponding to a received signal, and a transmit power detector that determines a required transmit power level for a transmitted signal. The integrated circuit may also include at least one of the following: a process monitor that determines process corners of components within the receiver and/or the transmitter, and a temperature monitor that determines a temperature of the receiver and/or the transmitter. The integrated circuit may also include a state machine. The state machine may transition the receiver from a high linearity mode to a low linearity mode if a set of operating conditions is satisfied. Similarly, the state machine may transition the transmitter from a high power mode to a low power mode if a set of operating conditions is satisfied.

9 citations


Proceedings ArticleDOI
15 Sep 2009
TL;DR: In this article, a methodology to design a performance enhanced sub-threshold standard cell library robust to process variations is discussed and an optimal design choice is made with energy-delay product as a metric.
Abstract: Digital subthreshold circuits are gaining importance because of their ability to serve as an ideal low power solution. In this paper, a methodology to design a performance enhanced subthreshold standard cell library robust to process variations is discussed. Several approaches to design a performance enhanced cell library are discussed and an optimal design choice is made with energy-delay product as a metric. Significant performance improvements of 2X, 8X and 1.5X are achieved for inverter, AND, and OR cells respectively over regular cell library. The variation in delay for the proposed standard cell library with respect to four process corners is studied. A significant reduction of about 75.6% in delay variation across worst case process corners was observed when a normal inverter and inverter from the high performance cell library were simulated.

Proceedings ArticleDOI
18 May 2009
TL;DR: In this paper, a radiation hard PLL using 0.25 µm SOS-CMOS technology for space applications is presented, which is fully self-biased and gives output frequency of 2.5GHz.
Abstract: This paper presents a radiation hard PLL using 0.25 µm SOS-CMOS technology for space applications. This PLL is fully self-biased and gives output frequency of 2.5GHz. This robust PLL successfully performs for all the process corners from −40°C to 80°C under Cadence-SpectreRF schematic and layout simulations. A new modification has been done on the differential buffers of the VCO used in the PLL to reduce phase noise. Simulation results from extracted layout including buffers and pads are enlisted for pre and post radiation environments.

Patent
29 May 2009
TL;DR: In this paper, an apparatus and a method are provided for adaptively adjusting the impulse response of the optical output of the laser of an optical link in a way that ensures that the optical waveform being transmitted from the optical TX into the optical link has a desired waveform shape.
Abstract: An apparatus and a method are provided for adaptively adjusting the impulse response of the optical output of the laser of the optical TX in a way that ensures that the optical waveform being transmitted from the optical TX into the optical waveguide of the optical link has a desired waveform shape that improves or optimizes the performance of the optical link across variations in temperature, power supply, laser process corners, IC process corners, component aging, mechanical manufacturing tolerances, and part alignment tolerances. Adaptively adjusting the impulse response of the optical signal output from the laser in this way allows the optical TX to dynamically adapt to and compensate for a wide range of factors that typically cause performance degradation and result in reduced product yields, increased testing times, and increased test complexity, and higher costs. This, in turn, allows manufacturing tolerances and alignment tolerances to be relaxed, test times and test complexity to be reduced, and overall manufacturing and testing costs to be reduced.

Proceedings ArticleDOI
Mingxu Huo1, Koubao Ding1, Yan Han1, Shurong Dong1, Xiaoyang Du1, Dahai Huang1, Bo Song1 
06 Jul 2009
TL;DR: In this paper, the trigger voltage of the same pin on some products shifts from 9.5V to 15.5v and the circuit simulations at various process corners are applied to study the snapback device under this situation.
Abstract: The popular electrostatic discharge (ESD) protection device, multi-finger NMOS with gate-coupling technique for better uniform turning-on, can be affected by process variation. The transmission line pulsing (TLP) test results reveal this phenomenon. The trigger voltage of the same pin on some products shifts from 9.5V to 15.5V. No such significant difference was ever reported in the literature. In this study, the circuit simulations at various process corners are applied to study the snapback device under this situation. With only the NMOS gate-drain overlap as coupling capacitance, the gate-to-ground resistor plays a vital role in counteracting the variation. When increased from 3KOhm to 12KOhm, the turn-on voltage is reduced and the target ESD performance is achieved. The protection structure is processed on an EEPROM process, which is used as both I/O protection circuit and power-clamp. It is able to pass 4KV HBM ESD level.

Proceedings ArticleDOI
07 Oct 2009
TL;DR: Simulation results are provided using the predictive technology file for 32nm feature size in CMOS to show that the proposed hardened memory cell is best suited when designing memories for both high performance and soft error tolerance.
Abstract: This paper proposes a new design for hardening a CMOS memory cell at the nano feature size of 32nm. By separating the circuitry for the write and read operations, the static stability of the proposed cell configuration increases more than 4.4 times at typical process corner, respectively compared to previous designs. Simulation shows that by appropriately sizing the pull-down transistors, the proposed cell results in a 40% higher critical charge and 13% less delay than the conventional design. Simulation results are provided using the predictive technology file for 32nm feature size in CMOS to show that the proposed hardened memory cell is best suited when designing memories for both high performance and soft error tolerance.

Proceedings ArticleDOI
22 Dec 2009
TL;DR: This paper describes a 65nm 16-bit parallel transceiver IP macro, whose bandwidth is 4.8GByte/s with 5pf load including the HBM 2000v ESD protection, which can be applied for the interface of sub-100nm high performance processors which require low latency and high stability.
Abstract: This paper describes a 65nm 16-bit parallel transceiver IP macro, whose bandwidth is 4.8GByte/s with 5pf load including the HBM 2000v ESD protection. Equalizers and CDR modules, CRC checkers and 8b/10b encoders are not added in the design for reducing the latency and the whole latency is 7ns without cables. Since the transceiver has many robust features including a PVT independent PLL with calibrations, the low skew differential clock tree, a stable current mode driver with common mode feedback. The transceiver can tolerance 20% power supply variations and work properly at different process corners and the extreme temperatures. The transceiver can be applied for the interface of sub-100nm high performance processors which require low latency and high stability. The transceiver shows a BER less than 10-15 at 3Gb/s/pin.

Proceedings ArticleDOI
31 Dec 2009
TL;DR: A 4.8GHz LC voltage controlled oscillator for Wireless Sensor Network (WSN) SoC RFIC chipset is designed based on SMIC 0.18 μm 1P6M RF CMOS process which achieves good phase noise performance and 2 bit switched capacitor array provides extra tuning range.
Abstract: A 4.8GHz LC voltage controlled oscillator (VCO) for Wireless Sensor Network (WSN) SoC RFIC chipset is designed based on SMIC 0.18 μm 1P6M RF CMOS process. The core circuit adopts complementary differential negative resistance structure with resistor biasing which achieves good phase noise performance. The 2 bit switched capacitor array provides extra tuning range. The chip size is 600μm×475μm with testing pads. With a 1.8V supply voltage, the post-simulation and chipset measured results show that the achieved maximum 40% tuning range can perfectly compensating the deviation due to process corners. And the measured phase noise is −96dBc/Hz@3MHz with the carrier be 4.8GHz. Besides, the operating current of the whole circuit is less than 7mA.

Patent
21 Dec 2009
TL;DR: In this article, a self-test circuit is used to switch from a low-speed operation mode to a high speed operation mode during a given time period, and the processing result is invalidated based on a control signal.
Abstract: A semiconductor integrated circuit includes a self-test circuit, wherein, when a operation mode of the self-test circuit has been switched from a low-speed operation mode to a high-speed operation mode, processing is performed in the high-speed operation mode during a given time period, and the processing result is invalidated based on a control signal.

Proceedings ArticleDOI
31 Aug 2009
TL;DR: Improved circuits of delay monitor and leakage monitor are proposed for both PMOS and NMOS process corner detection, which are uncorrelated in inter-die variations, to improve the yield by adopting correct body bias.
Abstract: As the technology scales down to nanometer, the yield degradation caused by inter-die variations is getting worse. Using adaptive body bias is an effective method to mitigate the yield degradation (especially for memory compiler generated SRAMs), however we need to know a die having high threshold voltage or low threshold voltage (also called process corner) in order to use this technique. Unfortunately, it is hard to detect the process corners when PMOS and NMOS variations are uncorrelated. In this paper, we propose some improved circuits of delay monitor and leakage monitor for both PMOS and NMOS process corner detection, which are uncorrelated in inter-die variations. The experimental results show that our circuits can clearly distinguish each process corner of PMOS and NMOS, thus improve the yield by adopting correct body bias.

Patent
01 Apr 2009
TL;DR: A semiconductor integrated circuit wafer includes: a plurality of semiconductor Integrated Circuit regions each of which includes a semiconductor IC formed thereon; a scribe region which separates the semiconductor integration regions adjacent to each other; a build in self test (BIST) circuit which is provided in the scribe regions and inspects the SIC as mentioned in this paper.
Abstract: A semiconductor integrated circuit wafer includes: a plurality of semiconductor integrated circuit regions each of which includes a semiconductor integrated circuit formed thereon; a scribe region which separates the semiconductor integrated circuit regions adjacent to each other; a build in self test (BIST) circuit which is provided in the scribe region and inspects the semiconductor integrated circuit; a connection wiring which is formed ranging from the scribe region to the semiconductor integrated circuit region and connects the semiconductor integrated circuit and the BIST circuit; a BIST switching signal input pad which is provided in the semiconductor integrated circuit region; and a BIST switching circuit which is provided in the semiconductor integrated circuit region and is driven by a driving signal input from the BIST switching signal input pad, the BIST switching circuit including: an input-output pad which connects with the semiconductor integrated circuit; a circuit wiring which connects the input-output pad with the semiconductor integrated circuit; and a switch element which is provided at a middle position of the circuit wiring and is driven by the driving signal input from the BIST switching signal input pad.

DOI
01 Jan 2009
TL;DR: SubJPEG, an ultra low-energy multi-standard JPEG encoder co-processor with a sub/near threshold power supply is designed and implemented and is largely applicable to designing other sound/graphic and streaming processors.
Abstract: Voltage scaling is one of the most effective and straightforward means for CMOS digital circuit’s energy reduction. Aggressive voltage scaling to the near or sub-threshold region helps achieving ultra-low energy consumption. However, it brings along big challenges to reach the required throughput and to have good tolerance of process variations. This thesis presents our research work in designing robust near/sub-threshold CMOS digital circuit. Our work has two features. First, unlike the other research work that uses subthreshold operation only for low-frequency low-throughput applications, we use architectural-level parallelism to compensate throughput degradation, so a medium throughput of up to 100MB/s suitable for digital consumer electronic applications can be achieved. Second, several new techniques are proposed to mitigate the yield degradation due to process variations. These techniques include: (a) Configurable V T balancer to control the V T spread. When facing process corners in the sub-threshold, our balancer will balance the V T of p/nMOS transistors through bulk-biasing. (b) Transistor sizing to combat V T mismatch between transistors. This is needed if the circuit needs to be operated with very deep sub-threshold supply voltage, i.e., below 250mV for 65nm CMOS standard V T process. (c) Improving sub-threshold drivability by exploiting the V T mismatch between parallel transistors. While the V T mismatch between parallel transistors is always known as notorious, we proposed to utilize it to boost the driving current in the sub-threshold. This interesting approach also suggests using multiple-finger layout style, which helps reducing silicon area considerably. (d) Selection procedure of the standard cells and how they were modified for higher reliability in the sub-threshold regime. Standard library cells that are sensitive to process variations must be eliminated in the synthesis flow. We provided the basic guideline to select "safe" cells. (e) The method that turns dangerous ratioed logic such as latch and register into non-ratioed logic. SubJPEG, an ultra low-energy multi-standard JPEG encoder co-processor with a sub/near threshold power supply is designed and implemented to demonstrate all these ideas. This 8-bit resolution DMA based co-processor has multiple power domains and multiple clock domains. It uses 4 parallel DCTQuantization engines in the data path. Instruction-level parallelism is also used. All the parallelism is implemented in an efficient manner so as to minimize the associated area overhead. Details about this co-processor architecture and implementation issues are covered in this thesis. The prototype chip is fabricated in TSMC 65nm 7-layer Low-Power Standard V T CMOS process. The core area is 1.4×1.4mm2. Each engine has its own V T balancer. Each V T balancer is 25×30µm2. The measurement results show that our V T balancer has very good balancing effect. In the sub-threshold mode the engines can operate with 2.5MHz clock frequency at 0.4V supply, with 0.75pJ energy per cycle per single engine for DCT and Quantization processing, i.e. 0.75pJ/(engine·cycle). This leads to 8.3× energy/(engine·cycle) reduction when compared to using a 1.2V nominal supply. In the near-threshold regime the energy dissipation is about 1.1pJ/(engine·cycle) with a 0.45V supply voltage at 4.5MHz. The system throughput can meet 15fps 640×480 pixel VGA compression standard. By further increasing the supply, the test chip can satisfy multi-standard image encoding. Our methodology is largely applicable to designing other sound/graphic and streaming processors.

Patent
09 Sep 2009
TL;DR: In this paper, a manufacturing method of a semiconductor integrated circuit device including, in a plasma process, in-situ monitoring of moisture in a processing chamber by receiving an electromagnetic wave generated from plasma.
Abstract: The present inventors have found that a wafer process of VLSI (Very Large Scale Integration) has the following problem, that is, generation of foreign matters due to moisture from a wafer as a result of degassing when a barrier metal film or a first-level metal interconnect layer is formed by sputtering as a preliminary step for the formation of a tungsten plug in a pre-metal step. To overcome the problem, the present invention provides a manufacturing method of a semiconductor integrated circuit device including, in a plasma process, in-situ monitoring of moisture in a processing chamber by receiving an electromagnetic wave generated from plasma.

Proceedings ArticleDOI
28 Dec 2009
TL;DR: In this paper, an ESD protected, SiGe BiCMOS wideband LNA operating at 1.1-1.7GHz is presented, where the effects of the ESD protection on the performance are discussed.
Abstract: An ESD protected, SiGe BiCMOS wide-band LNA operating at 1.1—1.7GHz is presented in this paper. The cascoded common-emitter LNA with an LC input matching network and shunt peaked load is adopted. The effects of the ESD protection on the performance are discussed. The LNA is implemented in a 0.35-μ m SiGe BiCMOS process with fT = 45G Hz. The post simulation results show that the noise figure is 1.7dB with a high S21 (22.5dB) and IIP3 of -17dBm, consuming total current of 8.1mA with an output buffer. The circuit is simulated under the combination of process corners and variation of temperature and power supply voltage.

Journal ArticleDOI
TL;DR: This study proposes a design-dependent statistical interconnect corner extraction methodology, SICE, which achieves a good trade-off between complexity and pessimism by extracting more than one process corners in a statistical sense, which are also design dependent.
Abstract: While traditional worst-case corner analysis is often too pessimistic for nanometer designs, full-blown statistical circuit analysis requires significant modelling infrastructures. In this study, a design-dependent statistical interconnect corner extraction (SICE) methodology is proposed. SICE achieves a good trade-off between complexity and pessimism by extracting more than one process corners in a statistical sense, which are also design dependent. Our new approach removes the pessimism incurred in prior work while being computationally efficient. The efficiency of SICE comes from the use of parameter dimension reduction techniques. The statistical corners are further compacted by an iterative output clustering method. Numerical results show that SICE achieves up to 260X speedups over the Monte Carlo method.

Proceedings ArticleDOI
15 Sep 2009
TL;DR: In this paper, a design of a given circuit block is optimized for multiple process corners, giving rise to multiple sub-designs, which can be implemented using the same front-end-of-the-line mask steps, and having back-end of the line processing differing by as few as one mask step (e.g., the Via1 layer).
Abstract: This paper proposes a new approach for reducing the consequences of global process variation and improving integrated circuit yield. In the proposed technique, a design of a given circuit block is optimized for multiple process corners, giving rise to multiple sub-designs. The sub-designs are constructed such that all can be implemented using the same front-end-of-the-line mask steps, and having back-end-of-the-line processing differing by as few as one mask step (e.g., the Via1 layer). During fabrication, in-line measurements made after the first level of metal deposition determine which sub-design is fabricated through the appropriate selection of the mask step variant. The technique allows for per-wafer or per-reticle circuit customization based on the wafer's or reticle's process parameters. A tapered buffer chain is investigated as an example of the technique. Simulation results show yield improvements of up to 20% and reductions in power dissipation up to 18%.

03 May 2009
TL;DR: This paper examines CM application to statistical and probabilistic technology variations based on the predictive and non-binnable model with minimal physically meaningful parameters in ULSI systems with CMOS design paradigms.
Abstract: ULSI systems are designed by electronic design automation (EDA) tools with performance figures-ofmerit (FOM) measured by SPICE circuit simulation, in which nonlinear transistors are modeled by the compact model (CM) with its nominal set of parameters extracted from a golden die of the given technology. Inevitable technology variations are represented by parameter statistical distributions, from which process corners and variations are checked by Monte Carlo simulations within the design margins. In this paper, we examine CM application to statistical and probabilistic technology variations based on the predictive and non-binnable model with minimal physically meaningful parameters. To capture geometry variations physically, a binned model with too many empirical fitting parameters can never provide physically meaningful statistics. Statistics and probability theories are applied to the mathematical CM for describing major transistor FOM and their bias, geometry, and process variations as well as functional parameter sensitivities. Propagation of model statistics and variations to higher-level primitives (such as logic gates) and its application to probabilistic CMOS design paradigms is explored.

01 Jan 2009
TL;DR: In this paper, a methodology to design a performance enhanced sub-threshold standard cell library robust to process variations is discussed and an optimal design choice is made with energy-delay product as a metric.
Abstract: Digital subthreshold circuits are gaming importance because of their ability to serve as an ideal low power solution. In this paper, a methodology to design a performance enhanced subthreshold standard cell library robust to process variations is discussed. Several approaches to design a performance enhanced cell library are discussed and an optimal design choice is made with energy-delay product as a metric. Significant performance improvements of 2X, 8X and 1.5X are achieved for inverter, AND, and OR cells respectively over regular cell library. The variation in delay for the proposed standard cell library with respect to four process corners is studied. A significant reduction of about 75.6°,4 in delay variation across worst case process corners was observed when a normal inverter and inverter from the high performance cell library were simulated.

Journal ArticleDOI
TL;DR: A design and optimization technique is proposed to minimize the bit-line voltage differential variation across process corners and voltages, which increases the read frequency by reducing the delay guard-band required at the design process corner.

Book ChapterDOI
09 Sep 2009
TL;DR: Statistical Static Timing Analysis (SSTA) is a promising approach to deal with nanometer process variations, especially the intra-die variations that cannot be handled properly by existing corner-based techniques, in the digital design flow.
Abstract: As process parameter dimensions continue to scale down, the gap between the designed layout and what is really manufactured on silicon is increasing. Due to the difficulty in process control in advanced nanometer technologies, manufacturing-induced variations are growing both in number and as a percentage of device feature sizes, and a deep understanding of the different sources of variation, along with their characterization and modeling, has become mandatory. Furthermore, process variability makes the prediction of digital circuit performance an extremely challenging task. Traditionally, the methodology adopted to determine the performance spread of a design in presence of variability is to run multiple Static Timing Analyses at different process corners, where standard cells and interconnects have the worst/best combinations of delay. Unfortunately, as the number of variability sources increases, the corner-based method is becoming computationally very expensive. Moreover, with a larger parameter spread this approach results in overly conservative and suboptimal designs, leaving most of the advantages offered by the new technologies on the table. Statistical Static Timing Analysis (SSTA) is a promising approach to deal with nanometer process variations, especially the intra-die variations that cannot be handled properly by existing corner-based techniques, in the digital design flow. Finally, the complexity and the impact of the variability problem on design productivity and profitability require innovative design solutions at the circuit and architectural level, and some of the most promising techniques for variability-aware design will be presented.

Proceedings ArticleDOI
15 Apr 2009
TL;DR: This talk summarizes recent results obtained in the design of such cognitive computing and communication systems and points to directions for future work in this area.
Abstract: CMOS technology scaling along with the resulting large variability of circuit performance has made post-silicon circuit and algorithmic level built-in test and adaptation/tuning almost a necessity for deeply scaled technologies. Currently, circuits are designed to tolerate worst-case process corners. In addition, circuits as well as demodulation/signal processing algorithms must be designed for worst case operating conditions (e.g. environmental noise). This forces designers to excessively guard band their circuits while using “aggressive” back-end algorithms to support the end application, resulting in unacceptable power-performance-yield tradeoffs. One way to tackle this problem is to design circuits and relevant signal processing algorithms that are cognitive of their environmental operating conditions and manufacturing process conditions and use this cognition to perform self-adaptation that conserves power while maximizing yield and reliability. Such self-adaptation involves incorporation of built-in test, diagnosis and tuning/adaptation mechanisms into the circuits and systems concerned. A key issue is that of test, diagnosis and tuning of complex circuit and system-level parameters that must be evaluated and traded off against one another during the adaptation process without access to complex external test instrumentation. This talk summarizes recent results obtained in the design of such cognitive computing and communication systems and points to directions for future work in this area.