scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Low Power Electronics in 2013"


Journal ArticleDOI
TL;DR: It is shown that significant energy, performance and area gains can be achieved, while trading a perceptually tolerable level of error, in contexts where the “quality” of the results of the computation is perceptually determined by the authors' senses.
Abstract: It is widely acknowledged that the exponentially improving benefits of sustained technology scaling prophesied by the Moore’s Law would end within the next decade or so, attributed primarily to an understanding that switching devices can no longer function deterministically as feature sizes are scaled down to the molecular levels. The benefits of Moore’s Law could, however, continue provided systems with probabilistic or “error-prone” elements could still process information usefully. We believe that this is indeed possible in contexts where the “quality” of the results of the computation is perceptually determined by our senses—audio and video information being significant examples. To demonstrate this principle, we will show how such “inexact” computing based devices, circuits and computing architectures can be used effectively to realize many ubiquitous energy-constrained error-resilient applications. Further, we show that significant energy, performance and area gains can be achieved, while trading a perceptually tolerable level of error–that will be ultimately determined based on neurobiological models—applied in the context of video and audio data in digital signal processing. This design philosophy of inexact computing is of particular interest in the domain of embedded and (portable) multimedia applications and in application domains of budding interest such as recognition and data mining, all of which can tolerate inaccuracies to varying extents or can synthesize accurate (or sufficient) information even from inaccurate computations!

16 citations



Journal ArticleDOI
TL;DR: This paper proposes a two-write and two-read bit-cell for a multi-port (MP) SRAM design to improve the static noise margin (SNM) and solve the write-disturb issues of nanoscale CMOS technologies.
Abstract: This paper proposes a two-write and two-read (2W2R) bit-cell for a multi-port (MP) SRAM design to improve the static noise margin (SNM) and solve the write-disturb issues of nanoscale CMOS technologies. Using an additional Y -access MOS (column-direction access transistor), the 2W2R MP SRAM adopts a scheme of combining the row access transistor and sharing write bit-line with an adjacent bit cell. This scheme halves the write bit-line number and mitigates the write current consumption caused by pre-charging the bit-line to VDD. This paper also proposes a selective read path structure for read operation. Replacing the ground connection in the read port with a virtual VSS controlled by a Y -select signal reduces read-port current consumption. Results show that the proposed design reduces both the write current and read current consumption by 30%, compared to the conventional MP structure, from 1.3 V to 0.6 V VDD. The proposed 8 Kb 2W2R MP SRAM was fabricated on the test chip using TSMC 40 nm CMOS technology.

11 citations



Journal ArticleDOI
TL;DR: A resonant DC-DC converter suitable for ultra-low power and low voltage sources, which allows a self-starting and aSelf-operation under harsh conditions of input voltage and power without any additional start-up assistance is presented.
Abstract: This article presents a resonant DC-DC converter suitable for ultra-low power and low voltage sources. This original topology allows a self-starting and a self-operation under harsh conditions of input voltage and power without any additional start-up assistance. A global theoretical modeling of the converter which includes start-up and steady-state phases is presented and a methodology for optimal design is detailed. It is based on the combination of both theoretical calculations and circuit simulations. Experimental tests based on discrete prototypes are carried out in order to demonstrate the good operation of the converter. Experimental tests have been achieved using an RF energy harvesting source. Ultra-low power and low voltage conditions as low as 3 μW and 100 mV respectively can be achieved as demonstrated by the experimental measurements. The input low voltage is stepped-up to a conventional level of some volts, what allows to power autonomously and solely low power circuits from energy harvesting sources.

8 citations





Journal ArticleDOI
TL;DR: A new circuit style is proposed to tune the delay, subthreshold leakage (ISUB), and gate leakage of high fan-in multiplexer circuits, such as the FPGA Look-Up Table (LUT) and Switch-Box (SB), without increasing the Gate Induced Drain Leakage (GIDL) current or causing any reliability problems.
Abstract: A new circuit style is proposed to tune the delay, subthreshold leakage (ISUB), and gate leakage (IG) of high fan-in multiplexer circuits, such as the FPGA Look-Up Table (LUT) and Switch-Box (SB), without increasing the Gate Induced Drain Leakage (GIDL) current or causing any reliability problems. In the proposed Adaptive Vgs (AVGS) style, Regular Threshold Voltage (RVT) transistors are replaced with the Low-VT (LVT) ones, but during the active-mode, new transistors have Vgs = - ΔV in the OFF and Vgs = VDD - ΔV in the ON conditions where ΔV is a new adjustable supply rail. AVGS can be a scalable replacement of the Adaptive Body Biasing (ABB) and Adaptive Supply Voltage (ASV) techniques in emerging manufacturing technologies that have very small body effect and cannot tolerate voltages higher than the nominal supply voltage (VDD) due to the reliability issues. Proposed technique is verified on silicon in the 90 nm technology and remarkable results are observed. AVGS can also be utilized to customize the FPGA delay-leakage trade-off. Area, leakage, dynamic power, and performance overheads are small.

6 citations


Journal ArticleDOI
TL;DR: A new recursive recoding algorithm is proposed that shortens the critical path of the multiplier and reduces the hardware complexity of partial-product-generators as well and provides an optimal space/time partitioning of themultiplier architecture for any size N of the operands.
Abstract: This paper addresses the problem of multiplication with large operand sizes (N≥32). We propose a new recursive recoding algorithm that shortens the critical path of the multiplier and reduces the hardware complexity of partial-product-generators as well. The new recoding algorithm provides an optimal space/time partitioning of the multiplier architecture for any size N of the operands. As a result, the critical path is drastically reduced to 33 N / 2 - 3 with no area overhead in comparison to modified Booth algorithm that shows a critical path of N/2 in adder stages. For instance, only 7 adder stages are needed for a 64-bit two's complement multiplier. Confronted to reference algorithms for N=64, important gain ratios of 1.62, 1.71, 2.64 are obtained in terms of multiply-time, energy consumption per multiply- operation, and total gate count, respectively.

6 citations



Journal ArticleDOI
TL;DR: A methodology combining neural networks and evolutionary computing for quickly estimating peak power consumption is presented, which was applied on the Intel 8051 CPU core synthesized with a 65 nm industrial technology reducing significant time with respect to old methods.
Abstract: High power consumption during test may lead to yield loss and premature aging. In particular, excessive peak power consumption during at-speed delay fault testing represents an important issue. In the literature, several techniques have been proposed to reduce peak power consumption during at-speed LOC or LOS delay testing. On the other side, limiting too much the power consumption during test may reduce the defect coverage. Hence, techniques for identifying upper and lower functional power limits are crucial for delay fault testing. Yet, the task of computing the maximum functional peak power achievable by CPU cores is challenging, since the functional patterns with maximum peak power depend on specific instruction execution order and operands. In this paper, we present a methodology combining neural networks and evolutionary computing for quickly estimating peak power consumption. The method is used within an algorithm for automatic functional program generation used to identify test programs with maximal functional peak power consumption, which are suitable for defining peak power limits under test. The proposed approach was applied on the Intel 8051 CPU core synthesized with a 65 nm industrial technology reducing significant time with respect to old methods.




Journal ArticleDOI
TL;DR: The proposed approach aims at considerably alleviating the detrimental effects of current contention mechanisms, occurring at critical switching nodes of the circuits, in this way, both latency and power consumption of pulse triggered flip-flops are reduced.
Abstract: In this paper, simple circuital techniques to design efficient pulse triggered flip-flops are presented. The proposed approach aims at considerably alleviating the detrimental effects of current contention mechanisms, occurring at critical switching nodes of the circuits. In this way, both latency and power consumption of pulse triggered flip-flops are reduced. The proposed approach is assessed by means of simulations in 90-nm ST commercial CMOS technology. When applied to some recently proposed implicit pulse triggered flip-flop architectures, the suggested design strategy, allows speed to be improved up to 13% and power-delay-product to be lowered down to 14%. Moreover, also the process variation tolerance is considerably improved.

Journal ArticleDOI
TL;DR: A design method for applying the resonant clocking approach for synthesized clock trees is presented, which attempts to minimize the number of LC tanks that can deliver a full swing signal to all the sink nodes by considering the capacitive load at each node to determine the location ofLC tanks.
Abstract: Clock distribution networks consume a considerable portion of the power dissipated by synchronous circuits. In conventional clock distribution networks, clock buffers are inserted to retain signal integrity along the long interconnects, which, in turn, significantly increase the power consumed by the clock distribution network. Resonant clock distribution networks are considered as efficient low-power alternatives to traditional clock distribution schemes. These networks utilize additional inductive circuits to reduce power while delivering a full swing clock signal to the sink nodes. A design method for applying the resonant clocking approach for synthesized clock trees is presented. The proper number and placement of LC tanks and the related resonance parameters are determined in the proposed method. This method attempts to minimize the number of LC tanks that can deliver a full swing signal to all the sink nodes by considering the capacitive load at each node to determine the location of LC tanks. Resonance parameters, such as the size of the inductor can be adapted to reduce the power consumption and/or area overhead of the clock distribution network. Simulation results indicate up to 57% reduction in the power consumed by the resonant clock network as compared to a conventional buffered clock network. Compared to existing methods, the number of LC tanks for the proposed technique is decreased up to 15% and the signal swing is also improved by 44%. Depending on whether power or area is the design objective, two different approaches are followed to determine the parameters of resonance. If the design objective is to lower the power consumed by the network, the power and area of the designed network improve up to 24% and 51%, respectively, as compared to state of the art methods. If a low area is targeted, the power and area improvements are 11% and 57%, respectively.


Journal ArticleDOI
TL;DR: SPICE simulation of dual-voltage ISCAS’85 benchmark circuits using the 90nm bulk CMOS PTM (predictive technology model) shows energy savings of up to 60% with no increase in the original critical path delay and up to 70% with relaxedcritical path delay.
Abstract: We propose a method for dual supply voltage digital design to reduce energy consumption without violating the given performance requirement. Although the basic idea of placing low voltage gates on non-critical paths is well known, a new two-step procedures does it so more efficiently. First, given a circuit and its nominal single supply voltage, we find a suitable value for a lower second supply voltage that is likely to give the best advantage in power reduction. Besides, using the critical path timing constraint and a linear-time gate slack calculation we also classify gates into three groups. All gates in Group 1 can be simultaneously assigned the lower voltage. Any gate in Group 2 can be assigned the lower voltage but then gate slacks must be recalculated because the group classifications may change. No gate in Group 3 can be assigned the lower voltage. A second step then assigns the lower voltage to the largest possible number of gates using the gate classifications and imposing a topological constraint, preventing any low voltage gate from feeding into a higher voltage gate, thus avoiding the use of level converters. SPICE simulation of dual-voltage ISCAS’85 benchmark circuits using the 90nm bulk CMOS PTM (predictive technology model) shows energy savings of up to 60% with no increase in the original critical path delay and up to 70% with relaxed critical path delay.






Journal ArticleDOI
TL;DR: This paper models the Capture-Power minimization problem as an instance of the Bottleneck Traveling Salesman Path Problem (BTSPP) and presents a methodology for estimating a lower bound on the peak capture-power.
Abstract: IR-Drop induced timing failures during testing can be avoided by minimizing the peak capturepower. This paper models the Capture-Power minimization problem as an instance of the Bottleneck Traveling Salesman Path Problem (BTSPP). The solution for the BTSPP implies an ordering on the input test vectors, which when followed during testing minimizes the Peak Capture-Power. The paper also presents a methodology for estimating a lower bound on the peak capture-power. Applying the proposed technique on ITC'99 benchmarks yielded optimal (equal to the estimated lower bound) results for all circuits. Interestingly, the technique also significantly reduced the average power consumed during testing when compared with commercial state-of-the-art tools