scispace - formally typeset
Search or ask a question

Showing papers in "ACM Journal on Emerging Technologies in Computing Systems in 2012"


Journal ArticleDOI
TL;DR: Through architecture-space exploration in conjunction with novel power-efficient on-chip wireless link design, it is demonstrated that it is possible to improve performance of conventional NoC architectures significantly without incurring high area overhead.
Abstract: Massive levels of integration are making modern multicore chips all pervasive in several domains. High performance, robustness, and energy-efficiency are crucial for the widespread adoption of such platforms. Networks-on-Chip (NoCs) have emerged as communication backbones to enable a high degree of integration in multicore Systems-on-Chip (SoCs). Despite their advantages, an important performance limitation in traditional NoCs arises from planar metal interconnect-based multihop links with high latency and power consumption. This limitation can be addressed by drawing inspiration from the evolution of natural complex networks, which offer great performance-cost trade-offs. Analogous with many natural complex systems, future multicore chips are expected to be hierarchical and heterogeneous in nature as well. In this article we undertake a detailed performance evaluation for hierarchical small-world NoC architectures where the long-range communications links are established through the millimeter-wave wireless communication channels. Through architecture-space exploration in conjunction with novel power-efficient on-chip wireless link design, we demonstrate that it is possible to improve performance of conventional NoC architectures significantly without incurring high area overhead.

118 citations


Journal ArticleDOI
TL;DR: The article provides a comprehensive presentation of the architectural, software, and algorithmic issues for energy-aware scheduling of workflows on single, multicore, and parallel architectures and includes a systematic taxonomy of the algorithms developed in the literature based on the overall optimization goals and characteristics of applications.
Abstract: Enabled by high-speed networking in commercial, scientific, and government settings, the realm of high performance is burgeoning with greater amounts of computational and storage resources. Large-scale systems such as computational grids consume a significant amount of energy due to their massive sizes. The energy and cooling costs of such systems are often comparable to the procurement costs over a year period. In this survey, we will discuss allocation and scheduling algorithms, systems, and software for reducing power and energy dissipation of workflows on the target platforms of single processors, multicore processors, and distributed systems. Furthermore, recent research achievements will be investigated that deal with power and energy efficiency via different power management techniques and application scheduling algorithms. The article provides a comprehensive presentation of the architectural, software, and algorithmic issues for energy-aware scheduling of workflows on single, multicore, and parallel architectures. It also includes a systematic taxonomy of the algorithms developed in the literature based on the overall optimization goals and characteristics of applications.

69 citations


Journal ArticleDOI
TL;DR: This work proposes a torus-based hierarchical hybrid optical-electronic NoC, called THOE, which employs several new techniques including floorplan optimization, an adaptive power control mechanism, low-latency control protocols, and hybrid Optical-electrical routers with a low-power optical switching fabric.
Abstract: Networks-on-chip (NoCs) are emerging as a key on-chip communication architecture for multiprocessor systems-on-chip (MPSoCs). Optical communication technologies are introduced to NoCs in order to empower ultra-high bandwidth with low power consumption. However, in existing optical NoCs, communication locality is poorly supported, and the importance of floorplanning is overlooked. These significantly limit the power efficiency and performance of optical NoCs. In this work, we address these issues and propose a torus-based hierarchical hybrid optical-electronic NoC, called THOE. THOE takes advantage of both electrical and optical routers and interconnects in a hierarchical manner. It employs several new techniques including floorplan optimization, an adaptive power control mechanism, low-latency control protocols, and hybrid optical-electrical routers with a low-power optical switching fabric. Both of the unfolded and folded torus topologies are explored for THOE. Based on a set of real MPSoC applications, we compared THOE with a typical torus-based optical NoC as well as a torus-based electronic NoC in 45nm on a 256-core MPSoC, using a SystemC-based cycle-accurate NoC simulator. Compared with the matched electronic torus-based NoC, THOE achieves 2.46X performance and 1.51X network switching capacity utilization, with 84p less energy consumption. Compared with the optical torus-based NoC, THOE achieves 4.71X performance and 3.05X network switching capacity utilization, while reducing 99p of energy consumption. Besides real MPSoC applications, a uniform traffic pattern is also used to show the average packet delay and network throughput of THOE. Regarding hardware cost, THOE reduces 75p of laser sources and half of optical receivers compared with the optical torus-based NoC.

67 citations


Journal ArticleDOI
TL;DR: This work proposes for the first time, a hybrid memory that aims to incorporate the area advantage provided by the utilization of multilevel logic and nanoscale memristive devices in conjunction with CMOS for the realization of a high density nonvolatile multileVEL memory.
Abstract: With technology migration into nano and molecular scales several hybrid CMOS/nano logic and memory architectures have been proposed that aim to achieve high device density with low power consumption The discovery of the memristor has further enabled the realization of denser nanoscale logic and memory systems by facilitating the implementation of multilevel logic This work describes the design of such a multilevel nonvolatile memristor memory system, and the design constraints imposed in the realization of such a memory In particular, the limitations on load, bank size, number of bits achievable per device, placed by the required noise margin for accurately reading and writing the data stored in a device are analyzed Also analyzed are the nondisruptive read and write methodologies for the hybrid multilevel memristor memory to program and read the memristive information without corrupting it This work showcases two write methodologies that leverage the best traits of memristors when used in either linear (low power) or nonlinear drift (fast speeds) modes The system can therefore be tailored depending on the required performance parameters of a given application for a fast memory or a slower but very energy-efficient system We propose for the first time, a hybrid memory that aims to incorporate the area advantage provided by the utilization of multilevel logic and nanoscale memristive devices in conjunction with CMOS for the realization of a high density nonvolatile multilevel memory

64 citations


Journal ArticleDOI
TL;DR: The wireless implantable/intracavity micromanometer system was designed to fulfill the unmet need for a chronic bladder pressure sensing device in urological fields such as urodynamics for diagnosis and neuromodulation for bladder control.
Abstract: The wireless implantable/intracavity micromanometer (WIMM) system was designed to fulfill the unmet need for a chronic bladder pressure sensing device in urological fields such as urodynamics for diagnosis and neuromodulation for bladder control. Neuromodulation in particular would benefit from a wireless bladder pressure sensor which could provide real-time pressure feedback to an implanted stimulator, resulting in greater bladder capacity while using less power. The WIMM uses custom integrated circuitry, a MEMS transducer, and a wireless antenna to transmit pressure telemetry at a rate of 10 Hz. Aggressive power management techniques yield an average current draw of 9 μA from a 3.6-Volt micro-battery, which minimizes the implant size. Automatic pressure offset cancellation circuits maximize the sensing dynamic range to account for drifting pressure offset due to environmental factors, and a custom telemetry protocol allows transmission with minimum overhead. Wireless operation of the WIMM has demonstrated that the external receiver can receive the telemetry packets, and the low power consumption allows for at least 24 hours of operation with a 4-hour wireless recharge session.

62 citations


Journal ArticleDOI
TL;DR: A very brief glance back at the early history of implant electronics in the period from the 1950s to the 1970s is offered, by employing selected examples from the author’s research.
Abstract: Implantable systems for biomedical research and clinical care are now a flourishing field of activities in academia as well as industrial institutions. The broad field includes experimental explorations in electronics, mechanical, chemical, and biological components and systems, and the combination of all these. Today virtually all implants involve both electronic circuits and micro-electro-mechanical-systems (MEMS). This article offers a very brief glance back at the early history of implant electronics in the period from the 1950s to the 1970s, by employing selected examples from the author’s research. This short review also discusses the challenges of implantable electronics at present, and suggests some potentially important trends in the future research and development of implantable microsystems. It is aimed as an introduction of implantable/attached electronic systems to research engineers that are interested in implantable systems as a section of Biomedical Instrumentations.

36 citations


Journal ArticleDOI
TL;DR: In this article, an adder for the 2Dimensional Nearest-Neighbor, Two-Qubit gate, Concurrent (2D NTC) architecture, designed to match the architectural constraints of many quantum computing technologies, is presented.
Abstract: In this work, we propose an adder for the 2-Dimensional Nearest-Neighbor, Two-Qubit gate, Concurrent (2D NTC) architecture, designed to match the architectural constraints of many quantum computing technologies. The chosen architecture allows the layout of logical qubits in two dimensions with √n columns where each column has √n qubits and the concurrent execution of one- and two-qubit gates with nearest-neighbor interaction only. The proposed adder works in three phases. In the first phase, the first column generates the summation output and the other columns do the carry-lookahead operations. In the second phase, these intermediate values are propagated from column to column, preparing for computation of the final carry for each register position. In the last phase, each column, except the first one, generates the summation output using this column-level carry. The depth and the number of qubits of the proposed adder are Θ(√n) and O(n), respectively. The proposed adder executes faster than the adders designed for the 1D NTC architecture when the length of the input registers n is larger than 51.

32 citations


Journal ArticleDOI
TL;DR: This article proposes a family of barely alive active low-power server states that facilitates both fast reactivation and access to memory while in a low- power state and finds that the barely alive states can reduce service energy consumption by up to 38%, compared to an energy-oblivious system.
Abstract: Current resource provisioning schemes in Internet services leave servers less than 50p utilized almost all the time. At this level of utilization, the servers' energy efficiency is substantially lower than at peak utilization. A solution to this problem could be dynamically consolidating workloads into fewer servers and turning others off. However, services typically resist doing so, because of high response times during reactivation in handling traffic spikes. Moreover, services often want the memory and/or storage of all servers to be readily available at all times.In this article, we propose a family of barely alive active low-power server states that facilitates both fast reactivation and access to memory while in a low-power state. We compare these states to previously proposed active and idle states. In particular, we investigate the impact of load bursts in each energy-saving scheme. We also evaluate the additional benefits of memory access under low-power states with a study of a search service using a cooperative main-memory cache. Finally, we propose a system that combines a barely-alive state with the off state. We find that the barely alive states can reduce service energy consumption by up to 38p, compared to an energy-oblivious system. We also find that these energy savings are consistent across a large parameter space.

28 citations


Journal ArticleDOI
TL;DR: Heuristic algorithms are developed and it is proved that the approximation ratio on the minimum total cost is bounded by the number of data centers, and how relaxing the delay requirement for a small fraction of users can increase the cost savings of DAHM is explored.
Abstract: Dynamic Application Hosting Management (DAHM) is proposed for geographically distributed data centers, which decides on the number of active servers and on the workload share of each data center. DAHM achieves cost-efficient application hosting by taking into account: (i) the spatio-temporal variation of energy cost, (ii) the data center computing and cooling energy efficiency, (iii) the live migration cost, and (iv) any SLA violations due to migration overhead or network delay. DAHM is modeled as fixed-charge min-cost flow and mixed integer programming for stateless and stateful applications, respectively, and it is shown NP-hard. We also develop heuristic algorithms and prove, when applications are stateless and servers have an identical power consumption model, that the approximation ratio on the minimum total cost is bounded by the number of data centers. Further, the heuristics are evaluated in a simulation study using realistic parameter data; compared to a performance-oriented application assignment, that is, hosting at the data center with the least delay, the potential cost savings of DAHM reaches 33p. The savings come from reducing the total number of active servers as well as leveraging the cost efficiency of data centers. Through the simulation study, the article further explores how relaxing the delay requirement for a small fraction of users can increase the cost savings of DAHM.

20 citations


Journal ArticleDOI
TL;DR: A graphical Computer-Aided Design (CAD) environment for the design, analysis, and layout of Carbon NanoTube (CNT) Field-Effect Transistor (CNFET) circuits and provides users with a customizable CNFET technology library with the ability to specify λ-based design rules.
Abstract: In this article, we present a graphical Computer-Aided Design (CAD) environment for the design, analysis, and layout of Carbon NanoTube (CNT) Field-Effect Transistor (CNFET) circuits. This work is motivated by the fact that such a tool currently does not exist in the public domain for researchers. Our tool has been integrated within Electric a very powerful, yet free CAD system for custom design of Integrated Circuits (ICs). The tool supports CNFET schematic and layout entry, rule checking, and HSpice/VerilogA netlist generation. We provide users with a customizable CNFET technology library with the ability to specify λ-based design rules. We showcase the capabilities of our tool by demonstrating the design of a large CNFET standard cell and components library. Meanwhile, HSPICE simulations also have been presented for cell library characterization. We hope that the availability of this tool will invigorate the CAD community to explore novel ideas in CNFET circuit design.

18 citations


Journal ArticleDOI
TL;DR: This study presents an NML programmable logic array (PLA) based on a previously proposed reprogrammable quantum-dot cellular automata PLA design, and uses results from this study to shape a concluding discussion about which architectures appear to be most suitable for NML.
Abstract: In order to continue the performance and scaling trends that we have come to expect from Moore’s Law, many emergent computational models, devices, and technologies are actively being studied to either replace or augment CMOS technology. Nanomagnet Logic (NML) is one such alternative. NML operates at room temperature, it has the potential for low power consumption, and it is CMOS compatible. In this aricle, we present an NML programmable logic array (PLA) based on a previously proposed reprogrammable quantum-dot cellular automata PLA design. We also discuss the fabrication and simulation validation of the circuit structures unique to the NML PLA, present area, energy, and delay estimates for the NML PLA, compare the area of NML PLAs to other reprogrammable nanotechnologies, and analyze how architectural-level redundancy will affect performance and defect tolerance in NML PLAs. We will use results from this study to shape a concluding discussion about, which architectures appear to be most suitable for NML.

Journal ArticleDOI
TL;DR: A fast method to identify the given Boolean function as a threshold function with weight assignment based on the parameters that have been defined in the literature is introduced and has been shown to operate fast for functions with as many as forty inputs.
Abstract: A fast method to identify the given Boolean function as a threshold function with weight assignment is introduced. It characterizes the function based on the parameters that have been defined in the literature. The proposed method is capable to quickly characterize all functions that have less than eight inputs and has been shown to operate fast for functions with as many as forty inputs. Furthermore, comparisons with other existing heuristic methods show huge increase in the number of threshold functions identified, and drastic reduction in time and complexity.

Journal ArticleDOI
TL;DR: The DCeP metric was found to be successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve (or even maximize) energy productivity.
Abstract: As data centers proliferate in size and number, the endeavor to improve their energy efficiency and productivity is becoming increasingly important. We discuss the properties of a number of the proposed metrics of energy efficiency and productivity. In particular, we focus on the Data Center Energy Productivity (DCeP) metric, which is the ratio of useful work produced by the data center to the energy consumed performing that work. We describe our approach for using DCeP as the principal outcome of a designed experiment using a highly instrumented, high-performance computing data center. We found that DCeP was successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve (or even maximize) energy productivity. We also discuss some of the challenges and benefits associated with implementing the DCeP metric, and we examine the efficacy of the metric in making comparisons within a data center and among data centers.

Journal ArticleDOI
TL;DR: This tutorial article presents a blueprint for a “net-zero data center”: one that offsets any electricity used from the grid via adequate on-site power generation that gets fed back to the grid at a later time.
Abstract: A world consisting of billions of service-oriented client devices and thousands of data centers can deliver a diverse range of services, from social networking to management of our natural resources. However, these services must scale in order to meet the fundamental needs of society. To enable such scaling, the total cost of ownership of the data centers that host the services and comprise the vast majority of service delivery costs will need to be reduced. As energy drives the total cost of ownership of data centers, there is a need for a new paradigm in design and management of data centers that minimizes energy used across their lifetimes, from “cradle to cradle”. This tutorial article presents a blueprint for a “net-zero data center”: one that offsets any electricity used from the grid via adequate on-site power generation that gets fed back to the grid at a later time. We discuss how such a data center addresses the total cost of ownership, illustrating that contrary to the oft-held view of sustainability as “paying more to be green”, sustainable data centers—built on a framework that focuses on integrating supply and demand management from end-to-end—can concurrently lead to lowest cost and lowest environmental impact.

Journal ArticleDOI
TL;DR: Analytical and experimental results demonstrate that the Markov chain-based scheme can improve the performance in terms of connection delay without affecting the time efficiency, or vice versa, as opposed to the trade-off observed in traditional schemes.
Abstract: To extend the lifetime of a wireless sensor network, sensor nodes usually duty cycle between dormant and active states. Duty cycling schemes are often evaluated in terms of connection delay, connection duration, and duty cycle. In this article, we show with experiments on Sun SPOT sensors that duty cycling time (energy) efficiency, that is, the ratio of time (energy) employed in ancillary operations when switching from and into deep sleep mode, is an important performance metric too. We propose a novel randomized duty cycling scheme based on Markov chains with the goal of (i) reducing the connection delay, while maintaining a given time (energy) efficiency, or (ii) keeping a constant connection delay, while increasing the time (energy) efficiency. Analytical and experimental results demonstrate that the Markov chain-based scheme can improve the performance in terms of connection delay without affecting the time efficiency, or vice versa, as opposed to the trade-off observed in traditional schemes. We extend the proposed duty cycling scheme to a partially randomized scheme, where wireless nodes can switch into active state beyond their schedules when their neighbors are active to anticipate message forwarding. The analytical and experimental results confirm the relationship between connection delay and time efficiency also for this scheme.

Journal ArticleDOI
TL;DR: Results indicate that simply increasing the number of planes of a 3D IC does not necessarily lead to lower skew variation and higher operating frequencies, and a multigroup 3D clock tree topology is proposed to effectively mitigate the variability of clock skew.
Abstract: In three-dimensional (3D) integrated circuits, the effect of process variations on clock skew differs from 2D circuits. The combined effect of inter-die and intra-die process variations on the design of 3D clock distribution networks is considered in this article. A statistical clock skew model incorporating both the systematic and random components of process variations is employed to describe this effect. Two regular 3D clock tree topologies are investigated and compared in terms of clock skew variation. The statistical skew model used to describe clock skew variations is verified through Monte-Carlo simulations. The clock skew is shown to change in different ways with the number of planes forming the 3D IC and the clock network architecture. Simulations based on a 45-nm CMOS technology show that the maximum standard deviation of clock skew can vary from 15 ps to 77 ps. Results indicate that simply increasing the number of planes of a 3D IC does not necessarily lead to lower skew variation and higher operating frequencies. A multigroup 3D clock tree topology is proposed to effectively mitigate the variability of clock skew. Tradeoffs between the investigated 3D clock distribution networks and the number of planes comprising a 3D circuit are discussed and related design guidelines are offered. The skew variation in 3D clock trees is also compared with the skew variation of clock grids.

Journal ArticleDOI
TL;DR: The analysis reveals that the nanophotonic switch is resistant to a skew longer than the input signal duration, and the tolerance to skew is asymmetric with respect to the two inputs.
Abstract: We examine the timing dependence of nanophotonic devices based on optical excitation transfer via optical near-field interactions at the nanometer scale. We theoretically analyze the dynamic behavior of a two-input nanophotonic switch composed of three quantum dots based on a density matrix formalism while assuming arrival-time differences, or skew, between the inputs. The analysis reveals that the nanophotonic switch is resistant to a skew longer than the input signal duration, and the tolerance to skew is asymmetric with respect to the two inputs. The skew dependence is also experimentally examined based on near-field spectroscopy of InGaAs quantum dots, showing good agreement with the theory. Elucidating the dynamic properties of nanophotonics, together with the associated spatial and energy dissipation attributes at the nanometer scale, will provide critical insights for novel system architectures.

Journal ArticleDOI
TL;DR: Although conventional DVFS might become less effective with technology scaling, it will continue to play an important role in the context of emerging power management techniques, for example, for massively parallel multiprocessor systems where only a subset of cores can be turned on at any given point of time due to total power constraints.
Abstract: Runtime power management is a critical technique for reducing the energy footprint of digital electronic devices and enabling sustainable computing, since it allows electronic devices to dynamically adapt their power and energy consumption to meet performance requirements. In this article, we consider the case of MultiProcessor Systems-on-Chip (MPSoC) implemented using multiple Voltage and Frequency Islands (VFIs) relying on fine-grained Dynamic Voltage and Frequency Scaling (DVFS) to reduce the system power dissipation. In particular, we present a framework to theoretically analyze the impact of three important technology-driven constraints; (i) reliability-driven upper limits on the maximum supply voltage; (ii) inductive noise-driven constraints on the maximum rate of change of voltage/frequency; and (iii) the impact of manufacturing process variations on the performance of DVFS control for multiple VFI MPSoCs. The proposed analysis is general, in the sense that it is not bound to a specific DVFS control algorithm, but instead focuses on theoretically bounding the performance that any DVFS controller can possibly achieve. Our experimental results on real and synthetic benchmarks show that in the presence of reliability- and temperature-driven constraints on the maximum frequency and maximum frequency increment, any DVFS control algorithm will lose up to 87p performance in terms of the number of steps required to reach a reference steady state. In addition, increasing process variations can lead to up to 60p of fabricated chips being unable to meet the specified DVFS control specifications, irrespective of the DVFS algorithm used. Nonetheless, we note that although conventional DVFS might become less effective with technology scaling, it will continue to play an important role in the context of emerging power management techniques, for example, for massively parallel multiprocessor systems where only a subset of cores can be turned on at any given point of time due to total power constraints.

Journal ArticleDOI
TL;DR: A dynamic technique is proposed that controls the instance of data capture in critical path memory flops, by delaying the clock edge trigger, and improves the timing yield of the circuit without significant overcompensation.
Abstract: In the nanometer era, process, voltage, and temperature variations are dominating circuit performance, power, and yield. Over the past few years, statistical optimization methods have been effective in improving yield in the presence of uncertainty due to process variations. However, statistical methods overconsume resources, even in the absence of variations. Hence, to facilitate a better performance-power-yield trade-off, techniques that can dynamically enable variation compensation are becoming necessary. In this article, we propose a dynamic technique that controls the instance of data capture in critical path memory flops, by delaying the clock edge trigger. The methodology employs a dynamic delay detection circuit to identify the uncertainty in delay due to variations and stretches the clock in the destination flip-flops. The delay detection circuit uses a latch and set of combinational gates to dynamically detect and create the slack needed to accommodate the delay due to variations. The Clock Stretching Logic (CSL) is added only to paths, which have a high probability of failure in the presence of variations. The proposed methodology improves the timing yield of the circuit without significant overcompensation. The methodology approach was simulated using Synopsys design tools for circuit synthesis and Cadence tools for placement and routing of the design. Extraction of parasitic of timing information was parsed using Perl scripts and simulated using a simulation program generated in Cpp. Experimental results based on Monte-Carlo simulations on benchmark circuits indicate considerable improvement in timing yield with negligible area overhead.

Journal ArticleDOI
TL;DR: An orientation strategy for layout of a multichip that reduces routing congestion and consequently facilitates wire routing for the electrode array and supports a hierarchical approach to wire routing that ensures scalability is presented.
Abstract: Potential applications of digital microfluidic (DMF) biochips now include several areas of real-life applications like environmental monitoring, water and air pollutant detection, and food processing to name a few. In order to achieve sufficiently high throughput for these applications, several instances of the same bioassay may be required to be executed concurrently on different samples. As a straightforward implementation, several identical biochips can be integrated on a single substrate as a multichip to execute the assay for various samples concurrently. Controlling individual electrodes of such a chip by independent pins may not be acceptable since it increases the cost of fabrication. Thus, in order to keep the overall pin-count within an acceptable bound, all the respective electrodes of these individual pieces are connected internally underneath the chip so that they can be controlled with a single external control pin. In this article, we present an orientation strategy for layout of a multichip that reduces routing congestion and consequently facilitates wire routing for the electrode array. The electrode structure of the individual pieces of the multichip may be either direct-addressable or pin-constrained. The method also supports a hierarchical approach to wire routing that ensures scalability. In this scheme, the size of the biochip in terms of the total number of electrodes may be increased by a factor of four by increasing the number of routing layers by only one. In general, for a multichip with 4n identical blocks, (n p 1) layers are sufficient for wire routing.

Journal ArticleDOI
TL;DR: The article lays out the challenges of EAC in various environments in terms of the adaptation of the workload and the infrastructure to cope with energy and cooling deficiencies, and addresses the problem of simultaneous energy demand and energy supply regulation at multiple levels, work, from servers to the entire data center.
Abstract: The sustainability concerns of Information Technology (IT) go well beyond energy-efficient computing and require techniques for minimizing environmental impact of IT infrastructure over its entire life-cycle Traditionally, IT infrastructure is overdesigned at all levels from chips to entire data centers and ecosystem; the paradigm explored in this article is to replace overdesign with rightsizing coupled with smarter control, henceforth referred to as Energy-Adaptive Computing or EAC The article lays out the challenges of EAC in various environments in terms of the adaptation of the workload and the infrastructure to cope with energy and cooling deficiencies The article then focuses on implementing EAC in a data center environment, and addresses the problem of simultaneous energy demand and energy supply regulation at multiple levels, work, from servers to the entire data center The proposed control scheme adapts the assignments of tasks to servers in a way that can cope with the varying energy limitations The article also presents some experimental results to show how the scheme can continue to meet Quality of Service (QoS) requirements of tasks under energy limitations

Journal ArticleDOI
TL;DR: An implantable system-on-a-chip integrating controller/actuation circuitry and 8 individually addressable drug reservoirs is proposed for on-demand drug delivery, implemented by standard 0.35-μm CMOS technology and post-IC processing.
Abstract: An implantable system-on-a-chip (SoC) integrating controller/actuation circuitry and 8 individually addressable drug reservoirs is proposed for on-demand drug delivery. It is implemented by standard 0.35-μm CMOS technology and post-IC processing. The post-IC processing includes deposition of metallic membranes (200A Pt/3000A Ti/200A Pt) to cap the drug reservoirs, deep dry etching to carve drug reservoirs in silicon as drug containers, and PDMS layer bonding to enlarge the drug storage. Based on electrothermal activation technique, drug releases can be precisely controlled by wireless signals. The wireless controller/actuation circuits including on-off keying (OOK) receiver, microcontroller unit, clock generator, power-on-reset circuit, and switch array are integrated on the same chip, providing patients the ability of remote drug activation and noninvasive therapy modification. Implanted by minimally invasive surgery, this SoC can be used for the precise drug dosing of localized treatment, such as the cancer therapy, or the immediate medication to some emergent diseases, such as heart attack. In vitro experimental results show that the reservoir content can be released successfully through the rupture of the membrane which is appointed by received wireless commands.

Journal ArticleDOI
TL;DR: A data storage system with the emerging nonvolatile memory technologies used for the implantable electrocardiography (ECG) recorder is proposed and the new read and write schemes of STT-RAM and spintronic memristors are presented and optimized to fit the specific application scenario.
Abstract: In this article, we propose a data storage system with the emerging nonvolatile memory technologies used for the implantable electrocardiography (ECG) recorder. The proposed storage system can record the digitalized real-time ECG waveforms continuously inside the implantable device and export the stored data to external reader periodically to obtain a long-term backup. Spin transfer torque random access memory (STT-RAM) and spintronic memristor are selected as the storage elements for their nonvolatility, high density, high reliability, low power consumption, good scalability, and CMOS technology compatibility. The new read and write schemes of STT-RAM and spintronic memristors are presented and optimized to fit the specific application scenario. The tradeoffs among data accuracy, chip area, and read/write energy for the different technologies are thoroughly analyzed and compared. Our simulation results show the configuration with a data sampling rate (e.g., 128 Hz) and a quantization resolution (e.g., 12 bits) can record 18-hour real-time data within ~ 3.6-mm2 chip area when the data storage is built with single-level cell (SLC) STT-RAMs. Daily energy consumption is 5.46 mJ. Utilizing the multilevel cell (MLC) STT-RAMs or the spintronic memristors as the storage elements can further reduce the chip area and decrease energy dissipation.

Journal ArticleDOI
TL;DR: A thorough evaluation of a new nanotechnology-enabled power gating structure, CMOS-compatible NEMS switches, in the presence of aggressive supply voltage scaling shows that NEMS-based power-gating warrants further investigation and the fabrication of a prototype.
Abstract: A rapidly growing class of battery constrained electronic applications are those with very long sleep periods, such as structural health monitoring systems, biomedical implants, and wireless border security cameras. The traditional method for sleep-mode power reduction, transistor power gating, has drawbacks, including performance loss and residual leakage. This article presents a thorough evaluation of a new nanotechnology-enabled power gating structure, CMOS-compatible NEMS switches, in the presence of aggressive supply voltage scaling. Due to the infinite off-resistance of the NEMS switches, the average power consumption of an FFT processor performing 1 FFT per hour drops by around 30 times compared to a transistor-based power gating implementation. Additionally, the low on-resistance and nanoscale size means even with current prototypes, area overhead is as much as 5 times lower, with much room for improvement. The major drawback of NEMS switches is the high activation voltage, which can be many times higher than typical CMOS supply voltages. We demonstrate that with a charge pump, these voltages can be generated on-die, and the energy and bootup delay overhead is negligible compared to the FFT processing itself. These results show that NEMS-based power-gating warrants further investigation and the fabrication of a prototype.

Journal ArticleDOI
TL;DR: An implantable closed-loop epilepsy prosthesis is presented, which is dedicated to automatically detect seizure onsets based on intracerebral electroencephalographic recordings from intracranial electrode contacts and provide an electrical stimulation feedback to the same contacts in order to disrupt these seizures.
Abstract: In this article, we present an implantable closed-loop epilepsy prosthesis, which is dedicated to automatically detect seizure onsets based on intracerebral electroencephalographic (icEEG) recordings from intracranial electrode contacts and provide an electrical stimulation feedback to the same contacts in order to disrupt these seizures. A novel epileptic seizure detector and a dedicated electrical stimulator were assembled together with common recording electrodes to complete the proposed prosthesis. The seizure detector was implemented in CMOS 0.18-μm by incorporating a new seizure detection algorithm that models time-amplitude and -frequency relationship in icEEG. The detector was validated offline on ten patients with refractory epilepsy and showed excellent performance for early detection of seizures. The electrical stimulator, used for suppressing the developing seizure, is composed of two biphasic channels and was assembled with embedded FPGA in a miniature PCB. The stimulator efficiency was evaluated on cadaveric animal brain tissue in an in vitro morphologic electrical model. Spatial characteristics of the voltage distribution in cortex were assessed in an attempt to identify optimal stimulation parameters required to affect the suspected epileptic focus. The experimental results suggest that lower frequency stimulation parameters cause significant amount of shunting of current through the cerebrospinal fluid; however higher frequency stimulation parameters produce effective spatial voltage distribution with lower stimulation charge.

Journal ArticleDOI
TL;DR: A survey of technology developments relevant to millimeter wave beaming indicates that massive, mass-produced solid-state arrays capable of achieving good efficiency and cost effectiveness are possible in the near term to enable such retail power beaming architectures.
Abstract: Retail delivery of electric power through millimeter waves is relevant in developing areas where the market for communication devices outpaces the power grid infrastructure It is also a critical component of an evolutionary path towards terrestrial and space-based renewable power generation Narrow-band power can be delivered as focused beams to receivers near end-users, from central power plants, rural distribution points, UAVs, tethered aerostats, stratospheric airship platforms, or space satellites The article surveys the available knowledge base on millimeter wave beamed power delivery It then considers design requirements for a retail beamed power architecture, in the context of rural India where power delivery is lagging behind the demand growth for connectivity A survey of technology developments relevant to millimeter wave beaming is conducted, and indicates that massive, mass-produced solid-state arrays capable of achieving good efficiency and cost effectiveness are possible in the near term to enable such retail power beaming architectures

Journal ArticleDOI
TL;DR: The calculation results show that the Q-factors of Carbon NanoTube (CNT) wire (SWCNT bundle and MWCNT) inductors are higher than that of the Cu wire inductor, mainly due to much lower resistance of CNT and negligible skin effect in carbon nanotubes at higher frequencies.
Abstract: We have utilized our Multiwalled Carbon NanoTube (MWCNT) and Single-Walled Carbon NanoTube (SWCNT) bundle interconnects model in a widely used π model to study the performances of MWCNT and SWCNT bundle wire inductors and compared these with copper (Cu) inductors. The calculation results show that the Q-factors of Carbon NanoTube (CNT) wire (SWCNT bundle and MWCNT) inductors are higher than that of the Cu wire inductor. This is mainly due to much lower resistance of CNT and negligible skin effect in carbon nanotubes at higher frequencies. The application of CNT wire inductor in LC VCO is also studied and the Cadence/Spectre simulations show that VCOs with CNT bundle wire inductors have significantly improved performance such as the higher oscillation frequency and lower phase noise due to their smaller resistances and higher Q-factors. It is also noticed that CMOS LC VCO using a SWCNT bundle wire inductor has better performance when compared with the performance of LC VCO using the MWCNT wire inductor due to its lower resistance and higher Q-factor.

Journal ArticleDOI
TL;DR: Resilient and Adaptive Performance (RAP) logic is proposed for maximum adaptive performance and soft error resilience in nanoscale computing and outperforms alternative Delay-Insensitive (DI) code-based static (Domino) RAP logic with less area, higher performance, and lower power consumption for the large test cases.
Abstract: As VLSI technology continues scaling, increasingly significant parametric variations and increasingly prevalent defects present unprecedented challenges to VLSI design at nanometer scale. Specifically, performance variability has hindered performance scaling, while soft errors become an emerging problem for logic computation at recent technology nodes. In this article, we leverage the existing Totally Self-Checking (TSC)/Strongly Fault-Secure (SFS) logic design techniques, and propose Resilient and Adaptive Performance (RAP) logic for maximum adaptive performance and soft error resilience in nanoscale computing. RAP logic clears all timing errors in the absence of external soft errors, albeit at a higher area/power cost compared with Razor logic. Our experimental results further show that dual-rail static (Domino) RAP logic outperforms alternative Delay-Insensitive (DI) code-based static (Domino) RAP logic with less area, higher performance, and lower power consumption for the large test cases, and achieves an average of 2.29(2.41)× performance boost, 2.12(1.91)× layout area, and 2.38(2.34)× power consumption compared with the traditional minimum area static logic based on the Nangate 45-nm open cell library.

Journal ArticleDOI
TL;DR: A low-power, user-programmable architecture for discrete wavelet transform (DWT) based epileptic seizure detection algorithm that was implemented in TSMC-65nm technology and consumes less than 550-nW power at 250-mV supply.
Abstract: In this article, we present a low-power, user-programmable architecture for discrete wavelet transform (DWT) based epileptic seizure detection algorithm. A simplified, low-pass filter (LPF)-only-DWT technique is employed in which energy contents of different frequency bands are obtained by subtracting quasi-averaged, consecutive LPF outputs. Training phase is used to identify the range of critical DWT coefficients that are in turn used to set patient-specific system level parameters for minimizing power consumption. The proposed optimizations allow the design to work at significantly lower power in the normal operation mode. The system has been tested on neural data obtained from kainate-treated rats. The design was implemented in TSMC-65nm technology and consumes less than 550-nW power at 250-mV supply.

Journal ArticleDOI
TL;DR: This article presents a blueprint for a net-zero data center, and claims that sustainable data centers that are built on a framework focusing on integrating supply and demand management will concurrently lead to the lowest cost and the lowest environmental impact.
Abstract: It is our pleasure to introduce this special issue on sustainable and green computing systems. Modern large-scale computing systems, such as data centers and high-performance computing (HPC) clusters are severely constrained by power and cooling costs for solving extreme-scale (or exascale) problems. This increasing power consumption is of growing concern due to several reasons, for example, cost, reliability, scala-bility, and environmental impact. A report from the Environmental Protection Agency (EPA) indicates that the nation's servers and data centers alone use about 1.5% of the total national energy consumed per year, at a cost of approximately $4.5 billion. The growing energy demands in data centers and HPC clusters are of utmost concern and there is a need to build efficient and sustainable computing environments that reduce the negative environmental impacts. Emerging technologies to support these computing systems are therefore of tremendous interest. Power management of data centers and HPC platforms is getting significant attention both from academia and industry. The power efficiency and sustainability aspects need to be addressed from various angles that include system design, computer architecture, programming language, compilers, networking, etc. This special issue of the ACM Journal on Emerging Technologies in Computing Systems (JETC) presents several articles that highlight the state-of-the art on sustainable and green computing systems. While bridging the gap between various disciplines, this special issue highlights new sustainable and green computing paradigms and presents some of their features, advantages, disadvantages, and associated challenges. Contributions were submitted by researchers from both industry and academia, and each article was selected through a rigorous review process. Out of twelve initial submissions , eight articles were accepted. These feature a range of application areas, from sustainable data centers, to runtime power management in multicore chips, to green wireless sensor networks, energy efficiency of servers, and energy-and performance-aware scheduling of tasks on parallel and distributed systems. In the first article, " Towards a Net-Zero Data Center " , Prithviraj Banerjee et al. advocate a need for a new paradigm in design and management of data centers that minimizes energy used across their lifetimes. This article presents a blueprint for a net-zero data center. It claims that sustainable data centers that are built on a framework focusing on integrating supply and demand management will concurrently lead to the lowest cost and the lowest environmental impact. The second article, " Technology-Driven Limits on Runtime Power Management Algorithms for Multiprocessor Systems-on-Chip " by …