Showing papers in "ACM Journal on Emerging Technologies in Computing Systems in 2012"

PDF

Open Access

Journal Article•DOI•

Performance evaluation and design trade-offs for wireless network-on-chip architectures

[...]

Kevin Chang¹, Sujay Deb¹, Amlan Ganguly², Xinmin Yu¹, Suman P. Sah¹, Partha Pratim Pande¹, Benjamin J. Belzer¹, Deukhyoun Heo¹ - Show less +4 more•Institutions (2)

Washington State University¹, Rochester Institute of Technology²

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: Through architecture-space exploration in conjunction with novel power-efficient on-chip wireless link design, it is demonstrated that it is possible to improve performance of conventional NoC architectures significantly without incurring high area overhead.

...read moreread less

Abstract: Massive levels of integration are making modern multicore chips all pervasive in several domains. High performance, robustness, and energy-efficiency are crucial for the widespread adoption of such platforms. Networks-on-Chip (NoCs) have emerged as communication backbones to enable a high degree of integration in multicore Systems-on-Chip (SoCs). Despite their advantages, an important performance limitation in traditional NoCs arises from planar metal interconnect-based multihop links with high latency and power consumption. This limitation can be addressed by drawing inspiration from the evolution of natural complex networks, which offer great performance-cost trade-offs. Analogous with many natural complex systems, future multicore chips are expected to be hierarchical and heterogeneous in nature as well. In this article we undertake a detailed performance evaluation for hierarchical small-world NoC architectures where the long-range communications links are established through the millimeter-wave wireless communication channels. Through architecture-space exploration in conjunction with novel power-efficient on-chip wireless link design, we demonstrate that it is possible to improve performance of conventional NoC architectures significantly without incurring high area overhead.

...read moreread less

118 citations

Journal Article•DOI•

Energy- and performance-aware scheduling of tasks on parallel and distributed systems

[...]

Hafiz Fahad Sheikh¹, Hengxing Tan², Ishfaq Ahmad¹, Sanjay Ranka², Phanisekhar Bv² - Show less +1 more•Institutions (2)

University of Texas at Arlington¹, University of Florida²

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The article provides a comprehensive presentation of the architectural, software, and algorithmic issues for energy-aware scheduling of workflows on single, multicore, and parallel architectures and includes a systematic taxonomy of the algorithms developed in the literature based on the overall optimization goals and characteristics of applications.

...read moreread less

Abstract: Enabled by high-speed networking in commercial, scientific, and government settings, the realm of high performance is burgeoning with greater amounts of computational and storage resources. Large-scale systems such as computational grids consume a significant amount of energy due to their massive sizes. The energy and cooling costs of such systems are often comparable to the procurement costs over a year period. In this survey, we will discuss allocation and scheduling algorithms, systems, and software for reducing power and energy dissipation of workflows on the target platforms of single processors, multicore processors, and distributed systems. Furthermore, recent research achievements will be investigated that deal with power and energy efficiency via different power management techniques and application scheduling algorithms. The article provides a comprehensive presentation of the architectural, software, and algorithmic issues for energy-aware scheduling of workflows on single, multicore, and parallel architectures. It also includes a systematic taxonomy of the algorithms developed in the literature based on the overall optimization goals and characteristics of applications.

...read moreread less

69 citations

Journal Article•DOI•

A Torus-Based Hierarchical Optical-Electronic Network-on-Chip for Multiprocessor System-on-Chip

[...]

Yaoyao Ye¹, Jiang Xu¹, Xiaowen Wu¹, Wei Zhang², Weichen Liu¹, Mahdi Nikdast¹ - Show less +2 more•Institutions (2)

Hong Kong University of Science and Technology¹, Nanyang Technological University²

01 Feb 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: This work proposes a torus-based hierarchical hybrid optical-electronic NoC, called THOE, which employs several new techniques including floorplan optimization, an adaptive power control mechanism, low-latency control protocols, and hybrid Optical-electrical routers with a low-power optical switching fabric.

...read moreread less

Abstract: Networks-on-chip (NoCs) are emerging as a key on-chip communication architecture for multiprocessor systems-on-chip (MPSoCs). Optical communication technologies are introduced to NoCs in order to empower ultra-high bandwidth with low power consumption. However, in existing optical NoCs, communication locality is poorly supported, and the importance of floorplanning is overlooked. These significantly limit the power efficiency and performance of optical NoCs. In this work, we address these issues and propose a torus-based hierarchical hybrid optical-electronic NoC, called THOE. THOE takes advantage of both electrical and optical routers and interconnects in a hierarchical manner. It employs several new techniques including floorplan optimization, an adaptive power control mechanism, low-latency control protocols, and hybrid optical-electrical routers with a low-power optical switching fabric. Both of the unfolded and folded torus topologies are explored for THOE. Based on a set of real MPSoC applications, we compared THOE with a typical torus-based optical NoC as well as a torus-based electronic NoC in 45nm on a 256-core MPSoC, using a SystemC-based cycle-accurate NoC simulator. Compared with the matched electronic torus-based NoC, THOE achieves 2.46X performance and 1.51X network switching capacity utilization, with 84p less energy consumption. Compared with the optical torus-based NoC, THOE achieves 4.71X performance and 3.05X network switching capacity utilization, while reducing 99p of energy consumption. Besides real MPSoC applications, a uniform traffic pattern is also used to show the average packet delay and network throughput of THOE. Regarding hardware cost, THOE reduces 75p of laser sources and half of optical receivers compared with the optical torus-based NoC.

...read moreread less

67 citations

Journal Article•DOI•

Design Considerations for Multilevel CMOS/Nano Memristive Memory

[...]

Harika Manem¹, Jeyavijayan Rajendran¹, Garrett S. Rose¹•Institutions (1)

New York University¹

01 Feb 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: This work proposes for the first time, a hybrid memory that aims to incorporate the area advantage provided by the utilization of multilevel logic and nanoscale memristive devices in conjunction with CMOS for the realization of a high density nonvolatile multileVEL memory.

...read moreread less

Abstract: With technology migration into nano and molecular scales several hybrid CMOS/nano logic and memory architectures have been proposed that aim to achieve high device density with low power consumption The discovery of the memristor has further enabled the realization of denser nanoscale logic and memory systems by facilitating the implementation of multilevel logic This work describes the design of such a multilevel nonvolatile memristor memory system, and the design constraints imposed in the realization of such a memory In particular, the limitations on load, bank size, number of bits achievable per device, placed by the required noise margin for accurately reading and writing the data stored in a device are analyzed Also analyzed are the nondisruptive read and write methodologies for the hybrid multilevel memristor memory to program and read the memristive information without corrupting it This work showcases two write methodologies that leverage the best traits of memristors when used in either linear (low power) or nonlinear drift (fast speeds) modes The system can therefore be tailored depending on the required performance parameters of a given application for a fast memory or a slower but very energy-efficient system We propose for the first time, a hybrid memory that aims to incorporate the area advantage provided by the utilization of multilevel logic and nanoscale memristive devices in conjunction with CMOS for the realization of a high density nonvolatile multilevel memory

...read moreread less

64 citations

Journal Article•DOI•

Wireless, Ultra-Low-Power Implantable Sensor for Chronic Bladder Pressure Monitoring

[...]

Steve Majerus¹, Steven L. Garverick¹, Michael A. Suster¹, Paul C. Fletter², Margot S. Damaser² - Show less +1 more•Institutions (2)

Case Western Reserve University¹, Veterans Health Administration²

01 Jun 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The wireless implantable/intracavity micromanometer system was designed to fulfill the unmet need for a chronic bladder pressure sensing device in urological fields such as urodynamics for diagnosis and neuromodulation for bladder control.

...read moreread less

Abstract: The wireless implantable/intracavity micromanometer (WIMM) system was designed to fulfill the unmet need for a chronic bladder pressure sensing device in urological fields such as urodynamics for diagnosis and neuromodulation for bladder control. Neuromodulation in particular would benefit from a wireless bladder pressure sensor which could provide real-time pressure feedback to an implanted stimulator, resulting in greater bladder capacity while using less power. The WIMM uses custom integrated circuitry, a MEMS transducer, and a wireless antenna to transmit pressure telemetry at a rate of 10 Hz. Aggressive power management techniques yield an average current draw of 9 μA from a 3.6-Volt micro-battery, which minimizes the implant size. Automatic pressure offset cancellation circuits maximize the sensing dynamic range to account for drifting pressure offset due to environmental factors, and a custom telemetry protocol allows transmission with minimum overhead. Wireless operation of the WIMM has demonstrated that the external receiver can receive the telemetry packets, and the low power consumption allows for at least 24 hours of operation with a 4-hour wireless recharge session.

...read moreread less

62 citations

Journal Article•DOI•

Early History and Challenges of Implantable Electronics

[...]

Wen H. Ko¹•Institutions (1)

Case Western Reserve University¹

01 Jun 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A very brief glance back at the early history of implant electronics in the period from the 1950s to the 1970s is offered, by employing selected examples from the author’s research.

...read moreread less

Abstract: Implantable systems for biomedical research and clinical care are now a flourishing field of activities in academia as well as industrial institutions. The broad field includes experimental explorations in electronics, mechanical, chemical, and biological components and systems, and the combination of all these. Today virtually all implants involve both electronic circuits and micro-electro-mechanical-systems (MEMS). This article offers a very brief glance back at the early history of implant electronics in the period from the 1950s to the 1970s, by employing selected examples from the author’s research. This short review also discusses the challenges of implantable electronics at present, and suggests some potentially important trends in the future research and development of implantable microsystems. It is aimed as an introduction of implantable/attached electronic systems to research engineers that are interested in implantable systems as a section of Biomedical Instrumentations.

...read moreread less

36 citations

Journal Article•DOI•

A Θ( √ n)-depth quantum adder on the 2D NTC quantum computer architecture

[...]

Byung-Soo Choi¹, Rodney Van Meter²•Institutions (2)

Seoul National University¹, Keio University²

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: In this article, an adder for the 2Dimensional Nearest-Neighbor, Two-Qubit gate, Concurrent (2D NTC) architecture, designed to match the architectural constraints of many quantum computing technologies, is presented.

...read moreread less

Abstract: In this work, we propose an adder for the 2-Dimensional Nearest-Neighbor, Two-Qubit gate, Concurrent (2D NTC) architecture, designed to match the architectural constraints of many quantum computing technologies. The chosen architecture allows the layout of logical qubits in two dimensions with √n columns where each column has √n qubits and the concurrent execution of one- and two-qubit gates with nearest-neighbor interaction only. The proposed adder works in three phases. In the first phase, the first column generates the summation output and the other columns do the carry-lookahead operations. In the second phase, these intermediate values are propagated from column to column, preparing for computation of the final carry for each register position. In the last phase, each column, except the first one, generates the summation output using this column-level carry. The depth and the number of qubits of the proposed adder are Θ(√n) and O(n), respectively. The proposed adder executes faster than the adders designed for the 1D NTC architecture when the length of the input registers n is larger than 51.

...read moreread less

32 citations

Journal Article•DOI•

Barely alive memory servers: Keeping data active in a low-power state

[...]

Vlasia Anagnostopoulou¹, Susmit Biswas¹, Heba Saadeldeen¹, Alan Savage¹, Ricardo Bianchini², Tao Yang¹, Diana Franklin¹, Frederic T. Chong¹ - Show less +4 more•Institutions (2)

University of California, Santa Barbara¹, Rutgers University²

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: This article proposes a family of barely alive active low-power server states that facilitates both fast reactivation and access to memory while in a low- power state and finds that the barely alive states can reduce service energy consumption by up to 38%, compared to an energy-oblivious system.

...read moreread less

Abstract: Current resource provisioning schemes in Internet services leave servers less than 50p utilized almost all the time. At this level of utilization, the servers' energy efficiency is substantially lower than at peak utilization. A solution to this problem could be dynamically consolidating workloads into fewer servers and turning others off. However, services typically resist doing so, because of high response times during reactivation in handling traffic spikes. Moreover, services often want the memory and/or storage of all servers to be readily available at all times.In this article, we propose a family of barely alive active low-power server states that facilitates both fast reactivation and access to memory while in a low-power state. We compare these states to previously proposed active and idle states. In particular, we investigate the impact of load bursts in each energy-saving scheme. We also evaluate the additional benefits of memory access under low-power states with a study of a search service using a cooperative main-memory cache. Finally, we propose a system that combines a barely-alive state with the off state. We find that the barely alive states can reduce service energy consumption by up to 38p, compared to an energy-oblivious system. We also find that these energy savings are consistent across a large parameter space.

...read moreread less

28 citations

Journal Article•DOI•

DAHM: A green and dynamic web application hosting manager across geographically distributed data centers

[...]

Zahra Abbasi¹, Tridib Mukherjee¹, Georgios Varsamopoulos¹, Sandeep K. S. Gupta¹•Institutions (1)

Arizona State University¹

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: Heuristic algorithms are developed and it is proved that the approximation ratio on the minimum total cost is bounded by the number of data centers, and how relaxing the delay requirement for a small fraction of users can increase the cost savings of DAHM is explored.

...read moreread less

Abstract: Dynamic Application Hosting Management (DAHM) is proposed for geographically distributed data centers, which decides on the number of active servers and on the workload share of each data center. DAHM achieves cost-efficient application hosting by taking into account: (i) the spatio-temporal variation of energy cost, (ii) the data center computing and cooling energy efficiency, (iii) the live migration cost, and (iv) any SLA violations due to migration overhead or network delay. DAHM is modeled as fixed-charge min-cost flow and mixed integer programming for stateless and stateful applications, respectively, and it is shown NP-hard. We also develop heuristic algorithms and prove, when applications are stateless and servers have an identical power consumption model, that the approximation ratio on the minimum total cost is bounded by the number of data centers. Further, the heuristics are evaluated in a simulation study using realistic parameter data; compared to a performance-oriented application assignment, that is, hosting at the data center with the least delay, the potential cost savings of DAHM reaches 33p. The savings come from reducing the total number of active servers as well as leveraging the cost efficiency of data centers. Through the simulation study, the article further explores how relaxing the delay requirement for a small fraction of users can increase the cost savings of DAHM.

...read moreread less

20 citations

Journal Article•DOI•

A physical design tool for carbon nanotube field-effect transistor circuits

[...]

Jiale Huang¹, Minhao Zhu¹, Shengqi Yang¹, Pallav Gupta², Wei Zhang³, Steven M. Rubin⁴, Gilda Garreton⁴, Jin He⁵ - Show less +4 more•Institutions (5)

Shanghai University¹, Intel², Nanyang Technological University³, Oracle Corporation⁴, Peking University⁵

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A graphical Computer-Aided Design (CAD) environment for the design, analysis, and layout of Carbon NanoTube (CNT) Field-Effect Transistor (CNFET) circuits and provides users with a customizable CNFET technology library with the ability to specify λ-based design rules.

...read moreread less

Abstract: In this article, we present a graphical Computer-Aided Design (CAD) environment for the design, analysis, and layout of Carbon NanoTube (CNT) Field-Effect Transistor (CNFET) circuits. This work is motivated by the fact that such a tool currently does not exist in the public domain for researchers. Our tool has been integrated within Electric a very powerful, yet free CAD system for custom design of Integrated Circuits (ICs). The tool supports CNFET schematic and layout entry, rule checking, and HSpice/VerilogA netlist generation. We provide users with a customizable CNFET technology library with the ability to specify λ-based design rules. We showcase the capabilities of our tool by demonstrating the design of a large CNFET standard cell and components library. Meanwhile, HSPICE simulations also have been presented for cell library characterization. We hope that the availability of this tool will invigorate the CAD community to explore novel ideas in CNFET circuit design.

...read moreread less

18 citations

Journal Article•DOI•

A Reconfigurable PLA Architecture for Nanomagnet Logic

[...]

Michael Crocker¹, Michael Niemier¹, X. Sharon Hu¹•Institutions (1)

University of Notre Dame¹

01 Feb 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: This study presents an NML programmable logic array (PLA) based on a previously proposed reprogrammable quantum-dot cellular automata PLA design, and uses results from this study to shape a concluding discussion about which architectures appear to be most suitable for NML.

...read moreread less

Abstract: In order to continue the performance and scaling trends that we have come to expect from Moore’s Law, many emergent computational models, devices, and technologies are actively being studied to either replace or augment CMOS technology. Nanomagnet Logic (NML) is one such alternative. NML operates at room temperature, it has the potential for low power consumption, and it is CMOS compatible. In this aricle, we present an NML programmable logic array (PLA) based on a previously proposed reprogrammable quantum-dot cellular automata PLA design. We also discuss the fabrication and simulation validation of the circuit structures unique to the NML PLA, present area, energy, and delay estimates for the NML PLA, compare the area of NML PLAs to other reprogrammable nanotechnologies, and analyze how architectural-level redundancy will affect performance and defect tolerance in NML PLAs. We will use results from this study to shape a concluding discussion about, which architectures appear to be most suitable for NML.

...read moreread less

Journal Article•DOI•

An efficient heuristic to identify threshold logic functions

[...]

Ashok Kumar Palaniswamy¹, Spyros Tragoudas¹•Institutions (1)

Southern Illinois University Carbondale¹

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A fast method to identify the given Boolean function as a threshold function with weight assignment based on the parameters that have been defined in the literature is introduced and has been shown to operate fast for functions with as many as forty inputs.

...read moreread less

Abstract: A fast method to identify the given Boolean function as a threshold function with weight assignment is introduced. It characterizes the function based on the parameters that have been defined in the literature. The proposed method is capable to quickly characterize all functions that have less than eight inputs and has been shown to operate fast for functions with as many as forty inputs. Furthermore, comparisons with other existing heuristic methods show huge increase in the number of threshold functions identified, and drastic reduction in time and complexity.

...read moreread less

Journal Article•DOI•

Implementing the data center energy productivity metric

[...]

Landon H. Sego¹, Andres Marquez¹, Andrew R. Rawson², Tahir Cader³, Kevin Fox¹, William I. Gustafson¹, Christopher J. Mundy¹ - Show less +3 more•Institutions (3)

Pacific Northwest National Laboratory¹, Advanced Micro Devices², Hewlett-Packard³

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The DCeP metric was found to be successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve (or even maximize) energy productivity.

...read moreread less

Abstract: As data centers proliferate in size and number, the endeavor to improve their energy efficiency and productivity is becoming increasingly important. We discuss the properties of a number of the proposed metrics of energy efficiency and productivity. In particular, we focus on the Data Center Energy Productivity (DCeP) metric, which is the ratio of useful work produced by the data center to the energy consumed performing that work. We describe our approach for using DCeP as the principal outcome of a designed experiment using a highly instrumented, high-performance computing data center. We found that DCeP was successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve (or even maximize) energy productivity. We also discuss some of the challenges and benefits associated with implementing the DCeP metric, and we examine the efficacy of the metric in making comparisons within a data center and among data centers.

...read moreread less

Journal Article•DOI•

Towards a net-zero data center

[...]

Prithviraj Banerjee¹, Chandrakant D. Patel¹, Cullen E. Bash¹, Amip J. Shah¹, Martin Arlitt¹ - Show less +1 more•Institutions (1)

Hewlett-Packard¹

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: This tutorial article presents a blueprint for a “net-zero data center”: one that offsets any electricity used from the grid via adequate on-site power generation that gets fed back to the grid at a later time.

...read moreread less

Abstract: A world consisting of billions of service-oriented client devices and thousands of data centers can deliver a diverse range of services, from social networking to management of our natural resources. However, these services must scale in order to meet the fundamental needs of society. To enable such scaling, the total cost of ownership of the data centers that host the services and comprise the vast majority of service delivery costs will need to be reduced. As energy drives the total cost of ownership of data centers, there is a need for a new paradigm in design and management of data centers that minimizes energy used across their lifetimes, from “cradle to cradle”. This tutorial article presents a blueprint for a “net-zero data center”: one that offsets any electricity used from the grid via adequate on-site power generation that gets fed back to the grid at a later time. We discuss how such a data center addresses the total cost of ownership, illustrating that contrary to the oft-held view of sustainability as “paying more to be green”, sustainable data centers—built on a framework that focuses on integrating supply and demand management from end-to-end—can concurrently lead to lowest cost and lowest environmental impact.

...read moreread less

Journal Article•DOI•

Energy-efficient markov chain-based duty cycling schemes for greener wireless sensor networks

[...]

Giacomo Ghidini¹, Sajal K. Das¹•Institutions (1)

University of Texas at Arlington¹

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: Analytical and experimental results demonstrate that the Markov chain-based scheme can improve the performance in terms of connection delay without affecting the time efficiency, or vice versa, as opposed to the trade-off observed in traditional schemes.

...read moreread less

Abstract: To extend the lifetime of a wireless sensor network, sensor nodes usually duty cycle between dormant and active states. Duty cycling schemes are often evaluated in terms of connection delay, connection duration, and duty cycle. In this article, we show with experiments on Sun SPOT sensors that duty cycling time (energy) efficiency, that is, the ratio of time (energy) employed in ancillary operations when switching from and into deep sleep mode, is an important performance metric too. We propose a novel randomized duty cycling scheme based on Markov chains with the goal of (i) reducing the connection delay, while maintaining a given time (energy) efficiency, or (ii) keeping a constant connection delay, while increasing the time (energy) efficiency. Analytical and experimental results demonstrate that the Markov chain-based scheme can improve the performance in terms of connection delay without affecting the time efficiency, or vice versa, as opposed to the trade-off observed in traditional schemes. We extend the proposed duty cycling scheme to a partially randomized scheme, where wireless nodes can switch into active state beyond their schedules when their neighbors are active to anticipate message forwarding. The analytical and experimental results confirm the relationship between connection delay and time efficiency also for this scheme.

...read moreread less

Journal Article•DOI•

Effect of process variations in 3D global clock distribution networks

[...]

Hu Xu¹, Vasilis F. Pavlidis¹, Giovanni De Micheli¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: Results indicate that simply increasing the number of planes of a 3D IC does not necessarily lead to lower skew variation and higher operating frequencies, and a multigroup 3D clock tree topology is proposed to effectively mitigate the variability of clock skew.

...read moreread less

Abstract: In three-dimensional (3D) integrated circuits, the effect of process variations on clock skew differs from 2D circuits. The combined effect of inter-die and intra-die process variations on the design of 3D clock distribution networks is considered in this article. A statistical clock skew model incorporating both the systematic and random components of process variations is employed to describe this effect. Two regular 3D clock tree topologies are investigated and compared in terms of clock skew variation. The statistical skew model used to describe clock skew variations is verified through Monte-Carlo simulations. The clock skew is shown to change in different ways with the number of planes forming the 3D IC and the clock network architecture. Simulations based on a 45-nm CMOS technology show that the maximum standard deviation of clock skew can vary from 15 ps to 77 ps. Results indicate that simply increasing the number of planes of a 3D IC does not necessarily lead to lower skew variation and higher operating frequencies. A multigroup 3D clock tree topology is proposed to effectively mitigate the variability of clock skew. Tradeoffs between the investigated 3D clock distribution networks and the number of planes comprising a 3D circuit are discussed and related design guidelines are offered. The skew variation in 3D clock trees is also compared with the skew variation of clock grids.

...read moreread less

Journal Article•DOI•

Skew Dependence of Nanophotonic Devices Based on Optical Near-Field Interactions

[...]

Makoto Naruse¹, Ferdinand Peper¹, Kouichi Akahane¹, Naokatsu Yamamoto¹, Tadashi Kawazoe², Naoya Tate², Motoichi Ohtsu² - Show less +3 more•Institutions (2)

National Institute of Information and Communications Technology¹, University of Tokyo²

01 Feb 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The analysis reveals that the nanophotonic switch is resistant to a skew longer than the input signal duration, and the tolerance to skew is asymmetric with respect to the two inputs.

...read moreread less

Abstract: We examine the timing dependence of nanophotonic devices based on optical excitation transfer via optical near-field interactions at the nanometer scale. We theoretically analyze the dynamic behavior of a two-input nanophotonic switch composed of three quantum dots based on a density matrix formalism while assuming arrival-time differences, or skew, between the inputs. The analysis reveals that the nanophotonic switch is resistant to a skew longer than the input signal duration, and the tolerance to skew is asymmetric with respect to the two inputs. The skew dependence is also experimentally examined based on near-field spectroscopy of InGaAs quantum dots, showing good agreement with the theory. Elucidating the dynamic properties of nanophotonics, together with the associated spatial and energy dissipation attributes at the nanometer scale, will provide critical insights for novel system architectures.

...read moreread less

Journal Article•DOI•

Technology-driven limits on runtime power management algorithms for multiprocessor systems-on-chip

[...]

Siddharth Garg¹, Diana Marculescu², Radu Marculescu²•Institutions (2)

University of Waterloo¹, Carnegie Mellon University²

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: Although conventional DVFS might become less effective with technology scaling, it will continue to play an important role in the context of emerging power management techniques, for example, for massively parallel multiprocessor systems where only a subset of cores can be turned on at any given point of time due to total power constraints.

...read moreread less

Abstract: Runtime power management is a critical technique for reducing the energy footprint of digital electronic devices and enabling sustainable computing, since it allows electronic devices to dynamically adapt their power and energy consumption to meet performance requirements. In this article, we consider the case of MultiProcessor Systems-on-Chip (MPSoC) implemented using multiple Voltage and Frequency Islands (VFIs) relying on fine-grained Dynamic Voltage and Frequency Scaling (DVFS) to reduce the system power dissipation. In particular, we present a framework to theoretically analyze the impact of three important technology-driven constraints; (i) reliability-driven upper limits on the maximum supply voltage; (ii) inductive noise-driven constraints on the maximum rate of change of voltage/frequency; and (iii) the impact of manufacturing process variations on the performance of DVFS control for multiple VFI MPSoCs. The proposed analysis is general, in the sense that it is not bound to a specific DVFS control algorithm, but instead focuses on theoretically bounding the performance that any DVFS controller can possibly achieve. Our experimental results on real and synthetic benchmarks show that in the presence of reliability- and temperature-driven constraints on the maximum frequency and maximum frequency increment, any DVFS control algorithm will lose up to 87p performance in terms of the number of steps required to reach a reference steady state. In addition, increasing process variations can lead to up to 60p of fabricated chips being unable to meet the specified DVFS control specifications, irrespective of the DVFS algorithm used. Nonetheless, we note that although conventional DVFS might become less effective with technology scaling, it will continue to play an important role in the context of emerging power management techniques, for example, for massively parallel multiprocessor systems where only a subset of cores can be turned on at any given point of time due to total power constraints.

...read moreread less

Journal Article•DOI•

Dynamic clock stretching for variation compensation in VLSI circuit design

[...]

Venkataraman Mahalingam¹, Nagarajan Ranganathan², Ransford Hyman²•Institutions (2)

Texas Instruments¹, University of South Florida²

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A dynamic technique is proposed that controls the instance of data capture in critical path memory flops, by delaying the clock edge trigger, and improves the timing yield of the circuit without significant overcompensation.

...read moreread less

Abstract: In the nanometer era, process, voltage, and temperature variations are dominating circuit performance, power, and yield. Over the past few years, statistical optimization methods have been effective in improving yield in the presence of uncertainty due to process variations. However, statistical methods overconsume resources, even in the absence of variations. Hence, to facilitate a better performance-power-yield trade-off, techniques that can dynamically enable variation compensation are becoming necessary. In this article, we propose a dynamic technique that controls the instance of data capture in critical path memory flops, by delaying the clock edge trigger. The methodology employs a dynamic delay detection circuit to identify the uncertainty in delay due to variations and stretches the clock in the destination flip-flops. The delay detection circuit uses a latch and set of combinational gates to dynamically detect and create the slack needed to accommodate the delay due to variations. The Clock Stretching Logic (CSL) is added only to paths, which have a high probability of failure in the presence of variations. The proposed methodology improves the timing yield of the circuit without significant overcompensation. The methodology approach was simulated using Synopsys design tools for circuit synthesis and Cadence tools for placement and routing of the design. Extraction of parasitic of timing information was parsed using Perl scripts and simulated using a simulation program generated in Cpp. Experimental results based on Monte-Carlo simulations on benchmark circuits indicate considerable improvement in timing yield with negligible area overhead.

...read moreread less

Journal Article•DOI•

Congestion-aware layout design for high-throughput digital microfluidic biochips

[...]

Sudip Roy¹, Debasis Mitra², Bhargab B. Bhattacharya³, Krishnendu Chakrabarty⁴•Institutions (4)

Indian Institute of Technology Kharagpur¹, National Institute of Technology, Durgapur², Indian Statistical Institute³, Duke University⁴

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: An orientation strategy for layout of a multichip that reduces routing congestion and consequently facilitates wire routing for the electrode array and supports a hierarchical approach to wire routing that ensures scalability is presented.

...read moreread less

Abstract: Potential applications of digital microfluidic (DMF) biochips now include several areas of real-life applications like environmental monitoring, water and air pollutant detection, and food processing to name a few. In order to achieve sufficiently high throughput for these applications, several instances of the same bioassay may be required to be executed concurrently on different samples. As a straightforward implementation, several identical biochips can be integrated on a single substrate as a multichip to execute the assay for various samples concurrently. Controlling individual electrodes of such a chip by independent pins may not be acceptable since it increases the cost of fabrication. Thus, in order to keep the overall pin-count within an acceptable bound, all the respective electrodes of these individual pieces are connected internally underneath the chip so that they can be controlled with a single external control pin. In this article, we present an orientation strategy for layout of a multichip that reduces routing congestion and consequently facilitates wire routing for the electrode array. The electrode structure of the individual pieces of the multichip may be either direct-addressable or pin-constrained. The method also supports a hierarchical approach to wire routing that ensures scalability. In this scheme, the size of the biochip in terms of the total number of electrodes may be increased by a factor of four by increasing the number of routing layers by only one. In general, for a multichip with 4n identical blocks, (n p 1) layers are sufficient for wire routing.

...read moreread less

Journal Article•DOI•

Enhancing data center sustainability through energy-adaptive computing

[...]

Krishna Kant¹, Muthukumar Murugan², David H. C. Du²•Institutions (2)

George Mason University¹, University of Minnesota²

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The article lays out the challenges of EAC in various environments in terms of the adaptation of the workload and the infrastructure to cope with energy and cooling deficiencies, and addresses the problem of simultaneous energy demand and energy supply regulation at multiple levels, work, from servers to the entire data center.

...read moreread less

Abstract: The sustainability concerns of Information Technology (IT) go well beyond energy-efficient computing and require techniques for minimizing environmental impact of IT infrastructure over its entire life-cycle Traditionally, IT infrastructure is overdesigned at all levels from chips to entire data centers and ecosystem; the paradigm explored in this article is to replace overdesign with rightsizing coupled with smarter control, henceforth referred to as Energy-Adaptive Computing or EAC The article lays out the challenges of EAC in various environments in terms of the adaptation of the workload and the infrastructure to cope with energy and cooling deficiencies The article then focuses on implementing EAC in a data center environment, and addresses the problem of simultaneous energy demand and energy supply regulation at multiple levels, work, from servers to the entire data center The proposed control scheme adapts the assignments of tasks to servers in a way that can cope with the varying energy limitations The article also presents some experimental results to show how the scheme can continue to meet Quality of Service (QoS) requirements of tasks under energy limitations

...read moreread less

Journal Article•DOI•

An Implantable Release-on-Demand CMOS Drug Delivery SoC Using Electrothermal Activation Technique

[...]

Yu-Jie Huang¹, Hsin-Hung Liao¹, Pen-Li Huang¹, Tao Wang², Yao-Joe Yang¹, Yao-Hong Wang¹, Shey-Shi Lu¹ - Show less +3 more•Institutions (2)

National Taiwan University¹, Chang Gung University²

01 Jun 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: An implantable system-on-a-chip integrating controller/actuation circuitry and 8 individually addressable drug reservoirs is proposed for on-demand drug delivery, implemented by standard 0.35-μm CMOS technology and post-IC processing.

...read moreread less

Abstract: An implantable system-on-a-chip (SoC) integrating controller/actuation circuitry and 8 individually addressable drug reservoirs is proposed for on-demand drug delivery. It is implemented by standard 0.35-μm CMOS technology and post-IC processing. The post-IC processing includes deposition of metallic membranes (200A Pt/3000A Ti/200A Pt) to cap the drug reservoirs, deep dry etching to carve drug reservoirs in silicon as drug containers, and PDMS layer bonding to enlarge the drug storage. Based on electrothermal activation technique, drug releases can be precisely controlled by wireless signals. The wireless controller/actuation circuits including on-off keying (OOK) receiver, microcontroller unit, clock generator, power-on-reset circuit, and switch array are integrated on the same chip, providing patients the ability of remote drug activation and noninvasive therapy modification. Implanted by minimally invasive surgery, this SoC can be used for the precise drug dosing of localized treatment, such as the cancer therapy, or the immediate medication to some emergent diseases, such as heart attack. In vitro experimental results show that the reservoir content can be released successfully through the rupture of the membrane which is appointed by received wireless commands.

...read moreread less

Journal Article•DOI•

Nonvolatile Memories as the Data Storage System for Implantable ECG Recorder

[...]

Zhenyu Sun¹, Xiang Chen², Yaojun Zhang², Hai Li¹, Yi Chen² - Show less +1 more•Institutions (2)

New York University¹, University of Pittsburgh²

01 Jun 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A data storage system with the emerging nonvolatile memory technologies used for the implantable electrocardiography (ECG) recorder is proposed and the new read and write schemes of STT-RAM and spintronic memristors are presented and optimized to fit the specific application scenario.

...read moreread less

Abstract: In this article, we propose a data storage system with the emerging nonvolatile memory technologies used for the implantable electrocardiography (ECG) recorder. The proposed storage system can record the digitalized real-time ECG waveforms continuously inside the implantable device and export the stored data to external reader periodically to obtain a long-term backup. Spin transfer torque random access memory (STT-RAM) and spintronic memristor are selected as the storage elements for their nonvolatility, high density, high reliability, low power consumption, good scalability, and CMOS technology compatibility. The new read and write schemes of STT-RAM and spintronic memristors are presented and optimized to fit the specific application scenario. The tradeoffs among data accuracy, chip area, and read/write energy for the different technologies are thoroughly analyzed and compared. Our simulation results show the configuration with a data sampling rate (e.g., 128 Hz) and a quantization resolution (e.g., 12 bits) can record 18-hour real-time data within ~ 3.6-mm2 chip area when the data storage is built with single-level cell (SLC) STT-RAMs. Daily energy consumption is 5.46 mJ. Utilizing the multilevel cell (MLC) STT-RAMs or the spintronic memristors as the storage elements can further reduce the chip area and decrease energy dissipation.

...read moreread less

Journal Article•DOI•

From Transistors to NEMS: Highly Efficient Power-Gating of CMOS Circuits

[...]

Michael B. Henry¹, Leyla Nazhandali¹•Institutions (1)

Virginia Tech¹

01 Feb 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A thorough evaluation of a new nanotechnology-enabled power gating structure, CMOS-compatible NEMS switches, in the presence of aggressive supply voltage scaling shows that NEMS-based power-gating warrants further investigation and the fabrication of a prototype.

...read moreread less

Abstract: A rapidly growing class of battery constrained electronic applications are those with very long sleep periods, such as structural health monitoring systems, biomedical implants, and wireless border security cameras. The traditional method for sleep-mode power reduction, transistor power gating, has drawbacks, including performance loss and residual leakage. This article presents a thorough evaluation of a new nanotechnology-enabled power gating structure, CMOS-compatible NEMS switches, in the presence of aggressive supply voltage scaling. Due to the infinite off-resistance of the NEMS switches, the average power consumption of an FFT processor performing 1 FFT per hour drops by around 30 times compared to a transistor-based power gating implementation. Additionally, the low on-resistance and nanoscale size means even with current prototypes, area overhead is as much as 5 times lower, with much room for improvement. The major drawback of NEMS switches is the high activation voltage, which can be many times higher than typical CMOS supply voltages. We demonstrate that with a charge pump, these voltages can be generated on-die, and the energy and bootup delay overhead is negligible compared to the FFT processing itself. These results show that NEMS-based power-gating warrants further investigation and the fabrication of a prototype.

...read moreread less

Journal Article•DOI•

Implantable Closed-Loop Epilepsy Prosthesis: Modeling, Implementation and Validation

[...]

Muhammad Tariqus Salam¹, Mohamad Sawan¹, Dang Khoa Nguyen²•Institutions (2)

École Polytechnique de Montréal¹, Université de Montréal²

01 Jun 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: An implantable closed-loop epilepsy prosthesis is presented, which is dedicated to automatically detect seizure onsets based on intracerebral electroencephalographic recordings from intracranial electrode contacts and provide an electrical stimulation feedback to the same contacts in order to disrupt these seizures.

...read moreread less

Abstract: In this article, we present an implantable closed-loop epilepsy prosthesis, which is dedicated to automatically detect seizure onsets based on intracerebral electroencephalographic (icEEG) recordings from intracranial electrode contacts and provide an electrical stimulation feedback to the same contacts in order to disrupt these seizures. A novel epileptic seizure detector and a dedicated electrical stimulator were assembled together with common recording electrodes to complete the proposed prosthesis. The seizure detector was implemented in CMOS 0.18-μm by incorporating a new seizure detection algorithm that models time-amplitude and -frequency relationship in icEEG. The detector was validated offline on ten patients with refractory epilepsy and showed excellent performance for early detection of seizures. The electrical stimulator, used for suppressing the developing seizure, is composed of two biphasic channels and was assembled with embedded FPGA in a miniature PCB. The stimulator efficiency was evaluated on cadaveric animal brain tissue in an in vitro morphologic electrical model. Spatial characteristics of the voltage distribution in cortex were assessed in an attempt to identify optimal stimulation parameters required to affect the suspected epileptic focus. The experimental results suggest that lower frequency stimulation parameters cause significant amount of shunting of current through the cerebrospinal fluid; however higher frequency stimulation parameters produce effective spatial voltage distribution with lower stimulation charge.

...read moreread less

Journal Article•DOI•

Retail beamed power using millimeter waves: Survey

[...]

Narayanan Komerath¹, Aravinda Kar²•Institutions (2)

Georgia Institute of Technology¹, University of Central Florida²

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A survey of technology developments relevant to millimeter wave beaming indicates that massive, mass-produced solid-state arrays capable of achieving good efficiency and cost effectiveness are possible in the near term to enable such retail power beaming architectures.

...read moreread less

Abstract: Retail delivery of electric power through millimeter waves is relevant in developing areas where the market for communication devices outpaces the power grid infrastructure It is also a critical component of an evolutionary path towards terrestrial and space-based renewable power generation Narrow-band power can be delivered as focused beams to receivers near end-users, from central power plants, rural distribution points, UAVs, tethered aerostats, stratospheric airship platforms, or space satellites The article surveys the available knowledge base on millimeter wave beamed power delivery It then considers design requirements for a retail beamed power architecture, in the context of rural India where power delivery is lagging behind the demand growth for connectivity A survey of technology developments relevant to millimeter wave beaming is conducted, and indicates that massive, mass-produced solid-state arrays capable of achieving good efficiency and cost effectiveness are possible in the near term to enable such retail power beaming architectures

...read moreread less

Journal Article•DOI•

CMOS LC voltage controlled oscillator design using multiwalled and single-walled carbon nanotube wire inductors

[...]

Ashok Srivastava¹, Yao Xu¹, Yang Liu¹, Ashwani K. Sharma², Clay Mayberry² - Show less +1 more•Institutions (2)

Louisiana State University¹, Air Force Research Laboratory²

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The calculation results show that the Q-factors of Carbon NanoTube (CNT) wire (SWCNT bundle and MWCNT) inductors are higher than that of the Cu wire inductor, mainly due to much lower resistance of CNT and negligible skin effect in carbon nanotubes at higher frequencies.

...read moreread less

Abstract: We have utilized our Multiwalled Carbon NanoTube (MWCNT) and Single-Walled Carbon NanoTube (SWCNT) bundle interconnects model in a widely used π model to study the performances of MWCNT and SWCNT bundle wire inductors and compared these with copper (Cu) inductors. The calculation results show that the Q-factors of Carbon NanoTube (CNT) wire (SWCNT bundle and MWCNT) inductors are higher than that of the Cu wire inductor. This is mainly due to much lower resistance of CNT and negligible skin effect in carbon nanotubes at higher frequencies. The application of CNT wire inductor in LC VCO is also studied and the Cadence/Spectre simulations show that VCOs with CNT bundle wire inductors have significantly improved performance such as the higher oscillation frequency and lower phase noise due to their smaller resistances and higher Q-factors. It is also noticed that CMOS LC VCO using a SWCNT bundle wire inductor has better performance when compared with the performance of LC VCO using the MWCNT wire inductor due to its lower resistance and higher Q-factor.

...read moreread less

Journal Article•DOI•

Resilient and adaptive performance logic

[...]

Bao Liu¹, Xuemei Chen¹, Fiona Teshome¹•Institutions (1)

University of Texas at San Antonio¹

15 Aug 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: Resilient and Adaptive Performance (RAP) logic is proposed for maximum adaptive performance and soft error resilience in nanoscale computing and outperforms alternative Delay-Insensitive (DI) code-based static (Domino) RAP logic with less area, higher performance, and lower power consumption for the large test cases.

...read moreread less

Abstract: As VLSI technology continues scaling, increasingly significant parametric variations and increasingly prevalent defects present unprecedented challenges to VLSI design at nanometer scale. Specifically, performance variability has hindered performance scaling, while soft errors become an emerging problem for logic computation at recent technology nodes. In this article, we leverage the existing Totally Self-Checking (TSC)/Strongly Fault-Secure (SFS) logic design techniques, and propose Resilient and Adaptive Performance (RAP) logic for maximum adaptive performance and soft error resilience in nanoscale computing. RAP logic clears all timing errors in the absence of external soft errors, albeit at a higher area/power cost compared with Razor logic. Our experimental results further show that dual-rail static (Domino) RAP logic outperforms alternative Delay-Insensitive (DI) code-based static (Domino) RAP logic with less area, higher performance, and lower power consumption for the large test cases, and achieves an average of 2.29(2.41)× performance boost, 2.12(1.91)× layout area, and 2.38(2.34)× power consumption compared with the traditional minimum area static logic based on the Nangate 45-nm open cell library.

...read moreread less

Journal Article•DOI•

Low-Power Architecture for Epileptic Seizure Detection Based on Reduced Complexity DWT

[...]

Mrigank Sharad¹, Sumeet Kumar Gupta¹, Shriram Raghunathan¹, Pedro P. Irazoqui¹, Kaushik Roy¹ - Show less +1 more•Institutions (1)

Purdue University¹

01 Jun 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: A low-power, user-programmable architecture for discrete wavelet transform (DWT) based epileptic seizure detection algorithm that was implemented in TSMC-65nm technology and consumes less than 550-nW power at 250-mV supply.

...read moreread less

Abstract: In this article, we present a low-power, user-programmable architecture for discrete wavelet transform (DWT) based epileptic seizure detection algorithm. A simplified, low-pass filter (LPF)-only-DWT technique is employed in which energy contents of different frequency bands are obtained by subtracting quasi-averaged, consecutive LPF outputs. Training phase is used to identify the range of critical DWT coefficients that are in turn used to set patient-specific system level parameters for minimizing power consumption. The proposed optimizations allow the design to work at significantly lower power in the normal operation mode. The system has been tested on neural data obtained from kainate-treated rats. The design was implemented in TSMC-65nm technology and consumes less than 550-nW power at 250-mV supply.

...read moreread less

Journal Article•DOI•

Introduction to the special issue on sustainable and green computing systems

[...]

Partha Pratim Pande¹, Amlan Ganguly²•Institutions (2)

Washington State University¹, Rochester Institute of Technology²

30 Nov 2012-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: This article presents a blueprint for a net-zero data center, and claims that sustainable data centers that are built on a framework focusing on integrating supply and demand management will concurrently lead to the lowest cost and the lowest environmental impact.

...read moreread less

Abstract: It is our pleasure to introduce this special issue on sustainable and green computing systems. Modern large-scale computing systems, such as data centers and high-performance computing (HPC) clusters are severely constrained by power and cooling costs for solving extreme-scale (or exascale) problems. This increasing power consumption is of growing concern due to several reasons, for example, cost, reliability, scala-bility, and environmental impact. A report from the Environmental Protection Agency (EPA) indicates that the nation's servers and data centers alone use about 1.5% of the total national energy consumed per year, at a cost of approximately $4.5 billion. The growing energy demands in data centers and HPC clusters are of utmost concern and there is a need to build efficient and sustainable computing environments that reduce the negative environmental impacts. Emerging technologies to support these computing systems are therefore of tremendous interest. Power management of data centers and HPC platforms is getting significant attention both from academia and industry. The power efficiency and sustainability aspects need to be addressed from various angles that include system design, computer architecture, programming language, compilers, networking, etc. This special issue of the ACM Journal on Emerging Technologies in Computing Systems (JETC) presents several articles that highlight the state-of-the art on sustainable and green computing systems. While bridging the gap between various disciplines, this special issue highlights new sustainable and green computing paradigms and presents some of their features, advantages, disadvantages, and associated challenges. Contributions were submitted by researchers from both industry and academia, and each article was selected through a rigorous review process. Out of twelve initial submissions , eight articles were accepted. These feature a range of application areas, from sustainable data centers, to runtime power management in multicore chips, to green wireless sensor networks, energy efficiency of servers, and energy-and performance-aware scheduling of tasks on parallel and distributed systems. In the first article, " Towards a Net-Zero Data Center " , Prithviraj Banerjee et al. advocate a need for a new paradigm in design and management of data centers that minimizes energy used across their lifetimes. This article presents a blueprint for a net-zero data center. It claims that sustainable data centers that are built on a framework focusing on integrating supply and demand management will concurrently lead to the lowest cost and the lowest environmental impact. The second article, " Technology-Driven Limits on Runtime Power Management Algorithms for Multiprocessor Systems-on-Chip " by …

...read moreread less