scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Design & Test of Computers in 2011"


Journal ArticleDOI
TL;DR: This work presents a leading effort to automate the production of pipelined data-path circuits for implementing numerical functions in FPGA-based acceleration of scientific computing.
Abstract: Efficient implementation of basic, data-path circuit elements is of fundamental importance to achieving high performance in FPGA-based acceleration of scientific computing. This work presents a leading effort to automate the production of pipelined data-path circuits for implementing numerical functions.

296 citations


Journal ArticleDOI
TL;DR: This tutorial provides a glimpse into the theory and practice of metastability, which can arise whenever a signal is sampled close to a transition, leading to indecision as to its correct value.
Abstract: Metastability can arise whenever a signal is sampled close to a transition, leading to indecision as to its correct value. Synchronizer circuits, which guard against metastability, are becoming ubiquitous with the proliferation of timing domains on a chip. Despite the critical importance of reliable synchronization, this topic remains inadequately understood. This tutorial provides a glimpse into the theory and practice of this fascinating subject.

149 citations


Journal ArticleDOI
TL;DR: New opportunities and challenges presented by spin-transfer torque RAM and phase-change RAM with a particular emphasis on modeling and architecture design are discussed.
Abstract: Spin-transfer torque RAM and phase-change RAM are vying to become the next-generation embedded memory, offering high speed, high density, and nonvolatility. This article discusses new opportunities and challenges presented by these two memory technologies with a particular emphasis on modeling and architecture design.

128 citations


Journal ArticleDOI
TL;DR: A new resistive RAM device with fast write operation is described with the aim to improve the speed of embedded nonvolatile memories.
Abstract: Especially for microcontroller and mobile applications, embedded nonvolatile memory is an important technology offering to reduce power and provide local persistent storage. This article describes a new resistive RAM device with fast write operation to improve the speed of embedded nonvolatile memories.

114 citations


Journal ArticleDOI
TL;DR: There is significant opportunity to look beyond parallelization and focus on domain-specific customization to bring significant power-performance efficiency improvement.
Abstract: To meet computing needs and overcome power density limitations, the computing industry has entered the era of parallelization. However, highly parallel, general-purpose computing systems face serious challenges in terms of performance, energy, heat dissipation, space, and cost. We believe that there is significant opportunity to look beyond parallelization and focus on domain-specific customization to bring significant power-performance efficiency improvement.

108 citations


Journal ArticleDOI
TL;DR: This tutorial provides an overview of the best-in-class asynchronous pipelining methods that can be used to fully exploit the advantages of this design style, covering both static and dynamic logic implementations.
Abstract: Pipelining is a key element of high-performance design. Distributed synchronization is at the same time one of the key strengths and one of the major difficulties of asynchronous pipelining. It automatically provides elasticity and on-demand power consumption. This tutorial provides an overview of the best-in-class asynchronous pipelining methods that can be used to fully exploit the advantages of this design style, covering both static and dynamic logic implementations.

101 citations


Journal ArticleDOI
Takayuki Kawahara1
TL;DR: Recent trends of spin-transfer-torque RAM technology, an emerging class of nonvolatile memory, is covered and its impact on the different layers of computer system hierarchy is discussed.
Abstract: Nonvolatile embedded memories may open the door to new computing paradigms based on "normally-off and instant-on" operation. This article covers recent trends of spin-transfer-torque RAM technology, an emerging class of nonvolatile memory, and discusses its impact on the different layers of computer system hierarchy.

91 citations


Journal ArticleDOI
TL;DR: This review article presents SRAM techniques including new bit cells, novel sensing schemes, and read/write assist circuits for ultra-low-power applications.
Abstract: SRAMs capable of operating at extremely low supply voltages-for example, below the transistor threshold voltage-can enable ultra-low-power battery-operated systems by allowing the logic and memory to operate at the same optimal supply voltage. This review article presents SRAM techniques including new bit cells, novel sensing schemes, and read/write assist circuits for ultra-low-power applications.

85 citations


Journal ArticleDOI
TL;DR: This article presents an industrial-strength asynchronous ASIC CAD flow that enables the automatic synthesis and physical design of high-level specifications into GHz silicon, greatly reducing design time and enabling far wider use of asynchronous technology.
Abstract: Editors' note:The high-performance benefits of asynchronous design have hitherto been obtained only using full-custom design. This article presents an industrial-strength asynchronous ASIC CAD flow that enables the automatic synthesis and physical design of high-level specifications into GHz silicon, greatly reducing design time and enabling far wider use of asynchronous technology.

79 citations


Journal ArticleDOI
TL;DR: Recently proposed techniques to selectively harden nanoelectronics and achieve very low error levels are reviewed.
Abstract: As ICs shrink into the nanometer range, they are increasingly subject to errors induced by physical faults. Traditional hardening for error mitigation consumes too much area and energy to be cost-effective in commercial applications. Selective hardening, applied only to a design's most error-sensitive parts, offers an attractive alternative. This article reviews recently proposed techniques to selectively harden nanoelectronics and achieve very low error levels.

49 citations


Journal ArticleDOI
TL;DR: CPS modeling and design are greatly improved when statistical physics approaches - such as master equations, renormalization group theory, and fractional derivatives - are implemented in the optimization loop.
Abstract: Built to interact with the physical world, a cyberphysical system (CPS) must be efficient, reliable, and safe. To optimize such systems, a science of CPS design considering workload characteristics (e.g., self-similarity and nonstationarity) must be established. CPS modeling and design are greatly improved when statistical physics approaches - such as master equations, renormalization group theory, and fractional derivatives - are implemented in the optimization loop.

Journal ArticleDOI
Youngsoo Shin1, Seungwhun Paik2
TL;DR: A design methodology and tools for pulsed-latch ASICs to complement the environment within a conventional ASIC design environment are identified and potential solutions are reviewed.
Abstract: Pulsed-latch circuits retain the advantages of both latches and flip-flops, offering higher performance and lower power consumption within a conventional ASIC design environment. This article identifies a design methodology and tools for pulsed-latch ASICs to complement this environment. The authors review potential solutions and provide quantitative results to assess the effectiveness of pulsed-latch circuits.

Journal ArticleDOI
TL;DR: Methods to manage the complexity associated with the analysis of data representation techniques so that the key challenge here is to derive best trade-offs between precision and performance.
Abstract: Data representation is an important problem for scientific computing problems that are mapped to FPGAs. The key challenge here is to derive best trade-offs between precision and performance. This article describes methods to manage the complexity associated with the analysis of data representation techniques so that we thereby understand precision/performance trade-offs.

Journal ArticleDOI
TL;DR: This article presents a wide variety of techniques for realizing transaction-level models of the increasingly large-scale multiprocessor systems on chip and describes how such models of hardware allow subsequent software integration and system performance evaluation.
Abstract: This article presents a wide variety of techniques for realizing transaction-level models of the increasingly large-scale multiprocessor systems on chip. It describes how such models of hardware allow subsequent software integration and system performance evaluation.

Journal ArticleDOI
TL;DR: A single-test, single-tuning-step method is introduced to constrain cost and complexity while reaping the benefits of a tunable design in analog and RF devices.
Abstract: As the semiconductor industry continues scaling devices toward smaller process nodes, maintaining acceptable yields despite process variations has become increasingly challenging. Analog and RF circuits are particularly sensitive to process variations. This article discusses the challenges of cost-effective postfabrication performance calibration in such analog and RF devices and introduces a single-test, single-tuning-step method to constrain cost and complexity while reaping the benefits of a tunable design.

Journal ArticleDOI
TL;DR: This model-based methodology and supporting toolset lets designers estimate application-specific network-on-chip power dissipation at early stages of the design flow by integrating a rate-based power estimation model into the proposed design flow.
Abstract: This model-based methodology and supporting toolset lets designers estimate application-specific network-on-chip (NoC) power dissipation at early stages of the design flow. An actor-oriented simulation framework captures the NoC's dynamic behavior and feeds its parameters to a rate-based power estimation model. Integrating this model into the proposed design flow enables the analysis of different design parameters and the identification of the most power-efficient application platform mappings.

Journal ArticleDOI
TL;DR: The cost and performance of fully synchronous and mixed synchronous asynchronous implementations of quantum cellular automata are compared, and the case that asynchrony is inevitable at the top levels of QCA designs is made.
Abstract: Emerging computing technologies inherently exhibit high process and timing variation Many researchers believe that an asynchronous approach is likely to play an enabling role in making these technologies feasible This article compares the cost and performance of fully synchronous and mixed synchronous asynchronous implementations of quantum cellular automata, and makes the case that asynchrony is inevitable at the top levels of QCA designs

Journal ArticleDOI
TL;DR: Orthogonal Latin Square Code (OLSC) can protect the interconnection against transient errors, while also lowering energy consumption, when applied to a 64-bit link using a 45-nm CMOS technology with low-swing signaling.
Abstract: A reliable, energy-efficient on-chip interconnection network employing low-swing signaling can be designed by incorporating error-correcting code. Orthogonal Latin Square Code (OLSC) can protect the interconnection against transient errors, while also lowering energy consumption. When applied to a 64-bit link using a 45-nm CMOS technology with low-swing signaling, OLSC provided up to 55% energy reduction, with only a small area overhead and no loss in reliability.

Journal ArticleDOI
TL;DR: A new class of sensors for built-in test in RF devices that are placed in close proximity to the DUT on the same substrate without being electrically connected to it and monitored by virtue of being subjected to the same process variations.
Abstract: This article proposes a new class of sensors for built-in test in RF devices. These sensors are placed in close proximity to the DUT on the same substrate without being electrically connected to it. Instead, they monitor it by virtue of being subjected to the same process variations. The authors also describe other types of sensors they have studied, including DC probes, an envelope detector, and a current sensor.

Journal ArticleDOI
TL;DR: The proposed optimized simulator enables fast validation of large multicore SoC designs by issuing multiple simulation threads simultaneously while ensuring safe synchronization.
Abstract: Editor's note: To address the limitations of discrete-event simulation engines, this article presents an extension of the SoC simulation kernel to support parallel simulation on multicore hosts. The proposed optimized simulator enables fast validation of large multicore SoC designs by issuing multiple simulation threads simultaneously while ensuring safe synchronization.

Journal ArticleDOI
TL;DR: Circuit techniques pursued by industry to overcome SRAM scaling challenges in future technology nodes are presented.
Abstract: Six-transistor SRAM cells have served as the workhorse embedded memory for several decades. However, with aggressive technology scaling, designers find it increasingly difficult to guarantee robust operation at low voltages because of the worsening process variation. This article presents circuit techniques pursued by industry to overcome SRAM scaling challenges in future technology nodes.

Journal ArticleDOI
TL;DR: The authors survey several thin-film transistor (TFT) technologies for flexible electronics of the future and focuses on the reliability issues of these new devices compared to those of classic silicon CMOS.
Abstract: In this review article, the authors survey several thin-film transistor (TFT) technologies for flexible electronics of the future. The review's focus centers on the reliability issues of these new devices compared to those of classic silicon CMOS. In addition, the article examines different digital and analog design techniques, which are discussed within the context of robust circuit design.

Journal ArticleDOI
TL;DR: The design of a cryptographic chip using a globally asynchronous, locally synchronous (GALS) design methodology demonstrates the key advantage of using asynchrony in cryptography: the randomization of event timing internal to the chip leads to a dramatic increase in its robustness to side-channel attacks based on power and electromagnetic emission signatures.
Abstract: This article presents the design of a cryptographic chip using a globally asynchronous, locally synchronous (GALS) design methodology. The design demonstrates the key advantage of using asynchrony in cryptography: the randomization of event timing internal to the chip leads to a dramatic increase in its robustness to side-channel attacks based on power and electromagnetic emission signatures.

Journal ArticleDOI
TL;DR: The development of adaptive testing in response to the ever-growing need to dynamically and cost-effectively tailor IC testing to discriminately manage manufacturing process variations is described.
Abstract: This article describes the development of adaptive testing in response to the ever-growing need to dynamically and cost-effectively tailor IC testing to discriminately manage manufacturing process variations. Various degrees of adoption are presented, together with benefits and examples of its use. Finally, challenges for future development are discussed.

Journal ArticleDOI
Kiyoo Itoh1
TL;DR: Memories are categorized as embedded memories (e-memories) and stand-alone memories, which ensure nonvolatility of stored data, exemplified by flash memories and ferroelectric RAMs (FeRAMs).
Abstract: Memories are categorized as embedded memories (e-memories) and stand-alone memories E-memories favor high speed rather than low cost In addition, they must maintain compatibility with the logic process, because they must be cofabricated on the same chip as the logic In contrast, standalone memories give the first priority to low cost and thus high density rather than high speed There are two types of e-memories: RAMs and ROMs (or "almost ROMs" such as flash memories) RAMs ensure an unlimited number of read and write cycles, and at present, RAMs come in two types: SRAMs, and DRAMs that necessitate refresh operations to retain the data stored at cell capacitors ROMs ensure nonvolatility of stored data, exemplified by flash memories and ferroelectric RAMs (FeRAMs)

Journal ArticleDOI
TL;DR: This article presents a microsystem that is powered only by energy extracted from the environment to implement an autonomous sensing application that provides greater energy efficiency and graceful adaptation to highly variable power availability.
Abstract: Asynchronous circuits are well-suited to ultra-low-power design. This article presents a microsystem that is powered only by energy extracted from the environment to implement an autonomous sensing application. Key to this application is the use of asynchronous logic, which not only provides greater energy efficiency due to its event-driven nature but, more importantly, allows graceful adaptation to highly variable power availability.

Journal ArticleDOI
TL;DR: A methodology to validate software applications for a multicore platform by automatically generating transaction-level models from task-level specification of the applications is suggested.
Abstract: This article suggests a methodology to validate software applications for a multicore platform by automatically generating transaction-level models from task-level specification of the applications. Software vendors developing applications for multicore platforms can leverage this methodology for early validation.

Journal ArticleDOI
TL;DR: This column examines the considerations that underlie how semiconductor manufacturing technologists determine whether a given technology road is worth pursuing.
Abstract: This column examines the considerations that underlie how semiconductor manufacturing technologists determine whether a given technology road is worth pursuing.

Journal ArticleDOI
TL;DR: Systems with elaborate multiple clock distributions are a necessity, and the authors address the postfabrication debug of such multiclock systems with solutions that achieve a consistent snapshot of the system state and force the erroneous state in the face of nondeterminism.
Abstract: Systems with elaborate multiple clock distributions are a necessity, and the authors address the postfabrication debug of such multiclock systems. Solutions, based on the authors' communication-centric debug approach, are presented that achieve a consistent snapshot of the system state and force the erroneous state in the face of nondeterminism.

Journal ArticleDOI
TL;DR: Flexible, large-area display and sensor arrays are finding growing applications in multimedia and future smart homes and this article discusses the implementation, requirements, and testing of flexible sensor arrays.
Abstract: Editors' note:Flexible, large-area display and sensor arrays are finding growing applications in multimedia and future smart homes. This article first analyzes and compares current flexible devices, then discusses the implementation, requirements, and testing of flexible sensor arrays.