Showing papers on "Very-large-scale integration published in 1998"

PDF

Open Access

Journal Article•DOI•

Digital circuit applications of resonant tunneling devices

[...]

Pinaki Mazumder¹, S. Kulkarni¹, Mayukh Bhattacharya¹, J. P. Sun¹, George I. Haddad¹ - Show less +1 more•Institutions (1)

University of Michigan¹

01 Jan 1998

TL;DR: In this article, the authors describe new bistable logic families using resonant tunneling diodes (RTD's) in conjunction with high-performance III-V devices such as heterojunction bipolar transistors (HBT's) and modulation doped field effect transistors(MODFET's) for binary and multiple-valued logic.

...read moreread less

Abstract: Many semiconductor quantum devices utilize a novel tunneling transport mechanism that allows picosecond device switching speeds The negative differential resistance characteristic of these devices, achieved due to resonant tunneling, is also ideally suited for the design of highly compact, self-latching logic circuits As a result, quantum device technology is a promising emerging alternative for high-performance very-large-scale-integration design The bistable nature of the basic logic gates implemented using resonant tunneling devices has been utilized in the development of a gate-level pipelining technique, called nanopipelining, that significantly improves the throughput and speed of pipelined systems The advent of multiple-peak resonant tunneling diodes provides a viable means for efficient design of multiple-valued circuits with decreased interconnect complexity and reduced device count as compared to multiple-valued circuits in conventional technologies This paper details various circuit design accomplishments in the area of binary and multiple-valued logic using resonant tunneling diodes (RTD's) in conjunction with high-performance III-V devices such as heterojunction bipolar transistors (HBT's) and modulation doped field-effect transistors (MODFET's) New bistable logic families using RTD+HBT and RTD+MODFET gates are described that provide a single-gate, self-latching majority function in addition to basic NAND, NOR, and inverter gates

...read moreread less

477 citations

Journal Article•DOI•

Techniques for minimizing power dissipation in scan and combinational circuits during test application

[...]

V. Dabholkar, S. Chakravarty¹, Irith Pomeranz², Sudhakar M. Reddy²•Institutions (2)

Intel¹, University of Iowa²

01 Dec 1998-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Heuristics with good performance bounds can be derived for combinational circuits tested using built-in self-test (BIST) and considerable reduction in power dissipation can be obtained using the proposed techniques.

...read moreread less

Abstract: Reduction of power dissipation during test application is studied for scan designs and for combinational circuits tested using built-in self-test (BIST). The problems are shown to be intractable. Heuristics to solve these problems are discussed. We show that heuristics with good performance bounds can be derived for combinational circuits tested using BIST. Experimental results show that considerable reduction in power dissipation can be obtained using the proposed techniques.

...read moreread less

338 citations

Proceedings Article•DOI•

Designing pipeline FFT processor for OFDM (de)modulation

[...]

Shousheng He¹, M. Torkelson•Institutions (1)

Lund University¹

29 Sep 1998

TL;DR: By exploiting the spatial regularity of the new algorithm, the requirement for both dominant elements in VLSI implementation, the memory size and the number of complex multipliers, have been minimized and the area/power efficiency has been enhanced.

...read moreread less

Abstract: The FFT processor is one of the key components in the implementation of wideband OFDM systems. Architectures with a structured pipeline have been used to meet the fast, real-time processing demand and low-power consumption requirement in a mobile environment. Architectures based on new forms of FFT, the radix-2/sup i/ algorithm derived by cascade decomposition, is proposed. By exploiting the spatial regularity of the new algorithm, the requirement for both dominant elements in VLSI implementation, the memory size and the number of complex multipliers, have been minimized. Progressive wordlength adjustment has been introduced to optimize the total memory size with a given signal-to-quantization-noise-ratio (SQNR) requirement in fixed-point processing. A new complex multiplier based on distributed arithmetic further enhanced the area/power efficiency of the design. A single-chip processor for 1 K complex point FFT transform is used to demonstrate the design issues under consideration.

...read moreread less

322 citations

Proceedings Article•DOI•

Planning for performance

[...]

Ralph H. J. M. Otten¹, Robert K. Brayton¹•Institutions (1)

University of California, Berkeley¹

01 May 1998

TL;DR: To achieve a non-iterative design flow, it is proposed that early synthesis stages should use "wireplanning" to distribute delays over the functional elements and interconnect, and layout synthesis should use its degrees of freedom to realize those delays.

...read moreread less

Abstract: A shift is proposed in the design of VLSI circuits. In conventional design, higher levels of synthesis produce a netlist, from which layout synthesis builds a mask specification for manufacturing. Timing analysis is built into a feedback loop to detect timing violations which are then used to update specifications to synthesis. Such iteration is undesirable, and for very high performance designs, infeasible. The problem is likely to become much worse with future generations of technology. To achieve a non-iterative design flow, we propose that early synthesis stages should use "wireplanning" to distribute delays over the functional elements and interconnect, and layout synthesis should use its degrees of freedom to realize those delays. In this paper we attempt to quantify this problem for future technologies and propose some solutions for a "constant delay" methodology.

...read moreread less

300 citations

Proceedings Article•DOI•

A bandwidth-efficient architecture for media processing

[...]

Scott Rixner¹, William J. Dally¹, Ujval J. Kapasi¹, Brucek Khailany¹, A. Lopez-Lagunas¹, Peter Mattson¹, John D. Owens¹ - Show less +3 more•Institutions (1)

Stanford University¹

01 Nov 1998

TL;DR: The Imagine architecture supports the stream programming model by providing a bandwidth hierarchy tailored to the demands of media applications by reducing the global register and memory bandwidth required by typical applications by factors of 13 and 21 respectively.

...read moreread less

Abstract: Media applications are characterized by large amounts of available parallelism, little data reuse, and a high computation to memory access ratio. While these characteristics are poorly matched to conventional microprocessor architectures, they are a good fit for modern VLSI technology with its high arithmetic capacity but limited global bandwidth. The stream programming model, in which an application is coded as streams of data records passing through computation kernels, exposes both parallelism and locality in media applications that can be exploited by VLSI architectures. The Imagine architecture supports the stream programming model by providing a bandwidth hierarchy tailored to the demands of media applications. Compared to a conventional scalar processor. Imagine reduces the global register and memory bandwidth required by typical applications by factors of 13 and 21 respectively. This bandwidth efficiency enables a single chip Imagine processor to achieve a peak performance of 16.2GFLOPS (single-precision floating point) and sustained performance of up to 8.5GFLOPS on media processing kernels.

...read moreread less

280 citations

Proceedings Article•DOI•

Fast and exact simultaneous gate and wire sizing by Lagrangian relaxation

[...]

Chung-Ping Chen, Chris Chu¹, D. F. Wong¹•Institutions (1)

University of Texas at Austin¹

01 Nov 1998

TL;DR: A fast and exact algorithm which can minimize total area subject to maximum delay bound and is based on Lagrangian relaxation and "one-gate/wire-at-a-time" local optimizations, and is extremely economical and fast.

...read moreread less

Abstract: This paper considers simultaneous gate and wire sizing for general very large scale integrated (VLSI) circuits under the Elmore delay model. We present a fast and exact algorithm which can minimize total area subject to maximum delay bound. The algorithm can be easily modified to give exact algorithms for optimizing several other objectives (e.g., minimizing maximum delay or minimizing total area subject to arrival time specifications at all inputs and outputs). No previous algorithm for simultaneous gate and wire sizing can guarantee exact solutions for general circuits. Our algorithm is an iterative one with a guarantee on convergence to global optimal solutions. It is based on Lagrangian relaxation and "one-gate/wire-at-a-time" greedy optimizations, and is extremely economical and fast. For example, we can optimize a circuit with 27648 gates and wires in 11.53 min using under 23 Mbytes memory on a PC with a 333-MHz Pentium II processor.

...read moreread less

255 citations

Proceedings Article•DOI•

Design and implementation of a 1024-point pipeline FFT processor

[...]

Shousheng He¹, M. Torkelson¹•Institutions (1)

Lund University¹

11 May 1998

TL;DR: By exploiting the spatial regularity of the new algorithm, minimal requirement for both dominant components in VLSI implementation has been achieved: only 4 complex multipliers and 1024 complex-word data memory for the pipelined 1K FFT processor.

...read moreread less

Abstract: The design and implementation of a 1024-point pipeline FFT processor is presented. The architecture is based on a new form of FFT, the radix-2/sup 2/ algorithm. By exploiting the spatial regularity of the new algorithm, minimal requirement for both dominant components in VLSI implementation has been achieved: only 4 complex multipliers and 1024 complex-word data memory for the pipelined 1K FFT processor. The chip has been implement in 0.5 /spl mu/m CMOS technology and takes an area of 40 mm/sup 2/. With 3.3 V power supply, it can compute 2/sup n/, n=0, 1, ..., 10 complex point forward and inverse FFT in real time with up to 30 MHz sampling frequency. The SQNR is above 50 dB for white noise input.

...read moreread less

243 citations

Journal Article•DOI•

Defect tolerance in VLSI circuits: techniques and yield analysis

[...]

Israel Koren, Zahava Koren

01 Sep 1998

TL;DR: A detailed survey of yield-enhancement techniques for very large-scale-integration (VLSI) circuits can be found in this article, where the authors provide a detailed survey and illustrate their use by describing the design of several representative defect-tolerant VLSI circuits.

...read moreread less

Abstract: Current very-large-scale-integration (VLSI) technology allows the manufacture of large-area integrated circuits with submicrometer feature sizes, enabling designs with several millions of devices. However, imperfections in the fabrication process result in yield-reducing manufacturing defects, whose severity grows proportionally with the size and density of the chip. Consequently, the development and use of yield-enhancement techniques at the design stage, to complement existing efforts at the manufacturing stage, is economically justifiable. Design-stage yield-enhancement techniques are aimed at making the integrated circuit "defect tolerant", i.e., less sensitive to manufacturing defects. They include incorporating redundancy into the design, modifying the circuit floorplan, and modifying its layout. Successful designs of defect-tolerant chips must rely on accurate yield projections. This paper reviews the currently used statistical yield-prediction models and their application to defect-tolerant designs. We then provide a detailed survey of various yield-enhancement techniques and illustrate their use by describing the design of several representative defect-tolerant VLSI circuits.

...read moreread less

236 citations

Journal Article•DOI•

Wave-pipelining: a tutorial and research survey

[...]

Wayne Burleson, Maciej Ciesielski¹, F. Klass², Wentai Liu³•Institutions (3)

University of Massachusetts Amherst¹, Sun Microsystems², North Carolina State University³

01 Sep 1998-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This paper presents a tutorial of the principles of wave-pipelining and a survey ofWave- pipelined VLSI chips and CAD tools for the synthesis and analysis of wave -pipelined circuits.

...read moreread less

Abstract: Wave-pipelining is a method of high-performance circuit design which implements pipelining in logic without the use of intermediate latches or registers. The combination of high-performance integrated circuit (IC) technologies, pipelined architectures, and sophisticated computer-aided design (CAD) tools has converted wave-pipelining from a theoretical oddity into a realistic, although challenging, VLSI design method. This paper presents a tutorial of the principles of wave-pipelining and a survey of wave-pipelined VLSI chips and CAD tools for the synthesis and analysis of wave-pipelined circuits.

...read moreread less

235 citations

Journal Article•DOI•

On-Line Testing for VLSI—A Compendium of Approaches

[...]

Michael Nicolaidis, Yervant Zorian¹•Institutions (1)

LogicVision¹

01 Feb 1998-Journal of Electronic Testing

TL;DR: An overview of a comprehensive collection of on-line testing techniques for VLSI, avoiding complex fail-safe interfaces using discrete components; radiation hardened designs, avoiding expensive fabrication process such as SOI, etc.

...read moreread less

Abstract: This paper presents an overview of a comprehensive collection of on-line testing techniques for VLSI. Such techniques are for instance: self-checking design, allowing high quality concurrent checking by means of hardware cost drastically lower than duplication; signature monitoring, allowing low cost concurrent error detection for FSMs; on-line monitoring of reliability relevant parameters such as current, temperature, abnormal delay, signal activity during steady state, radiation dose, clock waveforms, etc.; exploitation of standard BIST, or implementation of BIST techniques specific to on-line testing (Transparent BIST, Built-In Concurrent Self-Test,...); exploitation of scan paths to transfer internal states for performing various tasks for on-line testing or fault tolerance; fail-safe techniques for VLSI, avoiding complex fail-safe interfaces using discrete components; radiation hardened designs, avoiding expensive fabrication process such as SOI, etc.

...read moreread less

234 citations

Book•

Algorithms for VLSI design automation

[...]

Sabih H. Gerez

01 Jan 1998

TL;DR: One of the first books on the subject, this guide covers all stages of design and focuses on the algorithms which are the building blocks of the design automation software which generates the layout of VLSI circuits.

...read moreread less

Abstract: From the Publisher: Modern microprocessors such as Intel's Pentium chip typically contain millions of transitors. Known generically as Very Large-Scale Integrated (VLSI) systems, the chips have a scale and complexity that has necessitated the development of CAD tools to automate their design. This book focuses on the algorithms which are the building blocks of the design automation software which generates the layout of VLSI circuits. One of the first books on the subject, this guide covers all stages of design.

...read moreread less

Book•

Algorithms and data structures in VLSI design

[...]

Christoph Meinel, Thorsten Theobald

01 Jan 1998

TL;DR: This book provides an introduction to the foundations of this interdisciplinary research area, emphasizing its applications in computer aided circuit design.

...read moreread less

Abstract: From the Publisher: This book provides an introduction to the foundations of this interdisciplinary research area, emphasizing its applications in computer aided circuit design

...read moreread less

Journal Article•DOI•

Interconnect and circuit modeling techniques for full-chip power supply noise analysis

[...]

H.H. Chen¹, J.S. Neely¹•Institutions (1)

IBM¹

01 Aug 1998-IEEE Transactions on Components, Packaging, and Manufacturing Technology: Part B

TL;DR: This integrated chip-and-package model provides a complete analysis of the resistive IR drop, inductive delta-I noise, and the on-chip Vdd distribution and allows designers to identify the hot spots on the chip and optimize design variables to minimize the noise.

...read moreread less

Abstract: This paper describes the interconnect and circuit modeling techniques to analyze the on-chip power supply noise for high-performance very large scale integration (VLSI) design. To reduce the complexity of full-chip analysis, a hierarchical power supply distribution model, which consists of a 12/spl times/12 package model, a 50/spl times/50 on-chip power bus model, and a distributed switching circuit model, is developed. This integrated chip-and-package model provides a complete analysis of the resistive IR drop, inductive delta-I noise, and the on-chip Vdd distribution. It also allows designers to identify the hot spots on the chip and optimize design variables to minimize the noise. Analysis results of our benchmark microprocessor chips will be presented to demonstrate the various applications of this methodology.

...read moreread less

Journal Article•DOI•

Probability propagation and decoding in analog VLSI

[...]

Hans-Andrea Loeliger, F. Lustenberger, M. Helfenstein, F. Tarkoy

16 Aug 1998

TL;DR: The sum-product algorithm (belief/probability propagation) can be naturally mapped into analog transistor circuits, which enable the construction of analog-VLSI decoders for turbo codes, low-density parity-check codes, and similar codes.

...read moreread less

Abstract: The sum-product algorithm (belief/probability propagation) can be naturally mapped into analog transistor circuits. These circuits enable the construction of analog-VLSI decoders for turbo codes, low-density parity-check codes, and similar codes.

...read moreread less

Book Chapter•DOI•

Communicating neuronal ensembles between neuromorphic chips

[...]

Kwabena Boahen¹•Institutions (1)

California Institute of Technology¹

01 Sep 1998

TL;DR: This work has shown that multiplexing is an effective way of leveraging the 5 order-of-magnitude difference in bandwidth between a neuron and a digital bus, enabling us to replace dedicated point-to-point connections among thousands of neurons with a handful of high-speed connections and thousands of switches.

...read moreread less

Abstract: The small number of input-output connections available with standard chip-packaging technology, and the small number of routing layers available in VLSI technology, place severe limitations on the degree of intra- and interchip connectivity that can be realized in multichip neuromorphic systems. Inspired by the success of time-division multiplexing in communications [16] and computer networks [19], many researchers have adopted multiplexing to solve the connectivity problem [12, 67, 17]. Multiplexing is an effective way of leveraging the 5 order-of-magnitude difference in bandwidth between a neuron (hundreds of Hz) and a digital bus (tens of megahertz), enabling us to replace dedicated point-to-point connections among thousands of neurons with a handful of high-speed connections and thousands of switches (transistors). This approach pays off in VLSI technology because transistors take up a lot less area than wires, and are becoming relatively more and more compact as the fabrication process scales down to deep submicron feature sizes.

...read moreread less

Journal Article•DOI•

Optoelectronic-VLSI: photonics integrated with VLSI circuits

[...]

Ashok V. Krishnamoorthy¹, Keith W. Goossen•Institutions (1)

Bell Labs¹

01 Nov 1998-IEEE Journal of Selected Topics in Quantum Electronics

TL;DR: The results suggest that OE-VLSI integration offers substantial power and speed improvements even when relatively small numbers of photonic devices are driven with commodity complementary metal-oxide-semiconductor logic technologies.

...read moreread less

Abstract: Optoelectronic-VLSI (OE-VLSI) technology represents the intimate integration of photonic devices with silicon VLSI electronics. We review the motivations and status of emerging OE-VLSI technologies and examine the performance of OE-VLSI technology versus conventional wire-bonded OE packaging. The results suggest that OE-VLSI integration offers substantial power and speed improvements even when relatively small numbers of photonic devices are driven with commodity complementary metal-oxide-semiconductor logic technologies.

...read moreread less

Proceedings Article•DOI•

Address bus encoding techniques for system-level power optimization

[...]

Luca Benini¹, G. De Micheli¹, Enrico Macii², Donatella Sciuto³, Cristina Silvano⁴ - Show less +1 more•Institutions (4)

Stanford University¹, Polytechnic University of Turin², Polytechnic University of Milan³, University of Brescia⁴

23 Feb 1998

TL;DR: This paper presents innovative encoding techniques suitable for minimizing the switching activity of system-level address buses, and targets the reduction of the average number of bus line transitions per clock cycle.

...read moreread less

Abstract: The power dissipated by system-level buses is the largest contribution to the global power of complex VLSI circuits. Therefore, the minimization of the switching activity at the I/O interfaces can provide significant savings on the overall power budget. This paper presents innovative encoding techniques suitable for minimizing the switching activity of system-level address buses. In particular, the schemes illustrated here target the reduction of the average number of bus line transitions per clock cycle. Experimental results, conducted on address streams generated by a real microprocessor, have demonstrated the effectiveness of the proposed methods.

...read moreread less

Vector microprocessors

[...]

Krste Asanovic, John Wawrzynek

01 Jan 1998

TL;DR: This thesis presents the design, implementation, and evaluation of T0 (Torrent-0): the first single-chip vector microprocessor, a compact but highly parallel processor that can sustain over 24 operations per cycle while issuing only a single 32-bit instruction per cycle.

...read moreread less

Abstract: Most previous research into vector architectures has concentrated on supercomputing applications and small enhancements to existing vector supercomputer implementations. This thesis expands the body of vector research by examining designs appropriate for single-chip full-custom vector microprocessor implementations targeting a much broader range of applications. I present the design, implementation, and evaluation of T0 (Torrent-0): the first single-chip vector microprocessor. T0 is a compact but highly parallel processor that can sustain over 24 operations per cycle while issuing only a single 32-bit instruction per cycle. T0 demonstrates that vector architectures are well suited to full-custom VLSI implementation and that they perform well on many multimedia and human-machine interface tasks. The remainder of the thesis contains proposals for future vector microprocessor designs. I show that the most area-efficient vector register file designs have several banks with several ports, rather than many banks with few ports as used by traditional vector supercomputers, or one bank with many ports as used by superscalar microprocessors. To extend the range of vector processing, I propose a vector flag processing model which enables speculative vectorization of "while" loops. To improve the performance of inexpensive vector memory systems, I introduce virtual processor caches, a new form of primary vector cache which can convert some forms of strided and indexed vector accesses into unit-stride bursts.

...read moreread less

Journal Article•DOI•

ILLIADS-T: an electrothermal timing simulator for temperature-sensitive reliability diagnosis of CMOS VLSI chips

[...]

Yi-Kan Cheng¹, P. Raha¹, Chin-Chi Teng¹, Elyse Rosenbaum¹, Sung-Mo Kang¹ - Show less +1 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1998-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A new chip-level electrothermal timing simulator for CMOS VLSI circuits is presented, and temperature-dependent reliability and timing problems of VLSi circuits can be accurately identified.

...read moreread less

Abstract: In this paper, we present a new chip-level electrothermal timing simulator for CMOS VLSI circuits. Given the chip layout, the packaging specification, and the periodic input signal pattern, it finds the on-chip steady-state temperature profile and the resulting circuit performance. A tester chip has been designed for verification of ILLIADS-T, and very good agreement between simulation and experiment was found. Using this electrothermal simulator, temperature-dependent reliability and timing problems of VLSI circuits can be accurately identified.

...read moreread less

Journal Article•DOI•

SOI for digital CMOS VLSI: design considerations and advances

[...]

Ching-Te Chuang¹, Pong-Fe Lu¹, C.J. Anderson¹•Institutions (1)

IBM¹

01 Apr 1998

TL;DR: In this article, the recent advances of silicon-on-insulator (SOI) technology for complementary metal-oxide-semiconductor (CMOS) very large-scaleintegration memory and logic applications are considered.

...read moreread less

Abstract: This paper reviews the recent advances of silicon-on-insulator (SOI) technology for complementary metal-oxide-semiconductor (CMOS) very-large-scale-integration memory and logic applications. Static random access memories (SRAMs), dynamic random access memories (DRAMs), and digital CMOS logic circuits are considered. Particular emphases are placed on the design issues and advantages resulting from the unique SOI device structure. The impact of floating-body in partially depleted devices on the circuit operation, stability, and functionality are addressed. The use of smart-body contact to improve the power and delay performance is discussed, as are global design issues.

...read moreread less

Proceedings Article•DOI•

Layout based frequency dependent inductance and resistance extraction for on-chip interconnect timing analysis

[...]

Byron L. Krauter¹, Sharad Mehrotra¹•Institutions (1)

IBM¹

01 May 1998

TL;DR: This work proposes a rules-based method that efficiently and accurately captures the high and low frequency characteristics directly from layout shapes, and subsequently synthesizes a simple frequency independent ladder circuit suitable for timing analysis.

...read moreread less

Abstract: It is well understood that frequency independent lumped-element circuits can be used to accurately model proximity and skin effects in transmission lines [7]. Furthermore, it is also understood that these circuits can be synthesized knowing only the high and the low frequency resistances and inductances [4]. Existing VLSI extraction tools however, are not efficient enough to solve for the frequency dependent resistances and inductances on large VLSI layouts, nor do they synthesize circuits suitable for timing analysis. We propose a rules-based method that efficiently and accurately captures the high and low frequency characteristics directly from layout shapes, and subsequently synthesizes a simple frequency independent ladder circuit suitable for timing analysis. We compare our results to other simulation results.

...read moreread less

Journal Article•DOI•

A low-power VLSI architecture for full-search block-matching motion estimation

[...]

V.L. Do¹, Kenneth Yi Yun•Institutions (1)

University of California, San Diego¹

01 Aug 1998-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: Augmenting the estimation technique to a conventional systolic-architecture-based VLSI motion estimation reduces the power consumption by a factor of 2, while still preserving the optimal solution and the throughput.

...read moreread less

Abstract: This paper presents an architectural enhancement to reduce the power consumption of the full-search block-matching (FSBM) motion estimation. Our approach is based on eliminating unnecessary computation using conservative approximation. Augmenting the estimation technique to a conventional systolic-architecture-based VLSI motion estimation reduces the power consumption by a factor of 2, while still preserving the optimal solution and the throughput. A register-transfer level implementation as well as simulation results on benchmark video clips are presented.

...read moreread less

Journal Article•DOI•

Methodologies for tolerating cell and interconnect faults in FPGAs

[...]

F. Hanchek¹, Shantanu Dutt²•Institutions (2)

Intel¹, University of Illinois at Chicago²

01 Jan 1998-IEEE Transactions on Computers

TL;DR: Compared to other techniques for fault tolerance in FPGAs, these methods are shown to provide significantly greater yield improvement, and a 35 percent non-FT chip yield for a 16/spl times/16 FPGA is more than doubled.

...read moreread less

Abstract: The very high levels of integration and submicron device sizes used in current and emerging VLSI technologies for FPGAs lead to higher occurrences of defects and operational faults. Thus, there is a critical need for fault tolerance and reconfiguration techniques for FPGAs to increase chip yields (with factory reconfiguration) and/or system reliability (with field reconfiguration). We first propose techniques utilizing the principle of node-covering to tolerate logic or cell faults in SRAM-based FPGAs. A routing discipline is developed that allows each cell to cover-to be able to replace-its neighbor in a row. Techniques are also proposed for tolerating wiring faults by means of replacement with spare portions. The replaceable portions can be individual segments, or else sets of segments, called "grids". Fault detection in the FPGAs is accomplished by separate testing, either at the factory or by the user. If reconfiguration around faulty cells and wiring is performed at the factory (with laser-burned fuses, for example), it is completely transparent to the user. In other words, user configuration data loaded into the SRAM remains the same, independent of whether the chip is detect-free or whether it has been reconfigured around defective cells or wiring-a major advantage for hardware vendors who design and sell FPGA-based logic (e.g., glue logic in microcontrollers, video cards, DSP cards) in production-scale quantities. Compared to other techniques for fault tolerance in FPGAs, our methods are shown to provide significantly greater yield improvement, and a 35 percent non-FT chip yield for a 16/spl times/16 FPGA is more than doubled.

...read moreread less

Proceedings Article•DOI•

Reducing power consumption during test application by test vector ordering

[...]

Patrick Girard¹, Christian Landrault, Serge Pravossoudovitch, D. Severac•Institutions (1)

University of Montpellier¹

31 May 1998

TL;DR: The proposed approach is based on a re-ordering of the vectors in the test sequence to minimize the switching activity of the circuit during test application and guarantees a decrease in power consumption and heat dissipation.

...read moreread less

Abstract: This paper considers the problem of testing VLSI integrated circuits without exceeding their power ratings during test. The proposed approach is based on a re-ordering of the vectors in the test sequence to minimize the switching activity of the circuit during test application. Our technique uses the Hamming distance between test vectors and guarantees a decrease in power consumption and heat dissipation without modifying the initial fault coverage. Results of experiments are presented at the end of this paper and shows a reduction of the circuit activity in the range from 8.2 to 54.1% during test application.

...read moreread less

Journal Article•DOI•

Characterization of self-heating in advanced VLSI interconnect lines based on thermal finite element simulation

[...]

S. Rzepka, Kaustav Banerjee¹, E. Meusel², Chenming Hu¹•Institutions (2)

University of California, Berkeley¹, Dresden University of Technology²

01 Jan 1998-IEEE Transactions on Components, Packaging, and Manufacturing Technology: Part A

TL;DR: In this paper, the authors prove the necessity for extending the system of design rules, propose a thermal design rule, and present an efficient and quantitatively accurate thermal simulator as tool for the design process.

...read moreread less

Abstract: In this paper, self-heating of interconnects has been shown to affect the lifetime of next generation integrated circuits significantly more severely than today's. The paper proves the necessity for extending the system of design rules, proposes a thermal design rule, and presents an efficient and quantitatively accurate thermal simulator as tool for the design process.

...read moreread less

Proceedings Article•DOI•

A new VLSI-oriented FFT algorithm and implementation

[...]

Lihong Jia, Yonghong Gao, Jouni Isoaho, Hannu Tenhunen

13 Sep 1998

TL;DR: A new VLSI-oriented fast Fourier transform (FFT) algorithm-radix-2/4/8, which can effectively minimize the number of complex multiplications and is designed for use in the DVB application in 0.3 V triple-metal CMOS process.

...read moreread less

Abstract: In this paper, we present a new VLSI-oriented fast Fourier transform (FFT) algorithm-radix-2/4/8, which can effectively minimize the number of complex multiplications. This algorithm can be implemented efficiently using a pipelined architecture. Based on this pipelined architecture, an 8 K FFT ASIC is designed for use in the DVB (Digital Video Broadcasting) application in 0.6 /spl mu/m-3.3 V triple-metal CMOS process.

...read moreread less

Proceedings Article•DOI•

Designing the best clock distribution network

[...]

Phillip J. Restle¹, Alina Deutsch¹•Institutions (1)

IBM¹

11 Jun 1998

TL;DR: Novel modeling and measurement techniques are used to investigate on-chip transmission-line effects that are important for high performance clock distribution networks.

...read moreread less

Abstract: Clock distribution has become an increasingly challenging problem for VLSI designs, consuming an increasing fraction of resources such as wiring, power, and design time. Unwanted differences or uncertainties in clock network delays degrade performance or cause functional errors. Three dramatically different strategies being used in the VLSI industry to address these challenges are compared. Novel modeling and measurement techniques are used to investigate on-chip transmission-line effects that are important for high performance clock distribution networks.

...read moreread less

Book Chapter•DOI•

On the Automatic Design of Robust Electronics Through Artificial Evolution

[...]

Adrian Thompson¹•Institutions (1)

University of Sussex¹

23 Sep 1998

TL;DR: After defining an ‘operational envelope’ of robustness, the feasibility of performing fitness evaluations in widely varying physical conditions in order to provide a selection-pressure for robustness is demonstrated and preliminary experimental results are encouraging.

...read moreread less

Abstract: ‘Unconstrained intrinsic hardware evolution’ allows an evolutionary algorithm freedom to find the forms and processes natural to a reconfigurable VLSI medium. It has been shown to produce highly unconventional but extremely compact FPGA configurations for simple tasks, but these circuits are usually not robust enough to be useful: they malfunction if used on a slightly different FPGA, or at a different temperature. After defining an ‘operational envelope’ of robustness, the feasibility of performing fitness evaluations in widely varying physical conditions in order to provide a selection-pressure for robustness is demonstrated. Preliminary experimental results are encouraging.

...read moreread less

Proceedings Article•DOI•

Complexity analysis of the emerging MPEG-4 standard as a basis for VLSI implementation

[...]

Peter M. Kuhn, Walter Stechele

09 Jan 1998

TL;DR: It is shown, that the average MB complexity per arbitrary shaped P-VOP depicts significant variation over time for the encoder and minor variations for the decoder.

...read moreread less

Abstract: A complexity analysis of the video part of the emerging ISO/IEC MPEG-4 standard was performed as a basis for HW/SW partitioning for VLSI implementation of a portable MPEG-4 terminal. While the computational complexity of previously standardized video coding schemes was predictable for I-, P- and B-frames over time, the support of arbitrarily shaped visual objects as well as various coding options within MPEG-4 introduce now content dependent computational requirements with significant variance. In this paper the result of a time dependent complexity analysis of the encoding and decoding process of a binary shape coded video object (VO) and the comparison with a rectangular shaped VO is given for the complete codec as well as for the single tools of the encoding and decoding process. It is shown, that the average MB complexity per arbitrary shaped P-VOP depicts significant variation over time for the encoder and minor variations for the decoder.© (1998) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

Journal Article•DOI•

Telescopic units: a new paradigm for performance optimization of VLSI designs

[...]

Luca Benini¹, Enrico Macii², Massimo Poncino², G. De Micheli¹•Institutions (2)

Stanford University¹, Polytechnic University of Turin²

01 Mar 1998-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: An algorithm for automatically restructuring the controllers of the data paths in which variable-latency units have been introduced is formulated, and results show an average throughput improvement exceeding 27%, at the price of a modest area increase.

...read moreread less

Abstract: This paper introduces a novel optimization paradigm for increasing the throughput of digital systems. The basic idea consists of transforming fixed-latency units into variable-latency ones that run with a faster clock cycle. The transformation is fully automatic and can be used in conjunction with traditional design techniques to improve the overall performance of speed-critical units. In addition, we introduce procedures for reducing the area overhead of the modified units, and we formulate an algorithm for automatically restructuring the controllers of the data paths in which variable-latency units have been introduced. Results, obtained on a large set of benchmark circuits, show an average throughput improvement exceeding 27%, at the price of a modest area increase (less than 8% on average).

...read moreread less

Collapse