scispace - formally typeset
Search or ask a question

Showing papers on "Very-large-scale integration published in 1982"


Journal ArticleDOI
TL;DR: It is shown that addition of n-bit binary numbers can be performed on a chip with a regular layout in time proportional to log n and with area proportional to n.
Abstract: With VLSI architecture, the chip area and design regularity represent a better measure of cost than the conventional gate count. We show that addition of n-bit binary numbers can be performed on a chip with a regular layout in time proportional to log n and with area proportional to n.

1,147 citations


Journal ArticleDOI
TL;DR: It is shown that for most practical ALU implementations, including the carry-lookahead adders, the RESO technique will detect all errors caused by faults in a bit-slice or a specific subcircuit of the bit slice.
Abstract: A new method of concurrent error detection in the Arithmetic and Logic Units (ALU's) is proposed. This method, called "Recomputing with Shifted Operands" (RESO), can detect errors in both the arithmetic and logic operations. RESO uses the principle of time redundancy in detecting the errors and achieves its error detection capability through the use of the already existing replicated hardware in the form of identical bit slices. It is shown that for most practical ALU implementations, including the carry-lookahead adders, the RESO technique will detect all errors caused by faults in a bit-slice or a specific subcircuit of the bit slice. The fault model used is more general than the commonly assumed stuck-at fault model. Our fault model assumes that the faults are confined to a small area of the circuit and that the precise nature of the faults is not known. This model is very appropriate for the VLSI circuits.

344 citations


Journal ArticleDOI
TL;DR: Calculations of time delay for interconnections made of poly-Si, WSi 2 , W, and Al indicate that as the chip area is increased and other device-related dimensions are decreased the interconnection time delay becomes significant compared to the device time delay and in extreme cases dominates the chip performance.
Abstract: Effect of scaling of dimensions, i.e., increase in chip size and decrease in minimum feature size, on the RC time delay associated with interconnections in VLSIC's has been investigated. Analytical expressions have been developed to relate this time delay to various elements of technology, i.e., interconnection material, minimum feature size, chip area, length of the interconnect, etc. Empirical expressions to predict the trends of the technological elements as a function of chronological time have been developed. Calculations of time delay for interconnections made of poly-Si, WSi 2 , W, and Al have been done and they indicate that as the chip area is increased and other device-related dimensions are decreased the interconnection time delay becomes significant compared to the device time delay and in extreme cases dominates the chip performance.

207 citations


Journal ArticleDOI
TL;DR: In this article, a multichip module for future VLSI computer packages on which an array of silicon chips is directly attached and interconnected by high-density thin-film lossy transmission lines is discussed.
Abstract: This paper discusses a multichip module for future VLSI computer packages on which an array of silicon chips is directly attached and interconnected by high-density thin-film lossy transmission lines. Since the high-performance VLSI chips contain a large number of off-chip driver circuits which are allowed to switch simultaneously in operation, low-inductance on-module capacitors are found to be essential for stabilizing the on-module power supply. Novel on-module capacitor structures are therefore proposed, discussed, and evaluated. Material systems and processing techniques for both the thin-film interconnection lines and the capacitor structures are also briefly discussed in the paper. Development of novel defect detection and repair techniques has been identified as essential for fabricating the Thin-Film Module with practical yields.

171 citations


Journal ArticleDOI
TL;DR: This correspondence is concerned with the development of algorithms for special-purpose VLSI arrays and the approach used is to identify algorithm transformations which modify favorably the index set and the data dependences, but perserve the ordering imposed on theindex set by the data dependsences.
Abstract: This correspondence is concerned with the development of algorithms for special-purpose VLSI arrays. The approach used in this correspondence is to identify algorithm transformations which modify favorably the index set and the data dependences, but perserve the ordering imposed on the index set by the data dependences. Conditions for the existance of such transformations are given for a class of algorithms. Also, a methodology is proposed for the synthesis of VLSI algorithms.

164 citations


Journal ArticleDOI
TL;DR: A simple formula for the estimation of the capacitance of a single interconnection line in VLSI circuits is presented and it is shown that the approximation agrees favorably with the results obtained from much more costly two-dimensional simulations.
Abstract: A simple formula for the estimation of the capacitance of a single interconnection line in VLSI circuits is presented It is shown that the approximation agrees favorably with the results obtained from much more costly two-dimensional simulations The approximation is also simpler and more accurate than other approximations that have been proposed

144 citations


Journal ArticleDOI
TL;DR: Calculations of time delay for interconnections made of poly-Si, WSi2, W, and Al indicate that as the chip area is increased and other device-related dimensions are decreased the interconnection time delay becomes significant compared to the device time delay and in extreme cases dominates the chip performance.
Abstract: Effect of scaling of dimensions, i.e., increase in chip size and decrease in minimum feature size, on the RC time delay associated with interconnections in VLSIC's has been investigated. Analytical expressions have been developed to relate this time delay to various elements of technology, i.e., interconnection material, minimum feature size, chip area, length of the interconnect, etc. Empirical expressions to predict the trends of the technological elements as a function of chronological time have been developed. Calculations of time delay for interconnections made of poly-Si, WSi/sub 2/, W, and Al have been done and they indicate that as the chip area is increased and other device-related dimensions are decreased the interconnection time delay becomes significant compared to the device time delay and in extreme cases dominates the chip performance.

139 citations


Journal ArticleDOI
TL;DR: Two bit-serial parallel processing systems are developed: an airborne associative processor and a ground based massively parallel processor.
Abstract: About a decade ago, a bit-serial parallel processing system STARAN®1 was developed. It used standard integrated circuits that were available at that time. Now, with the availability of VLSI, a much greater processing capability can be packed in a unit volume. This has led to the recent development of two bit-serial parallel processing systems: an airborne associative processor and a ground based massively parallel processor.

133 citations


Journal ArticleDOI
TL;DR: A new type of tactile sensor is presented that was designed to give a robot manipulation system information about contact between its hand and objects in the environment and analyses based on the Poisson model indicate that working arrays with 1,000 functional cells are possible if computing elements are rep licated within each tactile cell.
Abstract: A new type of tactile sensor is presented that was designed to give a robot manipulation system information about contact between its hand and objects in the environment. We describe a device that ...

119 citations


Proceedings Article
01 Jan 1982

118 citations


Proceedings ArticleDOI
Prabhakar Goel, M. T. McMahon1
01 Jan 1982
TL;DR: The ECIPT methodology additionally provides a mechanism for simplified tests of failures associated with interchip wiring and chip I/O connections.
Abstract: Electronic Chip-in-Place Test (ECIPT) is a design approach and a test methodology for VLSI packages containing multiple semi-conductor chips. Shift register latches are used in such a way that each chip on a package is accessible for testing from the package pins without in-circuit probing. A means is therefore provided, whereby tests generated for a chip can be reapplied at the package level. The ECIPT methodology additionally provides a mechanism for simplified tests of failures associated with interchip wiring and chip I/O connections.

Journal ArticleDOI
TL;DR: The benefits and the limitations of on-chip modularization and the use of spare elements are presented, and significant yield improvements are shown to be possible.
Abstract: In order to take full advantage of VLSI, new design methods are necessary to improve the yield and testability. Designs which incorporate redundancy to improve the yields of high density memory chips are well known. The goal of this paper is to motivate the extension of this technique to other types of VLSI logic circuits. The benefits and the limitations of on-chip modularization and the use of spare elements are presented, and significant yield improvements are shown to be possible.

Journal ArticleDOI
TL;DR: A new class of partitioned matrix algorithms is developed for possible VLSI implementation of large-scale matrix solvers that can be applied to solve arbitrarily large linear systems of equations in an iterative fashion.
Abstract: A new class of partitioned matrix algorithms is developed for possible VLSI implementation of large-scale matrix solvers. Fast matrix solvers are higherly demanded in signal/image processing and in many real-time and scientific applications. Only a few functional types of VLSI arithmetic chips are needed for submatrix computations after partitioning. This partitioned approach is not restricted by problem sizes and thus can be applied to solve arbitrarily large linear systems of equations in an iterative fashion. The following four matrix computations are shown systematically partitionable into submatrix operations, which are feasible for direct VLSI implementation.

Patent
22 Oct 1982
TL;DR: In this article, a hardware network or system is described for testing LSI and VLSI logic device design and system design by simulation utilizing individual gate functions using switching logic, random access memory, and a state table device.
Abstract: A hardware network or system is disclosed for testing LSI and VLSI logic device design and system design by simulation utilizing individual gate functions The simulator system uses switching logic, random access memory, and a state table device to simulate particular test routines to test device design with functions which may appear in random or semi-random sequence

Journal ArticleDOI
05 Oct 1982
TL;DR: The MIPS processor is a fast pipelined engine without pipeline interlocks, which attempts to achieve high performance with the use of a simplified instruction set, similar to those found in microengines.
Abstract: MIPS is a new single chip VLSI microprocessor. It attempts to achieve high performance with the use of a simplified instruction set, similar to those found in microengines. The processor is a fast pipelined engine without pipeline interlocks. Software solutions to several traditional hardware problems, such as providing pipeline interlocks, are used.

Journal ArticleDOI
TL;DR: The design of a dictionary machine that is suitable for VLSI implementation that supports the operations of SEARCH, INSERT, DELETE, and EXTRACTMIN on an arbitrary ordered set is presented.
Abstract: We present the design of a dictionary machine that is suitable for VLSI implementation, and we discuss how to realize this implementation efficiently. The machine supports the operations of SEARCH, INSERT, DELETE, and EXTRACTMIN on an arbitrary ordered set. Each of these operations takes time O(log n), where n is the number of entries present when the operation is performed. Moreover, arbitrary sequences of these instructions can be pipelined through the machine at a constant rate (i.e., independent of n and the capacity of the machine). The time O(log n) is an improvement over previous VLSI designs of dictionary machines which require time O(log N) per operation, where N is the maximum number of keys that can be stored.

Journal ArticleDOI
Taylor1
TL;DR: The moduli size limitation is overcome using VLSI technology, special architectures, and moduli choice, resulting in a residue multiplier having a 48–72 bit dynamic range, capable of performing 10M multiplication/s.
Abstract: Recently, residue arithmetic has received increased attention in the open literature. Using table lookup methods and high-speed memory, modular arithmetic has been demonstrated. However, the memory size limitation of ECL, bipolar, and high-speed MOS limits the admissible size of the moduli used in the numbering system. In this paper the moduli size limitation is overcome using VLSI technology, special architectures, and moduli choice. A residue multiplier having a 48–72 bit dynamic range, capable of performing 10M multiplication/s is reported.

Journal ArticleDOI
K. Ohta, K. Yamada1, K. Shimizu1, Y. Tarui
TL;DR: In this paper, a new one-transistor, one-capacitor RAM cell structure called a Quadruply Self-Aligned Stacked High Capacitance (QSA SHC) was proposed as a basic cell for a future one-million-bit VLSI memory.
Abstract: A new one-transistor, one-capacitor RAM cell structure called a Quadruply Self-Aligned Stacked High Capacitance (QSA SHC) RAM is proposed as a basic cell for a future one-million-bit VLSI memory. This cell consists of a QSA MOSFET and a Ta 2 O 5 capacitor stacked on it. By this cell, the ultimate cell area 3F \times 2F can be realized with sufficient operating margin. Here, F is the minimum feature size. The basic cell was fabricated and its operation was experimentally verified. The leakage current of Ta 2 O 5 film was small enough for the storage capacitor dielectric. Using a 3F \times 4F cell and a 4F pitch sense amplifier, a one-million-bit memory was designed with a 2-µm rule. A cell size of 6.5 × 8 µm2, and a chip size of 9.2 × 9.5 mm2were obtained. The access time, neglecting the RC time constant of the word line, was estimated to be about 170 ns. Based on this design, it is argued that a future one-million-bit memory can be realized by QSA SHC technology with a 2-1-µm process. The mask set of the 1-Mbit RAM was actually fabricated by an electron-beam mask maker. A photomicrograph of the 1-Mbit RAM chip patterned by the mask set is shown. This chip was patterned not to get an operating sample but to show an actual chip image of the future 1- Mbit RAM. The area of each circuit block including storage array can be seen in this chip image.


Journal ArticleDOI
TL;DR: The Geometry Engine is a special-purpose VLSI processor for computer graphics for accomplishing three basic operations in computer graphi...
Abstract: The Geometry Engine[1] is a special-purpose VLSI processor for computer graphics. It is a four-component vector, floating-point processor for accomplishing three basic operations in computer graphi...

Book
01 Jan 1982

Journal ArticleDOI
TL;DR: A quasi-physical short channel MOSFET current model is derived and a "process box" based on the statistical variation of parameters is extracted from a completely automated device characterization system to allow the circuit response to be simulated across the process window.
Abstract: VLSI circuit simulation requires computationally efficient MOSFET models. In this paper, VLSI circuit simulator models for the active device and some important passive devices are described. A quasi-physical short channel MOSFET current model is derived. This current model contains both above-threshold and subthreshold components. The values of the model parameter are extracted automatically from measured I-V data. The reduction in process information in this representation is shown to be tolerable using a proper quantization of the geometry and device type space. Narrow width effect is also included. A charge conserving MOSFET capacitor model is also given. The importance of the parasitic devices on VLSI circuit is shown and a model for the fringing capacitance due to finite gate thickness is introduced. A "process box" based on the statistical variation of parameters is extracted from a completely automated device characterization system. Experimental results indicate that the width and length are independent random variables. This statistical information allows the circuit response to be simulated across the process window.

BookDOI
01 Jan 1982
TL;DR: The book is divided into nine sections: Invited Papers, Models of Computation, Layout Theory and Algorithms, Languages and Verification, Systems and Processors, and Systems and processes, which contains papers describing frameworks for entire systems.
Abstract: The book is divided into nine sections: Invited Papers. Six distinguished researchers from industry and academia presented invited papers. Models of Computation. The papers in this section deal with abstracting the properties of VLSI circuits into models that can be used to analyze the chip area, time or energy required for a particular computation. Complexity Theory. This section shows how computations can be analyzed to obtain bounds on the resources (chip area, time, energy) required to perform some computation. The last paper in this section is a light-hearted reminder that complexity theories must acknowledge reality. Layout Theory and Algorithms. Papers in this section describe ways to route wires that connect together different circuits on a chip. This topic is of importance in computer-aided design, but also relates to the complexity of circuit layouts. Languages and Verification. This section presents several results on the specification and verification of circuits and of entire systems. The large number of communicating processes in some VLSI architectures must be designed methodically to insure proper operation. Special-Purpose Architectures. This section deals with systolic computing architectures and their application to areas such as signal processing. Multiplier Designs. The problem of designing an efficient multiplier is of bothmore » practical and theoretical interest. An important application for multipliers is in signal processing. Processors. Two papers in this section describe new designs for single-chip general-purpose computers whose architecture is influenced by VLSI design opportunities. Systems and Processors. This section contains papers describing frameworks for entire systems, such as parallel processing arrays and content-addressable memories.« less

Journal ArticleDOI
TL;DR: In this article, a simple empirical relation for the calculation of the capacitance of interconnection lines in MOSFET VLSI, including edge effects, is presented, which gives approximate results compared to two-dimensional computer calculations.
Abstract: A simple empirical relation for the calculation of the capacitance of interconnection lines in MOSFET VLSI, including edge effects, is presented. The equation gives approximate results compared to two-dimensional computer calculations.

Book
01 Jan 1982
TL;DR: The goal of the present work is to efficiently map algorithms onto architectures by maintaining a close link with the theoretical basis of a particular signal processing method by exploiting the ability to design a powerful signal processing chip capable of efficiently implementing such popular algorithms as the discrete Fourier transform, ladder filters and associated matrix algebra operations.
Abstract: The advent of the Very Large Scale Integration (VLSI) technology has provided the ability to construct large systems on a single silicon chip. This dissertation is concerned with exploiting this ability to design a powerful signal processing chip capable of efficiently implementing such popular algorithms as the discrete Fourier transform, ladder filters and associated matrix algebra operations. The latter include Givens rotations and Cholesky factorization. The goal of the present work is to efficiently map algorithms onto architectures by maintaining a close link with the theoretical basis of a particular signal processing method. It is shown that all of the algorithms considered can be cast into a mathematical framework involving generalized vector rotations. Such rotation operations provide a natural description of the algorithms and the computational complexity measured in terms of these elementary operations is much lower than in terms of the usual measure of total number of multiplications. Thus, unlike present day signal processing computers which emphasize rapid multiplication, the signal processing architectures in this thesis are based on the ability to perform vector rotations in generalized coordinate systems. It is shown that the CORDIC algorithm of Volder provides a convenient implementation of vector rotations with only simple components such as adders, registers and shifters. Unfortunately, throughput is severely compromised owing to the need for performing special operations to account for the limited region of convergence and spurious scale constants inherent to the method. New techniques to circumvent these problems with no additional hardware and only a marginal speed penalty are described. Further speed enhancements through the use of a newly developed method known as hybrid CORDIC are discussed. Additionally, floating point CORDIC (FLORDIC) algorithms that are conceptually simpler than their fixed point counterparts are developed and the connection of CORDIC to the convergence computation methods is shown. The architecture of a dual CORDIC block chip is described for a target application of real time speech analysis. The resulting chip is shown to have a higher throughput per area than conventional chips based on fast multiplications. This is attributed to the close match of the present chip to the algorithms. Large mesh connected processor architectures for matrix factorization are developed which are also closely matched to the algorithms. Individual processing elements in the mesh are based on CORDIC operations, in fact on the aforementioned signal processing chip. Finally, a new technique for signal detection in additive Gaussian noise is developed with a view towards ease of implementation. It is based on ladder filters and may be implemented using the signal processing chip mentioned above.

Patent
12 Jul 1982
TL;DR: In this article, a process for producing VLSI (very large scale integrated) circuits employs techniques of selfaligned gates and contacts for FET devices and for both diffused conducting lines in the substrate and polysilicon conducting lines situated on isolating field oxide formed on the substrate.
Abstract: A process for producing VLSI (very large scale integrated) circuits employs techniques of self-aligned gates and contacts for FET devices and for both diffused conducting lines in the substrate and polysilicon conducting lines situated on isolating field oxide formed on the substrate. Mask alignment tolerances are increased and rendered non-critical. The use of materials in successive layers having different etch characteristics permits selective oxidation of desired portions only of the structure without need for masking and removal of selected material from desired locations by batch removal processes again without use of masking. There results VLSI circuits having increased density and reliability.

Journal ArticleDOI
Yasuura1, Takagi1, Yajima1
TL;DR: A new hardware algorithm of parallel enumeration sorting circuits whose processing time is linearly proportional to the number of data for sorting is designed, suitable for VLSI implementation.
Abstract: We propose a new parallel sorting scheme, called the parallel enumeration sorting scheme, which is suitable for VLSI implementation. This scheme can be introduced to conventional computer systems without changing their architecture. In this scheme, sorting is divided into two stages, the ordering process and the rearranging one. The latter can be efficiently performed by central processing units or intelligent memory devices. For implementations of the ordering process by VLSI technology, we design a new hardware algorithm of parallel enumeration sorting circuits whose processing time is linearly proportional to the number of data for sorting. Data are serially transmitted between the sorting circuit and memory devices and the total communication between them is minimized. The basic structure used in the algorithm is called a bus connected cellular array structure with pipeline and parallel processing. The circuit consists of a linear array of one type of simple cell and two buses connecting all cells for efficient global communications in the circuit. The sorting circuit is simple, regular and small enough for realization by today's VLSI technology. We discuss several applications of the sorting circuit and evaluate its performance.

Journal ArticleDOI
TL;DR: In this article, the physics of the TI RAM cell are discussed as well as circuit considerations for its implementation into an array, and an experimental array (64 rows by 8 columns), representing a cross section of a 16K dRAM, with on-chip decoding and sensing has been fabricated using the TI dRAM cell as the memory element.
Abstract: The TI dRAM cell, a MOSFET with two dynamically programmable threshold states, is very attractive for VLSI dRAM's because of its potential 3× density advantage over the one-transistor and-capacitor (1-T) cell, 10× lower leakage at high temperatures compared to the 1-T cell, and its immunity to soft errors. Linear scaling of the 1-T cell by a factor k reduces the available signal by \sim k 3, whereas the charging current for the TI RAM cell is invariant to scaling since the W/L ratio remains constant allowing it to scale to higher density. An experimental array (64 rows by 8 columns), representing a cross section of a 16K dRAM, with on-chip decoding and sensing has been fabricated using the TI RAM cell as the memory element. Using 4-µm design rules, the cell size was 204 µm2due to pitch requirements for the decoder and sense amplifier. This compares with 170-200 µm2for the 1-T cell using 2.5-µm design rules being fabricated in the 64K dRAM's today. The array which is compatible with 5-V-only operation was designed to provide diagnostic capability rather than speed and shows the data can be accessed 85-100 ns after the \bar{CAS} signal. In this paper, the physics of the TI RAM cell are discussed as well as circuit considerations for its implementation into an array.

Journal ArticleDOI
TL;DR: Conditions are outlined under which propagation delays in VLSI circuits can be achieved that are logarithmic in the wire lengths, imposed by area requirements and the velocity of light.
Abstract: Conditions are outlined under which propagation delays in VLSI circuits can be achieved that are logarithmic in the wire lengths. These conditions are imposed by area requirements and the velocity of light.