scispace - formally typeset
Search or ask a question

Showing papers on "Very-large-scale integration published in 1995"


Journal ArticleDOI
01 Feb 1995
TL;DR: An overview on architectures for VLSI implementations of video compression schemes as specified by standardization committees of the ITU and ISO is presented.
Abstract: The paper presents an overview on architectures for VLSI implementations of video compression schemes as specified by standardization committees of the ITU and ISO. VLSI implementation strategies are discussed and split into function specific and programmable architectures. As examples for the function oriented approach, alternative architectures for DCT and block matching will be evaluated. Also dedicated decoder chips are included Programmable video signal processors are classified and specified as homogeneous and heterogenous processor architectures. Architectures are presented for reported design examples from the literature. Heterogenous processors outperform homogeneous processors because of adaptation to the requirements of special, subtasks by dedicated modules. The majority of heterogenous processors incorporate dedicated modules for high performance subtasks of high regularity as DCT and block matching. By normalization to a fictive 1.0 /spl mu/m CMOS process typical linear relationships between silicon area and through-put rate have been determined for the different architectural styles. This relationship indicates a figure of merit for silicon efficiency. >

362 citations


Book
01 Jan 1995
TL;DR: This paper presents a methodology for designing low-Voltage Low-Power VLSI CMOS Circuit Design that addresses the challenge of integrating low-voltage components into a coherent system.
Abstract: Preface. 1. Low-Power VLSI Design: An Overview. 2. Low-Voltage Process Technology. 3. Low-Voltage Device Modeling. 4. Low-Voltage Low-Power VLSI CMOS Circuit Design. 5. Low-Voltage VLSI BiCMOS Circuit Design. 6. Low-Power CMOS Random Access Memory Circuits. 7. VLSI CMOS Subsystem Design. 8. Low-Power VLSI Design Methodology. References. Index.

327 citations


Proceedings ArticleDOI
01 Jan 1995
TL;DR: This work surveys state-of-the-art optimization methods that target low power dissipation in VLSI circuits and considers the circuit, logic, architectural and system levels.
Abstract: We survey state-of-the-art optimization methods that target low power dissipation in VLSI circuits. Optimizations at the circuit, logic, architectural and system levels are considered.

257 citations


BookDOI
01 Jan 1995

249 citations


Journal Article
TL;DR: In this paper, a frequency-synthesizing, all-digital phase-locked loop (ADPLL) is integrated with a 0.5 /spl mu/m CMOS microprocessor.
Abstract: A frequency-synthesizing, all-digital phase-locked loop (ADPLL) is fully integrated with a 0.5 /spl mu/m CMOS microprocessor. The ADPLL has a 50-cycle phase lock, has a gain mechanism independent of process, voltage, and temperature, and is immune to input jitter. A digitally-controlled oscillator (DCO) forms the core of the ADPLL and operates from 50 to 550 MHz, running at 4/spl times/ the reference clock frequency. The DCO has 16 b of binarily weighted control and achieves LSB resolution under 500 fs. >

189 citations


Journal ArticleDOI
TL;DR: This paper presents the first VLSI single chip dedicated to the computation of direct or inverse fast Fourier transforms of up to 8192 complex points, and could therefore be introduced in the coming years in digital terrestrial TV receivers at low cost.
Abstract: Large-scale single-frequency networks are now being considered in Europe as very promising network topologies to achieve drastic savings in spectrum usage for digital terrestrial television transmission. Such networks are possible using the COFDM system, with large guard intervals (more than 200 /spl mu/s) to absorb long echoes. In order to limit the spectral efficiency loss to about 20%, very long size fast Fourier transforms (up to 8 K complex points) have to be performed in real time for the demodulation of every COFDM symbol (every 1 ms). This paper presents the first VLSI single chip dedicated to the computation of direct or inverse fast Fourier transforms of up to 8192 complex points. Due to its pipelined architecture, it can perform an 8 K FFT every 400 /spl mu/s and a 1 K FFT every 50 /spl mu/s. All the storage is onchip, so that no external memories are required. A new internal result scaling technique, called convergent block floating point, has been introduced in order to minimize the required storage for a given quantization noise, The chip, 1 cm/sup 2/ large with 1.5 million transistors, has been designed in a 3.3 V-0.5 /spl mu/m triple-level metal CMOS process and is fully functional. The 8 K complex FFT function could therefore be introduced in the coming years in digital terrestrial TV receivers at low cost. >

187 citations


Journal ArticleDOI
TL;DR: Automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs, and extensions to these algorithms for mapping asynchronous circuits to Montage, the first FGPA architecture to completely support asynchronous and synchronous interface applications are described.
Abstract: Field-programmable gate arrays (FPGAs) are becoming an increasingly important implementation medium for digital logic. One of the most important keys to using FPGAs effectively is a complete, automated software system for mapping onto the FPGA architecture. Unfortunately, many of the tools necessary require different techniques than traditional circuit implementation options, and these techniques are often developed specifically for only a single FPGA architecture. In this paper we describe automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs. These tools include a simulated-annealing placement algorithm that handles the routability issues of fine-grained FPGAs, and an architecture-adaptive routing algorithm that can easily be retargeted to other FPGAs. We also describe extensions to these algorithms for mapping asynchronous circuits to Montage, the first FPGA architecture to completely support asynchronous and synchronous interface applications.

177 citations


Journal ArticleDOI
TL;DR: A pattern-independent, linear time algorithm (iMax) that estimates at every contact point, an upper bound envelope of all possible current waveforms that result by the application of different input patterns to the circuit is proposed.
Abstract: Currents flowing in the power and ground (P&G) buses of CMOS digital circuits affect both circuit reliability and performance by causing excessive voltage drops. Excessive voltage drops manifest themselves as glitches on the P&G buses and cause erroneous logic signals and degradation in switching speeds. Maximum current estimates are needed at every contact point in the buses to study the severity of the voltage drop problems and to redesign the supply lines accordingly. These currents, however, depend on the specific input patterns that are applied to the circuit. Since it is prohibitively expensive to enumerate all possible input patterns, this problem has, for a long time, remained largely unsolved. In this paper, we propose a pattern-independent, linear time algorithm (iMax) that estimates at every contact point, an upper bound envelope of all possible current waveforms that result by the application of different input patterns to the circuit. The algorithm is extremely efficient and produces good results for most circuits as is demonstrated by experimental results on several benchmark circuits. The accuracy of the algorithm can be further improved by resolving the signal correlations that exist inside a circuit. We also present a novel partial input enumeration (PIE) technique to resolve signal correlations and significantly improve the upper bounds for circuits where the bounds produced by iMax are not tight. We establish with extensive experimental results that these algorithms represent a good time-accuracy trade-off and are applicable to VLSI circuits. >

156 citations


Patent
28 Feb 1995
TL;DR: In this article, an emulation modeling apparatus (54) comprises a combination of a device under simulation (48) to be emulated and means for keeping the VLSI circuit in a quiescent state at normal operating speeds and in a normal operating sequence so as to allow dual access to the emulation modelling apparatus without loss of data or accuracy of functions.
Abstract: An emulation modeling apparatus (54) comprises a combination of a device under simulation (48) to be emulated and means for keeping the device under simulation (48) in a quiescent state at normal operating speeds and in a normal operating sequence so as to allow dual access to the emulation modeling apparatus (54) without loss of data or accuracy of functions One access is from a host simulation environment (26) while the other is from a model debug user interface (20) where internal architecturally visible registers and status are available to the user for greater debug control on the simulated subsystem within simulation environment (26) Specifically, any of a wide variety of physical VLSI circuits (48) to be modeled is kept in a quiescent state after power-on by a device control (50) It is then accessed through simulation means by simulated subsystem within a simulation environment (26), to change the architecturally visible internal state of the VLSI circuit (48) Control (50) brings VLSI circuit (48) out of the quiescent state and submits the requested simulated access After taking the response, control (50) returns VLSI circuit (48) again to its quiescent state so as to keep its internal state current The response is sent back to simulation environment (26) to update the simulated subsystem Independently, any user request for accessing the architecturally visible internal state of the circuit is gathered by model debug and user interface (20) Interface (20) enables control (50) to bring VLSI circuit (48) out of the quiescent state and to submit the user request access Subsequently, control (50) monitors the response and returns VLSI circuit (48) to its quiescent state so as to maintain the internal state of VLSI circuit (48) current Control (50) then sends the response back to user interface (20) VLSI circuit (48) thus is always kept ready and current for the next request, either from simulation environment (26) or from user interface (20) without having to reset it If any user defined breakpoint condition is met during the simulated accesses on the VLSI circuit (48), this information is forwarded by control (50) to simulation environment (26) for stopping the simulation and to user interface (20) to update the debug screen accordingly

134 citations


Journal ArticleDOI
TL;DR: In this paper, the integration of GaAs-AlGaAs multiple quantum well modulators directly on top of active silicon CMOS circuits is presented, which enables optoelectronic VLSI circuits to be achieved and also allows the design and optimization of the CMOS circuit to proceed independently of the placement and the bonding of surface normal optical modulators to the circuit.
Abstract: We accomplish the integration of GaAs-AlGaAs multiple quantum well modulators directly on top of active silicon CMOS circuits. This enables optoelectronic VLSI circuits to be achieved and also allows the design and optimization of the CMOS circuits to proceed independently of the placement and the bonding of surface-normal optical modulators to the circuit. Using this technique, we demonstrate operation of a 0.8 micron CMOS transimpedance receiver-transmitter circuit at 375 Mb/s. >

133 citations


Journal ArticleDOI
TL;DR: Simulations and measurements are used to study details of interconnect and insulator electrical properties, pulse propagation, and CPU cycle-time estimation, with particular attention to potential advantages of advanced materials and processes for wiring of high-performance CMOS microprocessors.
Abstract: We examine electrical performance issues associated with advanced VLSI semiconductor on-chip interconnections or “interconnects.” Performance can be affected by wiring geometry, materials, and processing details, as well as by processor-level needs. Simulations and measurements are used to study details of interconnect and insulator electrical properties, pulse propagation, and CPU cycle-time estimation, with particular attention to potential advantages of advanced materials and processes for wiring of high-performance CMOS microprocessors. Detailed performance improvements are presented for migration to copper wiring, low-e dielectrics, and scaled-up interconnects on the final levels for long-line signal propagation.

Journal ArticleDOI
TL;DR: Experimental results of a communication architecture tailored for analog VLSI perceptive systems satisfactorily support the theoretical basis upon which the system was constructed and Extensions to the communication architecture are finally presented.
Abstract: A communication architecture tailored for analog VLSI perceptive systems is proposed. Information is generated on a transmitter array of cells each driving a pulse generator. The resulting pulse-frequency modulated signals are transmitted through the nonarbitered, asynchronous access of pulses to a common bus. Pulses are decoded and accumulated in a receiver chip and the mapping of the activity distribution of the transmitter onto the receiver is achieved. One possible implementation of these principles is presented. The circuit description of all blocks is given and experimental results are shown: they satisfactorily support the theoretical basis upon which the system was constructed. Extensions to the communication architecture are finally presented. >

Journal ArticleDOI
01 Jan 1995
TL;DR: A fully pipelined single chip VLSI architecture for implementing the JPEG baseline image compression standard that exploits the principles of pipelining and parallelism to the maximum extent in order to obtain high speed and throughput.
Abstract: In this paper, we describe a fully pipelined single chip VLSI architecture for implementing the JPEG baseline image compression standard. The architecture exploits the principles of pipelining and parallelism to the maximum extent in order to obtain high speed and throughput. The architecture for discrete cosine transform and the entropy encoder are based on efficient algorithms designed for high speed VLSI implementation. The entire architecture can be implemented on a single VLSI chip to yield a clock rate of about 100 MHz which would allow an input rate of 30 frames per second for 1024/spl times/1024 color images. >

Journal ArticleDOI
TL;DR: A new approach for realistic worst-case analysis of VLSI circuit performances and a novel methodology for circuit performance optimization that is formulated as a constrained multicriteria optimization are presented.
Abstract: In this paper, we present a new approach for realistic worst-case analysis of VLSI circuit performances and a novel methodology for circuit performance optimization. Circuit performance measures are modeled as response surfaces of the designable and uncontrollable (noise) parameters. Worst-case analysis proceeds by first computing the worst-case circuit performance value and then determining the worst-case noise parameter values by solving a nonlinear programming problem. A new circuit optimization technique is developed to find an optimal design point at which all of the circuit specifications are met under worst-case conditions. This worst-case design optimization method is formulated as a constrained multicriteria optimization. The methodologies described in this paper are applied to several VLSI circuits to demonstrate their accuracy and efficiency. >

Journal ArticleDOI
TL;DR: Several HF compensation architectures, such as parallel, Miller, multipath nested Miller, and multipath hybrid nested Miller are presented, which approach the physical limitations of bandwidth, gain and power consumption.
Abstract: VLSI operational amplifier cells that approach the physical limitations of bandwidth, gain and power consumption are here described. To this purpose several HF compensation architectures are presented, such as parallel, Miller, multipath nested Miller, and multipath hybrid nested Miller.

Journal ArticleDOI
TL;DR: Triptych is presented, an FPGA architecture designed to achieve improved logic density with competitive performance by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits.
Abstract: Field-programmable gate arrays (FPGAs) are an important implementation medium for digital logic. Unfortunately, they currently suffer from poor silicon area utilization due to routing constraints. In this paper we present Triptych, an FPGA architecture designed to achieve improved logic density with competitive performance. This is done by allowing a per-mapping tradeoff between logic and routing resources, and with a routing scheme designed to match the structure of typical circuits. We show that, using manual placement, this architecture yields a logic density improvement of up to a factor of 3.5 over commercial FPGAs, with comparable performance. We also describe Montage, the first FPGA architecture to fully support asynchronous and synchronous interface circuits.

Proceedings ArticleDOI
27 Mar 1995
TL;DR: A novel bipartitioning algorithm that includes both new and pre-existing techniques is presented and produces results that are at least 17% better than the state-of-the-art while also being efficient in run time.
Abstract: Logic partitioning is an important issue in VLSI CAD, and has been an active area of research for at least the last 25 years. Numerous approaches have been developed and many different techniques have been combined for a wide range of applications. In this paper, we examine many of the existing techniques for logic bipartitioning and present a methodology for determining the best mix of approaches. The result is a novel bipartitioning algorithm that includes both new and pre-existing techniques. Our algorithm produces results that are at least 17% better than the state-of-the-art while also being efficient in run time.

Journal ArticleDOI
TL;DR: The state-of-the-art salicide and polycide processes are reviewed, with emphasis on work at IBM, and the limitations that pertain to future implementations in high-performance VLSI circuit applications are discussed.
Abstract: As the minimum VLSI feature size continues to scale down to the 0.1–0.2-µm regime, the need for low-resistance local interconnections will become increasingly critical. Although reduction in the MOSFET channel length will remain the dominant factor in achieving higher circuit performance, existing local interconnection materials will impose greater than acceptable performance limitations. We review the state-of-the-art salicide and polycide processes, with emphasis on work at IBM, and discuss the limitations that pertain to future implementations in high-performance VLSI circuit applications. A brief review of various silicide-based and tungsten-based approaches for forming local interconnections is presented, along with a more detailed description of a tungsten-based “damascene” local interconnection approach.

Book
01 Jan 1995
TL;DR: This chapter discusses the design methodologies of VLSI Neural Networks, as well as selected commercial products from Industry and applications and system Prototyping.
Abstract: Preface Part I: Paradigms and Models 1 Introduction 2 Artificial Neural Network Algorithms 3 Other Computational Intelligence Topics 4 Biologically-Inspired Vision Processing 5 Cellular Neural Networks 6 Paralleled Hardware Annealing for Optimal Solutions Part II: VLSI Design Technology 7 Design Methodologies of VLSI Neural Networks 8 Analog VLSI Building Blocks 9 Digital VLSI Neuroprocessors Part III: Applications and System Prototyping 10 Back-Propagation Neural Networks 11 Self-Organization Neural Networks 12 Advanced Vision Chips and Systems 13 Photonic Neural Networks 14 Smart-Pixel, Cellular Neural Network, and Chaotic Chips 15 Various Subsystem and System Construction Examples 16 Selected Commercial Products from Industry A: Spice CMOS Level-2, Level-4, and BSIM_Plus Model Files B: Basic VLSI Building Blocks C: Current-Mode Circuits for Piecewise-Linear Functions D: Selected Software Listing Subject Index

Proceedings ArticleDOI
01 Jan 1995
TL;DR: The current state-of-the-art in parallel logic simulation, including parallel simulation techniques, factors that impact simulation performance, performance results to date, and the directions currently being pursued by the research community are described.
Abstract: Design verification via simulation is an important component in the development of digital systems. However, with continuing increases in the capabilities of VLSI systems, the simulation task has become a significant bottleneck in the design process. As a result, researchers are attempting to exploit parallel processing techniques to improve the performance of VLSI logic simulation. This tutorial describes the current state-of-the-art in parallel logic simulation, including parallel simulation techniques, factors that impact simulation performance, performance results to date, and the directions currently being pursued by the research community.

Patent
07 Apr 1995
TL;DR: In this paper, a temperature sensing circuit directly measures the chip temperature, producing a temperature output signal, and a power management circuit, which is connected to the temperature sensor and to the chip logic, responds to the output signal to either stop or modify the operating frequency of the clock signal, depending upon the state of the control signal.
Abstract: Chip logic, a frequency multiplication and/or division, a temperature sensing circuit, and a power management circuit, are integrated on a very large scale integrated (VLSI) circuit chip. The temperature sensing circuit directly measures the chip temperature, producing a temperature output signal. The power management circuit, which is connected to the temperature sensing circuit and to the chip logic, responds to the temperature output signal and to a functional state of the chip logic to generate a control signal to the PLL. The PLL responds to the control signal to either stop the clock signal or modify the operating frequency of the clock signal, depending upon the state of the control signal.

Proceedings ArticleDOI
01 Jan 1995
TL;DR: Given a synthesized network, the algorithm modifies it minimally to realize a new specification to ensure a large part of engineering effort can be preserved.
Abstract: In the process of VLSI design, specifications are often changed. It is desirable that such changes will not lead to a very different design so that a large part of engineering effort can be preserved. We consider synthesis algorithms for handling such engineering changes. Given a synthesized network, our algorithm modifies it minimally to realize a new specification.

Proceedings ArticleDOI
TL;DR: According to the evaluation results based on an FPGA implementation, hardware portion of these functionalities can be executed within 250 ns and the task scheduling can be performed within 750 ns simultaneously, which are about 6 to 50 times faster than software implementation.
Abstract: This paper proposes a new approach to realize a very high performance real-time OS using VLSI technology In this method, quick and steady response can be guaranteed by implementing basic operations of a real-time OS as a peripheral chip (Silicon TRON) to be connected to general purpose microprocessors In order to confirm the effectiveness of this method, most basic system calls of /spl mu/ITRON have been designed using an HDL Synthesis results using a 08 /spl mu/m CMOS technology show that most important part of the system calls can be realized as a VLSI chip According to the evaluation results based on an FPGA implementation, hardware portion of these functionalities can be executed within 250 ns and the task scheduling can be performed within 750 ns simultaneously, which are about 6 to 50 times faster than software implementation Accordingly, very high performance real-time systems can be realized by the proposed method

Journal ArticleDOI
TL;DR: An exact algorithm for selecting partial scan flip-flops to break all feedback cycles to solve the MFVS problem exactly for large, practical instances is developed.
Abstract: We develop anexact algorithm for selecting partial scan flip-flops to break all feedback cycles. We also permit the option of not breaking self-loops. The key ideas that allow us to solve this complex problemexactly for large, practical instances are—an MFVS-preserving graph transformation, a partitioning scheme used in the branch and bound procedure, and pruning techniques based on an integer linear programming formulation of the MFVS problem. We have obtained optimum solutions for all ISCAS'89 benchmark circuits and several production VLSI circuits within reasonable computation time. For example, the optimal number of scan flip-flops required to eliminate all cycles except self-loops in the circuit s38417 is 374. An optimal solution was obtained in 32 CPU seconds on a SUN Sparc 2 workstation.

Journal ArticleDOI
15 Feb 1995
TL;DR: Strict design methodology allowed fully functional first silicon which met all speed targets and high clock speed was obtained by the use of delayed reset logic, a new register file design; and novel comparators.
Abstract: A 167 MHz 64 b VLSI CPU chip is described. The chip executes a 333-MFLOPS (peak) with an estimated system performance of 270SPECint92/380SPECfp92 (@167 MHz, 2 MB E-cache). The 17.7/spl times/17.8 mm die is fabricated with a 0.5 micron CMOS technology with four metal layers and contains 5.2 M transistors. The superscalar processor is capable of sustaining an execution rate of four instructions per cycle even in the presence of conditional branches and cache misses. Four fully pipelined 8/spl times/16 b multipliers and four single-cycle latency 16 b adders combine to speed up image processing, 2-D, 3-D graphics, video compression/decompression by up to an order of magnitude. High clock speed was obtained by the use of delayed reset logic, a new register file design; and novel comparators. Strict design methodology allowed fully functional first silicon which met all speed targets. The power dissipation of the chip is 28 W.

Journal ArticleDOI
TL;DR: In this article, a modular, high density 0.5 /spl mu/m complementary BiCMOS technology with integrated high-voltage Lateral Diffused MOS (LDMOS) and conductivity modulated Lateral Insulated Gate Bipolar Transistor (LIGBT) structures designed for high performance, multi-functional integrated circuit applications is described.
Abstract: A modular, high density 0.5 /spl mu/m Complementary BiCMOS technology with integrated high-voltage Lateral Diffused MOS (LDMOS) and conductivity modulated Lateral Insulated Gate Bipolar Transistor (LIGBT) structures designed for high performance, multi-functional integrated circuit applications is described. The advantages of VLSI processing and 0.5 /spl mu/m compatible layout rules have been applied to the design and fabrication of the tight-pitch high-voltage devices without sacrificing the performance of 0.5 /spl mu/m dual-poly (N+/P+) gate CMOS and complementary vertical bipolar transistors. Single chip integration of VLSI microprocessors with high-voltage and/or high-current input and output functions for "Smart Power" applications can be achieved using this technology. >

Journal ArticleDOI
11 Jan 1995
TL;DR: This survey paper reviews numerous high-level transformation techniques which can be applied at the algorithm or the architecture level to improve the performance of digital signal and image processing architectures and circuits implemented using VLSI technology.
Abstract: This survey paper reviews numerous high-level transformation techniques which can be applied at the algorithm or the architecture level to improve the performance of digital signal and image processing architectures and circuits implemented using VLSI technology. Successful design of VLSI signal and image processors requires careful selection of algorithms, architectures, implementation styles, and synthesis techniques. High-level transformations can play an important role in reducing silicon area or power at the same speed or in increasing the speed for same area. These transformations can also increase the suitability of an algorithm for a particular architectural style. The transformation techniques reviewed in this paper include pipelining, parallel processing, retiming, unfolding, folding, look-ahead, relaxed look-ahead, associativity, distributivity, and reduction in strength.

Book ChapterDOI
01 Jan 1995
TL;DR: The AMULET group at Manchester University has developed an asynchronous implementation of the ARM microprocessor based on micropipelines as part of a broad investigation into low power techniques.
Abstract: High-performance VLSI microprocessors are becoming very power hungry; this presents an increasing problem of heat removal in desk-top machines and of battery life in portable machines. Asynchronous operation is proposed as a route to more energy efficient computing. In his 1988 Turing Award Lecture, Ivan Sutherland proposed a modular approach to asynchronous design based on “Micropipelines”. The AMULET group at Manchester University has developed an asynchronous implementation of the ARM microprocessor based on micropipelines as part of a broad investigation into low power techniques. The design is described in detail, the rationale for the work is presented and the characteristics of the chip described. The first silicon from the design arrived in April 1994 and an evaluation of it is presented here.

Journal ArticleDOI
TL;DR: In this article, the SWAMI LOCOS technique is applied for monolithic integration of both, integrated optical devices, and microelectronic circuits, and the results of static and dynamic measurements are discussed.
Abstract: Optical waveguides, photodetectors, and VLSI CMOS circuits are integrated monolithically in different ways: in a combined integration technique the light-guiding film is deposited and covered with a SiO/sub 2/ layer replacing standard PSG as the dielectric insulation of polysilicon and metallization. In a stacked method the waveguide fabrication starts after metallization and test of the CMOS circuits. Electrooptical coupling is performed by butt-, leaky wave-, or, mirror-coupling of waveguides and photodetectors. To fabricate the system, SWAMI LOCOS technique is applied for monolithic integration of both, integrated optical devices, and microelectronic circuits. This paper discusses the integration technology and the results of static and dynamic measurements. >

Journal ArticleDOI
TL;DR: Robust, miniature, and energetically efficient VLSI systems for AOR can ultimately be achieved by following a path which optimizes the design at and between all levels of system integration, i.e., from devices and circuit techniques all the way to algorithms and architectural level considerations.