scispace - formally typeset
Search or ask a question

Showing papers on "Very-large-scale integration published in 1997"


Journal ArticleDOI
TL;DR: In this paper, the results of a comprehensive investigation into the characteristics and optimization of inductors fabricated with the top-level metal of a submicron silicon VLSI process are presented.
Abstract: The results of a comprehensive investigation into the characteristics and optimization of inductors fabricated with the top-level metal of a submicron silicon VLSI process are presented. A computer program which extracts a physics-based model of microstrip components that is suitable for circuit (SPICE) simulation has been used to evaluate the effect of variations in metallization, layout geometry, and substrate parameters upon monolithic inductor performance. Three-dimensional (3-D) numerical simulations and experimental measurements of inductors were also used to benchmark the model accuracy. It is shown in this work that low inductor Q is primarily due to the restrictions imposed by the thin interconnect metallization available in most very large scale integration (VLSI) technologies, and that computer optimization of the inductor layout can be used to achieve a 50% improvement in component Q-factor over unoptimized designs.

541 citations


Proceedings ArticleDOI
Howard H. Chen1, David D. Ling1
13 Jun 1997
TL;DR: A new design methodology to analyzethe on-chip power supply noise for high-performance microprocessors based on an integrated package-level and chip-level power bus model, and a simulated switching circuit model for each functional block offers the most complete and accurate analysis of Vdd distribution.
Abstract: This paper describes a new design methodology to analyzethe on-chip power supply noise for high-performance microprocessors.Based on an integrated package-level andchip-level power bus model, and a simulated switching circuitmodel for each functional block, this methodology offersthe most complete and accurate analysis of Vdd distributionfor the entire chip. The analysis results not only providedesigners with the inductive ΔI noise and the resistive IRdrop data at the same time, but also allow designers to easilyidentify the hot spots on the chip and ΔV across the chip.Global and local optimization such as buffer sizing, powerbus sizing, and on-chip decoupling capacitor placement canthen be conducted to maximize the circuit performance andminimize the noise.

325 citations


Book
Gary Yeap1
31 Aug 1997
TL;DR: This tutorial was developed when I developed a company wide training class Tutorial on Low Power Digital VLSI Design for designers in Motorola The feedback from the tutorial attendees helps to improve the quality of the training.
Abstract: 2011 Gary K Yeap Practical Low Power Digital Vlsi Design April 21st, 2019 c1731006c4 FPGA Architectures amp Applications Testing amp Testability Low Power VLSI Design 2002 kluwer academic publishers Gary K Yeap Practical Low Power Digital Neil Weste and K Eshragian Principles of CMOS VLSI Design A PRACTICAL LOW POWER DIGITAL VLSI DESIGN Springer April 10th, 2019 ceived when I developed a company wide training class Tutorial on Low Power Digital VLSI Design for designers in Motorola The feedback from the tutorial attendees helps to

323 citations


Proceedings ArticleDOI
15 Sep 1997
TL;DR: The paper describes the structure of a high-performance asynchronous pipeline, in particular precise exceptions, pipelined caches, arithmetic, and registers, and the circuit techniques developed to achieve high throughput.
Abstract: The design of an asynchronous clone of a MIPS R3000 microprocessor is presented. In 0.6 /spl mu/m CMOS, we expect performance close to 280 MIPS, for a power consumption of 7 W. The paper describes the structure of a high-performance asynchronous pipeline, in particular precise exceptions, pipelined caches, arithmetic, and registers, and the circuit techniques developed to achieve high throughput.

317 citations


DissertationDOI
01 Jan 1997
TL;DR: It is found that the ripple-carry, the carry-lookahead, and the proposed carry-increment adders show the best overall performance characteristics for cell-based design.
Abstract: The addition of two binary numbers is the fundamental and most often used arithmetic operation on microprocessors, digital signal processors (DSP), and data-processing application-specific integrated circuits (ASIC). Therefore, bi¬ nary adders are crucial building blocks in very large-scale integrated (VLSI) circuits. Their efficient implementation is not trivial because a costly carrypropagation operation involving all operand bits has to be performed. Many different circuit architectures for binary addition have been proposed over the last decades, covering a wide range of performance characteristics. Also, their realization at the transistor level for full-custom circuit implemen¬ tations has been addressed intensively. However, the suitability of adder archi¬ tectures for cell-based design and hardware synthesis both prerequisites for the ever increasing productivity in ASIC design — was hardly investigated. Based on the various speed-up schemes for binary addition, a compre¬ hensive overview and a qualitative evaluation of the different existing adder architectures are given in this thesis. In addition, a new multilevel carryincrement adder architecture is proposed. It is found that the ripple-carry, the carry-lookahead, and the proposed carry-increment adders show the best overall performance characteristics for cell-based design. These three adder architectures, which together cover the entire range of possible area vs. delay trade-offs, are comprised in the more general prefix adder architecture reported in the literature. It is shown that this universal and flexible prefix adder structure also allows the realization of various customized adders and of adders fulfilling arbitrary timing and area constraints. A non-heuristic algorithm for the synthesis and optimization of prefix adders is proposed. It allows the runtime-efficient generation of area-optimal adders for given timing constraints.

268 citations


Patent
22 Oct 1997
TL;DR: In this paper, a single chip implementation of a digital receiver for multicarrier signals that are transmitted by orthogonal frequency division multiplexing is presented, which has highly accurate sampling rate control and frequecy control circuitry.
Abstract: The invention provides a single chip implementation of a digital receiver for multicarrier signals that are transmitted by orthogonal frequency division multiplexing. Improved channel estimation and correction circuitry are provided. The receiver has highly accurate sampling rate control and frequecy control circuitry. BCH decoding of tps data carriers is achieved with minimal resources with an arrangement that includes a small Galois field multiplier. An improved FFT window synchronization circuit is coupled to the resampling circuit for locating the boundary of the guard interval transmitted with the active frame of the signal. A real-time pipelined FFT processor is operationally associated with the FFT window synchronization circuit and operates with reduced memory requirements.

240 citations


Proceedings ArticleDOI
01 Apr 1997
TL;DR: This paper proposes a novel VLSI artwork modification technique based on the concept of a minimum layout perturbation, and proposes and implements a practical graph-based simplex algorithm, which is compared to a commercially available linear programming package.
Abstract: In this paper we propose a novel VLSI artwork modification technique based on the concept of a minimum layoutperturbation. Layouts are designed so that minimum design rules must be satisfied. Often layout processes such as custom layout methodologies and design rule migration activities introduce design rule violations in layouts. A minimum layout perturbation defines a minimum cost change to a layout, such that the resulting layout satisfies all design rules. We formulate the minimum perturbation cost with the objective of preserving as much as possible the geometric and topological features of the original layout. The proposed minimum perturbation problem formulation is transformed into a linear programming problem with special structure. We exploit the structure of the problem to propose efficient algorithms that solve the problem. We also propose and implement a practical graph-based simplex algorithm, which we compare to a commercially available linear programming package, resulting in more than 40X performance improvements in some cases. Finally, the proposed methods have been implemented and used in real life problems, for example in the technology migration of data path macros and a 30O-cell gate array library.

197 citations


Journal ArticleDOI
14 Mar 1997-Science
TL;DR: The Research News article “Can chip devices keep shrinking?” by Robert F. Service represents Moore's Law as a doubling in the number of transistors on computer chips every 18 months.
Abstract: The Research News article “Can chip devices keep shrinking?” by Robert F. Service ([13 Dec., p. 1834][1]) represents Moore's Law as a doubling in the number of transistors on computer chips every 18 months. In the graphic by VLSI Research Inc., with a caption heralding the validity of Moore's

194 citations


Journal ArticleDOI
TL;DR: This work proposes several approaches to address the problem of power dissipation in high performance CMOS VLSI, and proposes codes that can be used on a class of terminated off-chip board-level buses with level signaling, or on tristate on-chip buses withlevel or transition signaling.
Abstract: Technology trends and especially portable applications are adding a third dimension (power) to the previously two-dimensional (speed, area) VLSI design space. A large portion of power dissipation in high performance CMOS VLSI is due to the inherent difficulties in global communication at high rates and we propose several approaches to address the problem. These techniques can be generalized at different levels in the design process. Global communication typically involves driving large capacitive loads which inherently require significant power. However, by carefully choosing the data representation, or encoding, of these signals, the average and peak power dissipation can be minimized. Redundancy can be added in space (number of bus lines), time (number of cycles) and voltage (number of distinct amplitude levels). The proposed codes can be used on a class of terminated off-chip board-level buses with level signaling, or on tristate on-chip buses with level or transition signaling.

188 citations


Proceedings ArticleDOI
13 Jun 1997
TL;DR: A modeling approach is presented that captures the dependence of the power dissipation of a combinational logic circuit on its input/output signal switching activity, and gives very good accuracy, with an RMSerror of under about 6%.
Abstract: A modeling approach is presentedthat captures the dependence of the power dissipationof a combinational logic circuit on its input/outputsignal switching activity.The resultingpower macromodel, consisting of a single three dimensionaltable, can be used to estimate the powerconsumed in the circuit for any given input/outputsignal statistics.Given a low-level (typically gate-level)description of the circuit, we describe a characterizationprocess by which such a table modelcan be automatically built.In contrast to otherproposed techniques, this can be done for any givenlogic circuit without any user intervention, and appliesto all possible input/output signal statistics;it does not require one to construct specialized analyticalequations for the power dissipation.Thethree dimensions of our table-based model are theaverage input signal probability, average input transitiondensity, and average output zero-delay transition density.This approach has been implemented and modelshave been built for many benchmark circuits.Overa wide range of input signal statistics, we show thatthis model gives very good accuracy, with an RMSerror of under about 6%.

174 citations


Journal ArticleDOI
TL;DR: In this article, a study of SEU generated transient pulse attenuation in combinational logic structures built using common digital CMOS design practices is presented, showing that while there is an observable effect, it cannot be generally assumed that attenuation will significantly reduce observed circuit bit error rates.
Abstract: Results are presented of a study of SEU generated transient pulse attenuation in combinational logic structures built using common digital CMOS design practices. SPICE circuit analysis, heavy ion tests, and pulsed, focused laser simulations were used to examine the response characteristics of transient pulse behavior in long logic strings. Results show that while there is an observable effect, it cannot be generally assumed that attenuation will significantly reduce observed circuit bit error rates.

Journal ArticleDOI
TL;DR: In this paper, the authors present two methods for estimating the velocity of a visual stimulus and their implementations with analog circuits using CMOS VLSI technology, where velocity is computed by identifying particular features in the image at different locations; these features are abrupt temporal changes in image irradiance.
Abstract: We present two algorithms for estimating the velocity of a visual stimulus and their implementations with analog circuits using CMOS VLSI technology. Both are instances of so-called token methods, where velocity is computed by identifying particular features in the image at different locations; in our algorithms, these features are abrupt temporal changes in image irradiance. Our circuits integrate photoreceptors and associated electronics for computing motion onto a single chip and unambiguously extract bidirectional velocity for stimuli of high and intermediate contrasts over considerable irradiance and velocity ranges. At low contrasts, the output signal for a given velocity tends to decrease gracefully with contrast, while direction-selectivity is maintained. The individual motion-sensing cells are compact and highly suitable for use in dense 1-D or 2-D imaging arrays.

Journal ArticleDOI
TL;DR: A new family of temperature sensors will be presented, developed by the authors especially for the purpose of thermal monitoring of VLSI chips, characterized by the very low silicon area and the low power consumption.
Abstract: The paper presents appropriate sensors for the realization of the design principle of design for thermal testability (DfTT). After a short overview of the available CMOS temperature sensors, a new family of temperature sensors will be presented, developed by the authors especially for the purpose of thermal monitoring of VLSI chips. These sensors are characterized by the very low silicon area of about 0.003-0.02 mm/sup 2/ and the low power consumption (200 /spl mu/W). The accuracy is in the order of 1/spl deg/C. Using the frequency-output versions an easy interfacing of digital test circuitry is assured. They can be very easily incorporated into the usual test circuitry, via the boundary-scan architecture. The paper presents measured results obtained by the experimental circuits. The facilities provided by the sensor connected to the boundary-scan test circuitry are also demonstrated experimentally.

Proceedings ArticleDOI
01 Apr 1997
TL;DR: This paper considers the delay minimization problem of an interconnect wire by simultaneously considering buffer insertion, buffer sizing and wire sizing and provides elegant closed form optimal solutions for all three problems.
Abstract: In this paper, we consider the delay minimization problem of an interconnect wire by simultaneously considering buffer insertion, buffer sizing and wire sizing. We consider three cases, namely using no buffer (i.e., wire sizing alone), using a given number of buffers, and using the optimal number of buffers. We provide elegant closed form optimal solutions for all three problems. These closed form solutions are useful in early stages of the VLSI design flow such as logic synthesis and floorplanning.

Journal ArticleDOI
TL;DR: The need for higher-level design automation tools are discussed first and some basic techniques for various subtasks of high-level synthesis are described, including testability, power efficiency, and reliability.
Abstract: We survey recent developments in high level synthesis technology for VLSI design. The need for higher-level design automation tools are discussed first. We then describe some basic techniques for various subtasks of high-level synthesis. Techniques that have been proposed in the past few years (since 1994) for various subtasks of high-level synthesis are surveyed. We also survey some new synthesis objectives including testability, power efficiency, and reliability.

01 Jan 1997
TL;DR: In this article, a new family of temperature sensors for the purpose of thermal monitoring of VLSI chips is presented, characterized by the very low silicon area of about 0.003-0.02 mm 2 and the low power consumption (200 μW).
Abstract: The paper presents appropriate sensors for the realization of the design principle of design for thermal testability (DfTT). After a short overview of the available CMOS temperature sensors, a new family of temperature sensors will be presented, developed by the authors especially for the purpose of thermal monitoring of VLSI chips. These sensors are characterized by the very low silicon area of about 0.003-0.02 mm 2 and the low power consumption (200 μW). The accuracy is in the order of 1 °C. Using the frequency-output versions an easy interfacing of digital test circuitry is assured. They can be very easily incorporated into the usual test circuitry, via the boundary-scan architecture. The paper presents measured results obtained by the experimental circuits. The facilities provided by the sensor connected to the boundary-scan test circuitry are also demonstrated experimentally.

Journal ArticleDOI
01 Jul 1997
TL;DR: A synchronous two-phase clocking scheme for RSFQ circuits of arbitrary complexity is introduced, which for critical circuit topologies offers advantages over previous synchronous and asynchronous schemes.
Abstract: Rapid Single Flux Quantum (RSFQ) logic is a digital circuit technology based on superconductors that has emerged as a possible alternative to advanced semiconductor technologies for large scale ultra-high speed, very low power digital applications Timing of RSFQ circuits at frequencies of tens to hundreds of gigahertz is a challenging and still unresolved problem Despite the many fundamental differences between RSFQ and semi- conductor logic at the device and at the circuit level, timing of large scale digital circuits in both technologies is principally governed by the same rules and constraints Therefore, RSFQ offers a new perspective on the timing of ultra-high speed digital circuits This paper is intended as a comprehensive review of RSFQ timing, from the viewpoint of the principles, concepts, and language developed for semiconductor VLSI It includes RSFQ clocking schemes, both synchronous and asynchronous, which have been adapted from semiconductor design methodologies as well as those developed specifically for RSFQ logic The primary features of these synchronization schemes, including timing equations, are presented and compared In many circuit topologies of current medium to large scale RSFQ circuits, single-phase synchronous clocking outperforms asynchronous schemes in speed, device/area overhead, and simplicity of the design procedure Synchronous clocking of RSFQ circuits at multigigahertz frequencies requires the application of non-standard design techniques such as pipelined clocking and intentional non-zero clock skew Even with these techniques, there exist difficulties which arise from the deleterious effects of process variations on circuit yield and performance As a result, alternative synchronization techniques, including but not limited to asynchronous timing, should be considered for certain circuit topologies A synchronous two-phase clocking scheme for RSFQ circuits of arbitrary complexity is introduced, which for critical circuit topologies offers advantages over previous synchronous and asynchronous schemes

Journal ArticleDOI
TL;DR: The architectural and circuit design aspects of a mixed analog/digital very large scale integration (VLSI) motion detection chip based on models of the insect visual system are described, implementing a reconfigurable architecture which facilitates the evaluation of several newly designed analog circuits.
Abstract: The architectural and circuit design aspects of a mixed analog/digital very large scale integration (VLSI) motion detection chip based on models of the insect visual system are described. The chip comprises two one-dimensional 64-cell arrays as well as front-end analog circuitry for early visual processing and digital control circuits. Each analog processing cell comprises a photodetector, circuits for spatial averaging and multiplicative noise cancellation, differentiation, and thresholding. The operation and configuration of the analog cells is controlled by digital circuits, thus implementing a reconfigurable architecture which facilitates the evaluation of several newly designed analog circuits. The chip has been designed and fabricated in a 1.2-/spl mu/m CMOS process and occupies an area of 2/spl times/2 mm/sup 2/.

Journal Article
TL;DR: The ATLANTA/sup TM/ switching architecture has the following distinguishing characteristics: is nonblocking, scales modularly over a wide range of switching and buffering capacities using commonly available implementation technology, achieves high buffer utilization while using distributed buffers, has low complexity, and provides a clear path for future growth in features.
Abstract: The ATLANTA/sup TM/ switching architecture has the following distinguishing characteristics: (1) is nonblocking, (2) scales modularly over a wide range of switching and buffering capacities using commonly available implementation technology, (3) achieves high buffer utilization while using distributed buffers, (4) has low complexity, and (5) provides a clear path for future growth in features. The ATLANTA architecture uses an innovative structure with ingress and egress buffers, where selective backpressure is applied from the fabric to the ingress cards. Selective backpressure makes the buffers in the ingress cards act as an extension of the output buffers in the fabric, achieving \"sharing\" of the distributed buffers and buffer utilization comparable with a centralized shared-memory switch. The advantage is that the majority of the buffers are in the ingress and egress port cards, and are implemented using low-cost off-the-shelf memories regardless of the total switching capacity. Different arrangements are possible for the switch fabric. In the smallest configuration, the fabric consists of a single standalone switching module; for larger switching capacities, the fabric is a modular three-stage memory/space/memory (MSM) arrangement. The ATLANTA architecture provides optimal support of multicast traffic. The ATLANTA chipset provides the complete set of building blocks for implementing ATM switches ranging in capacity from 622 Mb/s to 25 Gb/s. The chipset consists of four chips, two devices to be used in the fabric and two in the port cards. The port devices provide full-duplex ingress and egress functionality at 622 Mb/s port rate (plus the overhead due to the local header used internally to the switch). The physical interface to the incoming/outgoing lines supports the UTOPIA II multiplexing standard, and the port devices manage multiplexing/demultiplexing from/to a maximum of 30 subports per port. Although our current implementation of the architecture is targeted primarily to ATM, the principles behind the architecture are more general, and apply to IP switching and routing technologies.

Journal ArticleDOI
TL;DR: The latest advances in the SISSI package (simulator for integrated structures by simultaneous iteration) are discussed, including electro-thermal ac and transient simulation and the consideration of the thermal voltage of Si-Al contacts.
Abstract: Due to severe thermal problems of today's VLSI integrated circuits the need for reliable and quick thermal, electro-thermal and logi-thermal simulation tools is increasing, In this paper, we discuss the latest advances in the SISSI package (simulator for integrated structures by simultaneous iteration) which is a tool developed originally for analog VLSI design. The improvements include electro-thermal ac and transient simulation and the consideration of the thermal voltage of Si-Al contacts. Furthermore, we introduce a new module of SISSI, LOGITHERM, which is aimed at the self-consistent logic and thermal simulation of large digital VLSI designs. The features of our simulator package are highlighted by simulation examples that are compared in most cases with measurement results.

Journal ArticleDOI
TL;DR: A new rail-to-rail CMOS input architecture is presented that delivers behavior nearly independent of the common-mode level in terms of both transconductance and slewing characteristics, and can be easily described in VHDL to allow simulation of large mixed-signal systems.
Abstract: A new rail-to-rail CMOS input architecture is presented that delivers behavior nearly independent of the common-mode level in terms of both transconductance and slewing characteristics. Feedforward is used to achieve high common-mode bandwidth, and operation does not rely on analytic square law characteristics, making the technique applicable to deep submicron technologies. From the basis of a transconductor design, an asynchronous comparator and a video bandwidth op amp are also developed, providing a family of general purpose analog circuit functions which may be used in high (and low) bandwidth mixed-signal systems. Benefits for the system designer are that the need for rigorous control of common-mode levels is avoided and input signal swings right across the power supply range can be easily handled. A further benefit is that having very consistent performance, the circuits can be easily described in VHDL (or other behavioral language) to allow simulation of large mixed-signal systems. The circuits presented may be easily adapted for a range of requirements. Results are presented for representative transconductor, op amp, and comparator designs fabricated in a 0.5 /spl mu/m 3.3 V digital CMOS process.

Journal ArticleDOI
TL;DR: A neural network based approach to the electromagnetic (EM) simulation and optimization of high-speed interconnects is discussed, which is ideally suited for use in iterative CAD and optimization routines.
Abstract: In this paper, a neural network based approach to the electromagnetic (EM) simulation and optimization of high-speed interconnects is discussed. Traditional techniques used to model interconnects in high-speed very large scale integration (VLSI) circuits are based on EM-field simulation, and are thus highly demanding on central processing unit (CPU) resources. This limits their suitability for computer-aided design (CAD) and optimization techniques which are, in general, iterative in nature. Neural networks can be used to map the complex relationship between the physical and electrical parameters of interconnect structures in an efficient manner. The models, once developed, operate with minimal on-line CPU resources and are thus ideally suited for use in iterative CAD and optimization routines.

Proceedings ArticleDOI
13 Nov 1997
TL;DR: A reduced-order modelling approach that allows for passive multiport reduction of RC netlists as impedance macromodels while preserving the symmetry and sparsity of the state matrices for efficient storage is described.
Abstract: Noise is becoming one of the most important metrics in the design of VLSI systems, certainly of comparable importance to area, timing, and power. In this paper, we describe Global Harmony, a methodology for the analysis of coupling noise in the global interconnect of large VLSI chips, being developed for the design of high-performance microprocessors. The architecture of Global Harmony involves a careful combination of static noise analysis, static timing analysis, and reduced-order modelling techniques. We describe a reduced-order modelling approach that allows for passive multiport reduction of RC netlists as impedance macromodels while preserving the symmetry and sparsity of the state matrices for efficient storage. We describe how the macromodels are practically employed to perform coupling analysis and how timing constraints can be used to limit pessimism in the analysis.

Journal ArticleDOI
TL;DR: The scheme described herein was used to reduce the amount of digital information which must be sent to control a large quantity of stimulating electrodes, and is scalable to a 625-channel stimulator while keeping data transmission rates under 2 Mbps.
Abstract: A CMOS very large scale integration (VLSI) chip has been designed and built to implement a scheme developed for multiplexing/demultiplexing the signals required to operate an intracortical stimulating electrode array. Because the use of radio telemetry in a proposed system utilizing this chip may impose limits upon the rate of data transmission to the chip, the scheme described herein was used to reduce the amount of digital information which must be sent to control a large quantity (up to several hundred) of stimulating electrodes. By incorporating multiple current sources on chip, many channels may be stimulated simultaneously. By incorporating on-chip timers, control over pulse timing is assigned to the chip, reducing by up to fourfold the amount of control data which must be sent. By incorporating on-chip RAM, information associated with the desired stimulus amplitude and pulse timing can be stored on chip, In this manner, it is necessary to send control information to the chip only when the information changes, rather than at the stimulus repeat rate for each channel. This further reduces the data rate by a factor of five to ten times or more. The architecture described here, implemented as an eight-channel stimulator, is scalable to a 625-channel stimulator while keeping data transmission rates under 2 Mbps.

Journal ArticleDOI
TL;DR: In this paper, a methodology for simulating the static and dynamic performance of integrated circuits in the presence of electro-thermal interactions on the integrated circuit die is presented, which is based on the coupling of a finite element method (FEM) program with a circuit simulator.
Abstract: The paper presents a methodology for simulating the static and dynamic performance of integrated circuits in the presence of electro-thermal interactions on the integrated circuit die. The technique is based on the coupling of a finite element method (FEM) program with a circuit simulator. In contrast to other known simulator couplings a time step algorithm is used, Its implementation in simulation tools is described. The thermal modeling of the die/package structure and the extended modeling of the electronic circuit is discussed. Simulation results which indicate the capabilities of the methodology for electro-thermal simulation are compared to experimental results.

Journal ArticleDOI
TL;DR: Several versions of the routing problem arising in VLSI design are described and it is shown that some of the instances have no feasible solution if Manhattan routing is used instead of knock-knee routing.
Abstract: In this paper we describe several versions of the routing problem arising in VLSI design and indicate how the Steiner tree packing problem can be used to model these problems mathematically. We focus on switchbox routing problems and provide integer programming formulations for routing in the knock-knee and in the Manhattan model. We give a brief sketch of cutting plane algorithms that we developed and implemented for these two models. We report on computational experiments using standard test instances. Our codes are able to determine optimum solutions in most cases, and in particular, we can show that some of the instances have no feasible solution if Manhattan routing is used instead of knock-knee routing.

Journal ArticleDOI
TL;DR: Digital very large scale integration CMOS circuit families including static and dynamic CMOS logic, static cascade voltage switch logic (static CVSL), and dynamic cascade voltageswitch logic (dynamic CVSL) are investigated with particular emphasis on circuit topologies where the parasitic bipolar effect resulting from the floating body affects the circuit operation and stability.
Abstract: This paper presents a detailed study on the impact of a floating body in partially depleted (PD) silicon-on-insulator (SOI) MOSFET's on various CMOS circuits. Digital very large scale integration (VLSI) CMOS circuit families including static and dynamic CMOS logic, static cascade voltage switch logic (static CVSL), and dynamic cascade voltage switch logic (dynamic CVSL) are investigated with particular emphasis on circuit topologies where the parasitic bipolar effect resulting from the floating body affects the circuit operation and stability. Commonly used circuit building blocks for fast arithmetic operations in processor data-flow, such as static and dynamic carry lookahead circuits and Manchester carry chains, are examined. Pass-transistor-based designs including latch, multiplexer, and pseudo two-phase dynamic logic are then discussed. It is shown that under certain circuit topologies and switching patterns, the parasitic bipolar effect causes extra power consumption and degrades the noise margin and stability of the circuits. In certain dynamic circuits, the parasitic bipolar effect is shown to cause logic state error if not properly accounted for.

Journal ArticleDOI
TL;DR: The parallel approach is shown to consistently perform better than a sequential genetic algorithm when applied to these routing problems and is able to significantly reduce the occurrence of crosstalk.
Abstract: This paper presents a novel approach to solve the VLSI (very large scale integration) channel and switchbox routing problems. The approach is based on a parallel genetic algorithm (PGA) that runs on a distributed network of workstations. The algorithm optimizes both physical constraints (length of nets, number of vias) and crosstalk (delay due to coupled capacitance). The parallel approach is shown to consistently perform better than a sequential genetic algorithm when applied to these routing problems. An extensive investigation of the parameters of the algorithm yields routing results that are qualitatively better or as good as the best published results. In addition, the algorithm is able to significantly reduce the occurrence of crosstalk.

Journal ArticleDOI
TL;DR: This paper presents an evaluation of several well-known block-matching motion estimation algorithms from a system-level very large scale integration (VLSI) design viewpoint using three criteria: silicon area, input/output requirement, and image quality.
Abstract: This paper presents an evaluation of several well-known block-matching motion estimation algorithms from a system-level very large scale integration (VLSI) design viewpoint. Because a straightforward block-matching algorithm (BMA) demands a very large amount of computing power, many fast algorithms have been developed. However, these fast algorithms are often designed to merely reduce arithmetic operations without considering their overall performance in VLSI implementation. Three criteria are used to compare various block-matching algorithms: (1) silicon area, (2) input/output requirement, and (3) image quality. A basic systolic array architecture is chosen to implement all the selected algorithms. The purpose of this study is to compare these representative BMAs using the aforementioned criteria. The advantages/disadvantages of these algorithms in terms of their hardware tradeoff are discussed. The methodology and results presented provide useful guidelines to system designers in selecting a BMA for VLSI implementation.