scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems in 1989"


Journal Article•DOI•
TL;DR: A general scheduling methodology is presented that can be integrated into specialized or general-purpose high-level synthesis systems and reduces the number of functional units, storage units, and buses required by balancing the concurrency of operations assigned to them.
Abstract: A general scheduling methodology is presented that can be integrated into specialized or general-purpose high-level synthesis systems. An initial version of the force-directed scheduling algorithm at the heart of this methodology was originally presented by the authors in 1987. The latest implementation of the logarithm introduced here reduces the number of functional units, storage units, and buses required by balancing the concurrency of operations assigned to them. The algorithm supports a comprehensive set of constraint types and scheduling modes. These include multicycle and chained operations; mutually exclusive operations; scheduling under fixed global timing constraints with minimization of functional unit costs, minimization of register costs, and minimization of global interconnect requirements; scheduling with local time constraints (on operation pairs); scheduling under fixed hardware resource constraints; functional pipelining; and structural pipeline (use of pipeline functional units). Examples from current literature, one of which was chosen as a benchmark for the 1988 High-Level Synthesis Workshop, are used to illustrate the effectiveness of the approach. >

1,093 citations


Journal Article•DOI•
TL;DR: A hierarchically structured framework for analog circuit synthesis is described and mechanisms are described that select from among alternate design styles and translate performance specifications from one level in the hierarchy to the next lower, more concrete level.
Abstract: A hierarchically structured framework for analog circuit synthesis is described. This hierarchical structure has two important features: it decomposes the design task into a sequence of smaller tasks with uniform structure, and it simplifies the reuse of design knowledge. Mechanisms are described that select from among alternate design styles and translate performance specifications from one level in the hierarchy to the next lower, more concrete level. A prototype implementation, OASYS, synthesizes sized transistor schematics for CMOS operational amplifiers from performance specifications and process parameters. Measurements from detailed circuit simulation and from actual fabricated analog ICs based on OASYS-synthesized designs demonstrate that OASYS is capable of synthesizing functional circuits. >

417 citations


Journal Article•DOI•
TL;DR: The cellular automata-logic-block-observation circuits presented are expected to improve upon conventional design for testability circuitry such as built-in logic-block operation as a direct consequence of reduced cross correlation between the bit streams that are used as inputs to the logic unit under test.
Abstract: A variation on a built-in self-test technique is presented that is based on a distributed pseudorandom number generator derived from a one-dimensional cellular automata (CA) array. The cellular automata-logic-block-observation circuits presented are expected to improve upon conventional design for testability circuitry such as built-in logic-block operation as a direct consequence of reduced cross correlation between the bit streams that are used as inputs to the logic unit under test. Certain types of circuit faults are undetectable using the correlated bit streams produced by a conventional linear-feedback-shift-register (LFSR). It is also noted that CA implementations exhibit data compression properties similar to those of the LFSR and that they display locality and topological regularity, which are important attributes for a very large-scale integration implementation. It is noted that some CAs may be able to generate weighted pseudorandom test patterns. It is also possible that some of the analysis of pseudorandom testing may be more directly applicable to CA-based pseudorandom testing than to LFSR-based schemes. >

349 citations


Journal Article•DOI•
TL;DR: Simulated-annealing-based algorithms are presented which provide excellent solutions to the entire allocation process, namely register, arithmetic unit, and interconnect allocation, while effectively exploring the existing tradeoffs in the design space.
Abstract: Novel algorithms for the simultaneous cost/resource-constrained allocation of registers, arithmetic units, and interconnect in a data path have been developed. The entire allocation process can be formulated as a two-dimensional placement problem of microinstructions in space and time. This formulation readily lends itself to the use of a variety of heuristics for solving the allocation problem. The authors present simulated-annealing-based algorithms which provide excellent solutions to this formulation of the allocation problem. These algorithms operate under a variety of user-specifiable constraints on hardware resources and costs. They also incorporate conditional resource sharing and simultaneously address all aspects of the allocation problem, namely register, arithmetic unit, and interconnect allocation, while effectively exploring the existing tradeoffs in the design space. >

250 citations


Journal Article•DOI•
TL;DR: The construction of a set of measurements that detects many faulty circuits before specification testing is described, and its effectiveness in detecting faulty circuits is evaluated.
Abstract: The IC fabrication process contains several testing stages. Because of the high cost of packaging, the testing stage prior to aging, called wafer probe, is key in reducing the overall manufacturing cost. Typically in this stage, specification tests are performed. Even though specification tests can certainly distinguish a good circuit from all faulty ones, they are expensive, and many types of faulty behavior can be detected by simpler tests. The construction of a set of measurements that detects many faulty circuits before specification testing is described. Bounds on these measurements are specified, and an algorithm for test selection is presented. An example of a possible simple test is a test of DC voltages (i.e., parametric tests). This type of test is defined rigorously, and its effectiveness in detecting faulty circuits is evaluated. >

225 citations


Journal Article•DOI•
F. El-Turky1, E.E. Perry2•
TL;DR: An expert-systems-based automated design approach for analog circuits is presented and BLADES, believed to be the first successful design expert system in the analog design domain, is presented.
Abstract: An expert-systems-based automated design approach for analog circuits is presented The approach uses both formal and intuitive knowledge in the design process A prototype design environment, BLADES, which uses a divide and conquer solution strategy, has been successfully implemented and is currently capable of designing a wide range of subcircuit functional blocks as well as a limited class of integrated bipolar operational amplifiers BLADES is believed to be the first successful design expert system in the analog design domain It uses different levels of abstraction depending on the complexity of the design task under consideration The importance of the abstraction level lies in the fact that once design primitives are defined, the problem of extracting the knowledge (design rules) become less complex Two design examples are given to demonstrate the viability and versatility of the knowledge-based design technique as an analog design tool None of the circuits designed and tested using BLADES were unstable >

212 citations


Journal Article•DOI•
TL;DR: The authors construct a processor design approach that does not require the distribution of a clocking signal and develops a deterministic algorithm to synthesize asynchronous interconnection circuits from high-level specifications.
Abstract: The authors construct a processor design approach that does not require the distribution of a clocking signal. To facilitate design of processors that use fully asynchronous components, the first step is to design hazard-free asynchronous interconnection circuits. To this end, a deterministic algorithm was developed to synthesize asynchronous interconnection circuits from high-level specifications. This approach systematically designs correct asynchronous interconnection circuits with the weakest possible constraints and minimal overhead. The authors are primarily concerned with the synthesis of nonmetastable circuits, even though the procedure is also valid of metastable circuit synthesis. The synthesized logic is hazard-free and guaranteed to have the fastest operation according to a behavioral specification. A high-level description is used to specify circuit behavior, not only for a simpler input format, but also as a basis for determining the final optimum designs. Automatic synthesis and the ability to localize the timing considerations reduce design effort when systems become complex. >

201 citations


Journal Article•DOI•
Ravi Nair1, C.L. Berman1, P.S. Hauge1, E.J. Yoffa1•
TL;DR: Methods are presented for generating bounds on interconnection delays in a combinational network having specified timing requirements at its input and output terminals, and fast algorithms are provided that maximize the delay range, and hence the margin for error in layout, for various types of timing constraint.
Abstract: Methods are presented for generating bounds on interconnection delays in a combinational network having specified timing requirements at its input and output terminals. An automatic placement program that uses wirability as its primary objective could use these delay bounds to generate length or capacitance bounds for interconnection nets as secondary objectives. Thus, unlike previous timing-driven placement algorithms, the desired performance of the circuit is guaranteed when a wirable placement meeting these objectives is found. Fast algorithms are provided that maximize the delay range, and hence the margin for error in layout, for various types of timing constraint. >

197 citations


Journal Article•DOI•
TL;DR: Algorithms to select such sets of paths with minimum cardinality that includes at least one path, with maximum modeled delay, for each circuit lead or gate input are given.
Abstract: In order to ascertain correct operation of digital logic circuits it is necessary to verify correct functional operation as well as correct operation at desired clock rates. To ascertain correct operation at desired clock rates, it is verified that signal propagation delays along a set of selected paths fall within allowed limits by applying appropriate stimuli. It has previously been suggested that an appropriate set of paths to test would be the one that includes at least one path, with maximum modeled delay, for each circuit lead or gate input. Here, algorithms to select such sets of paths with minimum cardinality are given. >

174 citations


Journal Article•DOI•
TL;DR: The authors argue that statecharts can be beneficially used as a behavioral hardware description language and present a VLSI synthesis methodology by which layer area and delay periods can be reduced relative to the conventional finite-state-machine (FSM) synthesis method.
Abstract: Statecharts have been proposed recently as a visual formalism for the behavioral description of complex systems. They extend classical state diagrams in several ways, while retaining their formality and visual nature. The authors argue that statecharts can be beneficially used as a behavioral hardware description language. They illustrate some of the main features of the approach, including: hierarchical decomposition, multilevel timing specifications and flexible concurrency and synchronization capabilities. The authors also present a VLSI synthesis methodology by which layer area and delay periods can be reduced relative to the conventional finite-state-machine (FSM) synthesis method. >

173 citations


Journal Article•DOI•
TL;DR: Theoretical and simulation studies were performed to demonstrate that the test pattern generation efficiency of the CTSP is comparable to that of a pseudorandom generator, regardless of the functionality of the circuit under test.
Abstract: A technique for designing self-test VLSI circuits, referred to as circular self-test path (CSTP), is introduced. The CSTP is a feedback shift register (output of the last flip-flop is supplied to the first flip-flop) with a data communication capability. It serves simultaneously for test pattern generation and test response compaction, thereby minimizing the test schedule complexity; the whole chip is tested in a single test session. A distinguishing attribute of built-in self-test (BIST) chips designed using this technique is a low silicon area overhead, slightly exceeding that of scan path designs, but substantially lower than that of built-in logic block observer (BILBO)-based circuits. Theoretical and simulation studies were performed to demonstrate that the test pattern generation efficiency of the CTSP is comparable to that of a pseudorandom generator, regardless of the functionality of the circuit under test. >

Journal Article•DOI•
TL;DR: The authors discuss in detail the synthesis of structures from behavioural domain descriptions using a formal language, internal representation of the behaviour, synthesis based on data-flow analysis, optimizations and generation of a hardware structure.
Abstract: The authors discuss in detail the synthesis of structures from behavioural domain descriptions. The overall synthesis approach is explained, the techniques and methods used to solve the main problems are discussed, implementation results are given, and experiences with various examples are described. The principal topics that are addressed are design description in the behavioural domain using a formal language, internal representation of the behaviour, synthesis based on data-flow analysis, optimizations and generation of a hardware structure. These techniques were implemented in the Karlsruhe DSL synthesis system. >

Journal Article•DOI•
TL;DR: The authors present an improved implication procedure and an improved unique sensitization procedure that is capable of both successfully generating a test pattern for all testable faults in a set of combinational benchmark circuits, and of identifying all redundant faults with less than ten backtrackings.
Abstract: The authors present several concepts and techniques aiming at a further improvement and acceleration of the deterministic test-pattern-generation and redundancy identification process. In particular, they describe an improved implication procedure and an improved unique sensitization procedure. While the improved implication procedure takes advantage of the dynamic application of a learning procedure, the improved unique sensitization procedure profits from a dynamic and careful consideration of the existing situation of value assignments in the circuit. As a result of the application of the proposed techniques, SOCRATES is capable of both successfully generating a test pattern for all testable faults in a set of combinational benchmark circuits, and of identifying all redundant faults with less than ten backtrackings. >

Journal Article•DOI•
TL;DR: A design methodology for automated mapping of DSP algorithms into VLSI architectures is presented, which takes into account explicit algorithm requirements on throughput and latency, in addition to V LSI technology constraints on silicon area and power dissipation.
Abstract: A design methodology for automated mapping of DSP algorithms into VLSI architectures is presented. The methodology takes into account explicit algorithm requirements on throughput and latency, in addition to VLSI technology constraints on silicon area and power dissipation. Algorithm structure, design style of functional units, and parallellism of the architecture are all explored in the design space. The synthesized architecture is a multibus multifunction unit processor matched to the implemented algorithm. The architecture has a linear topology and uses a lower number of interconnects and multiplexer inputs compared to other synthesized architectures with random topology having the same performance. The synthesized processor is a self-timed element externally, while it is internally synchronous. The methodology is implemented in a design aid tool called SPAID. Results obtained using SPAID for two DSP algorithms compare favorably with other synthesis techniques. >

Journal Article•DOI•
TL;DR: ESP (evolution-based standard cell placement) uses the novel heuristic method of simulating an evolutionary process to minimize the cell interconnection wire length to achieve comparable results to popular simulated annealing algorithms.
Abstract: ESP (evolution-based standard cell placement) is a program package designed to perform standard cell placement including macro-block placement capabilities. It uses the novel heuristic method of simulating an evolutionary process to minimize the cell interconnection wire length. While achieving comparable results to popular simulated annealing algorithms, ESP usually requires less CPU time. A concurrent version designed to run on a network of loosely coupled processors, such as workstations connected via Ethernet, has also been developed. For medium to large circuits (>250 cells per processor) concurrent ESP achieves linear speedup. >

Journal Article•DOI•
TL;DR: An introduction to the hydrodynamic model for semiconductor devices is presented and arguments for existence of solutions and convergence of numerical methods are given for the case of subsonic electron flow.
Abstract: An introduction to the hydrodynamic model for semiconductor devices is presented. Special attention is paid to classifying the hydrodynamic PDEs (partial differential equations) and analyzing their nonlinear wave structure. Numerical simulations of the ballistic diode using the hydrodynamical device model are presented, as an illustrative elliptic problem. The importance of nonlinear block iterative methods is emphasized. Arguments for existence of solutions and convergence of numerical methods are given for the case of subsonic electron flow. >

Journal Article•DOI•
TL;DR: The standard cell design style is investigated, and two probabilistic models are presented that estimate the wiring space requirements in the routing channels between the cell rows and the number of feedthroughs that must be inserted in thecell rows to interconnect cells placed several rows apart.
Abstract: The standard cell design style is investigated. Two probabilistic models are presented. The first model estimates the wiring space requirements in the routing channels between the cell rows. The second model estimates the number of feedthroughs that must be inserted in the cell rows to interconnect cells placed several rows apart. These models were implemented in the standard cell area estimation program PLEST (PLotting ESTimator). PLEST was used to estimate the areas of a set of 12 standard cell chips. In all cases, the estimates were accurate to within 10% of the actual areas. PLEST's estimation of a chip layout area takes only a few seconds to produce, as compared with more than 10 h to generate the chip layout itself using an industrial layout system. >

Journal Article•DOI•
F. Venturi1, R.K. Smith2, E. Sangiorgi3, M.R. Pinto2, Bruno Ricco1 •
TL;DR: An efficient self-consistent device simulator coupling Poisson equation and Monte Carlo transport suitable for general silicon devices, including those with regions of high doping/carrier densities, is discussed.
Abstract: An efficient self-consistent device simulator coupling Poisson equation and Monte Carlo transport suitable for general silicon devices, including those with regions of high doping/carrier densities, is discussed. Key features include an original iteration scheme and an almost complete vectorization of the program. The simulator has been used to characterize nonequilibrium effects in deep submicron nMOSFETs. Substantial overshoot effects are noticeable at gate lengths of 0.25 mu m at room temperatures. >

Journal Article•DOI•
TL;DR: Experimental results are presented showing the effectiveness of the application of a concurrent fault simulator to automatic test vector generation in generating tests for combinational and sequential circuits.
Abstract: A description is given of the application of a concurrent fault simulator to automatic test vector generation. As faults are simulated in the fault simulator a cost function is simultaneously computed. A simple cost function is the distance (in terms of the number of gates and flip-flops) of a fault effect from a primary output. The input vector is then modified to reduce the cost function until a test is found. Experimental results are presented showing the effectiveness of this method in generating tests for combinational and sequential circuits. By defining suitable cost functions, it has been possible to generate: (1) initialization sequences; (2) tests for a group of faults; and (3) a test for a given fault. Even asynchronous sequential circuits can be handled by this approach. >

Journal Article•DOI•
TL;DR: Experimental results showed that SILK, when solving all the benchmarks from the literature, outperformed WEAVER, the most successful switch-box router to date, in both quality and speed aspects.
Abstract: The authors present a rip-up-and-rerouter based on a matrix representation scheme and simulated evolution technique for solving detailed routing problems in VLSI layout. The status of the routing region is represented as a matrix. Rip-up and reroute operations are emulated as matrix subtractions and additions, respectively. The quality of a routing result can be measured by a few simple matrix operations on the matrix. A rip-up and reroute switch-box/channel router, called SILK, using a simulated evolution technique has been implemented based on this representation alone. Experimental results showed that SILK, when solving all the benchmarks from the literature, outperformed WEAVER, the most successful switch-box router to date, in both quality and speed aspects. >

Journal Article•DOI•
TL;DR: The authors outline a synthesis procedure which beginning from a state transition graph (STG) description of a sequential machine produces an optimized fully and easily testable logic implementation which guarantees testability for both Moore and Mealy machines.
Abstract: The authors outline a synthesis procedure which beginning from a state transition graph (STG) description of a sequential machine produces an optimized fully and easily testable logic implementation. This logic-level implementation is guaranteed to be testable for all single stuck-at faults in the combinational logic and the test sequences for these faults can be obtained using combinational test generation techniques alone. The sequential machine is assumed to have a reset state and be R-reachable. All single stuck-at faults in the combinational logic and the input and output stuck-at faults of the memory elements in the synthesized logic-level automaton can be tested without access to the memory elements using these test sequences. Thus this procedure represents an alternative to a scan design methodology. The area penalty incurred due to the constraints on the optimization are small. The performance of the synthesized design is usually better than that of an unconstrained design optimized for area alone. The authors show that an intimate relationship exists between state assignment and the testability of a sequential machine. They propose a procedure of constrained state assignment and logic optimization which guarantees testability for both Moore and Mealy machines. >

Journal Article•DOI•
TL;DR: An attempt was made to define the algorithmic level of design and to provide the designer with the means to explore various design issues within the framework of the System Architect's Workbench.
Abstract: An attempt was made to define the algorithmic level of design (also known as the behavioral level) and to provide the designer with the means to explore various design issues. Within the framework of the System Architect's Workbench, a new set of behavioral and structural transformations was developed to allow the interactive exploration of algorithmic-level design alternatives. A description is given of these transformations, and a set of examples is presented both to demonstrate the application of the transformations and to further illustrate their effects. >

Journal Article•DOI•
D. Varma1, E.A. Trachtenberg1•
TL;DR: The disjoint decomposition problem is in the spectral domain, allowing the development of an algorithm that can simultaneously detect multiple decompositions of a given function and has the ability to detect the nonexistence of decomposition quickly.
Abstract: A description is given of linear and disjoint decompositions of completely specified Boolean functions using transform methods. Since previously known transform methods are impractical for automation due to their enormous computational complexity, polynomial approximations to the linear decomposition procedure that use reduced representations of functions are used. Experimental results are reported which establish that such decompositions can often result in improved implementations of logic functions. The disjoint decomposition problem is in the spectral domain, allowing the development of an algorithm that can simultaneously detect multiple decompositions of a given function. This algorithm has low average complexity and has the ability to detect the nonexistence of decompositions quickly. >

Journal Article•DOI•
TL;DR: The authors present two O(n/sup 2/) planarization algorithms, PLANARIZE and MAXIMAL-PLANARIZE, based on A. Cederbaum's (1967) planarity testing algorithm and its implementation using PQ-trees.
Abstract: The authors present two O(n/sup 2/) planarization algorithms, PLANARIZE and MAXIMAL-PLANARIZE. These algorithms are based on A. Lempel, S. Even, and I. Cederbaum's (1967) planarity testing algorithm and its implementation using PQ-trees. Algorithm PLANARIZE is for the construction of a spanning planar subgraph of an n-vertex nonplanar graph. The algorithm proceeds by embedding one vertex at a time and, at each step, adds the maximum number of edges possible without creating nonplanarity of the resultant graph. Given a biconnected spanning planar subgraph G/sub p/ of a nonplanar graph G, the MAXIMAL-PLANARIZE algorithm constructs a maximal planar subgraph of G which contains G/sub p/. This latter algorithm can also be used to planarize maximally a biconnected planar graph. >

Journal Article•DOI•
Shmuel Wimer1, Israel Koren, Israel Cederbaum•
TL;DR: This paper discusses the problem of selecting an optimal implementation for each building block so that the area of the final layout is minimized, and suggests a branch and bound algorithm which proves to be very efficient and can handle successfully large general non-slicing floorplans.
Abstract: The building blocks in a given floorplan have several possible physical implementations yielding different layouts. A discussion is presented of the problem of selecting an optimal implementation for each building block so that the area of the final layout is minimized. A polynomial algorithm that solves this problem for slicing floorplans was presented elsewhere, and it has been proved that for general (nonslicing) floorplans the problem is NP-complete. The authors suggest a branch-and-bound algorithm which proves to be very efficient and can handle successfully large general nonslicing floorplans. The high efficiency of the algorithm stems from the branching strategy and the bounding function used in the search procedure. The branch-and-bound algorithm is supplemented by a heuristic minimization procedure which further prunes the search, is computationally efficient, and does not prevent achieving a global minimum. Finally, the authors show how the nonslicing and the slicing algorithms can be combined to handle efficiently very large general floorplans. >

Journal Article•DOI•
TL;DR: A two-chain maximum dominance problem, which is of interest in its own right, is considered, and its applications to other very large-scale integration layout problems are shown.
Abstract: A topological via minimization problem in a two-layer routing environment is examined. The problem of minimizing the number of vias needed to route n two-terminal nets in a bounded routing region is shown to be NP-hard. However, in the case of a two-shore routing region, the topological via minimization problem can be solved in O(n/sup 2/ log n) time. As a basis for the algorithm, a two-chain maximum dominance problem, which is of interest in its own right, is considered, and its applications to other very large-scale integration layout problems are shown. >

Journal Article•DOI•
TL;DR: A unified approach is proposed for modeling gate oxide shorts in MOS transistors using lumped-element models that take into account the possible structure of gate oxide short and the resulting changes that affect the I-V characteristics of M OS transistors.
Abstract: A unified approach is proposed for modeling gate oxide shorts in MOS transistors using lumped-element models. These models take into account the possible structure of gate oxide short and the resulting changes that affect the I-V characteristics of MOS transistors. They can be used with the circuit simulator to predict the performance degradation of the VLSI circuit with gate oxide shorts. Demonstrated examples of models show close agreement with the experimental data. >

Journal Article•DOI•
TL;DR: In this scheme, self-checking techniques and built-in self-test techniques are combined in an original way to take advantage of each other and the result is a unified BIST scheme (UBIST), allowing high fault coverage for all tests needed for integrated circuits.
Abstract: An original built-in self-test (BIST) scheme is proposed aimed at covering some of the shortcomings of self-checking circuits and applicable to all tests needed for integrated circuits. In this scheme, self-checking techniques and built-in self-test techniques are combined in an original way to take advantage of each other. The result is a unified BIST scheme (UBIST), allowing high fault coverage for all tests needed for integrated circuits, e.g., offline test (design verification, manufacturing test, maintenance test) and online concurrent error detection. An important concept introduced is that of self-exercising checkers. The strongly code-disjoint property of the checkers is ensured for a very large class of fault hypotheses by internal test pattern generation, and the design of the checkers is simplified. >

Journal Article•DOI•
Y.-H. Jun1, Ki Jun1, S.-B. Park1•
TL;DR: A delay model for multiple delay simulation for NMOS and CMOS logic circuits is proposed and shows that the proposed modes can predict the delay times within 5% error and with a speedup of three orders of magnitude for several circuits tested as compared with the SPICE simulation.
Abstract: A delay model for multiple delay simulation for NMOS and CMOS logic circuits is proposed. For the simple inverter the rise or fall delay time is approximated by a product of polynomials of the input waveform slope, the output loading capacitance, and the device configuration ratio, with the polynomial coefficients determined so as to best fit the SPICE simulation results for a given fabrication process. This approach can easily be extended to the case of multiple-input transitions. The simulation results show that the proposed modes can predict the delay times within 5% error and with a speedup of three orders of magnitude for several circuits tested as compared with the SPICE simulation. >

Journal Article•DOI•
TL;DR: PEPPER is a computer program that simulates in one dimension the ion implantation, diffusion, oxidation, epitaxy, deposition, and etch processes used in VLSI technology and contains an efficient Monte Carlo ion implantedation algorithm that includes explicit calculation of ion channeling and damage.
Abstract: PEPPER is a computer program that simulates in one dimension the ion implantation, diffusion, oxidation, epitaxy, deposition, and etch processes used in VLSI technology. The program contains an efficient Monte Carlo ion implantation algorithm that includes explicit calculation of ion channeling and damage. The key feature of the diffusion calculation is a general partial differential equation solver for rapid prototyping of physical models. The solver has been used to develop several unique diffusion models. A novel model for impurity diffusion in polysilicon treats the problem as a two-stream process, with relatively slow standard diffusion within the grain and a much more rapid component of diffusion along the grain boundaries. The two components are coupled at each point by a segregation term. Two other models for impurity diffusion in silicon include explicit calculation of the coupling of point defects with impurities. One of the point-defect models is a general and detailed formulation from a chemical kinetics viewpoint, while the other makes further assumptions to simplify the model for engineering analysis. >