scispace - formally typeset
Search or ask a question

Showing papers presented at "Asia and South Pacific Design Automation Conference in 1997"


Proceedings ArticleDOI
28 Jan 1997
TL;DR: A new stochastic optimization method, named genetic simulated annealing (GSA) for general floorplanning, based on the new representation for non-slicing floorplans, called bounded slicing grid (BSG) structure is proposed.
Abstract: A new method of non-slicing floorplanning is proposed, which is based on the new representation for non-slicing floorplans, called bounded slicing grid (BSG) structure. We developed a new greedy algorithm based on the BSG structure, running in linear time, to select the alternative shape for each soft block so as to minimize the overall area for general floorplan, including non-slicing structures. We propose a new stochastic optimization method, named genetic simulated annealing (GSA) for general floorplanning. Based on BSG structure, we extend SA-based local search and GA-based global crossover to L-shaped, T-shaped blocks and obtain high density packing of rectilinear blocks.

76 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: A mapping from sequence-pair to rectangular dissection is given, which represents channels by line segments, and candidate arrangements of modules and channels are successfully represented with the generality and the efficiency inherited from the seq-pair.
Abstract: A fundamental issue in floorplanning is in how to represent candidate solutions. A representation called sequence-pair was recently proposed. Seq-pair is so general as to represent an area minimum placement, and also efficient because it does not represent any overlapping placement. However, seq-pair is not expressive enough since channels are not represented. The paper gives a mapping from seq-pair to rectangular dissection, which represents channels by line segments. Consequently, candidate arrangements of modules and channels are successfully represented with the generality and the efficiency inherited from the seq-pair.

41 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: An integer programming (IP) based approach taken in the OSCAR (Optimum Simultaneous sCheduling, Allocation and Resource assignment) synthesis system, which extends the set of library components considered in architectural synthesis by components with built-in chaining (BIC).
Abstract: Extends the set of library components which are usually considered in architectural synthesis by components with built-in chaining (BIC). For such components, the result of some internally computed arithmetic function is made available as an argument to some other function through a local connection. These components can be used to implement chaining in a data-path in a single component. Components with BIC are combinatorial circuits. They correspond to "complex gates" in logic synthesis. If compared to implementations with several components, components with BIC usually provide a denser layout, reduced power consumption and a shorter delay time. Multiplier/accumulators are the most prominent example of such components. Such components require new approaches for library mapping in architectural synthesis. In this paper, we describe an integer programming (IP) based approach taken in our OSCAR (Optimum Simultaneous sCheduling, Allocation and Resource assignment) synthesis system.

39 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: An algorithm to find the minimum clock-period of a circuit whose signal propagation delays are given is proposed and experimental results show that this technique achieves as much as about 16% reduction of clock- period compared with the conventional maximum signal delay based methods.
Abstract: It is known that the clock-period in a sequential circuit can be shorter than the maximum signal delay between registers if the clock arrival time to each register is controlled. We propose an algorithm to find the minimum clock-period of a circuit whose signal propagation delays are given. Experimental results on LGSynth93 benchmarks show that this technique achieves as much as about 16% reduction of clock-period compared with the conventional maximum signal delay based methods. An application of this technique to improve the reliability of circuits is considered.

38 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: A hybrid approach that combines the advantages of BDD-based and ATPG-based approaches is introduced and a technique called partial justification is incorporated to explore the sequential similarity between the two circuits under verification to speed up the verification process.
Abstract: In this paper, we address the problem of verifying the equivalence of two sequential circuits. A hybrid approach that combines the advantages of BDD-based and ATPG-based approaches is introduced. Furthermore, we incorporate a technique called partial justification to explore the sequential similarity between the two circuits under verification to speed up the verification process. Compared with existing approaches, our method is much less vulnerable to the memory explosion problem, and therefore can handle larger designs. The experimental results show that in a few minutes of CPU time, our tool can verify the sequential equivalence of an intensively optimized benchmark circuit with hundreds of flip-flops against its original version.

34 citations


Book ChapterDOI
28 Jan 1997
TL;DR: An Evolutionary Algorithm that learns good heuristics for OKFDD minimization starting from a given set of basic operations and combines high quality results with reasonable time overhead is presented.
Abstract: Ordered Kronecker Functional Decision Diagrams (OKFDDs) are a data structure for efficient representation and manipulation of Boolean functions. OKFDDs are very sensitive to the chosen variable ordering and the decomposition type list, i.e. the size may vary from linear to exponential. In this paper we present an Evolutionary Algorithm (EA) that learns good heuristics for OKFDD minimization starting from a given set of basic operations. The difference to other previous approaches to OKFDD minimization is that the EA does not solve the problem directly. Rather, it develops strategies for solving the problem. To demonstrate the efficiency of our approach experimental results are given. The newly developed heuristics combine high quality results with reasonable time overhead.

32 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: This paper introduces and characterizes a family of dynamic Markov trees that can model complex the spatiotemporal correlations which occur during power estimation in both combinational and sequential circuits.
Abstract: Presents an effective and robust technique for compacting a large sequence of input vectors into a much smaller input sequence so as to reduce the circuit/gate-level simulation time by orders of magnitude and maintain the accuracy of the power estimates. In particular, this paper introduces and characterizes a family of dynamic Markov trees that can model complex the spatiotemporal correlations which occur during power estimation in both combinational and sequential circuits. As the results demonstrate, large compaction ratios of 1-2 orders of magnitude can be obtained without a significant loss (less than 5% on average) in the accuracy of the power estimates.

29 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: A numerical noise analysis method for oscillators is proposed that can be applied to strongly nonlinear circuits and thermal noise, shot noise and flicker noise are considered as noise sources.
Abstract: A numerical noise analysis method for oscillators is proposed. Noise sources are usually small and can be considered as perturbations to a large amplitude oscillation. Transfer functions from each noise source to the oscillator output can be calculated by modeling the oscillator as a linear periodic time-varying circuit. The proposed method is a time domain method and can be applied to strongly nonlinear circuits. Thermal noise, shot noise and flicker noise are considered as noise sources. Error in the time domain method is also discussed.

27 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: This paper clarifies the representational power of bit-level Decision Diagrams and demonstrates that a restriction of the K(*)BMD concept to subclasses, such as OBDDs, MTBDDs and (*)BMDs as well, results in families of functions which lose their efficient representation.
Abstract: Several types of Decision Diagrams (DDs) have have been proposed in the area of Computer Aided Design (CAD), among them being bit-level DDs like OBDDs, OFDDs and OKFDDs. While the aforementioned types of DDs are suitable for representing Boolean functions at the bit-level and have proved useful for a lot of applications in CAD, recently DDs to represent integer-valued functions, like MTBDDs (=ADDs), EVBDDs, FEVBDDs, (*)BMDs, HDDs (=KBMDs), and K*BMDs, attract more and more interest, e.g., using *BMDs it was for the first time possible to verify multipliers of bit length up to n=256. In this paper we clarify the representational power of these DD classes. Several (inclusion) relations and (exponential) gaps between specific classes differing in the availability of additive and/or multiplicative edge weights and in the choice of decomposition types are shown. It turns out for example, that K(*)BMDs, a generalization of OKFDDs to the word-level, also "include" OBDDs, MTBDDs and (*)BMDs. On the other hand, it is demonstrated that a restriction of the K(*)BMD concept to subclasses, such as OBDDs, MTBDDs, (*)BMDs as well, results in families of functions which lose their efficient representation.

24 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: A new technique is presented for computing noise in nonlinear circuits based on a formulation that uses harmonic power spectral densities (HPSDs), using which a block-structured matrix relation between the second-order statistics of noise within a circuit is derived.
Abstract: A new technique is presented for computing noise in nonlinear circuits. The method is based on a formulation that uses harmonic power spectral densities (HPSDs), using which a block-structured matrix relation between the second-order statistics of noise within a circuit is derived. The HPSD formulation is used to devise a harmonic-balance-based noise algorithm that requires O(nN log N) time and O(nN) memory, where n represents circuit size and N the number of harmonics of the large-signal steady state. The method treats device noise sources with arbitrarily shaped PSDs (including thermal, shot and flicker noises) handles noise input correlations and computes correlations between different outputs. The HPSD formulation is also used to establish the non-intuitive result that bandpass filtering of cyclostationary noise can result in stationary noise. The new technique is illustrated using an example that exhibits noise folding and interaction between harmonic PSD components. The results are validated against Monte-Carlo simulations. The noise performance of a large industrial integrated RF circuit (with >300 nodes) is also analyzed in less than 2 hours using the new method.

21 citations


Proceedings ArticleDOI
28 Jan 1997
TL;DR: The above equation also gives the expected value of the transition activity in any sequence that satisfies the given signal probability (averaged over all such sequences).
Abstract: In current probability calculation algorithms for power estimation, switching activity E/sub SW/ of a node is calculated from its signal probability p by the following simple relation: E/sub SW/=2p(1-p). It is generally understood that this simple relationship holds under the temporal independence assumption for the node. This paper however shows that the above equation also gives the expected value of the transition activity in any sequence that satisfies the given signal probability (averaged over all such sequences). Therefore, this equation can be used to calculate the switching activity under more general conditions than previously thought.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: This paper introduces the notion of cycle-accurate macro-models for RT-level power evaluation, which provide the capability to estimate the circuit power dissipation cycle by cycle at RT- level without the need to invoke low level simulations.
Abstract: This paper introduces the notion of cycle-accurate macro-models for RT-level power evaluation. These macro-models provide us with the capability to estimate the circuit power dissipation cycle by cycle at RT-level without the need to invoke low level simulations. The statistical framework allows us to compute the error interval for the predicted value from the user specified confidence level. The proposed macro-model generation strategy has been applied to a number of RT-level blocks and detailed results and comparisons are provided.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: ChipEst-FPGA, a chip level estimator for designs implemented using a hierarchical design methodology for Lookup Table Based FPGAs, which uses a realistic model which takes the component area/delay as well as wiring effects into account.
Abstract: The importance of efficient area and timing estimation techniques for hierarchical design methodology is well-established in High-Level Synthesis (HLS), since the estimation allows more realistic exploration of the design space, and hierarchical design methodology matches well with HLS paradigm. In this paper, we present ChipEst-FPGA, a chip level estimator for designs implemented using a hierarchical design methodology for Lookup Table Based FPGAs. In FPGAs, the wire delay may contribute to a significant portion of the overall design delay. ChipEst-FPGA uses a realistic model which takes the component area/delay as well as wiring effects into account. We tested our ChipEst-FPGA on several benchmarks and the results show that we can get accurate area and timing estimates efficiently.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: A design method for AND-OR-EXOR three-level networks, where a single two-input EXOR gate is used and the /spl mu/-equivalence of logic functions is introduced to develop minimization algorithms for EX-SOPs with up to five variables.
Abstract: Presents a design method for AND-OR-EXOR three-level networks, where a single two-input EXOR gate is used. The network realizes an exclusive-OR of two sum-of-products expressions (EX-SOP), where the two sum-of-products expressions (SOPs) cannot share products. The problem is to minimize the total number of products in the two SOPs. We introduced the /spl mu/-equivalence of logic functions to develop minimization algorithms for EX-SOPs with up to five variables. We minimized all the representative functions of NP-equivalence classes for up to five variables and found that five-variable functions require up to nine products in minimum EX-SOPs. For n-variable functions, minimum EX-SOPs require at most 9/spl middot/2/sup n-5/ (n/spl ges/6) products. This upper bound is smaller than 2/sup n-1/, the upper bound for conventional SOPs.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: This paper describes how to help the designer in this task, by providing a flexible co-simulation environment in which these alternatives can be interactively evaluated.
Abstract: Current design methodologies for embedded systems often force the designer to evaluate early in the design process architectural choices that will heavily impact the cost and performance of the final product. Examples of these choices are hardware/software partitioning, choice of the micro-controller, and choice of a run-time scheduling method. This paper describes how to help the designer in this task, by providing a flexible co-simulation environment in which these alternatives can be interactively evaluated.

Proceedings ArticleDOI
Masahiro Fujita1, R. Murgai
28 Jan 1997
TL;DR: This paper surveys state-of-the-art methods for estimation and optimization of delays of logic circuits at the technology-independent stage, where at this stage there exist reasonably accurate estimation techniques.
Abstract: Logic synthesis has two stages of optimization: technology-independent and technology-dependent. This paper surveys state-of-the-art methods for estimation and optimization of delays of logic circuits at the technology-independent stage. Although at this stage we cannot completely predict final delays after technology mapping, there exist reasonably accurate estimation techniques. Final delays can be reduced with optimization techniques that use such estimation.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: It will be demonstrated that AND/OR reasoning graphs allow us to naturally extend basic notions of two-level switching circuit theory to multi-level circuits and it is proved that and/or reasoning graphs represent all these implicants.
Abstract: This paper presents a technique to determine prime implicants in multi-level combinational networks. The method is based on a graph representation of Boolean functions called AND/OR reasoning graphs. This representation follows from a search strategy to solve the satisfiability problem that is radically different from conventional search for this purpose (such as exhaustive simulation, backtracking, BDDs). The paper shows how to build AND/OR reasoning graphs for arbitrary combinational circuits and proves basic theoretical properties of the graphs. It will be demonstrated that AND/OR reasoning graphs allow us to naturally extend basic notions of two-level switching circuit theory to multi-level circuits. In particular, the notions of prime implicants and permissible prime implicants are defined for multi-level circuits and it is proved that AND/OR reasoning graphs represent all these implicants. Experimental results are shown for PLA factorization.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: An outline of a concurrent cell generation and mapping strategy is shown, and a method to map an input Boolean network into CMOS transistor network is proposed.
Abstract: The conventional technology mapping method is selecting cells from a limited standard library, and the performance of the resultant circuit deeply depends on the characteristics of the library. To realize detailed optimization not limited by an instance of cell library and to reduce the maintenance cost of standard cell libraries, a novel paradigm for technology mapping, in which cell generation and mapping can be executed concurrently, is considered. This paper shows an outline of a concurrent cell generation and mapping strategy, and proposes a method to map an input Boolean network into CMOS transistor network. The transduction in transistor level is introduced for cell generation and the Dynamic Programming is utilized for cell assignment.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: This paper proposes a new approach to realize a very high performance real-time OS using VLSI technology and the most basic system calls have been designed in order to confirm the effectiveness of this method.
Abstract: This paper proposes a new approach to realize a very high performance real-time OS using VLSI technology. In order to confirm the effectiveness of this method, the most basic system calls have been designed. According to the evaluation results based on a gate array implementation, hardware portion of system calls can be executed within 4 clocks and the task scheduler can be performed in only 8 clocks simultaneously, which are about 130 to 1880 times faster than software implementation.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: In this article, the authors considered three types of ternary decision diagrams (TDDs): AND-TDD, EXOR-TD, and Kleene-TDs.
Abstract: Three types of ternary decision diagrams (TDDs) are considered: AND-TDDs, EXOR-TDDs, and Kleene-TDDs. Kleene-TDDs are useful for logic simulation in the presence of unknown inputs. Let N(BDD:f), N(AND-TDD:f), and N(EXOR-TDD:f) be the number of non-terminal nodes in the BDD, the AND-TDD, and the EXOR-TDD for f, respectively. Let N(Kleene-TDD:F) be the number of non-terminal nodes in the Kleene-TDD for F, where F is the Kleenean ternary function corresponding to f. Then N(BDD:f)/spl les/N(TDD:f). For parity functions, N(BDD:f)=N(AND-TDD:f)=N(EXOR-TDD:f)=N(Kleene-TDD:F). For unate functions, N(BDD:f)=N(AND-TDD:f). The sizes of Kleene-TDDs are O(3/sup n//n), and O(n/sup 3/) for arbitrary functions, and symmetric functions, respectively. There exist a 2n-variable function, where Kleene-TD Ds require O(n) nodes with the best order, while O(3/sup n/) nodes in the worst order.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: A class of FPGA architectures in which the mapping problem remains NP-complete, even with 6(W-1)/sup 2/+6W/Sup 2/ SpSB (this is close to the maximum number of SpSB, which is 6W/sup 2/).
Abstract: It has been observed experimentally that the mapping of global to detailed routing in a conventional FPGA routing architecture (2D array) yields unpredictable results. A different class of FPGA structures called greedy routing architectures (GRAs), where a locally optimal switch box routing can be extended to an optimal entire-chip routing, were investigated by Wu et al. (1994), Takashima et al. (1996) and Wu et al. (1996). It was shown that GRAs have good mapping properties. An H-tree GRA with W/sup 2/+2W switches per switch box (SpSB) and a 2D array GRA with 4W/sup 2/+2W SpSB were proposed by those authors (W is the number of tracks in each switch box). We continue this work by introducing an H-tree GRA with W/sup 2//2+2W SpSB and a 2D array GRA with 3.5 W/sup 2/+2 W SpSB. These new GRAs have the same good mapping properties but use fewer switches. We also show a class of FPGA architectures in which the mapping problem remains NP-complete, even with 6(W-1)/sup 2/+6W/sup 2/ SpSB (this is close to the maximum number of SpSB, which is 6W/sup 2/). Thus, more switches do not necessarily result in more routability.

Proceedings ArticleDOI
H. Ochi1
28 Jan 1997
TL;DR: A new approach is proposed that makes it possible for every undergraduate student to perform experiments of developing a pipelined RISC processor within limited time available for the course.
Abstract: This paper proposes a new approach that makes it possible for every undergraduate student to perform experiments of developing a pipelined RISC processor within limited time available for the course. The approach consists of 4 steps; at the first step, modeling of pipelined RISC processor is simplified by avoiding structural hazard and by ignoring other hazards, and in the succeeding steps, students learn difficulties of pipelining by themselves. An educational FPGA board ASAver.1 and results of feasibility study are also shown.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: An efficient approach to the synthesis of CA (Cellular Architecture)-type FPGAs is presented, which removes the need for generating minimal SOP or ESOP expressions which can be costly in some cases.
Abstract: In this paper, an efficient approach to the synthesis of CA (Cellular Architecture)-type FPGAs is presented. To exploit the array structure of cells in CA-type FPGAs, logic expressions called Maitra terms, which can be mapped directly to the cell arrays are generated. In this approach, a BDD is modified so that each node of the BDD has another branch which is an exclusive-OR of the two branches of a node. Once the modified BDD is obtained, a traversal of the BDD is sufficient to generate the Maitra terms needed. Since a BDD can be traversed in O(n) steps, where it is the number of nodes in the BDD, Maitra terms are generated very efficiently. This also removes the need for generating minimal SOP or ESOP expressions which can be costly in some cases. The experiments show that the proposed method generates better results than existing methods.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: In this paper, a new HDL called AIDL (Architecture- and Implementation-level Description Language) is proposed, and three processors are described and compared in both A IDL and VHDL descriptions.
Abstract: In order to design advanced processors in a short time, designers must simulate their designs and reflect the results to the designs at the very early stages. However, conventional hardware description languages (HDLs) do not have enough ability to describe designs easily and accurately at these stages. Thus, we have proposed a new HDL called AIDL (Architecture- and Implementation-level Description Language). In this paper, in order to evaluate the effectiveness of AIDL, we describe and compare three processors in both AIDL and VHDL descriptions.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: A high performance edge detection architecture for real-time image processing applications that is capable of producing one edge-pixel every clock cycle and can process 30 frames per second.
Abstract: We present a high performance edge detection architecture for real-time image processing applications. The architecture is finely pipelined. The proposed ASIC is capable of producing one edge-pixel every clock cycle. At a clock rate of 10 MHz, the architecture can process 30 frames per second, where the size of each frame is 640/spl times/480 8-bit pixels. The ASIC was laid out and fabricated using Samsung's 0.8 /spl mu/m double-metal CMOS process.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: The method exploits common subexpressions among constants based on hierarchical clustering and reduce the number of shifts, additions, and subtractions to solve the Multiple Constant Multiplication problem.
Abstract: In this paper, we propose an efficient solution for the Multiple Constant Multiplication (MCM) problem. The method exploits common subexpressions among constants based on hierarchical clustering and reduce the number of shifts, additions, and subtractions. The algorithm defines appropriate weights which indicate the operation priorities and selects the common subexpressions which results in the least number of local operations. It can also be extended to various high-level synthesis tasks such as arbitrary linear transforms. Experimental results show the effectiveness of our method.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: This paper proposes a scheduling method which derives an optimal schedule achieving the minimum iteration period and latency for a given signal processing algorithm on the specified processor array.
Abstract: In high-level synthesis for digital signal processing systems of array structured architecture, one of the most important procedures is the scheduling. By taking into account the allocation of operations to processors, it is mandatory to take into account the communication time between processors. In this paper we propose a scheduling method which derives an optimal schedule achieving the minimum iteration period and latency for a given signal processing algorithm on the specified processor array. The scheduling problem is modeled as an integer linear programming and solved by an ILP solver. Furthermore, we improve the scheduling method so that it can be applied to large scale signal processing algorithms without degrading the schedule optimality.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: This work proposes a placement tool that allows arbitrarily sized and shaped convex components to be placed, and extends the rectangle-packing method proposed by Kajitani.
Abstract: When designing integrated circuits, sub-components rarely end up being perfectly rectangular. However, currently most block-placers only consider rectangular components, resulting in inefficient area utilization. We propose a placement tool that allows arbitrarily sized and shaped convex components. It extends the rectangle-packing method proposed by Kajitani. We describe the methods used to create the placement and give some performance results.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: The design, the implementation, and the performance test of the Serial Viterbi decoder (SVD) using VHDL and FPGAs are described and it is shown that the SVD works well.
Abstract: This paper describes the design, the implementation, and the performance test of the Serial Viterbi decoder (SVD) using VHDL and FPGAs. The decoding scheme assumes the transmitted symbols were coded with a K=9, 32 Kbps, and rate 1/2 convolutional encoder with generator function g/sub 0/=(753)/sub 8/ and g/sub 1/=(561)/sub 8/ as defined in the JTC TAG-7 W-CDMA PCS standard. The SVD is designed using VHDL and implemented using FPGAs. The main algorithm is implemented in two Altera FLEX81500 FPGAs. The performance test results with 3DB Gaussian noise show that the SVD works well.

Proceedings ArticleDOI
28 Jan 1997
TL;DR: Experiments on a set of benchmarks demonstrate that combining entropy-based power measures with input-output correlation analyses of logic functions leads to a viable measure for high-level power estimation.
Abstract: In this paper, we present a study on the relationship between entropy and the average power consumption, of circuits generated from Boolean functions. Based on a general-delay model, an entropy-based formulation for power estimation is derived from a large set of experimental data. The study shows that the entropy measure provides an effective power estimate for single-output and fully-correlated multiple-output functions. The study also shows that if entropy is used as a power measure, the internal structure of a circuit must be considered in order to achieve accurate power estimates for non-correlated multiple-output functions. Experiments on a set of benchmarks demonstrate that combining entropy-based power measures with input-output correlation analyses of logic functions leads to a viable measure for high-level power estimation.