scispace - formally typeset
Search or ask a question

Showing papers in "Design Automation for Embedded Systems in 2004"


Journal ArticleDOI
TL;DR: This paper survey and classify different techniques used for watermarking IP designs, and defined several evaluation criteria, which can be used as a benchmark for new IPWatermarking developments.
Abstract: Intellectual property (IP) block reuse is essential for facilitating the design process of system-on-a-chip. Sharing IP designs poses significant high security risks. Recently, digital watermarking emerged as a candidate solution for copyright protection of IP blocks. In this paper, we survey and classify different techniques used for watermarking IP designs. To this end, we defined several evaluation criteria, which can also be used as a benchmark for new IP watermarking developments. Furthermore, we established a comprehensive set of requirements for future IP watermarking techniques.

88 citations


Journal ArticleDOI
TL;DR: A software tool is described that assists system designers in moduli selection for the design of RNS-based FIR filters and filter banks and indicates that certain level of trade-off among delay, area and power consumption exists for the R NS-based filter and filter bank implementation by using different moduli sets.
Abstract: Moduli selection is one of the most important issues in the implementation of systems that make use of residue number systems. In this paper, we describe a software tool that assists system designers in moduli selection for the design of RNS-based FIR filters and filter banks. According to some filter specification parameters, the software tool constructs valid moduli sets and calculates their estimated implementations cost in terms of delay, area and power consumption based on results obtained in logic synthesis. Moduli set that is most suitable for the user requirements is selected, together with the estimated cost, to be the output. Outputs of the software tool also indicate that certain level of trade-off among delay, area and power consumption exists for the RNS-based filter and filter bank implementation by using different moduli sets.

39 citations


Journal ArticleDOI
TL;DR: A single-source approach is proposed, that is, the use of the same code for system-level specification and profiling, and, after architectural mapping, for HW/SW co-simulation and embedded software generation.
Abstract: There is a clear need for new methodologies supporting efficient design of embedded systems on complex platforms implementing both hardware and software modules. Software development has to be carried out under a closer relationship with the underlying platform. The current trend is towards an increasing embedded software development effort under more stringent time-to-market requirements. As a consequence, it is necessary to reduce software generation cost while maintaining reliability and design quality. In that context, languages centered on describing whole systems, with software and hardware parts, have been proposed. Among these, SystemC is gaining increasing interest as a specification language for embedded systems. SystemC supports the specification of the complete system and the modeling of the platform. In this paper, the application of SystemC to performance analysis and embedded software generation is discussed. A single-source approach is proposed, that is, the use of the same code for system-level specification and profiling, and, after architectural mapping, for HW/SW co-simulation and embedded software generation. A design environment based on C++ libraries for performance analysis and software generation is presented. This approach avoids working on intermediate formats and translators, which facilitates the designer's interaction with the system description throughout the development process. Additionally, it ensures the preservation of the computational models used for the system specification during architectural mapping and compilation.

20 citations


Journal ArticleDOI
TL;DR: This paper aims to introduce a time partitioning algorithm which is an important step during the design process for fully reconfigurable systems and divides the input task graph model to an optimal number of partitions and puts each task in the appropriate partition so that the latency of theinput task graph is optimal.
Abstract: This paper aims to introduce a time partitioning algorithm which is an important step during the design process for fully reconfigurable systems. This algorithm is used to solve the time partitioning problem. It divides the input task graph model to an optimal number of partitions and puts each task in the appropriate partition so that the latency of the input task graph is optimal. Also a part of this paper is consecrated for implementation of some examples on a fully reconfigurable architecture following our approach.

15 citations


Journal ArticleDOI
TL;DR: A multi-module, multi-port memory design procedure that satisfies area and/or energy constraints for embedded applications by application of loop transformations and reordering of array accesses followed by memory allocation and assignment procedures based on ILP models and heuristic-based algorithms is described.
Abstract: In this paper we describe a multi-module, multi-port memory design procedure that satisfies area and/or energy constraints for embedded applications. Our procedure consists of application of loop transformations and reordering of array accesses to reduce the memory bandwidth followed by memory allocation and assignment procedures based on ILP models and heuristic-based algorithms. The specific problems include determination of (a) the memory configuration with minimum area, given the energy bound, (b) the memory configuration with minimum energy, given the area bound, (c) array allocation such that the energy consumption is minimum for a given memory configuration (number of modules, size and number of ports per module). The results obtained by the heuristics match well with those obtained by the ILP methods.

11 citations


Journal ArticleDOI
TL;DR: This paper applies assertion checking methodology to the system design of network processors based on Linear Temporal Logic and Logic of Constraints and demonstrates that the assertion-based methodology is very useful for both system level verification and design exploration.
Abstract: System level modeling with executable languages such as C/C++ has been crucial in the development of large electronic systems from general processors to application specific designs. To make sure that the executable models behave as they should, the designers often have to "eye-ball" the simulation traces and at best, apply simple "assert" statements or write simple trace checkers in some scripting languages. The problem is the lack of a concise and formal method to specify and check desired properties, whether they be functional or performance in nature. In this paper, we apply assertion checking methodology to the system design of network processors. Functional and performance assertions, based on Linear Temporal Logic and Logic of Constraints, are written during the design process. Trace checkers and simulation monitors are automatically generated to validate particular simulation runs or to analyze their performance characteristics. Several categories of assertions are checked throughout the design process, such as equivalence, functionality, transaction, and performance. We demonstrate that the assertion-based methodology is very useful for both system level verification and design exploration.

9 citations


Journal ArticleDOI
TL;DR: The approach proposed in this paper introduces a hardware/software co-design framework for developing complex embedded systems based on the combined use of UML and the B language for system modeling and design, and the seamless transition from UML specifications to system descriptions in B.
Abstract: The approach proposed in this paper introduces a hardware/software co-design framework for developing complex embedded systems. The method relies on formal proof of system properties at every phase of the co-design cycle. The key concept is the combined use of UML and the B language for system modeling and design, and the seamless transition from UML specifications to system descriptions in B. The final system prototype emerges from correct-by-construction subsystems described in the B language; the hardware components are translated in VHDL/SystemC, while for the software components C/C++ is used. The outcome is a formally proven correct system implementation. The efficiency of the proposed method is exhibited through the design of a case study from the telecommunication domain.

8 citations


Journal ArticleDOI
TL;DR: This paper aims to develop a fast and automatic prototyping process dedicated to parallel architectures especially suited to static executives, based on SynDEx CAD software that improves algorithm implementation on multiprocessor architectures by finding the best match between algorithms and architectures.
Abstract: Embedded real-time signal, image and control applications have very significant time constraints and thus require the use of several powerful numerical calculation units. Self-time scheduling developed from single-processor applications cannot take advantage of multiprocessor architectures: manual data transfers and synchronizations quickly become very complex, leading wasted time and potential deadlocks. We aim to develop a fast and automatic prototyping process dedicated to parallel architectures especially suited to static executives. The process is based on SynDEx CAD software that improves algorithm implementation on multiprocessor architectures by finding the best match between algorithms and architectures. This paper describes how to use SynDEx, from the high level description of an application and its functional verification to final optimized multiprocessor execution. New SynDEx kernels are depicted for automatic multi-PC and multi-DSP dedicated code generation. Finally, we demonstrate the effectiveness of the approach with a full MPEG-4 codec application shared on PC and multi-DSP platforms.

6 citations


Journal ArticleDOI
TL;DR: A method of polynomial simulation to calculate switching activities in a general-delay logic circuit is described, a generalization of the exact signal probability evaluation method due to Parker and McCluskey, which has been extended to handle temporal correlation and arbitrary transport delays.
Abstract: We describe a method of polynomial simulation to calculate switching activities in a general-delay logic circuit. This method is a generalization of the exact signal probability evaluation method due to Parker and McCluskey, which has been extended to handle temporal correlation and arbitrary transport delays. The method can target both combinational and sequential circuits. Our method is parameterized by a single parameter l, which determines the speed-accuracy tradeoff. l indicates the depth in terms of logic levels over which spatial signal correlation is taken into account. This is done by only taking into account reconvergent paths whose length is at most l. The rationale is that ignoring spatial correlation for signals that reconverge after many levels of logic introduces negligible error. When l = L, where L is the total number of levels of logic in the circuit, the method will produce the exact switching activity under a zero delay model, taking into account all internal correlation. We present results that show that the error in the switching activity and power estimates is very small even for small values of l. In fact, for most of the examples, power estimates with l = 0 are within 5% of the exact. However, this error can be higher than 20% for some examples. More robust estimates are obtained with l = 2, providing a good compromise between speed and accuracy.

4 citations


Journal ArticleDOI
TL;DR: A predictive precharging scheme to reduce bitline leakage energy consumption and show that energy savings are significant with little performance degradation is proposed.
Abstract: As technology scales down into deep-submicron, leakage energy is becoming a dominant source of energy consumption. Leakage energy is generally proportional to the area of a circuit and caches constitute a large portion of the die area. Therefore, there has been much effort to reduce leakage energy in caches. Most techniques have been targeted at cell leakage energy optimization. Bitline leakage energy is critical as well. To this end, we propose a predictive precharging scheme to reduce bitline leakage energy consumption. Results show that energy savings are significant with little performance degradation. Also, our predictive precharging is more beneficial in more aggressively scaled technologies.

4 citations


Journal ArticleDOI
TL;DR: An automated framework that partitions the code and data types for the needs of data management in an object-oriented source code to identify the crucial data types from data management perspective and separate these from the rest of the code is presented.
Abstract: We present an automated framework that partitions the code and data types for the needs of data management in an object-oriented source code. The goal is to identify the crucial data types from data management perspective and separate these from the rest of the code. In this way, the design complexity is reduced allowing the designer to easily focus on the important parts of the code to perform further refinements and optimizations. To achieve this, static and dynamic analysis is performed on the initial C++ specification code. Based on the analysis results, the data types of the application are characterized as crucial or non-crucial. Continuing, the initial code is rewritten automatically in such a way that the crucial data types and the code portions that manipulate them are separated from the rest of the code. Experiments on well-known multimedia and telecom applications demonstrate the correctness of the performed automated analysis and code rewriting as well as the applicability of the introduced framework in terms of execution time and memory requirements. Comparisons with Rational's QuantifyTM suite show the failure of QuantifyTM to analyze correctly the initial code for the needs of data management.

Journal ArticleDOI
TL;DR: The paradox of being advantageous to wait for intervals more than the shutdown threshold is explained and the introduction of idle period lengths instead of interarrival periods “blurs” the input distribution, leading to non-optimal decisions.
Abstract: A power aware system can reduce its energy dissipation by dynamically powering off during idle periods and powering on again upon a new service request arrival. We minimize the dissipated energy, by selecting the optimal waiting interval before powering off, under consideration of the expected time of the next arrival. This approach has been already proposed in the past, using the idle times distribution, rather than the interarrival periods captured at the moment of service completion. Algorithms proposed in the literature utilize the history of idle periods or assume a vanishing service time. There has been no clear proposition on how service time affects the time instance of our power off decision; rather, whenever service time has been significant, a "blurred" image of the system's characteristic and a corresponding approximated optimal policy occurred. We clearly show analytically and experimentally that the idle times distribution should not be used as a primary design input, since it is the product of two separate inputs; the interarrival times and the service times. We give insight to our problem, using a mechanical equivalent established at the moment of service completion of all pending requests and show through analytical examples how service time affects our power-off decision. We explain the paradox of being advantageous to wait for intervals more than the shutdown threshold (which is a system characteristic) and show how the introduction of idle period lengths instead of interarrival periods "blurs" the input distribution, leading to non-optimal decisions. Our contribution is to define and solve the proper problem, solely relying on the interarrival distribution. Further, we examine the problem under the framework of competitive analysis. We show how the interarrival distribution that maximizes the competitive ratio, being an exponential distribution, intervenes with power management; it renders the optimization procedure worthless through its "memoryless property". Exponential interarrivals, irrespective of the service time pattern, are the marginal case where we cannot obtain energy gains. In all other cases the framework we promote ensures considerable advantages compared to other approaches in the literature. Moreover, it leads to a self contained module, implementable in software or hardware, which is based on an iterative formula and thus reduces power management calculations significantly. Here we exploit all operational features of the problem in proposing an implementation which spreads computations over the whole of the waiting period. We extensively compare our results numerically both against claimed expectations and against previous proposals. The outcome fully supports our framework as the one most appropriate for the application of power management.

Journal ArticleDOI
TL;DR: The presented method has been successfully tested on a large experimental benchmark, attaining quality levels close to those provided by the Synopsys Behavioral Compiler, and shows a significant improvement in the process time, while keeping the good precision and fidelity levels that the traditional estimation models usually offer.
Abstract: As Codesign problems become larger and more realistic, the required time to estimate their solutions turns into an important bottleneck. This paper presents a new approach to improve the traditional estimation techniques, in order to avoid this drawback. The presented method has been successfully tested on a large experimental benchmark, attaining quality levels close to those provided by the Synopsys Behavioral Compiler. Finally, a case study based on the standard H.261 video co-dec is described, proving the convenience of the technique on real-life situations. The obtained results show a significant improvement in the process time, while keeping the good precision and fidelity levels that the traditional estimation models usually offer.