scispace - formally typeset
Search or ask a question

Showing papers by "Zili Shao published in 2004"


Proceedings ArticleDOI
10 May 2004
TL;DR: A new efficient algorithm for dynamic SPT update to avoid the disadvantages caused by static SPTupdate algorithms, based on the understanding of the update procedure to reduce redundancy.
Abstract: The shortest path tree (SPT) construction is a critical issue to the high performance routing in an interior network using link state protocols, such as open shortest path first (OSPF) and IS-IS. In this paper, we propose a new efficient algorithm for dynamic SPT update to avoid the disadvantages (e.g. redundant computation) caused by static SPT update algorithms. The new algorithm is based on the understanding of the update procedure to reduce redundancy. Only significant elements that contribute to the construction of new SPT from the old one are focused on. The efficiency of our algorithm is improved because it only pay attention to the edges really count for the update process. The running time for the proposed algorithm is maximum reduced, which is shown through experimental results. Furthermore, our algorithm can be easily generalized to solve the SPT updating problem in a graph with negative weight edges and applied to the scenario of multiple edge weight changes.

16 citations


Proceedings ArticleDOI
22 Sep 2004
TL;DR: It is shown that any "J+K" model loop can be legally fused using the legalizing fusion technique, and the experimental results show that the loop fusion technique always significantly reduces the schedule length.
Abstract: Loop fusion is commonly used to improve the instruction-level parallelism of loops for high-performance embedded computing systems. Loop fusion, however, is not always directly applicable because the fusion prevention dependencies may exist among loops. Most of the existing techniques still have limitations in fully exploiting the advantages of loop fusion. In this paper, we present a general loop fusion technique for loops or nested loops based on the loop dependency graph model, retiming, and multi-dimensional retiming concepts. We show that any "J+K" model loop can be legally fused using our legalizing fusion technique. Polynomial-time algorithms are developed to solve the loop fusion problem for "J+K" model loops considering both timing and code size of the final code. Our technique produces the final code and calculates the resultant code size directly from the retiming values. The experimental results show that our loop fusion technique always significantly reduces the schedule length.

16 citations


Proceedings Article
01 Jan 2004
TL;DR: Huang et al. as discussed by the authors proposed the HSDefender (Hardware/Software Defender) technique that considers the protection and checking together to solve the buffer overflow problem in embedded systems by design a secure instruction set and requiring third-party software developers to use secure instructions to call functions.
Abstract: With more embedded systems networked, it becomes an important research problem to effectively defend embedded systems against buffer overflow attacks and efficiently check if systems have been protected. In this paper, we propose the HSDefender (Hardware/Software Defender) technique that considers the protection and checking together to solve this problem. Our basic idea is to design a secure instruction set and require third-party software developers to use secure instructions to call functions. Then the security checking can be easily performed by system integrators even without the knowledge of the source code. We first classify buffer overflow attacks into two categories, stack smashing attacks and function pointer attacks, and then provide two corresponding defending strategies. We analyze the HSDefender technique in respect of hardware cost, security, and performance, and experiment with it on the SimpleScalar/ARM simulator using benchmarks from MiBench. The results show that HSDefender can defend a system against more types of buffer overflow attacks with less overhead compared with the previous work.

16 citations


Proceedings ArticleDOI
05 Apr 2004
TL;DR: The basic idea is to design a secure instruction set and require third-party software developers to use secure instructions to call functions, so that the security checking can be easily performed by system integrators even without the knowledge of the source code.
Abstract: With more embedded systems networked, it becomes an important research problem to effectively defend embedded systems against buffer overflow attacks and efficiently check if systems have been protected. In this paper, we propose the HSDefender (hardware/software Defender) technique that considers the protection and checking together to solve this problem. Our basic idea is to design a secure instruction set and require third-party software developers to use secure instructions to call functions. Then the security checking can be easily performed by system integrators even without the knowledge of the source code. We first classify buffer overflow attacks into two categories, stack smashing attacks and function pointer attacks, and then provide two corresponding defending strategies. We analyze the HSDefender technique in respect of hardware cost, security, and performance, and experiment with it on the SimpleScalar/ARM simulator using benchmarks from MiBench. The results show that HSDefender can defend a system against more types of buffer overflow attacks with less overhead compared with the previous work.

13 citations


Journal ArticleDOI
01 Jan 2004
TL;DR: An instruction-level energy-minimisation scheduling technique to reduce the energy consumption of applications on VLIW processors is proposed and it is formally proved that this problem is NP-complete.
Abstract: Switching activity and schedule length are the two most important factors that influence the energy consumption of an application executed on a VLIW (very long instruction word) processor. Considering these two factors together, we propose an instruction-level energy-minimisation scheduling technique to reduce the energy consumption of applications on VLIW processors. We first formally prove that this problem is NP-complete. Then three heuristic algorithms, MSAS, MLMSA, and EMSA, are proposed. While switching activity and schedule length are given higher priority in MSAS and MLMSA respectively, EMSA gives the best result considering both of them. The experimental results show that EMSA gives a 31.7% reduction in energy compared with the traditional list scheduling approach on average.

7 citations


Proceedings ArticleDOI
01 Dec 2004
TL;DR: Two dynamic algorithms (MaxR, MinD) are proposed to reduce the times for node updating during the dynamic update process of the shortest path tree (SPT) dynamic update.
Abstract: Previous approaches for the shortest path tree (SPT) dynamic update have mainly focused on the case of one link state change. Little work has been done on the problem of deriving a new SPT based on its old one for multiple link state decrements in a network that applies link-state routing protocols. The complexity of this problem comes from there being no accurate boundary of nodes to be updated in an updating process and that multiple decrements can be accumulated. Two dynamic algorithms (MaxR, MinD) are proposed to reduce the times for node updating. Compared with other algorithms for the SPT update of multiple edge weight decrements, our algorithms yield fewer times for node updates during the dynamic update process. Such an achievement is attained by the mechanism of part node updating in a branch on the SPT after a particular node selection from a built node list. Simulation results are given to show our improvements.

4 citations


Proceedings ArticleDOI
15 Aug 2004
TL;DR: The experimental results show that the SPINE technique outperforms both the standard software pipelining and MD retiming significantly, and the computation time and code size of a software-pipelined loop nest is affected by execution sequence and retimed function.
Abstract: Software pipelining for nested loops remains a challenging problem for embedded system design. The existing software pipelining techniques for single loops can only explore the parallelism of the innermost loop, so the final timing performance is inferior. While multidimensional (MD) retiming can explore the outer loop parallelism, it introduces large overheads in loop index generation and code size due to transformation. We use MD retiming to model the software pipelining problem of nested loops. We show that the computation time and code size of a software-pipelined loop nest is affected by execution sequence and retiming function. The algorithm of software pipelining for nested loops technique (SPINE) is proposed to generate fully parallelized loops efficiently with the overheads as small as possible. The experimental results show that our technique outperforms both the standard software pipelining and MD retiming significantly.

3 citations


Proceedings ArticleDOI
27 Sep 2004
TL;DR: The experimental results show that the proposed SAMLS (Switching-Activity Minimization Loop Scheduling) algorithm can greatly reduce both schedule length and bus switching activities compared with the previous work.
Abstract: This paper develops an instruction-level loop scheduling technique to reduce both execution time and bus switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can greatly reduce both schedule length and bus switching activities compared with the previous work.

3 citations


Book ChapterDOI
25 Aug 2004
TL;DR: It is formally proved that to find a schedule that has the minimal switching activities among all minimum-latency schedules with or without resource constraints is NP-complete.
Abstract: This paper studies the scheduling problem that minimizes both schedule length and switching activities for applications with loops on multiple-functional-unit architectures. We formally prove that to find a schedule that has the minimal switching activities among all minimum-latency schedules with or without resource constraints is NP-complete. An algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), is proposed to minimize both schedule length and switching activities. In SAMLS, the best schedule is selected from the ones generated from a given initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show our algorithm can greatly reduce both schedule length and switching activities compared with the previous work.

2 citations


Proceedings ArticleDOI
26 Apr 2004
TL;DR: It is proved heterogeneous assignment problem is NP-complete and several algorithms to solve it are proposed, including DFG-assign-repeat is the best that gives a reduction of 25.7% on total cost compared with the previous work.
Abstract: Summary form only given. In high level synthesis for real-time digital signal processing (DSP) architectures using heterogeneous functional units (FUs), an important problem is how to assign a proper fit type to each operation of a DSP application and generate a schedule in such a way that all requirements can be met and the total cost can be minimized. We propose a two-phase approach to solve this problem. In the first phase, we solve heterogeneous assignment problem, i.e., how to assign a proper FU type to a DSP application such that the total cost can be minimized while the timing constraint is satisfied. In the second phase, based on the assignments obtained from the first phase, we propose a minimum resource scheduling algorithm to generate a schedule and a feasible configuration that uses as little resource as possible. We prove heterogeneous assignment problem is NP-complete and propose several algorithms to solve it. The experiments show that algorithm DFG-assign-repeat is the best that gives a reduction of 25.7% on total cost compared with the previous work.

2 citations


01 Dec 2004
TL;DR: In this paper, a switching-activity minimization loop scheduling (SAMLS) algorithm is proposed to minimize both schedule length and switching activities for applications with loops on multiple-functional-unit architectures.
Abstract: This paper studies the scheduling problem that minimizes both schedule length and switching activities for applications with loops on multiple-functional-unit architectures. We formally prove that to find a schedule that has the minimal switching activities among all minimum-latency schedules with or without resource constraints is NP-complete. An algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), is proposed to minimize both schedule length and switching activities. In SAMLS, the best schedule is selected from the ones generated from a given initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show our algorithm can greatly reduce both schedule length and switching activities compared with the previous work.

Book ChapterDOI
25 Aug 2004
TL;DR: In this paper, an address assignment and scheduling algorithm for multiple functional units processors is proposed to reduce the code size and schedule length of multiple functional unit processors by first constructing a nice address assignment, and then scheduling.
Abstract: DSP architecture typically provides indirect addressing modes with auto-increment and auto-decrement. Subsuming the address arithmetic into auto-increment and auto-decrement modes improves the size and performance of generated code. A lot of previous work has been done on address assignment optimization to achieve code size reduction by minimizing address operations for single functional unit processors. However, minimizing address operations alone may not directly reduce code size and schedule length for multiple-functional-unit processors. In this paper, we exploit address assignment and scheduling for multiple functional units processors. Our approach is to first construct a nice address assignment and then do scheduling. By fully taking advantage of the address assignment during scheduling, code size and schedule length can be significantly reduced. We propose a multiple-functional-unit algorithm to do both address assignment and scheduling. The experimental results show that our algorithm can greatly reduce code size and schedule length compared to the previous work.

Proceedings ArticleDOI
18 Oct 2004
TL;DR: The experimental results show that the proposed algorithm, SAMLS (switching-activity minimization loop scheduling), can greatly reduce both schedule length and bus switching activities compared with the previous work.
Abstract: This work develops an instruction-level loop scheduling technique to reduce both execution time and bus switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (switching-activity minimization loop scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can greatly reduce both schedule length and bus switching activities compared with the previous work.