Showing papers by "Zili Shao published in 2004"

PDF

Open Access

Proceedings Article•DOI•

Dynamic update of shortest path tree in OSPF

[...]

Bin Xiao, Jiannong Cao, Qingfeng Zhuge¹, Zili Shao¹, Edwin H.-M. Sha - Show less +1 more•Institutions (1)

10 May 2004

TL;DR: A new efficient algorithm for dynamic SPT update to avoid the disadvantages caused by static SPTupdate algorithms, based on the understanding of the update procedure to reduce redundancy.

...read moreread less

Abstract: The shortest path tree (SPT) construction is a critical issue to the high performance routing in an interior network using link state protocols, such as open shortest path first (OSPF) and IS-IS. In this paper, we propose a new efficient algorithm for dynamic SPT update to avoid the disadvantages (e.g. redundant computation) caused by static SPT update algorithms. The new algorithm is based on the understanding of the update procedure to reduce redundancy. Only significant elements that contribute to the construction of new SPT from the old one are focused on. The efficiency of our algorithm is improved because it only pay attention to the edges really count for the update process. The running time for the proposed algorithm is maximum reduced, which is shown through experimental results. Furthermore, our algorithm can be easily generalized to solve the SPT updating problem in a graph with negative weight edges and applied to the scenario of multiple edge weight changes.

...read moreread less

16 citations

Proceedings Article•DOI•

General loop fusion technique for nested loops considering timing and code size

[...]

Meilin Liu¹, Qingfeng Zhuge¹, Zili Shao¹, Edwin H.-M. Sha¹•Institutions (1)

University of Texas at Dallas¹

22 Sep 2004

TL;DR: It is shown that any "J+K" model loop can be legally fused using the legalizing fusion technique, and the experimental results show that the loop fusion technique always significantly reduces the schedule length.

...read moreread less

Abstract: Loop fusion is commonly used to improve the instruction-level parallelism of loops for high-performance embedded computing systems. Loop fusion, however, is not always directly applicable because the fusion prevention dependencies may exist among loops. Most of the existing techniques still have limitations in fully exploiting the advantages of loop fusion. In this paper, we present a general loop fusion technique for loops or nested loops based on the loop dependency graph model, retiming, and multi-dimensional retiming concepts. We show that any "J+K" model loop can be legally fused using our legalizing fusion technique. Polynomial-time algorithms are developed to solve the loop fusion problem for "J+K" model loops considering both timing and code size of the final code. Our technique produces the final code and calculates the resultant code size directly from the retiming values. The experimental results show that our loop fusion technique always significantly reduces the schedule length.

...read moreread less

16 citations

Proceedings Article•

Security protection and checking in embedded system integration against buffer overflow attacks

[...]

Zili Shao, Chun Xue, Qingfeng Zhuge, Edwin H.-M. Sha, Bin Xiao - Show less +1 more

01 Jan 2004

TL;DR: Huang et al. as discussed by the authors proposed the HSDefender (Hardware/Software Defender) technique that considers the protection and checking together to solve the buffer overflow problem in embedded systems by design a secure instruction set and requiring third-party software developers to use secure instructions to call functions.

...read moreread less

Abstract: With more embedded systems networked, it becomes an important research problem to effectively defend embedded systems against buffer overflow attacks and efficiently check if systems have been protected. In this paper, we propose the HSDefender (Hardware/Software Defender) technique that considers the protection and checking together to solve this problem. Our basic idea is to design a secure instruction set and require third-party software developers to use secure instructions to call functions. Then the security checking can be easily performed by system integrators even without the knowledge of the source code. We first classify buffer overflow attacks into two categories, stack smashing attacks and function pointer attacks, and then provide two corresponding defending strategies. We analyze the HSDefender technique in respect of hardware cost, security, and performance, and experiment with it on the SimpleScalar/ARM simulator using benchmarks from MiBench. The results show that HSDefender can defend a system against more types of buffer overflow attacks with less overhead compared with the previous work.

...read moreread less

16 citations

Proceedings Article•DOI•

Security protection and checking in embedded system integration against buffer overflow attacks

[...]

Zili Shao, Chun Xue, Qingfeng Zhuge, Edwin H.-M. Sha, Bin Xiao¹ - Show less +1 more•Institutions (1)

Hong Kong Polytechnic University¹

05 Apr 2004

TL;DR: The basic idea is to design a secure instruction set and require third-party software developers to use secure instructions to call functions, so that the security checking can be easily performed by system integrators even without the knowledge of the source code.

...read moreread less

Abstract: With more embedded systems networked, it becomes an important research problem to effectively defend embedded systems against buffer overflow attacks and efficiently check if systems have been protected. In this paper, we propose the HSDefender (hardware/software Defender) technique that considers the protection and checking together to solve this problem. Our basic idea is to design a secure instruction set and require third-party software developers to use secure instructions to call functions. Then the security checking can be easily performed by system integrators even without the knowledge of the source code. We first classify buffer overflow attacks into two categories, stack smashing attacks and function pointer attacks, and then provide two corresponding defending strategies. We analyze the HSDefender technique in respect of hardware cost, security, and performance, and experiment with it on the SimpleScalar/ARM simulator using benchmarks from MiBench. The results show that HSDefender can defend a system against more types of buffer overflow attacks with less overhead compared with the previous work.

...read moreread less

13 citations

Journal Article•DOI•

Algorithms and analysis of scheduling for low-power high-performance DSP on VLIW processors

[...]

Zili Shao¹, Qingfeng Zhuge¹, Youtao Zhang¹, Edwin H-M. Sha¹•Institutions (1)

University of Texas at Dallas¹

01 Jan 2004

TL;DR: An instruction-level energy-minimisation scheduling technique to reduce the energy consumption of applications on VLIW processors is proposed and it is formally proved that this problem is NP-complete.

...read moreread less

Abstract: Switching activity and schedule length are the two most important factors that influence the energy consumption of an application executed on a VLIW (very long instruction word) processor. Considering these two factors together, we propose an instruction-level energy-minimisation scheduling technique to reduce the energy consumption of applications on VLIW processors. We first formally prove that this problem is NP-complete. Then three heuristic algorithms, MSAS, MLMSA, and EMSA, are proposed. While switching activity and schedule length are given higher priority in MSAS and MLMSA respectively, EMSA gives the best result considering both of them. The experimental results show that EMSA gives a 31.7% reduction in energy compared with the traditional list scheduling approach on average.

...read moreread less

7 citations

Proceedings Article•DOI•

Dynamic shortest path tree update for multiple link state decrements

[...]

Bin Xiao, Jiannong Cao, Qingfeng Zhuge¹, Zili Shao¹, Edwin H.-M. Sha - Show less +1 more•Institutions (1)

University of Texas at Dallas¹

01 Dec 2004

TL;DR: Two dynamic algorithms (MaxR, MinD) are proposed to reduce the times for node updating during the dynamic update process of the shortest path tree (SPT) dynamic update.

...read moreread less

Abstract: Previous approaches for the shortest path tree (SPT) dynamic update have mainly focused on the case of one link state change. Little work has been done on the problem of deriving a new SPT based on its old one for multiple link state decrements in a network that applies link-state routing protocols. The complexity of this problem comes from there being no accurate boundary of nodes to be updated in an updating process and that multiple decrements can be accumulated. Two dynamic algorithms (MaxR, MinD) are proposed to reduce the times for node updating. Compared with other algorithms for the SPT update of multiple edge weight decrements, our algorithms yield fewer times for node updates during the dynamic update process. Such an achievement is attained by the mechanism of part node updating in a branch on the SPT after a particular node selection from a built node list. Simulation results are given to show our improvements.

...read moreread less

4 citations

Proceedings Article•DOI•

Timing optimization of nested loops considering code size for DSP applications

[...]

Qingfeng Zhuge, Zili Shao, Edwin H.-M. Sha

15 Aug 2004

TL;DR: The experimental results show that the SPINE technique outperforms both the standard software pipelining and MD retiming significantly, and the computation time and code size of a software-pipelined loop nest is affected by execution sequence and retimed function.

...read moreread less

Abstract: Software pipelining for nested loops remains a challenging problem for embedded system design. The existing software pipelining techniques for single loops can only explore the parallelism of the innermost loop, so the final timing performance is inferior. While multidimensional (MD) retiming can explore the outer loop parallelism, it introduces large overheads in loop index generation and code size due to transformation. We use MD retiming to model the software pipelining problem of nested loops. We show that the computation time and code size of a software-pipelined loop nest is affected by execution sequence and retiming function. The algorithm of software pipelining for nested loops technique (SPINE) is proposed to generate fully parallelized loops efficiently with the overheads as small as possible. The experimental results show that our technique outperforms both the standard software pipelining and MD retiming significantly.

...read moreread less

3 citations

Proceedings Article•DOI•

Switching-Activity Minimization on Instruction-Level Loop Scheduling for VLIWDSP Applications

[...]

Zili Shao¹, Qingfeng Zhuge¹, Meilin Liu¹, Bin Xiao¹, Edwin H.-M. Sha¹ - Show less +1 more•Institutions (1)

University of Texas at Dallas¹

27 Sep 2004

TL;DR: The experimental results show that the proposed SAMLS (Switching-Activity Minimization Loop Scheduling) algorithm can greatly reduce both schedule length and bus switching activities compared with the previous work.

...read moreread less

Abstract: This paper develops an instruction-level loop scheduling technique to reduce both execution time and bus switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can greatly reduce both schedule length and bus switching activities compared with the previous work.

...read moreread less

3 citations

Book Chapter•DOI•

Loop Scheduling for Real-Time DSPs with Minimum Switching Activities on Multiple-Functional-Unit Architectures

[...]

Zili Shao¹, Qingfeng Zhuge¹, Meilin Liu¹, Edwin H.-M. Sha¹, Bin Xiao² - Show less +1 more•Institutions (2)

University of Texas at Dallas¹, Hong Kong Polytechnic University²

25 Aug 2004

TL;DR: It is formally proved that to find a schedule that has the minimal switching activities among all minimum-latency schedules with or without resource constraints is NP-complete.

...read moreread less

Abstract: This paper studies the scheduling problem that minimizes both schedule length and switching activities for applications with loops on multiple-functional-unit architectures. We formally prove that to find a schedule that has the minimal switching activities among all minimum-latency schedules with or without resource constraints is NP-complete. An algorithm, SAMLS (Switching-Activity Minimization Loop Scheduling), is proposed to minimize both schedule length and switching activities. In SAMLS, the best schedule is selected from the ones generated from a given initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show our algorithm can greatly reduce both schedule length and switching activities compared with the previous work.

...read moreread less

2 citations

Proceedings Article•DOI•

Assignment and scheduling of real-time DSP applications for heterogeneous functional units

[...]

Zili Shao, Qingfeng Zhuge, Y. He, Chun Jason Xue, Meilin Liu, Edwin H.-M. Sha - Show less +2 more

26 Apr 2004

TL;DR: It is proved heterogeneous assignment problem is NP-complete and several algorithms to solve it are proposed, including DFG-assign-repeat is the best that gives a reduction of 25.7% on total cost compared with the previous work.

...read moreread less

Abstract: Summary form only given. In high level synthesis for real-time digital signal processing (DSP) architectures using heterogeneous functional units (FUs), an important problem is how to assign a proper fit type to each operation of a DSP application and generate a schedule in such a way that all requirements can be met and the total cost can be minimized. We propose a two-phase approach to solve this problem. In the first phase, we solve heterogeneous assignment problem, i.e., how to assign a proper FU type to a DSP application such that the total cost can be minimized while the timing constraint is satisfied. In the second phase, based on the assignments obtained from the first phase, we propose a minimum resource scheduling algorithm to generate a schedule and a feasible configuration that uses as little resource as possible. We prove heterogeneous assignment problem is NP-complete and propose several algorithms to solve it. The experiments show that algorithm DFG-assign-repeat is the best that gives a reduction of 25.7% on total cost compared with the previous work.

...read moreread less

2 citations

Loop scheduling for real-time DSPs with minimum switching activities on multiple-functional-unit architectures

[...]

Zili Shao¹, Qingfeng Zhuge¹, Meilin Liu¹, Edwin H.-M. Sha¹, Bin Xiao² - Show less +1 more•Institutions (2)

University of Texas at Dallas¹, Hong Kong Polytechnic University²

01 Dec 2004

TL;DR: In this paper, a switching-activity minimization loop scheduling (SAMLS) algorithm is proposed to minimize both schedule length and switching activities for applications with loops on multiple-functional-unit architectures.

...read moreread less

Book Chapter•DOI•

Optimizing address assignment for scheduling embedded DSPs

[...]

Chun Xue¹, Zili Shao¹, Edwin H.-M. Sha¹, Bin Xiao²•Institutions (2)

University of Texas at Dallas¹, Hong Kong Polytechnic University²

25 Aug 2004

TL;DR: In this paper, an address assignment and scheduling algorithm for multiple functional units processors is proposed to reduce the code size and schedule length of multiple functional unit processors by first constructing a nice address assignment, and then scheduling.

...read moreread less

Abstract: DSP architecture typically provides indirect addressing modes with auto-increment and auto-decrement. Subsuming the address arithmetic into auto-increment and auto-decrement modes improves the size and performance of generated code. A lot of previous work has been done on address assignment optimization to achieve code size reduction by minimizing address operations for single functional unit processors. However, minimizing address operations alone may not directly reduce code size and schedule length for multiple-functional-unit processors. In this paper, we exploit address assignment and scheduling for multiple functional units processors. Our approach is to first construct a nice address assignment and then do scheduling. By fully taking advantage of the address assignment during scheduling, code size and schedule length can be significantly reduced. We propose a multiple-functional-unit algorithm to do both address assignment and scheduling. The experimental results show that our algorithm can greatly reduce code size and schedule length compared to the previous work.

...read moreread less

Proceedings Article•DOI•

Switching-activity minimization on instruction-level loop for VLIW DSP applications

[...]

Zili Shao, Qingfeng Zhuge, Meilin Liu, Bin Xiao, Edwin H.-M. Sha - Show less +1 more

18 Oct 2004

TL;DR: The experimental results show that the proposed algorithm, SAMLS (switching-activity minimization loop scheduling), can greatly reduce both schedule length and bus switching activities compared with the previous work.

...read moreread less

Abstract: This work develops an instruction-level loop scheduling technique to reduce both execution time and bus switching activities for applications with loops on VLIW architectures. We propose an algorithm, SAMLS (switching-activity minimization loop scheduling), to minimize both schedule length and switching activities for applications with loops. In the algorithm, we obtain the best schedule from the ones that are generated from an initial schedule by repeatedly rescheduling the nodes with schedule length and switching activities minimization based on rotation scheduling and bipartite matching. The experimental results show that our algorithm can greatly reduce both schedule length and bus switching activities compared with the previous work.

...read moreread less