InSyn: Integrated Scheduling for DSP Applications

doi:10.1145/157485.164926

Home
/
Papers
/
InSyn: Integrated Scheduling for DSP Applications

Proceedings Article•DOI•

InSyn: Integrated Scheduling for DSP Applications

Alok Sharma¹, Rajiv Jain•Institutions (1)

University of Wisconsin-Madison¹

01 Jul 1993-pp 349-354

TL;DR: The InSyn is presented, an integrated allocation and scheduling approach for high-level synthesis applications that considers functional units, busses and registers while performing time-step assignment.

read less

Abstract: In this paper, we present the InSyn, an integrated allocation and scheduling approach for high-level synthesis applications. The scheduler considers functional units, busses and registers while performing time-step assignment. The results show that incorporating all these features during scheduling can produce very good designs.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Recent developments in high-level synthesis

[...]

Youn-Long Lin

01 Jan 1997-ACM Transactions on Design Automation of Electronic Systems

TL;DR: The need for higher-level design automation tools are discussed first and some basic techniques for various subtasks of high-level synthesis are described, including testability, power efficiency, and reliability.

...read moreread less

Abstract: We survey recent developments in high level synthesis technology for VLSI design. The need for higher-level design automation tools are discussed first. We then describe some basic techniques for various subtasks of high-level synthesis. Techniques that have been proposed in the past few years (since 1994) for various subtasks of high-level synthesis are surveyed. We also survey some new synthesis objectives including testability, power efficiency, and reliability.

...read moreread less

111 citations

Journal Article•DOI•

Ant Colony Optimizations for Resource- and Timing-Constrained Operation Scheduling

[...]

Gang Wang¹, Wenrui Gong², Brian DeRenzi³, Ryan Kastner¹•Institutions (3)

University of California, Santa Barbara¹, Mentor Graphics², University of Washington³

01 Jun 2007-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This work presents novel OS algorithms using the ant colony optimization approach for both timing-constrained scheduling (TCS) and resource-constructed scheduling (RCS) problems, using a unique hybrid approach by combining the MAX-MIN ant system metaheuristic with traditional scheduling heuristics.

...read moreread less

Abstract: Operation scheduling (OS) is a fundamental problem in mapping an application to a computational device. It takes a behavioral application specification and produces a schedule to minimize either the completion time or the computing resources required to meet a given deadline. The OS problem is NP-hard; thus, effective heuristic methods are necessary to provide qualitative solutions. We present novel OS algorithms using the ant colony optimization approach for both timing-constrained scheduling (TCS) and resource-constrained scheduling (RCS) problems. The algorithms use a unique hybrid approach by combining the MAX-MIN ant system metaheuristic with traditional scheduling heuristics. We compiled a comprehensive testing benchmark set from real-world applications in order to verify the effectiveness and efficiency of our proposed algorithms. For TCS, our algorithm achieves better results compared with force-directed scheduling on almost all the testing cases with a maximum 19.5% reduction of the number of resources. For RCS, our algorithm outperforms a number of different list-scheduling heuristics with better stability and generates better results with up to 14.7% improvement. Our algorithms outperform the simulated annealing method for both scheduling problems in terms of quality, computing time, and stability

...read moreread less

55 citations

Cites background from "InSyn: Integrated Scheduling for DS..."

...If an operation can be performed by more than one resource type, we call it “heterogeneous” scheduling [7]....
[...]

Proceedings Article•DOI•

Comprehensive lower bound estimation from behavioral descriptions

[...]

Seong Yong Ohm¹, Fadi J. Kurdahi¹, Nikil Dutt¹•Institutions (1)

University of California, Irvine¹

06 Nov 1994

TL;DR: This paper presents a comprehensive technique for lower bound estimation (LBE) of resources from behavioral descriptions that accounts for storage resources in addition to functional resources and uses a finer granularity that permits the modeling of functional unit, register and interconnect delays.

...read moreread less

Abstract: In this paper, we present a comprehensive technique for lower bound estimation (LBE) of resources from behavioral descriptions. Previous work has focused on LBE techniques that use very simple cost models which primarily focus on the functional unit resources. Our cost model accounts for storage resources in addition to functional resources. Our timing model uses a finer granularity that permits the modeling of functional unit, register and interconnect delays. We tested our LBE technique for both functional unit and storage requirements on several high-level synthesis benchmarks and observed near-optimal results.

...read moreread less

39 citations

Cites background or result from "InSyn: Integrated Scheduling for DS..."

...[17] A. Sharma and R. Jain, InSyn: Integrated Scheduling for DSP Applications, Proc....
[...]
...Basically we compared our results with OASIC [10], ILP approach [15], HAL [16], and InSyn [17]....
[...]
...Basically we compared our results with OASIC [10], ILP approach [15], HAL [16], and InSyn [17]....
[...]
...Delay InSyn [17] Our Estimation (ns) FU Reg....
[...]
...340 3(+), 2(*p) 10 3(+), 2(*p) 12 3(+), 2(*p) 10 360 3(+), 1(*p) 10 3(+), 1(*p) - 3(+), 1(*p) 10 380 2(+), 1(*p) 9 2(+), 1(*p) 12 2(+), 1(*p) 9 *p: 2-stage pipelined multiplier (delay of 25.0 ns), +: adder (delay of 15.0 ns) Table 6: 5th order elliptic wave .lter -design III Delay (ns) InSyn [17] Our Estimation FU Reg....
[...]

Journal Article•DOI•

Resource-constrained loop list scheduler for DSP algorithms

[...]

Ching Yi Wang¹, Keshab K. Parhi¹•Institutions (1)

University of Minnesota¹

30 Oct 1995

TL;DR: This work defines and makes use of newgraph dependent constraints to obtain a lower bound estimate on the iteration period for any data-flow graph and incorporates implicit retiming and pipelining to generate optimal and near optimal schedules.

...read moreread less

Abstract: We present a new algorithm for resource-constrained scheduling for digital signal processing (DSP) applications when the number of processors is fixed and the objective is to obtain a schedule with the minimum iteration period. This type of scheduling is best suited for moderate speed applications where conservation of area and power is more important than speed. We define and make use of newgraph dependent constraints to obtain a lower bound estimate on the iteration period for any data-flow graph. By satisfying these constraints before performing the scheduling task, we can restrict the design space and can generate valid schedules in less time than previously reported. The graph dependent constraints provide a more accurate lower bound estimate on the iteration period than previously published results. This new scheduling algorithm exploits the iterative nature of DSP algorithms and uses aniterative-loop based scheduling approach. This resource scheduling algorithm has been incorporated in the Minnesota ARchitecture Synthesis (MARS) system. Our approach exploits inter-iteration and intra-iteration precedence constraints and incorporates implicit retiming and pipelining to generate optimal and near optimal schedules.

...read moreread less

27 citations

Cites background or methods from "InSyn: Integrated Scheduling for DS..."

...To obtain more optimal designs, both tasks should be performed simultaneously [4-6], [14-27]....
[...]
...In recent years many synthesis systems have been developed for automated design of high performance dedicated architectures, especially for digital signal processing (DSP) applications [1-27]....
[...]

Journal Article•DOI•

A unified lower bound estimation technique for high-level synthesis

[...]

Seong Yong Ohm¹, Fadi J. Kurdahi², Nikil Dutt²•Institutions (2)

Seoul Women's University¹, University of California, Irvine²

01 May 1997-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper presents an integrated approach aimed at predicting lower bounds on hardware resources needed to implement a behavioral description within a given amount of time, and believes that this approach can lead to better quality HLS solutions in less time.

...read moreread less

Abstract: The importance of effective lower bound estimation (LBE) techniques is well established in high-level synthesis (HLS) since it allows more efficient exploration of the design space while providing other HLS tools with the capability of predicting the effect of specific tools on the design space. Much of the previous work has focused on LBE techniques that use very simple cost models which primarily focus on the functional unit resources. With the push toward submicron technologies, simple models that use functional unit resources alone are not accurate enough to allow effective design space exploration since the effects of storage and interconnect can indeed dominate the cost function. In this paper, we present an integrated approach aimed at predicting lower bounds on hardware resources needed to implement a behavioral description within a given amount of time. Our area cost model accounts for storage (register) and interconnect resources (buses) in addition to functional resources. Our timing model uses a finer granularity that permits the modeling of functional unit, register, and interconnect delays. Our approach is integrated because we consider the dependencies between the different types of resources as well as the ordering in which the resources are allocated. We tested our technique for functional unit, storage, and interconnect requirements on several high-level synthesis benchmarks, and observed near-optimal results. We believe that our comprehensive LBE approach can lead to better quality HLS solutions in less time, and we demonstrate this approach in our paper.

...read moreread less

26 citations

1
2
3
4
…

References

PDF

Open Access

More filters

Journal Article•DOI•

Force-directed scheduling for the behavioral synthesis of ASICs

[...]

P.G. Paulin¹, J.P. Knight²•Institutions (2)

bell northern research¹, Carleton University²

01 Jun 1989-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A general scheduling methodology is presented that can be integrated into specialized or general-purpose high-level synthesis systems and reduces the number of functional units, storage units, and buses required by balancing the concurrency of operations assigned to them.

...read moreread less

Abstract: A general scheduling methodology is presented that can be integrated into specialized or general-purpose high-level synthesis systems. An initial version of the force-directed scheduling algorithm at the heart of this methodology was originally presented by the authors in 1987. The latest implementation of the logarithm introduced here reduces the number of functional units, storage units, and buses required by balancing the concurrency of operations assigned to them. The algorithm supports a comprehensive set of constraint types and scheduling modes. These include multicycle and chained operations; mutually exclusive operations; scheduling under fixed global timing constraints with minimization of functional unit costs, minimization of register costs, and minimization of global interconnect requirements; scheduling with local time constraints (on operation pairs); scheduling under fixed hardware resource constraints; functional pipelining; and structural pipeline (use of pipeline functional units). Examples from current literature, one of which was chosen as a benchmark for the 1988 High-Level Synthesis Workshop, are used to illustrate the effectiveness of the approach. >

...read moreread less

1,093 citations

Journal Article•DOI•

Software pipelining: an effective scheduling technique for VLIW machines

[...]

Monica S. Lam¹•Institutions (1)

Carnegie Mellon University¹

01 Jun 1988

TL;DR: This paper shows that software pipelining is an effective and viable scheduling technique for VLIW processors, and proposes a hierarchical reduction scheme whereby entire control constructs are reduced to an object similar to an operation in a basic block.

...read moreread less

Abstract: This paper shows that software pipelining is an effective and viable scheduling technique for VLIW processors. In software pipelining, iterations of a loop in the source program are continuously initiated at constant intervals, before the preceding iterations complete. The advantage of software pipelining is that optimal performance can be achieved with compact object code.This paper extends previous results of software pipelining in two ways: First, this paper shows that by using an improved algorithm, near-optimal performance can be obtained without specialized hardware. Second, we propose a hierarchical reduction scheme whereby entire control constructs are reduced to an object similar to an operation in a basic block. With this scheme, all innermost loops, including those containing conditional statements, can be software pipelined. It also diminishes the start-up cost of loops with small number of iterations. Hierarchical reduction complements the software pipelining technique, permitting a consistent performance improvement be obtained.The techniques proposed have been validated by an implementation of a compiler for Warp, a systolic array consisting of 10 VLIW processors. This compiler has been used for developing a large number of applications in the areas of image, signal and scientific processing.

...read moreread less

936 citations

Journal Article•DOI•

Advanced compiler optimizations for supercomputers

[...]

David Padua¹, Michael Wolfe¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1986-Communications of The ACM

TL;DR: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code to be able to operate on parallel systems.

...read moreread less

Abstract: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code.

...read moreread less

758 citations

Journal Article•DOI•

The high-level synthesis of digital systems

[...]

Michael C. McFarland¹, Alice C. Parker, Raul Camposano•Institutions (1)

Boston College¹

01 Feb 1990

TL;DR: It is shown how the high-level synthesis task can be decomposed into a number of distinct but not independent subtasks.

...read moreread less

Abstract: High-level synthesis systems start with an abstract behavioral specification of a digital system and find a register-transfer level structure that realizes the given behavior. The various tasks involved in developing a register-transfer level structure from an algorithmic level specification are described. In particular, it is shown how the high-level synthesis task can be decomposed into a number of distinct but not independent subtasks. The techniques that have been developed for solving those subtasks are presented. Areas related to high-level synthesis that are still open problems are examined. >

...read moreread less

639 citations

Journal Article•DOI•

Automated Synthesis of Data Paths in Digital Systems

[...]

Chia-Jeng Tseng¹, Daniel P. Siewiorek²•Institutions (2)

Bell Labs¹, Carnegie Mellon University²

01 Jul 1986-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This paper presents a unifying procedure, called Facet, for the automated synthesis of data paths at the register-transfer level that minimizes the number of storage elements, data operators, and interconnection units.

...read moreread less

Abstract: This paper presents a unifying procedure, called Facet, for the automated synthesis of data paths at the register-transfer level. The procedure minimizes the number of storage elements, data operators, and interconnection units. A design generator named Emerald, based on Facet, was developed and implemented to facilitate extensive experiments with the methodology. The input to the design generator is a behavioral description which is viewed as a code sequence. Emerald provides mechanisms for interactively manipulating the code sequence. Different forms of the code sequence are mapped into data paths of different cost and speed. Data paths for the behavioral descriptions of the AM2910, the AM2901, and the IBM System/370 were produced and analyzed. Designs for the AM2910 and the AM2901 are compared with commercial designs. Overall, the total number of gates required for Emerald's designs is about 15 percent more than the commercial designs. The design space spanned by the behavioral specification of the AM2901 is extensively explored.

...read moreread less

567 citations