scispace - formally typeset

Proceedings ArticleDOI

Heterogeneous built-in resiliency of application specific programmable processors

01 Nov 1996-pp 406-411

TL;DR: A new approach for permanent fault-tolerance: Heterogeneous Built-In-Resiliency (HBIR) is developed and the effectiveness of the overall approach, the synthesis algorithms, and software implementations on a number of designs are demonstrated.

AbstractUsing the flexibility provided by multiple functionalities we have developed a new approach for permanent fault-tolerance: Heterogeneous Built-In-Resiliency (HBIR). HBIR processor synthesis imposes several unique tasks on the synthesis process: (i) latency determination targeting k-unit fault-tolerance, (ii) application-to-faulty-unit matching and (iii) HBIR scheduling and assignment algorithms. We address each of them and demonstrate the effectiveness of the overall approach, the synthesis algorithms, and software implementations on a number of designs.

Topics: High-level synthesis (52%)

...read more

Content maybe subject to copyright    Report

Citations
More filters

Journal ArticleDOI
TL;DR: The method utilizes the register-transfer level (RTL) circuit description of an ASPP or ASIP to come up with a set of test microcode patterns which can be written into the instruction read-only memory (ROM) of the processor.
Abstract: In this paper, we present design for testability (DFT) and hierarchical test generation techniques for facilitating the testing of application-specific programmable processors (ASPPs) and application-specific instruction processors (ASIPs). The method utilizes the register-transfer level (RTL) circuit description of an ASPP or ASIP to come up with a set of test microcode patterns which can be written into the instruction read-only memory (ROM) of the processor. These lines of microcode dictate a new control/data flow in the circuit and can be used to test modules which are not easily testable. The new control/data flow is used to justify precomputed test sets of a module from the system primary inputs to the module inputs and propagate output responses from the module output to the system primary outputs. The testability analysis, which is based on the relevant control/data flow extracted from the RTL circuit, is symbolic. Thus, it is independent of the bit-width of the data path and is extremely fast. The test microcode patterns are a by-product of this analysis. If the derived test microcode cannot test all untested modules in the circuit, then test multiplexers are added (usually to the off-critical paths of the data path) to test these modules. This is done to guarantee the testability of all modules in the circuit. If the control microcode memory of the processor is erasable, then the test microcode lines can be erased once the testing of the chip is over. In that case, the DFT scheme has very little overhead (typically less than 1%). Otherwise, the test microcode lines remain as an overhead in the control memory. The method requires the addition of only one external test pin. Application of this technique to several examples has resulted in a very high fault coverage (above 99.6%) for all of them. The test generation time is about three orders of magnitude smaller compared to an efficient gate-level sequential test generator. The average area overhead (without assuming an erasable ROM) is 3.1% while the delay overheads are negligible. This method does not require any scan in the controller or data path. It is also amenable to at-speed testing.

39 citations


Journal ArticleDOI
TL;DR: Two low-cost approaches to graceful degradation-based permanent fault tolerance of ASPPs are presented and the effectiveness of the overall approach, the synthesis algorithms, and software implementations on a number of industrial-strength designs are demonstrated.
Abstract: Application Specific Programmable Processors (ASPP) provide efficient implementation for any of m specified functionalities. Due to their flexibility and convenient performance-cost trade-offs, ASPPs are being developed by DSP, video, multimedia, and embedded lC manufacturers. In this paper, we present two low-cost approaches to graceful degradation-based permanent fault tolerance of ASPPs. ASPP fault tolerance constraints are incorporated during scheduling, allocation, and assignment phases of behavioral synthesis: Graceful degradation is supported by implementing multiple schedules of the ASPP applications, each with a different throughput constraint. In this paper, we do not consider concurrent error detection. The first ASPP fault tolerance technique minimizes the hardware resources while guaranteeing that the ASPP remains operational in the presence of all k-unit faults. On the other hand, the second fault tolerance technique maximizes the ASPP fault tolerance subject to constraints on the hardware resources. These ASPP fault tolerance techniques impose several unique tasks, such as fault-tolerant scheduling, hardware allocation, and application-to-faulty-unit assignment. We address each of them and demonstrate the effectiveness of the overall approach, the synthesis algorithms, and software implementations on a number of industrial-strength designs.

24 citations


Journal ArticleDOI
TL;DR: Techniques to incorporate micropreemption constraints during multitask VLSI system synthesis are presented and algorithms to insert and refine preemption points in scheduled task graphs subject to preemption latency constraints are presented.
Abstract: Task preemption is a critical enabling mechanism in multitask very large scale integration (VLSI) systems. On preemption, data in the register files must be preserved for the task to be resumed. This entails extra memory to preserve the context and additional clock cycles to save and restore the context. In this paper, techniques and algorithms to incorporate micropreemption constraints during multitask VLSI system synthesis are presented. Specifically, algorithms to insert and refine preemption points in scheduled task graphs subject to preemption latency constraints, techniques to minimize the context switch overhead by considering the dedicated registers required to save the state of a task on preemption and the shared registers required to save the remaining values in the tasks, and a controller-based scheme to preclude the preemption-related performance degradation by: 1) partitioning the states of a task into critical sections; 2) executing the critical sections atomically; and 3) preserving atomicity by rolling forward to the end of the critical sections on preemption have been developed. The effectiveness of all approaches, algorithms, and software implementations is demonstrated on real examples. Validation of all the results is complete in the sense that functional simulation is conducted to complete layout implementation.

10 citations


Proceedings ArticleDOI
01 May 1998
TL;DR: A framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints is reported, which can be valuable in making early design decisions such as area and architectural trade-offs, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.
Abstract: In this paper we report a framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints. The framework uses a production-quality compiler and simulation tools to synthesize a high performance machine for an application. Using the framework we evaluate the validity of the fundamental assumption behind the development of application-specific programmable processors. Application-specific processors are based on the idea that applications differ from each other in key architectural parameters, such as the available instruction-level parallelism, demand on various hardware components (e.g. cache memory units, register files) and the need for different number of functional units. We found that the framework introduced in this paper can be valuable in making early design decisions such as area and architectural trade-off, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.

9 citations


Proceedings ArticleDOI
13 Nov 1997
Abstract: Task preemption is a critical enabling mechanism in multi-task VLSI systems. On preemption, data in the register files must be preserved in order for the task to be resumed. This entails extra memory to preserve the context and additional clock cycles to save and restore the context. In this paper, we present techniques and algorithms to incorporate micro-preemption constraints during multi-task VLSI system synthesis. Specifically, we have developed: (i) Algorithms to insert and refine preemption points in scheduled task graphs subject to preemption latency constraints. (ii) Techniques to minimize the context switch overhead by considering the dedicated registers required to save the state of a task on preemption and the shared registers required to save the remaining values in the tasks. (iii) A controller based scheme to preclude preemption related performance degradation. The effectiveness of all approaches, algorithms, and software implementations is demonstrated on real examples.

3 citations


References
More filters

Book
01 Dec 1989
TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
Abstract: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high-performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and Web technologies, and high-performance computing.

11,485 citations


"Heterogeneous built-in resiliency o..." refers methods in this paper

  • ...We assume the dedicated register le model [12, 11] where each register is connected to a single input of an execution unit, while each unit can send data to an arbitrary number of registers....

    [...]


Book
01 Jan 1990
TL;DR: The new edition of Breuer-Friedman's Diagnosis and Reliable Design ofDigital Systems offers comprehensive and state-ofthe-art treatment of both testing and testable design.
Abstract: For many years, Breuer-Friedman's Diagnosis and Reliable Design ofDigital Systems was the most widely used textbook in digital system testing and testable design. Now, Computer Science Press makes available a new and greativ expanded edition. Incorporating a significant amount of new material related to recently developed technologies, the new edition offers comprehensive and state-ofthe-art treatment of both testing and testable design.

2,737 citations


Journal ArticleDOI
TL;DR: This self-contained paper develops the theory necessary to statically schedule SDF programs on single or multiple processors, and a class of static (compile time) scheduling algorithms is proven valid, and specific algorithms are given for scheduling SDF systems onto single ormultiple processors.
Abstract: Large grain data flow (LGDF) programming is natural and convenient for describing digital signal processing (DSP) systems, but its runtime overhead is costly in real time or cost-sensitive applications. In some situations, designers are not willing to squander computing resources for the sake of programmer convenience. This is particularly true when the target machine is a programmable DSP chip. However, the runtime overhead inherent in most LGDF implementations is not required for most signal processing systems because such systems are mostly synchronous (in the DSP sense). Synchronous data flow (SDF) differs from traditional data flow in that the amount of data produced and consumed by a data flow node is specified a priori for each input and output. This is equivalent to specifying the relative sample rates in signal processing system. This means that the scheduling of SDF nodes need not be done at runtime, but can be done at compile time (statically), so the runtime overhead evaporates. The sample rates can all be different, which is not true of most current data-driven digital signal processing programming methodologies. Synchronous data flow is closely related to computation graphs, a special case of Petri nets. This self-contained paper develops the theory necessary to statically schedule SDF programs on single or multiple processors. A class of static (compile time) scheduling algorithms is proven valid, and specific algorithms are given for scheduling SDF systems onto single or multiple processors.

1,356 citations


"Heterogeneous built-in resiliency o..." refers background in this paper

  • ...Therefore, a natural and proper computational model for those important application domains is synchronous data flow [ 7 ]....

    [...]


Book
29 Feb 1992
TL;DR: This paper presents a methodology for High-Level Synthesis of Architectural Models in Synthesis and its applications in Design Description Languages and Design Representation and Transformations.
Abstract: Preface. 1. Introduction. 2. Architectural Models in Synthesis. 3. Quality Measures. 4. Design Description Languages. 5. Design Representation and Transformations. 6. Partitioning. 7. Scheduling. 8. Allocation. 9. Design Methodology for High-Level Synthesis. Bibliography. Index.

1,096 citations


"Heterogeneous built-in resiliency o..." refers background in this paper

  • ...Behavioral synthesis has been an active area of research for more than two decades [3, 8], and numerous outstanding systems have been built targeting both data path oriented and control oriented applications [15, 8]....

    [...]


BookDOI
01 Jan 1992

661 citations