scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Heterogeneous built-in resiliency of application specific programmable processors

TL;DR: A new approach for permanent fault-tolerance: Heterogeneous Built-In-Resiliency (HBIR) is developed and the effectiveness of the overall approach, the synthesis algorithms, and software implementations on a number of designs are demonstrated.
Abstract: Using the flexibility provided by multiple functionalities we have developed a new approach for permanent fault-tolerance: Heterogeneous Built-In-Resiliency (HBIR). HBIR processor synthesis imposes several unique tasks on the synthesis process: (i) latency determination targeting k-unit fault-tolerance, (ii) application-to-faulty-unit matching and (iii) HBIR scheduling and assignment algorithms. We address each of them and demonstrate the effectiveness of the overall approach, the synthesis algorithms, and software implementations on a number of designs.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
01 Jan 1997
TL;DR: In this paper, the authors present techniques and algorithms to incorporate micro-preemption constraints during multi-task VLSI system synthesis, and a controller based scheme to preclude preemption related performance degradation.
Abstract: Task preemption is a critical enabling mechanism in multi-task VLSI systems. On preemption, data in the register files must be preserved in order for the task to be resumed. This entails extra memory to save the context and additional clock cycles to restore the context. We present techniques and algorithms to incorporate micro-preemption constraints during multi-task VLSI system synthesis. Specifically, we have developed: algorithms to insert and refine preemption points in scheduled task graphs subject to preemption latency constraints; techniques to minimize the context switch overhead by considering the dedicated registers required to save the state of a task on preemption and the shared registers required to save the remaining values in the tasks; and a controller based scheme to preclude preemption related performance degradation.

2 citations

Journal ArticleDOI
01 Feb 2001
TL;DR: The framework introduced in this paper is valuable in making early architecture design decisions such as cache and issue width trade-off when area is constrained, and the number of branch units and instruction issue width.
Abstract: Distributed hypermedia systems that support collaboration are important emerging tools for creation, discovery, management and delivery of information. These systems are becoming increasingly desired and practical as other areas of information technologies advance. A framework is developed for efficiently exploring the hypermedia design space while intelligently capitalizing on tradeoffs between performance and area. We focus on a category of processors that are programmable yet optimized to a hypermedia application. The key components of the framework presented in this paper are a retargetable instruction-level parallelism compiler, instruction level simulators, a set of complete media applications written in a high level language, and a media processor synthesis algorithm. The framework addresses the need for efficient use of silicon by exploiting the instruction-level parallelism found in media applications by compilers that target multiple-instruction-issue processors. Using the developed framework we conduct an extensive exploration of the design space for a hypermedia application. We find that there is enough instruction-level parallelism in the typical media and communication applications to achieve highly concurrent execution when throughput requirements are high. On the other hand, when throughput requirements are low, there is little value in multiple-instruction-issue processors. Increased area does not improve performance enough to justify the use of multiple-instruction-issue processors when throughput requirements are low. The framework introduced in this paper is valuable in making early architecture design decisions such as cache and issue width trade-off when area is constrained, and the number of branch units and instruction issue width.

1 citations


Cites background from "Heterogeneous built-in resiliency o..."

  • ...Several research groups have published results on the topic of selecting and designing instruction set and processor architecture for a particular application domains [44], [25]....

    [...]

Journal ArticleDOI
TL;DR: An accurate chip area estimate is developed and a set of aggressive hardware optimization algorithms are developed to build a unique framework for system-level synthesis and to gain valuable insights about design and use of application-specific programmable processors for modern applications.
Abstract: We evaluate the validity of the fundamental assumption behind application-specific programmable processors: that applications differ from each other in key parameters which are exploitable, such as the available instruction-level parallelism (ILP), demand on various hardware resources, and the desired mix of function units. Following the tradition of the CAD community, we develop an accurate chip area estimate and a set of aggressive hardware optimization algorithms. We follow the tradition of the architecture community by using comprehensive real-life benchmarks and production quality tools. This combination enables us to build a unique framework for system-level synthesis and to gain valuable insights about design and use of application-specific programmable processors for modern applications. We explore the application-specific programmable processor (ASSP) design space to understand the relationship between performance and area. The architecture model we used is the Hewlett Packard PA-RISC with single level caches. The system, including all memory and bus latencies, is simulated and no other specialized ALU or memory structures are being used. The experimental results reveal a number of important characteristics of the ASSP design space. For example, we found that in most cases a single programmable architecture performs similarly to a set of architectures that are tuned to individual application. A notable exception is highly cost sensitive designs, which we observe need a small number of specialized architectures that require smaller areas. Also, it is clear that there is enough parallelism in the typical media and communication applications to justify use of high number of function units. We found that the framework introduced in this paper can be very valuable in making early design decisions such as area and architectural configuration tradeoff, cache and issue width tradeoff under area constraint, and the number of branch units and issue width.

1 citations

29 Jun 2010
TL;DR: In this paper, the authors propose two distinct methods for high resolution fault diagnosis of reconfigurable logic resources, based on the function generator and shift register modes of operation of an FPGA slice.
Abstract: The ever-shrinking technology features have as a direct consequence the increase of defect density in VLSI chips. Going into the nano-scale era, the fabrication procedures cannot keep improving at the pace of the aforementioned shrinking of technology features. Fault Tolerance emerges as a much cheaper solution and it is imperative in the future to be able to build a reliable system with unreliable components. Reconfigurable realization platforms offer the ideal substrate for such approaches, because of their regularity and reconfigurability, which allow for the basic resources to be substitutable, relaxing the defect-free requirement for the whole chip. Sparing and matching techniques allow for substitution and alternative utilization of resources respectively, paving the way to the nano-scale era. Although a significant number of research works have focused on sparing, very few actually go on to reusing the defective resources and even in these cases, the characterization is conservative, sacrificing more functionality than it needs to. We focus on improving the particular drawback, by proposing two distinct methods for high resolution fault diagnosis of reconfigurable logic resources. The methods are based on the function generator and shift register modes of operation of an FPGA slice. We choose to decouple the diagnosis problem from those of fault detection and localization that have been extensively researched and in this way relax the fault coverage requirements for our methods: It is critical to rescue the core functionality of a defective resource with minimal cost, rather than cover 100% of its possible faults. Substitutable Resource Characterization is performed based on the diagnosis result in a modular manner. Both diagnostic testers are prototyped on FPGA and applied to a real Circuit Under Test, with the help of fault injection. The experimental results show that our approach offers the basis for a viable, low-overhead integrated fault tolerance strategy, which we hope to continue developing in the near future.

1 citations

01 Jan 2010
TL;DR: The experimental results show that the proposed two distinct methods for high resolution fault diagnosis of reconfigurable logic resources are the basis for a viable, low-overhead integrated fault tolerance strategy, which the authors hope to continue developing in the near future.

Cites methods from "Heterogeneous built-in resiliency o..."

  • ...Outside the FPGA domain, a good example of a coarse-grain matching-based technique is presented by Kim et al in [25]....

    [...]

References
More filters
Book
01 Dec 1989
TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
Abstract: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high-performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and Web technologies, and high-performance computing.

11,671 citations


"Heterogeneous built-in resiliency o..." refers methods in this paper

  • ...We assume the dedicated register le model [12, 11] where each register is connected to a single input of an execution unit, while each unit can send data to an arbitrary number of registers....

    [...]

Book
01 Jan 1990
TL;DR: The new edition of Breuer-Friedman's Diagnosis and Reliable Design ofDigital Systems offers comprehensive and state-ofthe-art treatment of both testing and testable design.
Abstract: For many years, Breuer-Friedman's Diagnosis and Reliable Design ofDigital Systems was the most widely used textbook in digital system testing and testable design. Now, Computer Science Press makes available a new and greativ expanded edition. Incorporating a significant amount of new material related to recently developed technologies, the new edition offers comprehensive and state-ofthe-art treatment of both testing and testable design.

2,758 citations

Journal ArticleDOI
TL;DR: This self-contained paper develops the theory necessary to statically schedule SDF programs on single or multiple processors, and a class of static (compile time) scheduling algorithms is proven valid, and specific algorithms are given for scheduling SDF systems onto single ormultiple processors.
Abstract: Large grain data flow (LGDF) programming is natural and convenient for describing digital signal processing (DSP) systems, but its runtime overhead is costly in real time or cost-sensitive applications. In some situations, designers are not willing to squander computing resources for the sake of programmer convenience. This is particularly true when the target machine is a programmable DSP chip. However, the runtime overhead inherent in most LGDF implementations is not required for most signal processing systems because such systems are mostly synchronous (in the DSP sense). Synchronous data flow (SDF) differs from traditional data flow in that the amount of data produced and consumed by a data flow node is specified a priori for each input and output. This is equivalent to specifying the relative sample rates in signal processing system. This means that the scheduling of SDF nodes need not be done at runtime, but can be done at compile time (statically), so the runtime overhead evaporates. The sample rates can all be different, which is not true of most current data-driven digital signal processing programming methodologies. Synchronous data flow is closely related to computation graphs, a special case of Petri nets. This self-contained paper develops the theory necessary to statically schedule SDF programs on single or multiple processors. A class of static (compile time) scheduling algorithms is proven valid, and specific algorithms are given for scheduling SDF systems onto single or multiple processors.

1,380 citations


"Heterogeneous built-in resiliency o..." refers background in this paper

  • ...Therefore, a natural and proper computational model for those important application domains is synchronous data flow [ 7 ]....

    [...]

Book
29 Feb 1992
TL;DR: This paper presents a methodology for High-Level Synthesis of Architectural Models in Synthesis and its applications in Design Description Languages and Design Representation and Transformations.
Abstract: Preface. 1. Introduction. 2. Architectural Models in Synthesis. 3. Quality Measures. 4. Design Description Languages. 5. Design Representation and Transformations. 6. Partitioning. 7. Scheduling. 8. Allocation. 9. Design Methodology for High-Level Synthesis. Bibliography. Index.

1,104 citations


"Heterogeneous built-in resiliency o..." refers background in this paper

  • ...Behavioral synthesis has been an active area of research for more than two decades [3, 8], and numerous outstanding systems have been built targeting both data path oriented and control oriented applications [15, 8]....

    [...]