Media architecture: general purpose vs. multiple application-specific programmable processor

doi:10.1145/277044.277136

Home
/
Papers
/
Media architecture: general purpose vs. multiple application-specific programmable processor

Proceedings Article•DOI•

Media architecture: general purpose vs. multiple application-specific programmable processor

Chunho Lee¹, Johnson Kin¹, Miodrag Potkonjak¹, William H. Mangione-Smith¹•Institutions (1)

University of California, Los Angeles¹

01 May 1998-pp 321-326

TL;DR: A framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints is reported, which can be valuable in making early design decisions such as area and architectural trade-offs, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.

read less

Abstract: In this paper we report a framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints. The framework uses a production-quality compiler and simulation tools to synthesize a high performance machine for an application. Using the framework we evaluate the validity of the fundamental assumption behind the development of application-specific programmable processors. Application-specific processors are based on the idea that applications differ from each other in key architectural parameters, such as the available instruction-level parallelism, demand on various hardware components (e.g. cache memory units, register files) and the need for different number of functional units. We found that the framework introduced in this paper can be valuable in making early design decisions such as area and architectural trade-off, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Local watermarks: methodology and application to behavioral synthesis

[...]

Darko Kirovski¹, Miodrag Potkonjak²•Institutions (2)

Microsoft¹, University of California, Los Angeles²

04 Sep 2003-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: This work introduces local watermarks, an IP protection technique which facilitates watermark detection in many realistic design and adversarial scenarios, while satisfying the demand for low overhead and design transparency.

...read moreread less

Abstract: Recently, the electronic design automation industry has adopted the intellectual property (IP) business model as a dominant system-on-chip development platform. Since copyright fraud has been recognized as the most devastating obstruction to this model, a number of techniques for IP protection have been introduced. Most of them rely on a selection of a global solution to a design optimization problem according to a unique user-specific digital signature. Although such techniques provide strong proof of authorship, they fail to provide an effective procedure for watermark detection when a protected core design is augmented into a larger design. To address this fundamental issue, we introduce local watermarks, an IP protection technique which facilitates watermark detection in many realistic design and adversarial scenarios, while satisfying the demand for low overhead and design transparency. We demonstrate the efficiency of the new IP protection paradigm by applying its principles to a set of behavioral synthesis tasks such as operation scheduling and template matching.

...read moreread less

45 citations

Architecture and compiler design issues in programmable media processors

[...]

Wayne Wolf, Jason E. Fritts

01 Jan 2000

TL;DR: A speculative run-time technique for data parallelism that executes loop iterations in parallel across a multi-clustered architecture is proposed and provides architecture support for identifying and recovering from misspeculations.

...read moreread less

Abstract: The processing demands for multimedia applications are rapidly escalating. Many current applications are pushing the limits of existing microprocessors, and the next generation of multimedia promises considerably greater demands. Adequate support for future multimedia requires the flexibility and computing power of high-level language (HLL) programmable media processors. This thesis examines the architecture and compiler design issues for programmable media processors. Design of the architecture requires an accurate understanding of multimedia characteristics. Using the MediaBench benchmark suite and the Impact compiler, workload and architecture evaluations were performed to define the essential architecture for programmable media processors. The workload evaluation examines various processing aspects, including functional necessities, data types and sizes, branch performance, loop characteristics, memory statistics, and instruction level parallelism. The architecture evaluation examines the performance of dynamic versus static architecture features. Most existing media processors use static architectures, but as processors progress to higher frequencies, the dynamic aspects become more prominent and dynamic hardware may be necessary to minimize stall penalties. The architecture evaluation examines static versus dynamic scheduling, dynamic aspects of instruction fetch, and performance effects in higher frequency processors. Finally, an investigation of the memory hierarchy identifies the most significant bottlenecks in memory performance. The high degree of parallelism available in multimedia applications is well researched, but less well understood is how a compiler extracts and schedules that parallelism to highly parallel architectures. Evaluation of the compiler issues begins with an investigation of the available parallelism in multimedia. While instruction level parallelism unfortunately provides only modest performance, data parallelism offers a promising avenue for increased parallelism. However, data parallelism is of a coarser level of granularity than instruction level parallelism, so conventional compiler methods do not prove very effective. Parallel compiler methods are necessary to realize the benefits of data parallelism. Unfortunately, parallel compilation requires complex dependence analysis that is often unable to identify all available parallelism. Consequently, we propose a speculative run-time technique for data parallelism that executes loop iterations in parallel across a multi-clustered architecture. This method speculatively executes several loop iterations in parallel and provides architecture support for identifying and recovering from misspeculations.

...read moreread less

26 citations

Cites background or result from "Media architecture: general purpose..."

...The overall results, while not spectacular, are reasonable and consistent with similar results found in a separate study of ILP on MediaBench [102]....
[...]
...Only a select few have examined media processors using full application benchmarks [66][102][60], and these have all proposed static architectures....
[...]
...Among these, only [66] and [102] use compiled code from full applications, and only the second of these two evaluates multiple-issue media processors....
[...]
...Some small data sets and traces ● Augmented for greater representation of future multimedia — MPEG-4 object-oriented video — H.263 very-low bitrate video 7 Page 7 13 IMPACT Compiler ● Aggressive ILP research compiler — superblock (speculation) — hyperblock (predication) — loop unrolling ● Three levels of optimizations — Classical - classical optimizations only — Superscalar - adds loop unrolling and superblock formation — Hyperblock - adds hyperblock optimization ● Architecture-independent evaluation — large, generic instruction set — retargetable back-end ● Performance analysis tools — profiling — simulation for superscalar, VLIW architectures 14 Workload Evaluation 8 Page 8 15 Characteristics of Multimedia ● Compile with classical optimizations only ● Related Work [Lee97] “MediaBench: A Tool for Evaluating and Synthesizing Multimedia Communication Systems,” MICRO-30, 1997....
[...]
...6) Parallelism Analysis Performance Applications Performance Numbers Mapping Architecture Instance 12 MediaBench Benchmark Suite ● Developed at UCLA [CLee97] “MediaBench: A Tool for Evaluating and Synthesizing Multimedia Communication Systems,” MICRO-30, 1997....
[...]

Journal Article•DOI•

Processor-memory coexploration using an architecture description language

[...]

Prabhat Mishra¹, Mahesh Mamidipaka¹, Nikil Dutt¹•Institutions (1)

University of California, Irvine¹

01 Feb 2004-ACM Transactions in Embedded Computing Systems

TL;DR: This paper presents a language-based approach to explicitly capture the memory subsystem configuration, generate a memory-aware software toolkit, and perform coexploration of the processor--memory architectures.

...read moreread less

Abstract: Memory represents a major bottleneck in modern embedded systems in terms of cost, power, and performance. Traditionally, memory organizations for programmable embedded systems assume a fixed cache hierarchy. With the widening processor--memory gap, more aggressive memory technologies and organizations have appeared, allowing customization of a heterogeneous memory architecture tuned for specific target applications. However, such a processor--memory coexploration approach critically needs the ability to explicitly capture heterogeneous memory architectures. We present in this paper a language-based approach to explicitly capture the memory subsystem configuration, generate a memory-aware software toolkit, and perform coexploration of the processor--memory architectures. We present a set of experiments using our memory-aware architectural description language (ADL) to drive the exploration of the memory subsystem for the TI C6211 processor architecture, demonstrating cost, performance, and energy trade-offs.

...read moreread less

18 citations

Evaluation of Static and Dynamic Scheduling for Media Processors

[...]

Jason E. Fritts¹, Wayne Wolf²•Institutions (2)

Washington University in St. Louis¹, Princeton University²

01 Jan 2000

TL;DR: The results indicate that dynamic out-of-order scheduling methods enable significantly higher degrees of parallelism and are less susceptible to high frequency effects and may be necessary for maximizing performance in future media processors.

...read moreread less

Abstract: This paper presents the results of an architecture style evaluation that compares the performance of static scheduling and dynamic scheduling for media processors. Existing programmable media processors have predominantly used staticallyscheduled architectures. As future media processors progress to higher frequencies and higher degrees of parallelism, the dynamic aspects of processing become more pronounced and dynamic hardware support may be needed to achieve high performance. This paper explores many of the dynamic aspects of media processing by evaluating various fundamental architecture styles and the frequency effects on those architecture models. The results indicate that dynamic out-of-order scheduling methods enable significantly higher degrees of parallelism and are less susceptible to high frequency effects. Consequently, dynamic scheduling support may be necessary for maximizing performance in future media processors.

...read moreread less

15 citations

Cites background from "Media architecture: general purpose..."

...Only a select few have examined media processors using full applications on static architectures [7][8], and only the latter evaluates multiple-issue processors....
[...]

Proceedings Article•DOI•

Localized watermarking: methodology and application to operation scheduling

[...]

Darko Kirovski¹, Miodrag Potkonjak¹•Institutions (1)

University of California, Los Angeles¹

07 Nov 1999

TL;DR: In this article, localized watermarking is proposed as an IP protection technique that enables these features while satisfying the demand for low-cost and transparency, and a set of protocols are proposed that implement the new water-marking methodology at the operation scheduling design level.

...read moreread less

Abstract: Recently, a number of techniques for IP protection have been introduced that rely on a selection of a global solution to an optimization problem according to a unique user-specific digital signature. Although such techniques may provide convincing proof of authorship with low hardware overhead, they fail to protect parts of design, do not provide an easy procedure for watermark detection, and are not capable of detecting the watermark when the design or its part is augmented in another larger design. Since these demands are of the highest interest for the IP business, we introduce localized watermarking as an IP protection technique that enables these features while satisfying the demand for low-cost and transparency. We propose a set of protocols that implement the new watermarking methodology at the operation scheduling design level. We have demonstrated that the difficulty of erasing or finding another signature in the synthesized design can be made arbitrarily computationally difficult. The watermarking method has been tested on a set of real-life benchmarks where high likelihood of authorship has been achieved with negligible overhead in solution quality.

...read moreread less

11 citations

References

PDF

Open Access

More filters

Johnson: Computers and Intractability-A Guide to the Theory of NP-Completeness

[...]

Michael Randolph Garey

01 Jan 1979

42,654 citations

Book•

Computers and Intractability: A Guide to the Theory of NP-Completeness

[...]

Michael Randolph Garey, David S. Johnson

01 Jan 1979

TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.

...read moreread less

Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

...read moreread less

40,020 citations

Book•

Computer Architecture: A Quantitative Approach

[...]

John L. Hennessy¹, David A. Patterson²•Institutions (2)

Stanford University¹, University of California, Berkeley²

01 Dec 1989

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.

...read moreread less

Abstract: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high-performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and Web technologies, and high-performance computing.

...read moreread less

11,671 citations

Computers and Intractability: A Guide to the Theory of NP-Completeness

[...]

Michael Randolph Garey, D. S. Johanson

01 Jan 1999

3,564 citations

Proceedings Article•

The Art of Computer Systems Performance Analysis.

[...]

Raj Jain

01 Jan 1990