scispace - formally typeset
Search or ask a question

Showing papers by "Rajeev Balasubramonian published in 2005"


Proceedings ArticleDOI
12 Feb 2005
TL;DR: This paper proposes and evaluates microarchitectural techniques that can exploit a heterogeneous interconnect that is comprised of wires with varying latency, bandwidth, and energy characteristics and demonstrates that the proposed innovations result in up to 11% reductions in overall processor ED/sup 2/, compared to a baseline processor that employs a homogeneous interConnect.
Abstract: Future high-performance billion-transistor processors are likely to employ partitioned architectures to achieve high clock speeds, high parallelism, low design complexity, and low power. In such architectures, inter-partition communication over global wires has a significant impact on overall processor performance and power consumption. VLSI techniques allow a variety of wire implementations, but these wire properties have previously never been exposed to the microarchitecture. This paper advocates global wire management at the microarchitecture level and proposes a heterogeneous interconnect that is comprised of wires with varying latency, bandwidth, and energy characteristics. We propose and evaluate microarchitectural techniques that can exploit such a heterogeneous interconnect to improve performance and reduce energy consumption. These techniques include a novel cache pipeline design, the identification of narrow bit-width operands, the classification of non-critical data, and the detection of interconnect load imbalance. For a dynamically scheduled partitioned architecture, our results demonstrate that the proposed innovations result in up to 11% reductions in overall processor ED/sup 2/, compared to a baseline processor that employs a homogeneous interconnect.

64 citations


Patent
08 Jun 2005
TL;DR: In this paper, the optimal instruction interval is determined by starting with a minimum interval and doubling it until a low stability factor is reached, which is used until the next phase change is detected.
Abstract: In a processor having multiple clusters which operate in parallel, the number of clusters in use can be varied dynamically. At the start of each program phase, the configuration option for an interval is run to determine the optimal configuration, which is used until the next phase change is detected. The optimum instruction interval is determined by starting with a minimum interval and doubling it until a low stability factor is reached.

21 citations