Home
/
Authors
/
Raghavendra K

Author

Raghavendra K

Bio: Raghavendra K is an academic researcher from Indian Institutes of Technology. The author has contributed to research in topics: Cache & Shared memory. The author has an hindex of 1, co-authored 2 publications receiving 5 citations.

Topics: Cache, Shared memory, Queue, Access time, Router ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Process variation aware issue queue design

[...]

Raghavendra K¹, Madhu Mutyam¹•Institutions (1)

Indian Institutes of Technology¹

10 Mar 2008

TL;DR: Experimental results reveal that the proposed process variation aware issue queue design recovers most of the lost performance due to process variation and incurs a performance penalty of less than 2% with respect to the performance of issue queues without process variation.

...read moreread less

Abstract: In sub-90 nm process technology it becomes harder to control the fabrication process, which in turn causes variations between the design-time parameters and the fabricated parameters. Variations in the critical process parameters can result in significant fluctuations in the switching speed and leakage power consumption of different transistors in the same chip. In this paper, we study the impact of process variation on issue queues. Due to process variation, issue queues can take variable access latency. In order to work with nonuniform access latency issue queues, by exploiting ready operands of instructions at dispatch time, we propose a process variation aware issue queue design. Experimental results reveal that, for a 64-entry issue queue with half of the entries affected by process variation, our technique recovers most of the lost performance due to process variation and incurs a performance penalty of less than 2% with respect to the performance of issue queues without process variation.

...read moreread less

5 citations

Proceedings Article•DOI•

Router Buffer Caching for Managing Shared Cache Blocks in Tiled Multi-Core Processors

[...]

Joe Augustine¹, Raghavendra K², John Jose³, Madhu Mutyam¹•Institutions (3)

Indian Institute of Technology Madras¹, Indian Institutes of Technology², Indian Institute of Technology Guwahati³

01 Oct 2020

TL;DR: Wang et al. as mentioned in this paper proposed a congestion management technique in the LLC that equips the NoC router with small storage to keep a copy of heavily shared cache blocks, and also propose a prediction classifier in LLC controller.

...read moreread less

Abstract: Multiple cores in a tiled multi-core processor are connected using a network-on-chip mechanism. All these cores share the last-level cache (LLC). For large-sized LLCs, generally, non-uniform cache architecture design is considered, where the LLC is split into multiple slices. Accessing highly shared cache blocks from an LLC slice by several cores simultaneously results in congestion at the LLC, which in turn increases the access latency. To deal with this issue, we propose a congestion management technique in the LLC that equips the NoC router with small storage to keep a copy of heavily shared cache blocks. To identify highly shared cache blocks, we also propose a prediction classifier in the LLC controller. We implement our technique in Sniper, an architectural simulator for multi-core systems, and evaluate its effectiveness by running a set of parallel benchmarks. Our experimental results show that the proposed technique is effective in reducing the LLC access time.

...read moreread less

2 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A Survey of Architectural Techniques for Managing Process Variation

[...]

Sparsh Mittal¹•Institutions (1)

Oak Ridge National Laboratory¹

09 Feb 2016-ACM Computing Surveys

TL;DR: A survey of architectural techniques for managing process variation (PV) in modern processors is presented and these techniques are classified based on several important parameters to bring out their similarities and differences.

...read moreread less

Abstract: Process variation—deviation in parameters from their nominal specifications—threatens to slow down and even pause technological scaling, and mitigation of it is the way to continue the benefits of chip miniaturization. In this article, we present a survey of architectural techniques for managing process variation (PV) in modern processors. We also classify these techniques based on several important parameters to bring out their similarities and differences. The aim of this article is to provide insights to researchers into the state of the art in PV management techniques and motivate them to further improve these techniques for designing PV-resilient processors of tomorrow.

...read moreread less

68 citations

Proceedings Article•DOI•

Analysis and solutions to issue queue process variation

[...]

Niranjan Soundararajan¹, Aditya Yanamandra¹, Chrysostomos Nicopoulos¹, N. Vijaykrishnan¹, Anand Sivasubramaniam¹, Mary Jane Irwin¹ - Show less +2 more•Institutions (1)

Pennsylvania State University¹

24 Jun 2008

TL;DR: This paper provides a comprehensive analysis of the effects of PV on the microprocessor issue queue and proposes mechanisms that allow the fast and slow issue-queue entries to co-exist in turn enabling instruction dispatch, issue and forwarding to proceed with minimal stalls.

...read moreread less

Abstract: The last few years have witnessed an unprecedented explosion in transistor densities. Diminutive feature sizes have enabled microprocessor designers to break the billion-transistors per chip mark. However various new reliability challenges such as process variation (PV) have emerged that can no longer be ignored by chip designers. In this paper, we provide a comprehensive analysis of the effects of PV on the microprocessorpsilas Issue Queue. Variations can slow down issue queue entries and result in as much as 20.5% performance degradation. To counter this, we look at different solutions that include instruction steering, operand- and port- switching mechanisms. Given that PV is non-deterministic at design-time, our mechanisms allow the fast and slow issue-queue entries to co-exist in turn enabling instruction dispatch, issue and forwarding to proceed with minimal stalls. Evaluation on a detailed simulation environment indicates that the proposed mechanisms can reduce performance degradation due to PV to a low 1.3%.

...read moreread less

8 citations

Proceedings Article•DOI•

Power management of variation aware chip multiprocessors

[...]

Abu Saad Papa¹, Madhu Mutyam²•Institutions (2)

International Institute of Information Technology, Hyderabad¹, Indian Institute of Technology Madras²

04 May 2008

TL;DR: The goal of this work is to find the optimal frequency that balances performance with power against asymmetry, and it is demonstrated that traditional task scheduling techniques need to be revisited to mitigate the effects of process variations.

...read moreread less

Abstract: Faced with the challenge of finding ways to use an ever-growing transistor budget, microarchitects have begun to move towards the chip multiprocessors (CMPs) as an attractive solution. CMPs have become a common way of reducing chip complexity and power consumption while maintaining high performance. Multiple cores are replicated on a single chip, resulting in a potential linear scaling of performance. Cores are becoming sufficiently small with technology scaling. As technology continues to scale, inter-die and intra-die variations in process parameters can result in significant impact on performance and power consumption, leading to asymmetry among the cores that were designed to be symmetric. Adaptive voltage scaling can be used to bring all cores to the same performance level leaving only core-to-core power variations. The goal of our work is to find the optimal frequency that balances performance with power against asymmetry. We also demonstrate that traditional task scheduling techniques need to be revisited to mitigate the effects of process variations.

...read moreread less

8 citations

Dissertation•

Low Energy Implementation of Robust Digital Arithmetic in Sub/Near-Threshold Nanoscale CMOS: For Ultrasound Beamforming

[...]

Lars-Frode Schjolden

01 Jan 2013

TL;DR: This thesis will show combinatorial digital design using the 65nm transistor technology operating in near/sub-threshold region and a new digital building block for standard digital building blocks optimized for subthreshold performance are proposed.

...read moreread less

Abstract: This thesis will show combinatorial digital design using the 65nm transistor technology operating in near/sub-threshold region. Designing a 16By9Bit adder optimized with regard to power consumption with a speed requirement of 50MHz per operation for micro-beamforming. To optimize the addition of the 16, 9 bit numbers, studies of different building block are performed to find the best building blocks optimized for low power consumption, robustness and regular layout design without breaking the speed requirement. A new digital building block for standard digital building blocks optimized for subthreshold performance are proposed. In addition there will be shown a way to make regular layout designs. As a final result there will be shown a 16by9bit adder layout design with a delay equal to 17.7nS = 56.5MHz with a power consumption of 25uW at 20degrees and delay equal to 10nS = 100MHz with a power consumption of 36.2$uW at 80degrees. The design are build up from 6736 transistor and uses a area of 240um * 84um = 20.1mm^2.

...read moreread less

3 citations

Proceedings Article•DOI•

A case for exploiting complex arithmetic circuits towards performance yield enhancement

[...]

Shingo Watanabe¹, Masanori Hashimoto², Toshinori Sato³•Institutions (3)

Kyushu Institute of Technology¹, Osaka University², Fukuoka University³

16 Mar 2009

TL;DR: This paper investigates to exploit the statistical features in circuit delay and to cascade dependent instructions for reducing variations to unveil how efficiently instruction cascading improves performance yield of processors.

...read moreread less

Abstract: As semiconductor technologies are aggressively advanced, the problem of parameter variations is emerging. Process variations in transistors affect circuit delay, resulting in serious yield loss. Considering the situations, variationaware designs for yield enhancement interest researchers. This paper investigates to exploit the statistical features in circuit delay and to cascade dependent instructions for reducing variations. From statistical static timing analysis in circuit level and performance evaluation in processor level, this paper tries to unveil how efficiently instruction cascading improves performance yield of processors. Cascading instructions increases logic depth and decreases the standard deviation of the circuit delay. That might improve performance yield of microprocessors. Unfortunately, however, it is found that variability reduction in the circuit level does not always mean yield enhancement in the microarchitecture level.

...read moreread less

1 citations