scispace - formally typeset
Search or ask a question

Showing papers by "V. Kamakoti published in 2011"


Proceedings ArticleDOI
04 Jul 2011
TL;DR: A two-phase algorithm to size the switch buffers in NoCs is proposed that results in 42% reduction in amount of buffering required to meet the application constraints when compared to a standard buffering approach.
Abstract: Buffers in on-chip networks constitute a significant proportion of the power consumption and area of the interconnect. Hence, reducing the buffering overhead of Networks on Chips (NoCs) is an important problem. For application-specific designs, the network utilization across the different links and switches is non-uniform, thereby requiring a buffer sizing approach that tackles the non uniformity. Moreover, congestion effects that occur during network operation needs to be captured when sizing the buffers. To this end, we propose a two-phase algorithm to size the switch buffers in NoCs. Our algorithm considers both the static (based on bandwidth and latency requirements) and dynamic (based on simulation) effects when sizing buffers. Our experiments show that the application of the algorithm results in 42% reduction in amount of buffering required to meet the application constraints when compared to a standard buffering approach.

9 citations


Proceedings ArticleDOI
19 Dec 2011
TL;DR: This paper presents a new randomization based heuristic algorithm for QAP and shows that the proposed algorithm has competitive solutions comparable with one of the best heuristics reported in literature, while consuming significantly smaller amount of CPU time.
Abstract: The problem of placement is well known in Computer Aided Design (CAD) of VLSI Chips, DNA Micro arrays and Microi¬‚uidic biochips. Because of the similarity of the placement problem across diverse domains a generalization of the same is reported in the literature. The generalized placement problem is an instance of the classical Quadratic Assignment Problem (QAP). In this paper, we present a new randomization based heuristic algorithm for QAP. The key to success of the proposed technique is a novel probability distribution that is employed by the heuristics to generate the necessary randomization. We show through simulation results that the proposed algorithm i¬nds competitive solutions comparable with one of the best heuristics reported in literature, while consuming signii¬cantly smaller amount of CPU time.

5 citations


Journal ArticleDOI
01 Jan 2011
TL;DR: This work presents a hardware implementation of an FIR filter that is self-adaptive; that responds to arbitrary frequency response landscapes; that has built-in coefficient error tolerance capabilities; and that has a minimal adaptation latency.
Abstract: This work presents a hardware implementation of an FIR filter that is self-adaptive; that responds to arbitrary frequency response landscapes; that has built-in coefficient error tolerance capabilities; and that has a minimal adaptation latency. This hardware design is based on a heuristic genetic algorithm. Experimental results show that the proposed design is more efficient than non-evolutionary designs even for arbitrary response filters. As a byproduct, the paper also presents a novel flow for the complete hardware design of what is termed as an Evolutionary System on Chip (ESoC). With the inclusion of an evolutionary process, the ESoC is a new paradigm in modern System on Chip (SoC) designs. The ESoC methodology could be a very useful structured FPGA/ASIC implementation alternative in many practical applications of FIR filters.

2 citations


Journal ArticleDOI
TL;DR: The activities and practice in the area of MVR-based error concealment during the last two decades has been mainly elaborated here and a performance comparison of the prominent MVR techniques has also been presented.
Abstract: Error concealment in video communication is becoming increasingly important because of the growing interest in video delivery over unreliable channels such as wireless networks and the Internet. A subclass of this error concealment in video communication is known as motion vector recovery (MVR). MVR techniques try to retrieve the lost motion information in the compressed video streams based on the available information in the locality (both spatial and temporal) of the lost data. The activities and practice in the area of MVR-based error concealment during the last two decades has been mainly elaborated here. A performance comparison of the prominent MVR techniques has also been presented.

1 citations


Book ChapterDOI
26 Sep 2011
TL;DR: It is argued that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems and the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM.
Abstract: Workload optimized systems consisting of large number of general and special purpose cores, and with a support for shared memory programming, are slowly becoming prevalent. One of the major impediments for effective parallel programming on these systems is lock-based synchronization. An alternate synchronization solution called Transactional Memory (TM) is currently being explored.We observe that most of the TM design proposals in literature are catered to match the constrains of general purpose computing platforms. Given the fact that workload optimized systems utilize wider hardware design spaces and on-chip parallelism, we argue that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems. We re-evaluate the criteria to be satisfied by a HTM and identify possible scope for relaxations in the context of workload optimized systems. Based on the relaxed criteria, we demonstrate the scope for building HTM design variants, such that, each variant caters to a specific workload requirement. We carry out suitable experiments to bring about the trade-off between the design variants. Overall, we show how the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM.

1 citations


Journal ArticleDOI
TL;DR: An introductory material on TM is provided, followed by a background of important historical work on synchronization leading to current TM research, and a list of interesting open problems in this field is brought out.
Abstract: Transactional memory (TM) is being viewed by researchers as a suitable mechanism to perform shared-memory synchronization on upcoming many-core systems. This paper provides an introductory ...

Proceedings ArticleDOI
04 Jul 2011
TL;DR: It is found that a significant number of nets have high probabilities of being constant at 0 or 1, and it is shown how these signals can be used to put gates to sleep, thus saving significant leakage power.
Abstract: We consider the problem of reducing active mode leakage power by modifying the post-synthesis net lists of combinational logic blocks. The stacking effect is used to reduce leakage power, but instead of a separate signal one of the inputs to the gate itself is used. The approach is studied on multiplier blocks. It is found that a significant number of nets have high probabilities of being constant at 0 or 1. In specific applications such as those having high peak to average ratio, like audio and other signal processing applications, this effect is more pronounced. We show how these signals can be used to put gates to sleep, thus saving significant leakage power.