scispace - formally typeset
Search or ask a question

Showing papers by "V. Kamakoti published in 2006"


Proceedings ArticleDOI
01 Aug 2006
TL;DR: This paper proposes a set of basic sequential elements that could be used for building large reversible sequential circuits leading to logic and garbage reduction by a factor of 2 to 6 when compared to existing reversible designs reported in the literature.
Abstract: Reversible logic is gaining interest in the recent past due to its less heat dissipating characteristics. It has been proved that any Boolean function can be implemented using reversible gates. In this paper we propose a set of basic sequential elements that could be used for building large reversible sequential circuits leading to logic and garbage reduction by a factor of 2 to 6 when compared to existing reversible designs reported in the literature.

75 citations


Proceedings ArticleDOI
03 Jan 2006
TL;DR: A new high-speed VLSI architecture for decoding Reed-Solomon codes with the Berlekamp-Massey algorithm is presented, which utilizes the folding property of systolic array architectures and reduces the number of multipliers and adders drastically at the expense of some compromise in the speed.
Abstract: In this paper, a new high-speed VLSI architecture for decoding Reed-Solomon codes with the Berlekamp-Massey algorithm is presented. The proposed scheme uses the fully folded systolic architecture in which a single array of processors, computes both the error-locator and the error-evaluator polynomials. The proposed scheme utilizes the folding property of systolic array architectures and reduces the number of multipliers and adders drastically at the expense of some compromise in the speed. More interestingly, the proposed architecture requires approximately 60% fewer multipliers and a simpler control structure than the popular RiBM architecture. The reduction in the number of multipliers and adders in the proposed architecture leads to smaller silicon area and lower power consumption.

27 citations


Proceedings ArticleDOI
30 Apr 2006
TL;DR: The experimental results show that the proposed temporal redundancy based encoding technique is very effective in reducing the peak power and delay and from the delay perspective, it reduces the delay by at least 11% in the address (data) buses compared to the data transmission without encoding.
Abstract: In this paper, we propose a novel temporal redundancy based encoding technique for delay and peak power minimization. The proposed encoding scheme is tested with the SPEC2000 CINT benchmarks for 90nm and 65nm technologies. The experimental results show that our approach is very effective in reducing the peak power. From the delay perspective, our approach reduces the delay by at least 11% (4%) in the address (data) buses compared to the data transmission without encoding.

13 citations


Journal ArticleDOI
TL;DR: A new temporal encoding scheme is proposed, which uses self-shielding memory-less codes to completely eliminate worst-case crosstalk effects and hence significantly minimizes power consumption and delay of the bus.
Abstract: Power consumption and delay are two of the most important constraints in current-day on-chip bus design. The two major sources of dynamic power dissipation on a bus are the self capacitance and the coupling capacitance. As technology scales, the interconnect resistance increases due to shrinking wire-width. At the same time, spacing between the interconnects decreases resulting in an increase in the coupling capacitance. This, in turn, leads to stronger crosstalk effects between the interconnects. In Deep SubMicron technology the coupling capacitance exceeds the self capacitance, which, in turn, cause more power consumption and delay on the bus. Recently, the interest has also shifted to minimizing peak power dissipation. The reason being that higher peak power leads to an undesired increase in switching noise, metal electromigration problems and operationinduced variations due to non-uniform temperature on the die. Thus, minimizing power consumption and delay are the most important design objectives for on-chip buses. Several bus encoding schemes have been proposed in the literature for reducing crosstalk. Most of these encoding techniques use spatial redundancy that requires additional transmission wires on the bus. In this paper, a new temporal encoding scheme is proposed, which uses self-shielding memory-less codes to completely eliminate worst-case crosstalk effects and hence significantly minimizes power consumption and delay of the bus. A major advantage of the proposed temporal redundancy based encoding scheme is the reduction in the number of wires of the on-chip bus. This reduction facilitates extra spacing between the bus wires, when compared with the normal bus, for a given area. This, in turn, leads to reduced crosstalk effects between the wires. The proposed encoding scheme is tested with the SPEC2000 CINT benchmarks. The experimental results, when compared to the transmission over a normal bus, show that on an average the proposed technique leads to a reduction in the peak-power consumption by 51% (28%), 51% (29%) and 52% (30%) in the data (address) bus for 90nm, 65nm and 45nm technologies, respectively. For a bus length of 10mm the proposed technique also achieves 17%, 31% and 37% reduction in the bus delay for 90nm, 65nm and 45nm technologies, respectively, when compared to what is incurred by the data transmission on a normal bus.

5 citations



Proceedings ArticleDOI
03 Jan 2006
TL;DR: A function-generation based area-aware configurable logic block (CLB) architecture and an associated packing technique, for SRAM-based FPGAs, and is shown to produce designs with almost same routing cost and performance overhead as that produced by the T-VPack algorithm on standard benchmark circuits.
Abstract: This paper proposes a function-generation based area-aware configurable logic block (CLB) architecture and an associated packing technique, for SRAM-based FPGAs. The new CLB architecture provides the same logic functionality, but occupies 38% less area, consumes 38.31% less power and requires 50% less configuration-bits per CLB when compared to the standard 4-LUT CLB architecture. The proposed packing technique is timing-driven and is shown to produce designs with almost same routing cost and performance overhead as that produced by the T-VPack algorithm on standard benchmark circuits.