scispace - formally typeset
Search or ask a question

Showing papers by "V. Kamakoti published in 2005"


Proceedings ArticleDOI
03 Jan 2005
TL;DR: A new CLB architecture for FPGAs and an associated testing technique that detects routing errors caused by SEUs in the SRAM configuration memory of the FPGA is proposed and it is noteworthy that the time required for error detection is independent of both the number of switch matrices and thenumber of logic blocks in the FPN.
Abstract: This paper proposes a new CLB architecture for FPGAs and an associated testing technique that detects routing errors caused by SEUs in the SRAM configuration memory of the FPGA. The proposed testing technique detects all possible routing errors including bridging faults, and requires a single configuration of only the LUTs of the FPGA. Any routing error that affects the logic of the circuit is detected by the proposed technique in a maximum of 8 clock cycles. It is noteworthy that the time required for error detection is independent of both the number of switch matrices and the number of logic blocks in the FPGA.

27 citations


Proceedings ArticleDOI
03 Jan 2005
TL;DR: An efficient placement and routing algorithm is proposed for 3D-FPGAs which yields better results in terms of total interconnect length and channel-width and is implemented and tested on standard benchmark circuits and the results obtained are encouraging.
Abstract: The primary advantage of using 3D-FPGA over 2D-FPGA is that the vertical stacking of active layers reduce the Manhattan distance between the components in 3D-FPGA than when placed on 2D-FPGA. This results in a considerable reduction in total interconnect length. Reduced wire length eventually leads to reduction in delay and hence improved performance and speed. Design of an efficient placement and routing algorithm for 3D-FPGA that fully exploits the above mentioned advantage is a problem of deep research and commercial interest. In this paper, an efficient placement and routing algorithm is proposed for 3D-FPGAs which yields better results in terms of total interconnect length and channel-width. The proposed algorithm employs two important techniques, namely, reinforcement learning (RL) and support vector machines (SVMs), to perform the placement. The proposed algorithm is implemented and tested on standard benchmark circuits and the results obtained are encouraging. This is one of the very few instances where reinforcement learning is used for solving a problem in the area of VLSI.

15 citations


Proceedings ArticleDOI
03 Jan 2005
TL;DR: A template which captures the commonalities among the different random testing tools and enables the user to quickly design a random test generator by adding product-specific details and using most of the methods available in the template is proposed.
Abstract: This paper presents a universal random test generator template for the design verification of microprocessors and system-on-chips (SOCs). The tool enables verification of the product in one continuous, integrated environment, from C model to behavioral RTL and gate to system-level integration, all in one self-contained chassis. Due to complexity of large designs, it has been a common practice to rely on the power of randomization, to bless us with the humanly not-conceivable corner cases that can arise in reality. There are lots of common features shared by random tools used for testing products with diverse functionalities. This paper proposes a template which captures the commonalities among the different random testing tools and enables the user to quickly design a random test generator by adding product-specific details and using most of the methods available in the template. This leads to high degree of code reuse, less debugging of the random tool and huge reduction in design-cycle time. In addition the template provides enough flexibility and interfaces to enable the execution of the generated tests on targets which may be a C model, RTL or the final chip. By this, one may test a software component, say a bootup code for the system-on-chip or microprocessor at all stages of its design, namely, the software prototype, the RTL at the pre-silicon level and finally the chip, at a post-silicon level. This satisfies the expectations out of a verification platform for a hardware-software codesign environment. The random test generator template was employed for testing a x86-compatible microprocessor both at RTL and post-silicon stage and a software model of a 802.11 MAC. The results are presented in the paper.

12 citations


Proceedings ArticleDOI
03 Jan 2005
TL;DR: This paper presents a new PCNN-based face recognition system that can tolerate local variations in the face such as expression changes and directional lighting and an optimal digital hardware design is proposed for PCNN.
Abstract: Principal component analysis (PCA) finds wide usage in computer-aided vision applications and one such application is face recognition. The neural network that performs PCA is called a principal component neural network (PCNN). This paper presents a new PCNN-based face recognition system. The proposed recognition system can tolerate local variations in the face such as expression changes and directional lighting. An optimal digital hardware design is proposed for PCNN. An ASIC implementation of the proposed design yields a throughput of processing about 11,000 inputs per second during the training phase and about 19,000 inputs per second during the retrieval phase. The customized hardware-based recognition is about 10/sup 5/ times faster than a software-based recognition in a PC. Such results are valuable for high-speed applications.

11 citations


Proceedings ArticleDOI
04 Apr 2005
TL;DR: A new CLB architecture for FPGA and associated online testing and reconfiguration techniques that detect configuration upsets in the LUTs of SRAM-based FPGAs and correct them using partial reconfigurations is proposed.
Abstract: This paper proposes a new CLB architecture for FPGAs and associated online testing and reconfiguration techniques that detect configuration upsets in the LUTs of SRAM-based FPGAs and correct them using partial reconfiguration. These configuration upsets may either be single event upsets (SEUs) or even multiple configuration upsets. Any error in a CLB is detected with a latency of just 16 clock cycles and the errors are diagnosed by propagating them to a single output port by a chain-like shift register. The proposed CLB architectures require only 2 additional SRAM configuration bits per LUT for a Xilinx Virtex II architecture. This is extremely low when compared to the 16 additional SRAM configuration bits required by CLB architectures used to implement standard DWC techniques for detecting configuration upsets in LUTs.

10 citations


Proceedings ArticleDOI
07 Mar 2005
TL;DR: In this article, the authors report about the general design of an algorith-agile co-processor and the proof-of-concept implementation of an FPGA-based implementation.
Abstract: With growing computational needs of many real-world applications, frequently changing specifications of standards, and the high design and NRE costs of ASICs, an algorithm-agile FPGA based co-processor has become a viable alternative. In this article, we report about the general design of an algorith-agile co-processor and the proof-of-concept implementation.

8 citations


Proceedings ArticleDOI
18 Jan 2005
TL;DR: A new configurable logic block (CLB) architecture containing a single LUT that stores the truth table of a Boolean function F and is capable of generating three split-equivalent functions of F is proposed.
Abstract: The main objective of the technique presented in this paper is to exploit the relations between a set of Boolean functions so as to generate one function from another. The paper defines a relation termed as split-equivalence between logical functions. Using this relation, a single look-up table (LUT) storing the truth table of a function F may be used to generate other functions that are split-equivalent to F resulting in an overall reduction in the logic area used to map the circuit on the FPGA. This paper proposes a new configurable logic block (CLB) architecture containing a single LUT that stores the truth table of a Boolean function F and is capable of generating three split-equivalent functions of F. Given a set of Boolean functions to be mapped onto LUTs, the technique proposed identifies sets of four functions such that any three of them are split-equivalent to the fourth. These sets are mapped on to the proposed CLB architecture. The proposed CLB architecture was compared with the standard CLBs available on Xilinx Virtex architecture and it was found that the former occupies 26% lesser area than the latter with a small increase in the SRAM configuration bits required to configure a CLB.

6 citations


Proceedings ArticleDOI
18 Dec 2005
TL;DR: This paper presents an Automatic Assembly Program Generator (A^2 PG), that handles the design at the behavioral RTL level and is based on function-oriented test generation schemes, hence making it scalable and usable for some specific tasks.
Abstract: Pre-silicon functional design verification, performance measurements and post-silicon functional testing of processor cores consume the major portion of time and cost investment in any concept-to-silicon design flow. Most of the tools reported in the literature are based on function/faultindependent test generation schemes which cannot be effectively employed for verification or testing of specific functional behavior or for generating inputs for performance measurement of a specific parameter or functional unit in the design. In addition, the crucial bottleneck with existing tools is their scalability with larger designs. It is wellstudied and reported in the literature that for a tool to be scalable with larger designs, it is important to handle the design at higher levels of abstraction, typically, at the RTL level. In this paper, we present an Automatic Assembly Program Generator (A^2 PG), that handles the design at the behavioral RTL level and is based on function-oriented test generation schemes, hence making it scalable and usable for some specific tasks as mentioned above.

3 citations


Proceedings ArticleDOI
18 Jan 2005
TL;DR: A cluster-based parity-checking technique that can detect 100% of all single event upset (SEU) faults in the LUTs of SRAM-based FPGAs is proposed and two different configurable logic block (CLB) architectures that could be used to implement the proposed SEU detection technique are described.
Abstract: This paper proposes a cluster-based parity-checking technique that can detect 100% of all Single Event Upset (SEU) faults in the LUTs of SRAM-based FPGAs. The paper describes two different Configurable Logic Block (CLB) architectures that could be used to implement the proposed SEU detection technique. Of the two, the first architecture can perform at-speed testing of the LUTs without interrupting the normal functioning of the FPGA. The second one works by switching the CLBs from normal-mode to testing-mode and vice-versa. The LUTs are tested in the testing-mode. The switching frequency can be externally programmed and hence varied depending on the rate of SEU occurrences. Both the proposed architectures were compared with the Xilinx Virtex and Virtex Pro architecture. The proposed architectures require only 2 (when compared with Virtex) and 4 (when compared with Virtex Pro) additional SRAM configuration bits per LUT. This is extremely low when compared to the 16 additional SRAM configuration bits required by CLB architectures used to implement standard DWC techniques for detecting SEUs in LUTs. The area requirements of both the proposed architectures are also significantly less than the area requirements of DWC techniques. The proposed detection technique requires only 3 clock cycles of the Xilinx Virtex internal clock to detect the effect of an SEU in any LUT of the FPGA.

2 citations


Proceedings ArticleDOI
18 Jan 2005
TL;DR: This paper proposes a new reconfigurable system which has a function generator-based CLB architecture, different from the standard look-up table (LUT) based CLB architectures available in commercial FPGAs.
Abstract: This paper proposes a new reconfigurable system which has a function generator-based CLB architecture. This is different from the standard look-up table (LUT) based CLB architectures available in commercial FPGAs. The new function generation architecture is based on the fact that a small set of k-input Boolean functions can generate all the 2/sup 2k/, k-input Boolean functions using a simple mapping technique. The area required by the new function generation architecture is 58.6% lesser than the area required by a standard 16/spl times/1 LUT used in commercial FPGAs. In addition, the proposed architecture consumes 40.8% lesser power than the standard 16/spl times/1 LUT. The routing architecture for the proposed reconfigurable system is the same as those present in current-day FPGAs. Hence, the algorithms presently used for technology mapping, packing, placement and routing on FPGAs can be used for the proposed reconfigurable system without much modification. The new architecture requires a 10% increase in the SRAM configuration memory. This is an insignificant penalty in comparison to the reduction in the area of the FPGA and power consumption, achieved by the proposed CLB architecture.

2 citations


Journal ArticleDOI
TL;DR: Novel pseudo-online built-in self-test based techniques for detecting and locating multiple faults in lookup tables (LUTs), interconnects and dedicated clock lines of field programmable gate arrays (FPGAs).

Proceedings ArticleDOI
20 Feb 2005
TL;DR: A new CLB architecture for FPGAs and associated testing and reconfigured techniques that detect single routing/interconnect errors and correct them using partial reconfiguration are proposed, independent of both the number of switch matrices and thenumber of logic blocks in the FPGA.
Abstract: This paper proposes a new CLB architecture for FPGAs and associated testing and reconfiguration techniques that detect single routing/interconnect errors and correct them using partial reconfiguration. The results of error detection are propagated to a single output port by a chain-like shift register, which are used to reduce the segment of the routing architecture that has to be reconfigured. The error is corrected by partially reconfiguring the above minimal segment alone, thereby reducing the time for reconfiguration. The proposed testing technique detects all possible routing errors that affects the logic of the circuit, including bridging faults. It is noteworthy that the time required for error detection is independent of both the number of switch matrices and the number of logic blocks in the FPGA. Empirically, our technique detected all single interconnect errors in benchmark circuits. In addition, for the majority of errors, our correction technique required less than 10% of the switch matrices to be reconfigured to correct the errors.