scispace - formally typeset
Search or ask a question

Showing papers by "V. Kamakoti published in 2007"


Proceedings ArticleDOI
06 May 2007
TL;DR: The authors show that glitching activity on nodes must be considered in order to correctly handle constraints on instantaneous peak power and include a power profiler that can analyze a pattern source for violations and a PODEM-based pattern generation engine for generating power-safe patterns.
Abstract: Excessive dynamic voltage drop in the power supply rails during test mode is known to result in false failures and impact yield when testing devices that use low-cost wire-bond packages. Identifying and debugging such test failures is a complex and effort-intensive process, especially when scan compression is involved. From a design cycle-tune view point, it is best to avoid this problem by generating "power-safe" scan patterns. The generation of power-safe patterns must take into consideration the DFT architecture, physical design, tuning and power constraints. In this paper, the authors propose such a framework and show experimental results on some benchmark circuits. The framework can address a non-uniform power grid and region-based power constraints. The authors show that glitching activity on nodes must be considered in order to correctly handle constraints on instantaneous peak power. The framework includes a power profiler that can analyze a pattern source for violations and a PODEM-based pattern generation engine for generating power-safe patterns.

42 citations


Proceedings ArticleDOI
01 Oct 2007
TL;DR: A power-managed scan (PMScan) scheme which exploits the presence of adaptive voltage scaling logic to reduce test power and can be used as a vehicle to trade-off test application time with test power by suitably adjusting the scan shift frequency and scan-mode power supplies.
Abstract: In sub-70 nm technologies, leakage power becomes a significant component of the total power. Designers address this concern by extensive use of adaptive voltage scaling techniques to reduce dynamic as well as leakage power. Low-power scan test schemes that have evolved in the past primarily address dynamic power reduction, and are less effective in reducing the total power. We propose a power-managed scan (PMScan) scheme which exploits the presence of adaptive voltage scaling logic to reduce test power. We also discuss some practical implementation challenges that arise when the proposed scheme is employed on industrial designs. Experimental results on benchmark circuits and industrial designs show a significant reduction in dynamic and leakage power. The proposed method can also be used as a vehicle to trade-off test application time with test power by suitably adjusting the scan shift frequency and scan-mode power supplies.

28 citations


Proceedings ArticleDOI
01 Oct 2007
TL;DR: It is argued that false delay test failures can be avoided by generating "safe" patterns that are tolerant to on-chip variations, which uses process variation information, power grid topology and regional constraints on switching activity.
Abstract: Process variation is an increasingly dominant phenomenon affecting both power and performance in sub-100 nm technologies Cost considerations often do not permit over-designing the power supply infrastructure for test mode, considering the worst-case scenario Test application must not over-exercise the power supply grids, lest the tests will damage the device or lead to false test failures The problem of debugging a delay test failure can therefore be highly complex We argue that false delay test failures can be avoided by generating "safe" patterns that are tolerant to on-chip variations A statistical framework for power-safe pattern generation is proposed, which uses process variation information, power grid topology and regional constraints on switching activity Experimental results are provided on benchmark circuits to demonstrate the effectiveness of the framework

27 citations


Proceedings ArticleDOI
06 May 2007
TL;DR: The proposed technique converts the given behavioral model automatically to an integer (word-level) constraint model and employs an integer constraint solver to generate the required power virus vectors.
Abstract: The problem of peak power estimation in CMOS circuits is essential for analyzing the reliability and performance of circuits at extreme conditions. The dynamic power dissipated is directly proportional to the switching activity (number of gate outputs that toggles (changes state)) in the circuit. The power virus problem involves finding input vectors that cause maximum dynamic power dissipation (maximum toggles) in circuits. As the power virus problem is NP-complete the gate-level techniques are less scalable with increasing design size and produce less optimal vectors. In this paper, an approach for power virus generation using behavioral models of digital circuits is presented. The proposed technique converts the given behavioral model automatically to an integer (word-level) constraint model and employs an integer constraint solver to generate the required power virus vectors. Experimenting the proposed technique on ISCAS behavioral level benchmark circuits and the standard DLX processor model show that the above technique is fast and yields higher-quality results than the known gate-level techniques. Interestingly, the paper attempts to generate an assembly program that cause the maximum dynamic power dissipation on the given DLX processor model. To the best of our knowledge the proposed technique is the first reported that considers power virus generation using behavioral level models.

27 citations


Proceedings ArticleDOI
16 Apr 2007
TL;DR: In this article, a timing-based, power and layout-aware pattern generation technique is proposed to minimize both global and localized switching activity in WSNs. But, the technique is not suitable for WSN-based applications.
Abstract: With increasing use of low cost wire-bond packages for mobile devices, excessive dynamic IR-drop may cause tests to fail on the tester. Identifying and debugging such scan test failures is a very complex and effort-intensive process. A better solution is to generate correct-by-construction "power-safe" patterns. Moreover, with glitch power contributing to a significant component of dynamic power, pattern generation needs to be timing-aware to minimize glitching. In this paper, we propose a timing-based, power and layout-aware pattern generation technique that minimizes both global and localized switching activity. Techniques are also proposed for power-profiling and optimizing an initial pattern set to obtain a power-safe pattern set, with the addition of minimal patterns. The proposed technique also comprehends irregular power grid topologies for constraints on localized switching activity. Experiments on ISCAS benchmark circuits reveal the effectiveness of the proposed scheme.

26 citations


Journal ArticleDOI
01 Feb 2007
TL;DR: The design of a parallel architecture for on-line face recognition using weighted modular principal component analysis (WMPCA) and its system-on-programmable-chip (SoPC) implementation are discussed and an architecture that exploits this parallelism is presented.
Abstract: In this paper, the design of a parallel architecture for on-line face recognition using weighted modular principal component analysis (WMPCA) and its system-on-programmable-chip (SoPC) implementation are discussed. The WMPCA methodology, proposed by us earlier, is based on the assumption that the rates of variation of the different regions of a face are different due to variations in expression and illumination. Given a database of sample faces for training and a query face for recognizing, the WMPCA methodology involves division of the face into horizontal regions. Each of these regions are analyzed independently by computing the eigenfeatures and comparing the same with the corresponding eigenfeatures of the faces stored in the sample database to calculate the corresponding error. The final decision of the face recognizer is based on the weighted sum of the errors computed from each of the regions. These weights are calculated based on the extent to which the various samples of the subject are spread in the eigenspace. The WMPCA methodology has a better recognition rate compared to the modular PCA approach developed by Rajkiran and Vijayan [Rajkiran, G., Vijayan, K., 2004. An improved face recognition technique based on modular PCA approach. Pattern Recognition Letters, 25(4), 429-436]. The methodology also has a wide scope for parallelism. We present an architecture that exploits this parallelism and implement the same as a system-on-programmable-chip on an ALTERA based field programmable gate array (FPGA) platform. The implementation has achieved a processing speed of about 26 frames per second at an operating frequency of 33.33MHz.

25 citations


Proceedings ArticleDOI
16 Apr 2007
TL;DR: A timing-based, power and layout-aware pattern generation technique that minimizes both global and localization switching activity and comprehends irregular power grid topologies for constraints on localized switching activity is proposed.
Abstract: With increasing use of low cost wire-bond packages for mobile devices, excessive dynamic IR-drop may cause tests to fail on the tester. Identifying and debugging such scan test failures is a very complex and effort-intensive process. A better solution is to generate correct-by-construction "power-safe" patterns. Moreover, with glitch power contributing to a significant component of dynamic power, pattern generation needs to be timing-aware to minimize glitching. In this paper, we propose a timing-based, power and layout-aware pattern generation technique that minimizes both global and localized switching activity. Techniques are also proposed for power-profiling and optimizing an initial pattern set to obtain a power-safe pattern set, with the addition of minimal patterns. The proposed technique also comprehends irregular power grid topologies for constraints on localized switching activity. Experiments on ISCAS benchmark circuits reveal the effectiveness of the proposed scheme

22 citations


Proceedings ArticleDOI
06 Jan 2007
TL;DR: This paper takes up the challenge of reducing the overhead of daisy mode in divide-and-conquer testing by a careful analysis of the interactions between partitions, and introduces additional test modes to increase the coverage of glue logic by making sure that the number of scan cells involved in these "intermediate daisy modes" are minimal.
Abstract: A hierarchical or "divide-and-conquer" scan test methodology enables us to partition a large SoC into several partitions and perform design-for-testability (DFT) functions such as scan insertion, pattern generation, and pattern validation separately on individual partitions. Since the effort for DFT related tasks grows super-linearly with gate count, partitioning reduces the effort for DFT tasks. Further, test application can be divided into k + 1 modes, where k modes correspond to independent testing of the partitions and the (k + 1)th mode corresponds to a "residual" (or daisy) mode where faults that are not covered by the individual modes are considered. In reality, however, the daisy mode can be a killer and wipe out the benefits of divide-and-conquer testing. This is especially true for partitions that do not have test wrappers. In this paper, we take up the challenge of reducing the overhead of daisy mode in divide-and-conquer testing. By a careful analysis of the interactions between partitions, additional test modes are introduced to increase the coverage of glue logic, at the same time making sure that the number of scan cells involved in these "intermediate daisy modes" are minimal. We refer to this version of hierarchical scan testing as "quiet and optimized divide-and-conquer scan". Experimental results reveal that the proposed technique reduces the test time overhead of the conventional daisy mode by about 20times. In addition, the technique drastically reduces the switching activity in the daisy modes and hence reduces the test power

17 citations


Proceedings ArticleDOI
06 Jan 2007
TL;DR: An approach for power virus generation for both combinational and sequential circuits is presented and the basic intuition behind the approach is to use the 0- and 1- controllability measures of the gate outputs in the circuit to guide the D-algorithm.
Abstract: The problem of peak power estimation in CMOS circuits is essential for analyzing the reliability and performance of circuits at extreme conditions. The power virus problem involves finding input vectors that cause maximum dynamic power dissipation (maximum toggles) in circuits. In this paper, an approach for power virus generation for both combinational and sequential circuits is presented. The basic intuition behind the approach is to use the 0- and 1- controllability measures of the gate outputs in the circuit to guide the D-algorithm. The proposed technique was employed on the ISCAS'85 and ISCAS'89 circuits. The results of the above show a significant increase in power dissipation when compared to the best known existing techniques reported in the literature

9 citations


Journal ArticleDOI
01 Jan 2007
TL;DR: An artificial neural network (ANN) based parallel evolutionary solution to the placement and routing problems for field programmable gate arrays (FPGAs) and the results obtained are extremely encouraging, especially for circuits with very large number of nets.
Abstract: This paper presents an artificial neural network (ANN) based parallel evolutionary solution to the placement and routing problems for field programmable gate arrays (FPGAs). The concepts of artificial neural networks are utilized for guiding the parallel genetic algorithm to intelligently transform a set of initial populations of randomly generated solutions to a final set of populations that contain solutions approximating the optimal one. The fundamental concept of this paper lies in capturing the various intuitive strategies of the human brain into neural networks, which may help the genetic algorithm to evolve its population in a more lucrative manner. A carefully chosen fitness function acts in the capacity of a yardstick to appraise the quality of each ''chromosome'' to aid the selection phase. In conjunction with the migration phase and the chosen fitness function various genetic operators are employed, to expedite the transformation of the initial population towards the final solution. The suggested algorithms have been implemented on a 12-node SGI Origin-2000 platform using the message passing interface (MPI) standard and the neural network utilities provided by MAT Lab software. The results obtained by executing the same are extremely encouraging, especially for circuits with very large number of nets.

8 citations


Journal ArticleDOI
TL;DR: By generating safe patterns - those that tolerate on-chip variations - this framework avoids false delay test failures and uses power grid information and regional constraints on switching activity to minimize peak power and optimize the pattern set.
Abstract: By generating safe patterns - those that tolerate on-chip variations - this framework avoids false delay test failures. It uses power grid information and regional constraints on switching activity to minimize peak power and optimize the pattern set. Experimental results on benchmark circuits demonstrate the framework's effectiveness.

Journal ArticleDOI
TL;DR: Results of FPGA implementation of the design for principal component neural network show that as many as 500 input vectors can be processed during training phase and 700 input vectors during retrieval phase in a second, valuable for high-speed applications.
Abstract: Principal Component Analysis (PCA) finds wide applications in machine vision. The neural network that performs PCA is called Principal Component Neural Network (PCNN). This paper presents a digital hardware design for principal component neural network. The design is efficient in the sense that the learning rule is implemented with a reusable circuit. Results of FPGA implementation of the design show that as many as 500 input vectors can be processed during training phase and 700 input vectors during retrieval phase in a second. Such results are valuable for high-speed applications.

Posted Content
TL;DR: The general design of an algorithm-agile coprocessor and the proof-of-concept implementation are reported and the high design and NRE costs of ASICs are reported.
Abstract: With growing computational needs of many real-world applications, frequently changing specifications of standards, and the high design and NRE costs of ASICs, an algorithm-agile FPGA based co-processor has become a viable alternative. In this article, we report about the general design of an algorith-agile co-processor and the proof-of-concept implementation.

Proceedings ArticleDOI
01 Dec 2007
TL;DR: This work provides genetic methods to directly optimize truth table inputs using transistor level simplification to eliminate the intermediate gate level optimization step and provides optimized transistor netlists which could be used for dynamic library cell generation for custom and semi-custom designs on the fly.
Abstract: In this work, we propose a novel technique for evolving transistor netlists from truth table descriptions of arbitrary digital circuits The proposed methods incorporate the effective use of genetic algorithms (GAs) In typical semi-custom and custom design flows, logic optimization is done at the gate level after Boolean translation of the input truth table The final transistor netlist is then deduced from the simplified gate logic to be laid out on a chip However transistor level optimizations after the Boolean simplification step would still not lead to the minimum number of transistors This final optimization level is non-existent in present custom design flows This work aims to address this need A salient feature of the proposed technique is the bypassing of gate level representation and optimization in the VLSI design flow We provide genetic methods to directly optimize truth table inputs using transistor level simplification This eliminates the intermediate gate level optimization step and provides optimized transistor netlists which could be used for dynamic library cell generation for custom and semi-custom designs on the fly