Showing papers by "Kapil Ahuja published in 2020"

PDF

Open Access

Journal Article•DOI•

ALigN: A Highly Accurate Adaptive Layerwise Log_2_Lead Quantization of Pre-Trained Neural Networks

[...]

Siddharth Gupta¹, Salim Ullah², Kapil Ahuja¹, Aruna Tiwari¹, Akash Kumar² - Show less +1 more•Institutions (2)

Indian Institute of Technology Indore¹, Dresden University of Technology²

26 Jun 2020-IEEE Access

TL;DR: This paper proposes a novel and scalable technique with two different modes for the quantization of the parameters of pre-trained neural networks, which represents parameters in powers of 2, thereby eliminating the need for resource-computationally intensive multiplier units for the hardware accelerators of the neural networks.

...read moreread less

Abstract: Deep Neural Networks are one of the machine learning techniques which are increasingly used in a variety of applications. However, the significantly high memory and computation demands of deep neural networks often limit their deployment on embedded systems. Many recent works have considered this problem by proposing different types of data quantization schemes. However, most of these techniques either require post-quantization retraining of deep neural networks or bear a significant loss in output accuracy. In this paper, we propose a novel and scalable technique with two different modes for the quantization of the parameters of pre-trained neural networks. In the first mode, referred to as log_2_lead , we use a single template for the quantization of all parameters. In the second mode, denoted as ALigN , we analyze the trained parameters of each layer and adaptively adjust the quantization template to achieve even higher accuracy. Our technique significantly maintains the accuracy of the parameters and does not require retraining of the networks. Moreover, it supports quantization to an arbitrary bit-size. For example, compared to the single-precision floating-point numbers-based implementation, our proposed 8-bit quantization technique generates only $\sim 0.2\%$ and $\sim 0.1\%$ , loss in the Top-1 and Top-5 accuracies respectively for VGG-16 network using ImageNet dataset. We have observed similar minimal losses in the Top-1 and Top-5 accuracies for AlexNet and Resnet-18 using the proposed quantization scheme for the 8-bit range. Our proposed quantization technique also provides a higher mean intersection over union for semantic segmentation when compared with state-of-the-art quantization techniques. The proposed technique represents parameters in powers of 2, thereby eliminating the need for resource-computationally intensive multiplier units for the hardware accelerators of the neural networks. We also present a design for implementing the multiplication operation using bit-shifts and addition for the proposed quantization technique.

...read moreread less

9 citations

Journal Article•DOI•

Preconditioned linear solves for parametric model order reduction

[...]

Navneet Pratap Singh¹, Kapil Ahuja¹•Institutions (1)

Indian Institute of Technology Indore¹

02 Jul 2020-International Journal of Computer Mathematics

TL;DR: This work proposes the use of the block variant of the problem-dependent underlying iterative method, a technique to cheaply update the SPAI preconditioner, while solving parametrically changing linear systems.

...read moreread less

Abstract: The main computational cost of algorithms for computing reduced-order models of parametric dynamical systems is in solving sequences of very large and sparse linear systems of equations, which are ...

...read moreread less

6 citations

Proceedings Article•DOI•

L2L: a highly accurate log 2 lead quantization of pre-trained neural networks

[...]

Salim Ullah¹, Siddharth Gupta², Kapil Ahuja², Aruna Tiwari², Akash Kumar¹ - Show less +1 more•Institutions (2)

Dresden University of Technology¹, Indian Institute of Technology Indore²

09 Mar 2020

TL;DR: This paper proposes a novel quantization technique for parameters of pre-trained deep neural networks that significantly maintains the accuracy of the parameters and does not require retraining of the networks.

...read moreread less

5 citations

Journal Article•DOI•

Externalities in endogenous sharing economy networks

[...]

Pramod C. Mane¹, Kapil Ahuja¹, Nagarajan Krishnamurthy•Institutions (1)

Indian Institute of Technology Indore¹

06 Oct 2020-Applied Economics Letters

TL;DR: In this article, the impact of link formation between a pair of agents on the resource availability of other agents in a social cloud network, a special case of end-to-end networks, is investigated.

...read moreread less

Abstract: This paper investigates the impact of link formation between a pair of agents on the resource availability of other agents (that is, externalities) in a social cloud network, a special case of endo...

...read moreread less

4 citations

Journal Article•DOI•

Reusing Preconditioners in Projection Based Model Order Reduction Algorithms

[...]

Navneet Pratap Singh¹, Kapil Ahuja¹•Institutions (1)

Indian Institute of Technology Indore¹

15 Jul 2020-IEEE Access

TL;DR: In this article, Singh et al. showed that using preconditioners is an art via detailed algorithmic implementations in multiple model order reduction (MOR) algorithms and showed that reusing preconditions for reducing a real-life industrial problem (of size 1.2 million), leads to relative savings of up to 64 % in the total computation time (in absolute terms a saving of 5 days).

...read moreread less

Abstract: Dynamical systems are pervasive in almost all engineering and scientific applications. Simulating such systems is computationally very intensive. Hence, Model Order Reduction (MOR) is used to reduce them to a lower dimension. Most of the MOR algorithms require solving large sparse sequences of linear systems. Since using direct methods for solving such systems does not scale well in time with respect to the increase in the input dimension, efficient preconditioned iterative methods are commonly used. In one of our previous works, we have shown substantial improvements by reusing preconditioners for the parametric MOR (Singh et al. 2019). Here, we had proposed techniques for both, the non-parametric and the parametric cases, but had applied them only to the latter. We have three main contributions here. First, we demonstrate that preconditioners can be reused more effectively in the non-parametric case as compared to the parametric one. Second, we show that reusing preconditioners is an art via detailed algorithmic implementations in multiple MOR algorithms. Third and final, we demonstrate that reusing preconditioners for reducing a real-life industrial problem (of size 1.2 million), leads to relative savings of up to 64 % in the total computation time (in absolute terms a saving of 5 days).

...read moreread less

3 citations

Journal Article•DOI•

Stability, efficiency, and contentedness of social storage networks

[...]

Pramod C. Mane¹, Kapil Ahuja¹, Nagarajan Krishnamurthy²•Institutions (2)

Indian Institute of Technology Indore¹, Indian Institute of Management Indore²

01 Apr 2020-Annals of Operations Research

TL;DR: The concept of bilateral stability is proposed which refines the pairwise stability concept defined by Jackson and Wolinsky, by requiring mutual consent for both addition and deletion of links, as compared to mutual consent just for link addition.

...read moreread less

Abstract: Social storage systems are a good alternative to existing data backup systems of local, centralized, and P2P backup. Till date, researchers have mostly focussed on either building such systems by using existing underlying social networks (exogenously built) or on studying quality of service related issues. In this paper, we look at two untouched aspects of social storage systems. One aspect involves modelling social storage as an endogenous social network, where agents themselves decide with whom they want to build data backup relation, which is more intuitive than exogenous social networks. The second aspect involves studying the stability of social storage systems, which would help reduce maintenance costs and further, help build efficient as well as contented networks. We have a four fold contribution that covers the above two aspects. We, first, model the social storage system as a strategic network formation game. We define the utility of each agent in the network under two different frameworks, one where the cost to add and maintain links is considered in the utility function and the other where budget constraints are considered. In the context of social storage and social cloud computing, these utility functions are the first of its kind, and we use them to define and analyse the social storage network game. Second, we propose the concept of bilateral stability which refines the pairwise stability concept defined by Jackson and Wolinsky (J Econ Theory 71(1):44–74, 1996), by requiring mutual consent for both addition and deletion of links, as compared to mutual consent just for link addition. Mutual consent for link deletion is especially important in the social storage setting. The notion of bilateral stability subsumes the bilateral equilibrium definition of Goyal and Vega-Redondo (J Econ Theory 137(1):460–492, 2007). Third, we prove necessary and the sufficient conditions for bilateral stability of social storage networks. For symmetric social storage networks, we prove that there exists a unique neighborhood size, independent of the number of agents (for all non-trivial cases), where no pair of agents has any incentive to increase or decrease their neighborhood size. We call this neighborhood size as the stability point. Fourth, given the number of agents and other parameters, we discuss which bilaterally stable networks would evolve and also discuss which of these stable networks are efficient—that is, stable networks with maximum sum of utilities of all agents. We also discuss ways to build contented networks, where each agent achieves the maximum possible utility.

...read moreread less

2 citations

Posted Content•

Probabilistically Sampled and Spectrally Clustered Plant Genotypes using Phenotypic Characteristics

[...]

Aditya A. Shastri, Kapil Ahuja¹, Milind B. Ratnaparkhe, Yann Busnel•Institutions (1)

Indian Institute of Technology Indore¹

18 Sep 2020-arXiv: Learning

TL;DR: Spectral Clustering (SC) algorithm with Pivotal Sampling achieves substantially more accuracy than all the other proposed competitive clustering with sampling algorithms (i.e. SC with VQ), and outperforms the standard HC algorithm in both accuracy and computational complexity.

...read moreread less

Abstract: Clustering genotypes based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data. This algorithm suffers from low accuracy and high computational complexity issues. To address the accuracy challenge, we propose the use of Spectral Clustering (SC) algorithm. To make the algorithm computationally cheap, we propose using sampling, specifically, Pivotal Sampling that is probability based. Since application of samplings to phenotypic data has not been explored much, for effective comparison, another sampling technique called Vector Quantization (VQ) is adapted for this data as well. VQ has recently given promising results for genome data. The novelty of our SC with Pivotal Sampling algorithm is in constructing the crucial similarity matrix for the clustering algorithm and defining probabilities for the sampling technique. Although our algorithm can be applied to any plant genotypes, we test it on the phenotypic data obtained from about 2400 Soybean genotypes. SC with Pivotal Sampling achieves substantially more accuracy (in terms of Silhouette Values) than all the other proposed competitive clustering with sampling algorithms (i.e. SC with VQ, HC with Pivotal Sampling, and HC with VQ). The complexities of our SC with Pivotal Sampling algorithm and these three variants are almost same because of the involved sampling. In addition to this, SC with Pivotal Sampling outperforms the standard HC algorithm in both accuracy and computational complexity. We experimentally show that we are up to 45% more accurate than HC in terms of clustering accuracy. The computational complexity of our algorithm is more than a magnitude lesser than HC.

...read moreread less

1 citations

Posted Content•

Reusing Preconditioners in Projection based Model Order Reduction Algorithms

[...]

Navneet Pratap Singh¹, Kapil Ahuja¹•Institutions (1)

Indian Institute of Technology Indore¹

28 Mar 2020-arXiv: Numerical Analysis

TL;DR: It is demonstrated that preconditioners can be reused more effectively in the non-parametric case as compared to the parametric one, and it is shown that reusing preconditionsers is an art via detailed algorithmic implementations in multiple MOR algorithms.

...read moreread less

Abstract: Dynamical systems are pervasive in almost all engineering and scientific applications. Simulating such systems is computationally very intensive. Hence, Model Order Reduction (MOR) is used to reduce them to a lower dimension. Most of the MOR algorithms require solving large sparse sequences of linear systems. Since using direct methods for solving such systems does not scale well in time with respect to the increase in the input dimension, efficient preconditioned iterative methods are commonly used. In one of our previous works, we have shown substantial improvements by reusing preconditioners for the parametric MOR (Singh et al. 2019). Here, we had proposed techniques for both, the non-parametric and the parametric cases, but had applied them only to the latter. We have four main contributions here. First, we demonstrate that preconditioners can be reused more effectively in the non-parametric case as compared to the parametric one because of the lack of parameters in the former. Second, we show that reusing preconditioners is an art and it needs to be fine-tuned for the underlying MOR algorithm. Third, we describe the pitfalls in the algorithmic implementation of reusing preconditioners. Fourth, and final, we demonstrate this theory on a real life industrial problem (of size 1.2 million), where savings of upto 64% in the total computation time is obtained by reusing preconditioners. In absolute terms, this leads to a saving of 5 days.

...read moreread less

1 citations

Posted Content•

ParaLarH: Parallel FPGA Router based upon Lagrange Heuristics.

[...]

Rohit Agrawal, Kapil Ahuja, Dhaarna Maheshwari, Akash Kumar

22 Oct 2020-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: A set of novel Lagrange heuristics that improve the Lagrange relaxation process are introduced that lead to halving of the constraints violation, up to 10% improvement in the minimum channel width, and up to 8% reduction in the critical path delay as obtained from ParaLarPD.

...read moreread less

Abstract: Routing of the nets in Field Programmable Gate Array (FPGA) design flow is one of the most time consuming steps. Although Versatile Place and Route (VPR), which is a commonly used algorithm for this purpose, routes effectively, it is slow in execution. One way to accelerate this design flow is to use parallelization. Since VPR is intrinsically sequential, a set of parallel algorithms have been recently proposed for this purpose (ParaLaR and ParaLarPD). These algorithms formulate the routing process as a Linear Program (LP) and solve it using the Lagrange relaxation, the sub-gradient method, and the Steiner tree algorithm. Out of the many metrics available to check the effectiveness of routing, ParaLarPD, which is an improved version of ParaLaR, suffers from large violations in the constraints of the LP problem (which is related to the minimum channel width metric) as well as an easily measurable critical path delay metric that can be improved further. In this paper, we introduce a set of novel Lagrange heuristics that improve the Lagrange relaxation process. When tested on the MCNC benchmark circuits, on an average, this leads to halving of the constraints violation, up to 10% improvement in the minimum channel width, and up to 8% reduction in the critical path delay as obtained from ParaLarPD. We term our new algorithm as ParaLarH. Due to the increased work in the Lagrange relaxation process, as compared to ParaLarPD, ParaLarH does slightly deteriorate the speedup obtained because of parallelization, however, this aspect is easily compensated by using more number of threads.

...read moreread less

1 citations