Home
/
Authors
/
Sajid Anwar

Author

Sajid Anwar

Other affiliations: Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Seoul National University, Center for Excellence in Education ...read more

Bio: Sajid Anwar is an academic researcher from Information Technology Institute. The author has contributed to research in topics: Software system & Deep learning. The author has an hindex of 16, co-authored 67 publications receiving 1862 citations. Previous affiliations of Sajid Anwar include Ghulam Ishaq Khan Institute of Engineering Sciences and Technology & Seoul National University.

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2006

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Structured Pruning of Deep Convolutional Neural Networks

[...]

Sajid Anwar¹, Kyuyeon Hwang¹, Wonyong Sung¹•Institutions (1)

Seoul National University¹

09 Feb 2017-ACM Journal on Emerging Technologies in Computing Systems

TL;DR: The proposed work shows that when pruning granularities are applied in combination, the CIFAR-10 network can be pruned by more than 70% with less than a 1% loss in accuracy.

...read moreread less

Abstract: Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks: feature map-wise, kernel-wise, and intra-kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, in parallel computing environments, and in hardware-based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by assessing the misclassification rate with a corresponding connectivity pattern. The pruned network is retrained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra-kernel strided sparsity with a simple constraint can significantly reduce the size of the kernel and feature map tensors. The proposed work shows that when pruning granularities are applied in combination, we can prune the CIFAR-10 network by more than 70% with less than a 1% loss in accuracy.

...read moreread less

476 citations

Posted Content•

Structured Pruning of Deep Convolutional Neural Networks

[...]

Sajid Anwar¹, Kyuyeon Hwang¹, Wonyong Sung¹•Institutions (1)

Seoul National University¹

29 Dec 2015-arXiv: Neural and Evolutionary Computing

TL;DR: In this article, the importance weight of each particle is assigned by computing the misclassification rate with corresponding connectivity pattern, and the pruned network is re-trained to compensate for the losses due to pruning.

...read moreread less

Abstract: Real time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network connections that not only demand extra representation efforts but also do not fit well on parallel computation. We introduce structured sparsity at various scales for convolutional neural networks, which are channel wise, kernel wise and intra kernel strided sparsity. This structured sparsity is very advantageous for direct computational resource savings on embedded computers, parallel computing environments and hardware based systems. To decide the importance of network connections and paths, the proposed method uses a particle filtering approach. The importance weight of each particle is assigned by computing the misclassification rate with corresponding connectivity pattern. The pruned network is re-trained to compensate for the losses due to pruning. While implementing convolutions as matrix products, we particularly show that intra kernel strided sparsity with a simple constraint can significantly reduce the size of kernel and feature map matrices. The pruned network is finally fixed point optimized with reduced word length precision. This results in significant reduction in the total storage size providing advantages for on-chip memory based implementations of deep neural networks.

...read moreread less

454 citations

Proceedings Article•DOI•

Fixed point optimization of deep convolutional neural networks for object recognition

[...]

Sajid Anwar¹, Kyuyeon Hwang¹, Wonyong Sung¹•Institutions (1)

Seoul National University¹

19 Apr 2015

TL;DR: The results indicate that quantization induces sparsity in the network which reduces the effective number of network parameters and improves generalization, and reduces the required memory storage by a factor of 1/10 and achieves better classification results than the high precision networks.

...read moreread less

Abstract: Deep convolutional neural networks have shown promising results in image and speech recognition applications. The learning capability of the network improves with increasing depth and size of each layer. However this capability comes at the cost of increased computational complexity. Thus reduction in hardware complexity and faster classification are highly desired. This work proposes an optimization method for fixed point deep convolutional neural networks. The parameters of a pre-trained high precision network are first directly quantized using L2 error minimization. We quantize each layer one by one, while other layers keep computation with high precision, to know the layer-wise sensitivity on word-length reduction. Then the network is retrained with quantized weights. Two examples on object recognition, MNIST and CIFAR-10, are presented. Our results indicate that quantization induces sparsity in the network which reduces the effective number of network parameters and improves generalization. This work reduces the required memory storage by a factor of 1/10 and achieves better classification results than the high precision networks.

...read moreread less

255 citations

Journal Article•DOI•

Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study

[...]

Adnan Amin¹, Sajid Anwar¹, Awais Adnan¹, Muhammad Nawaz¹, Newton Howard², Junaid Qadir³, Ahmad Hawalah⁴, Amir Hussain⁵ - Show less +4 more•Institutions (5)

Information Technology Institute¹, University of Oxford², Information Technology University³, Taibah University⁴, University of Stirling⁵

26 Oct 2016-IEEE Access

TL;DR: The empirical results demonstrate that the overall predictive performance of MTDF and rules-generation based on genetic algorithms performed the best as compared with the rest of the evaluated oversampling methods and rule-generation algorithms.

...read moreread less

Abstract: Customer retention is a major issue for various service-based organizations particularly telecom industry, wherein predictive models for observing the behavior of customers are one of the great instruments in customer retention process and inferring the future behavior of the customers However, the performances of predictive models are greatly affected when the real-world data set is highly imbalanced A data set is called imbalanced if the samples size from one class is very much smaller or larger than the other classes The most commonly used technique is over/under sampling for handling the class-imbalance problem (CIP) in various domains In this paper, we survey six well-known sampling techniques and compare the performances of these key techniques, ie, mega-trend diffusion function (MTDF), synthetic minority oversampling technique, adaptive synthetic sampling approach, couples top-N reverse $k$ -nearest neighbor, majority weighted minority oversampling technique, and immune centroids oversampling technique Moreover, this paper also reveals the evaluation of four rules-generation algorithms (the learning from example module, version 2 (LEM2), covering, exhaustive, and genetic algorithms) using publicly available data sets The empirical results demonstrate that the overall predictive performance of MTDF and rules-generation based on genetic algorithms performed the best as compared with the rest of the evaluated oversampling methods and rule-generation algorithms

...read moreread less

198 citations

Journal Article•DOI•

Customer churn prediction in the telecommunication sector using a rough set approach

[...]

Adnan Amin¹, Sajid Anwar¹, Awais Adnan¹, Muhammad Nawaz¹, Khalid Alawfi², Amir Hussain³, Kaizhu Huang⁴ - Show less +3 more•Institutions (4)

Information Technology Institute¹, Taibah University², University of Stirling³, University of Liverpool⁴

10 May 2017-Neurocomputing

TL;DR: This study proposes an intelligent rule-based decision-making technique, based on rough set theory (RST), to extract important decision rules related to customer churn and non-churn, and shows that RST based on GA is the most efficient technique for extracting implicit knowledge in the form of decision rules from the publicly available, benchmark telecom dataset.

...read moreread less

155 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Proceedings Article•

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

[...]

Song Han¹, Huizi Mao², William J. Dally¹, William J. Dally³•Institutions (3)

Stanford University¹, Tsinghua University², Nvidia³

15 Feb 2016

TL;DR: Deep Compression as mentioned in this paper proposes a three-stage pipeline: pruning, quantization, and Huffman coding to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.

...read moreread less

Abstract: Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, we introduce "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy. Our method first prunes the network by learning only the important connections. Next, we quantize the weights to enforce weight sharing, finally, we apply Huffman coding. After the first two steps we retrain the network to fine tune the remaining connections and the quantized centroids. Pruning, reduces the number of connections by 9x to 13x; Quantization then reduces the number of bits that represent each connection from 32 to 5. On the ImageNet dataset, our method reduced the storage required by AlexNet by 35x, from 240MB to 6.9MB, without loss of accuracy. Our method reduced the size of VGG-16 by 49x from 552MB to 11.3MB, again with no loss of accuracy. This allows fitting the model into on-chip SRAM cache rather than off-chip DRAM memory. Our compression method also facilitates the use of complex neural networks in mobile applications where application size and download bandwidth are constrained. Benchmarked on CPU, GPU and mobile GPU, compressed network has 3x to 4x layerwise speedup and 3x to 7x better energy efficiency.

...read moreread less

7,256 citations

Book Chapter•DOI•

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

[...]

Mohammad Rastegari¹, Vicente Ordonez¹, Joseph Redmon², Ali Farhadi¹, Ali Farhadi² - Show less +1 more•Institutions (2)

Allen Institute for Artificial Intelligence¹, University of Washington²

08 Oct 2016

TL;DR: The Binary-Weight-Network version of AlexNet is compared with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than $16\,\%$ in top-1 accuracy.

...read moreread less

Abstract: We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32$\times $ memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58$\times $ faster convolutional operations (in terms of number of the high precision operations) and 32$\times $ memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a Binary-Weight-Network version of AlexNet is the same as the full-precision AlexNet. We compare our method with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than $16\,\%$ in top-1 accuracy. Our code is available at: http://allenai.org/plato/xnornet.

...read moreread less

3,288 citations

Journal Article•DOI•

Efficient Processing of Deep Neural Networks: A Tutorial and Survey

[...]

Vivienne Sze¹, Yu-Hsin Chen¹, Tien-Ju Yang¹, Joel Emer¹•Institutions (1)

Massachusetts Institute of Technology¹

20 Nov 2017

TL;DR: In this paper, the authors provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs, and discuss various hardware platforms and architectures that support DNN, and highlight key trends in reducing the computation cost of deep neural networks either solely via hardware design changes or via joint hardware and DNN algorithm changes.

...read moreread less

Abstract: Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI) applications including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Accordingly, techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost are critical to the wide deployment of DNNs in AI systems. This article aims to provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs. Specifically, it will provide an overview of DNNs, discuss various hardware platforms and architectures that support DNNs, and highlight key trends in reducing the computation cost of DNNs either solely via hardware design changes or via joint hardware design and DNN algorithm changes. It will also summarize various development resources that enable researchers and practitioners to quickly get started in this field, and highlight important benchmarking metrics and design considerations that should be used for evaluating the rapidly growing number of DNN hardware designs, optionally including algorithmic codesigns, being proposed in academia and industry. The reader will take away the following concepts from this article: understand the key design considerations for DNNs; be able to evaluate different DNN hardware implementations with benchmarks and comparison metrics; understand the tradeoffs between various hardware architectures and platforms; be able to evaluate the utility of various DNN design techniques for efficient processing; and understand recent implementation trends and opportunities.

...read moreread less

2,391 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse