Home
/
Authors
/
Kishor Sarawadekar

Author

Kishor Sarawadekar

Indian Institute of Technology (BHU) Varanasi

Other affiliations: Xilinx, Indian Institute of Technology Kharagpur, Indian Institutes of Technology

Bio: Kishor Sarawadekar is an academic researcher from Indian Institute of Technology (BHU) Varanasi. The author has contributed to research in topics: Gesture & JPEG 2000. The author has an hindex of 7, co-authored 39 publications receiving 182 citations. Previous affiliations of Kishor Sarawadekar include Xilinx & Indian Institute of Technology Kharagpur.

Topics: Gesture, JPEG 2000, Discrete cosine transform, Computer science, Bit plane ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Polynomial Learning Rate Policy with Warm Restart for Deep Neural Network

[...]

Purnendu Mishra¹, Kishor Sarawadekar¹•Institutions (1)

Indian Institute of Technology (BHU) Varanasi¹

01 Oct 2019

TL;DR: Another warm restart technique which is inspired by cyclical learning rate and stochastic gradient descent with warm restarts is introduced and it uses “poly” LR policy which helps in faster convergence of the DNN and it has slightly higher classification accuracy.

...read moreread less

Abstract: Learning rate (LR) is one of the most important hyper-parameters in any deep neural network (DNN) optimization process. It controls the speed of network convergence to the point of global minima by navigation through non-convex loss surface. The performance of a DNN is affected by presence of local minima, saddle points, etc. in the loss surface. Decaying the learning rate by a factor at fixed number of epochs or exponentially is the conventional way of varying the LR. Recently, two new approaches for setting learning rate have been introduced namely cyclical learning rate and stochastic gradient descent with warm restarts. In both of these approaches, the learning rate value is varied in a cyclic pattern between two boundary values. This paper introduces another warm restart technique which is inspired by these two approaches and it uses “poly” LR policy. The proposed technique is called as polynomial learning rate with warm restart and it requires only a single warm restart. The proposed LR policy helps in faster convergence of the DNN and it has slightly higher classification accuracy. The performance of the proposed LR policy is demonstrated on CIFAR-10, CIFAR-100 and tiny ImageNet dataset with CNN, ResNets and Wide Residual Networks (WRN) architectures.

...read moreread less

57 citations

Journal Article•DOI•

An Efficient Pass-Parallel Architecture for Embedded Block Coder in JPEG 2000

[...]

Kishor Sarawadekar¹, Swapna Banerjee¹•Institutions (1)

Indian Institute of Technology Kharagpur¹

01 Jun 2011-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: To encode all samples in a stripe-column, concurrently a new technique named as compact context coding is devised, and high throughput is attained and hardware requirement is also cut down.

...read moreread less

Abstract: The embedded block coding with optimized truncation (EBCOT) is a key algorithm in JPEG 2000 image compression system. Various applications, such as medical imaging, satellite imagery, digital cinema, and others, require high speed, high performance EBCOT architecture. Though efficient EBCOT architectures have been proposed, hardware requirement of these existing architectures is very high and throughput is low. To solve this problem, we investigated rate of concurrent context generation. Our paper revealed that in an image rate of four or more context pairs generation is about 68.9%. Therefore, to encode all samples in a stripe-column, concurrently a new technique named as compact context coding is devised. As a consequence, high throughput is attained and hardware requirement is also cut down. The performance of the matrix quantizer coder is improved by operating renormalization and byte out stages concurrently. The entire design of EBCOT encoder is tested on the field programmable gate array platform. The implementation results show that throughput of the proposed architecture is 163.59 MSamples/s which is equivalent to encoding 1920p (1920 × 1080, 4:2:2) high-definition TV picture sequence at 39 f/s. However, only bit plane coder (BPC) architecture operates at 315.06 MHz which implies that it is 2.86 times faster than the fastest BPC design available so far. Moreover, it is capable of encoding digital cinema size (2048 × 1080) at 42 f/s. Thus, it satisfies the requirement of applications like cartography, medical imaging, satellite imagery, and others, which demand high-speed real-time image compression system.

...read moreread less

29 citations

Journal Article•DOI•

Towards hand gesture based writing support system for blinds

[...]

Gourav Modanwal¹, Kishor Sarawadekar¹•Institutions (1)

Indian Institute of Technology (BHU) Varanasi¹

01 Sep 2016-Pattern Recognition

TL;DR: A new dactylology is proposed to achieve functionality similar to a standard keyboard and a new feature extraction technique called as reduced shape signature (RSS), which is rotation, translation and scale invariant is introduced.

...read moreread less

25 citations

Journal Article•DOI•

An Optimized Architecture of HEVC Core Transform Using Real-Valued DCT Coefficients

[...]

Subiman Chatterjee¹, Kishor Sarawadekar¹•Institutions (1)

Indian Institute of Technology (BHU) Varanasi¹

13 Mar 2018-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: A new transform kernel for HEVC is proposed which uses a new set of real-valued DCT coefficients which reduces the hardware cost and the processing time by reducing the complexity as well as intermediate data length.

...read moreread less

Abstract: Integer discrete cosine transform (DCT) reduces the complexity of the transform kernel in High Efficiency Video Coding (HEVC) by eliminating the need for floating point multiplications. However, the dynamic range of integer DCT is large and therefore hardware cost is high. In this brief, a new transform kernel for HEVC is proposed which uses a new set of real-valued DCT coefficients. The proposed real-valued DCT reduces the hardware cost and the processing time by reducing the complexity as well as intermediate data length. However, it maintains coding performance similar to that of the integer DCT. Further, a hardware efficient data flow model of 2D-DCT architecture is also presented, which shows that a transpose memory of 15-bit data depth is enough to process 9-bit residual data. Field-programmable gate array implementation of the proposed 1-D DCT architecture reduces the area-delay product and power by 37.5% and 43.4%, respectively, as compared to that of the integer DCT. The proposed architecture requires 88.6K logic gates to produce a constant throughput of 32 samples per clock and it operates at 256.4 MHz on CMOS 90-nm ASIC platform.

...read moreread less

20 citations

Journal Article•DOI•

VLSI design of memory-efficient, high-speed baseline MQ coder for JPEG 2000

[...]

Kishor Sarawadekar¹, Swapna Banerjee¹•Institutions (1)

Indian Institute of Technology Kharagpur¹

01 Jan 2012-Integration

TL;DR: Relative figure of merit is computed to compare the overall efficiency of all architectures which show that the proposed architecture provides good balance between the throughput and hardware cost.

...read moreread less

14 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

On robust estimation of the location parameter

[...]

Frederick R. Forst

01 Jan 1980

3,652 citations

Journal Article•DOI•

User-Independent American Sign Language Alphabet Recognition Based on Depth Image and PCANet Features

[...]

Walaa Aly¹, Saleh Aly¹, Sultan Almotairi¹•Institutions (1)

Majmaah University¹

01 Jan 2019-IEEE Access

TL;DR: Experimental results show that the performance of the proposed method outperforms state-of-the-art recognition accuracy using leave-one-out evaluation strategy and is evaluated using public dataset of real depth images captured from various users.

...read moreread less

Abstract: Sign language is the most natural and effective way for communications among deaf and normal people. American Sign Language (ASL) alphabet recognition (i.e. fingerspelling) using marker-less vision sensor is a challenging task due to the difficulties in hand segmentation and appearance variations among signers. Existing color-based sign language recognition systems suffer from many challenges such as complex background, hand segmentation, large inter-class and intra-class variations. In this paper, we propose a new user independent recognition system for American sign language alphabet using depth images captured from the low-cost Microsoft Kinect depth sensor. Exploiting depth information instead of color images overcomes many problems due to their robustness against illumination and background variations. Hand region can be segmented by applying a simple preprocessing algorithm over depth image. Feature learning using convolutional neural network architectures is applied instead of the classical hand-crafted feature extraction methods. Local features extracted from the segmented hand are effectively learned using a simple unsupervised Principal Component Analysis Network (PCANet) deep learning architecture. Two strategies of learning the PCANet model are proposed, namely to train a single PCANet model from samples of all users and to train a separate PCANet model for each user, respectively. The extracted features are then recognized using linear Support Vector Machine (SVM) classifier. The performance of the proposed method is evaluated using public dataset of real depth images captured from various users. Experimental results show that the performance of the proposed method outperforms state-of-the-art recognition accuracy using leave-one-out evaluation strategy.

...read moreread less

69 citations

Posted Content•

RESA: Recurrent Feature-Shift Aggregator for Lane Detection

[...]

Tu Zheng, Hao Fang¹, Yi Zhang¹, Wenjian Tang, Zheng Yang, Haifeng Liu¹, Deng Cai - Show less +3 more•Institutions (1)

Zhejiang University¹

31 Aug 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel module named REcurrent Feature-Shift Aggregator (RESA) to enrich lane feature after preliminary feature extraction with an ordinary CNN, which achieves state-of-the-art results on two popular lane detection benchmarks (CULane and Tusimple).

...read moreread less

Abstract: Lane detection is one of the most important tasks in self-driving. Due to various complex scenarios (e.g., severe occlusion, ambiguous lanes, and etc.) and the sparse supervisory signals inherent in lane annotations, lane detection task is still challenging. Thus, it is difficult for ordinary convolutional neural network (CNN) trained in general scenes to catch subtle lane feature from raw image. In this paper, we present a novel module named REcurrent Feature-Shift Aggregator (RESA) to enrich lane feature after preliminary feature extraction with an ordinary CNN. RESA takes advantage of strong shape priors of lanes and captures spatial relationships of pixels across rows and columns. It shifts sliced feature map recurrently in vertical and horizontal directions and enables each pixel to gather global information. With the help of slice-by-slice information propagation, RESA can conjecture lanes accurately in challenging scenarios with weak appearance clues. Moreover, we also propose a Bilateral Up-Sampling Decoder which combines coarse grained feature and fine detailed feature in up-sampling stage, and it can recover low-resolution feature map into pixel-wise prediction meticulously. Our method achieves state-of-the-art results on two popular lane detection benchmarks (CULane and Tusimple). The code will be released publicly available.

...read moreread less

69 citations

Journal Article•DOI•

Design of a High-Performance System for Secure Image Communication in the Internet of Things

[...]

Elias Kougianos¹, Saraju P. Mohanty¹, Gavin Coelho, Umar Albalawi¹, Prabha Sundaravadivel¹ - Show less +1 more•Institutions (1)

University of North Texas¹

17 Mar 2016-IEEE Access

TL;DR: This paper presents a modular and extensible quadrotor architecture and its specific prototyping for automatic tracking applications and is the first ever proposed hardware architecture for SBPG compression integrated with an SDC.

...read moreread less

Abstract: Image or video exchange over the Internet of Things (IoT) is a requirement in diverse applications, including smart health care, smart structures, and smart transportations. This paper presents a modular and extensible quadrotor architecture and its specific prototyping for automatic tracking applications. The architecture is extensible and based on off-the-shelf components for easy system prototyping. A target tracking and acquisition application is presented in detail to demonstrate the power and flexibility of the proposed design. Complete design details of the platform are also presented. The designed module implements the basic proportional–integral–derivative control and a custom target acquisition algorithm. Details of the sliding-window-based algorithm are also presented. This algorithm performs $20\times $ faster than comparable approaches in OpenCV with equal accuracy. Additional modules can be integrated for more complex applications, such as search-and-rescue, automatic object tracking, and traffic congestion analysis. A hardware architecture for the newly introduced Better Portable Graphics (BPG) compression algorithm is also introduced in the framework of the extensible quadrotor architecture. Since its introduction in 1987, the Joint Photographic Experts Group (JPEG) graphics format has been the de facto choice for image compression. However, the new compression technique BPG outperforms the JPEG in terms of compression quality and size of the compressed file. The objective is to present a hardware architecture for enhanced real-time compression of the image. Finally, a prototyping platform of a hardware architecture for a secure digital camera (SDC) integrated with the secure BPG (SBPG) compression algorithm is presented. The proposed architecture is suitable for high-performance imaging in the IoT and is prototyped in Simulink. To the best of our knowledge, this is the first ever proposed hardware architecture for SBPG compression integrated with an SDC.

...read moreread less

61 citations

Journal Article•DOI•

Nonsubsampled rotated complex wavelet transform (NSRCxWT) for medical image fusion related to clinical aspects in neurocysticercosis

[...]

Satishkumar S. Chavan¹, Abhishek Mahajan², Sanjay N. Talbar, Subhash Desai², Meenakshi Thakur², Anil K. D'Cruz² - Show less +2 more•Institutions (2)

Don Bosco Institute of Technology, Mumbai¹, Tata Memorial Hospital²

01 Feb 2017-Computers in Biology and Medicine

TL;DR: This paper presents a novel approach to Multimodality Medical Image Fusion (MMIF) used for the analysis of the lesions for the diagnostic purpose and post treatment review of NCC and shows promising and superior results when compared with the state of the art wavelet based fusion algorithms.

...read moreread less

50 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48

Collapse