scispace - formally typeset
Search or ask a question

Showing papers by "Santanu Chaudhury published in 2019"


Journal ArticleDOI
TL;DR: A fully convolutional neural network with attentional deep supervision for the automatic and accurate segmentation of the ultrasound images with improvement in overall segmentation accuracy is developed.
Abstract: Objective: Segmentation of anatomical structures in ultrasound images requires vast radiological knowledge and experience. Moreover, the manual segmentation often results in subjective variations, therefore, an automatic segmentation is desirable. We aim to develop a fully convolutional neural network (FCNN) with attentional deep supervision for the automatic and accurate segmentation of the ultrasound images. Method: FCNN/CNNs are used to infer high-level context using low-level image features. In this paper, a sub-problem specific deep supervision of the FCNN is performed. The attention of fine resolution layers is steered to learn object boundary definitions using auxiliary losses, whereas coarse resolution layers are trained to discriminate object regions from the background. Furthermore, a customized scheme for downweighting the auxiliary losses and a trainable fusion layer are introduced. This produces an accurate segmentation and helps in dealing with the broken boundaries, usually found in the ultrasound images. Results: The proposed network is first tested for blood vessel segmentation in liver images. It results in $F1$ score, mean intersection over union, and dice index of 0.83, 0.83, and 0.79, respectively. The best values observed among the existing approaches are produced by U-net as 0.74, 0.81, and 0.75, respectively. The proposed network also results in dice index value of 0.91 in the lumen segmentation experiments on MICCAI 2011 IVUS challenge dataset, which is near to the provided reference value of 0.93. Furthermore, the improvements similar to vessel segmentation experiments are also observed in the experiment performed to segment lesions. Conclusion: Deep supervision of the network based on the input-output characteristics of the layers results in improvement in overall segmentation accuracy. Significance: Sub-problem specific deep supervision for ultrasound image segmentation is the main contribution of this paper. Currently the network is trained and tested for fixed size inputs. It requires image resizing and limits the performance in small size images.

111 citations


Proceedings ArticleDOI
16 Jun 2019
TL;DR: This paper reviews the NTIRE challenge on image colorization (estimating color information from the corresponding gray image) with focus on proposed solutions and results.
Abstract: This paper reviews the NTIRE challenge on image colorization (estimating color information from the corresponding gray image) with focus on proposed solutions and results. It is the first challenge of its kind. The challenge had 2 tracks. Track 1 takes a single gray image as input. In Track 2, in addition to the gray input image, some color seeds (randomly samples from the latent color image) are also provided for guiding the colorization process. The operators were learnable through provided pairs of gray and color training images. The tracks had 188 registered participants, and 8 teams competed in the final testing phase.

22 citations


Journal ArticleDOI
TL;DR: In this paper, morphological, physiological, biochemical and molecular variations between drought tolerant (PB6 and Moroberakan) and drought sensitive (Way Rarem) varieties have been evaluated, and notable differences have been observed in root morphology, root xylem number and area, stomata number, relative water content, proline content, protein and gene expression.

10 citations


Proceedings ArticleDOI
16 Jun 2019
TL;DR: A Robust Image Colorization using self-attention based Progressive Generative Adversarial Network (RIC-SPGAN) which consists of residual encoder-decoder (RED) network and a Self-att attention based progressive Generative network (SP-GAN) in a cascaded form to perform the denoising and colorization of the image.
Abstract: Automatic image colorization is a very interesting computer graphics problem wherein an input grayscale image is transformed into its RGB domain. However, it is an ill-posed problem as there can be multiple RGB outcomes for a particular grayscale pixel. The problem further complicates if noise is present in the grayscale image. In this paper, we propose a Robust Image Colorization using Self-attention based Progressive Generative Adversarial Network (RIC-SPGAN) which consists of residual encoder-decoder (RED) network and a Self-attention based progressive Generative network (SP-GAN) in a cascaded form to perform the denoising and colorization of the image. We have used self-attention based progressive network to model the long-range dependencies and gradually enhanced the resolution of the colorized image for faster, stable and variation rich features for generation of the image. We also presented the stabilization technique of the presented generative model. Our model has shown exceptional perceptual results on noisy and normal grayscale images. We have trained our model on ILSVRC2012. The visual results on images of DIV2K with and without noise has been presented in the paper along with the failure cases of the model.

7 citations


Journal ArticleDOI
TL;DR: A distributed locality sensitive hashing based framework for image super resolution exploiting computational and storage efficiency of cloud and providing promising results in comparison to existing approaches.
Abstract: In this paper we propose a distributed locality sensitive hashing based framework for image super resolution exploiting computational and storage efficiency of cloud. Now days huge multimedia data is available on the cloud which can be utilized using store anywhere and excess anywhere model. It may be noted that super resolution is required for consumer electronics display devices due to various reasons. The propose framework exploits the image correlation for image super resolution using locality sensitive hashing (LSH) for manifold learning. In our work we have exploited the benefits of manifold learning for image super resolution, which in-turn is a highly time complex operation. The time complexity is involved due to finding the approximate nearest neighbors from trillion of image patches for locally linear embedding (LLE) operation. In our approach it is mitigated by using a distributed framework which internally uses hash tables for mapping of patches in the target image from a database of internet picture collection. The proposed framework for super resolution provides promising results in comparison to existing approaches.

5 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: It is found that the convolutional neural network performs better when trained with the assistance of fixation information compared to the network trained without eye fixations.
Abstract: This paper is concerned with the development of techniques for the recognition of ornamental characters motivated by the perceptual processes involved in humans. To understand the perceptual process, we have performed the eye-tracking experiment to recognize the special set of characters, with artistic variations in character structure and form. The novelty of this paper is the use of human visual fixations to supervise the intermediate layers of the convolutional neural network. From the results obtained, we found that the network performs better when trained with the assistance of fixation information compared to the network trained without eye fixations.

4 citations


Book ChapterDOI
22 Dec 2019
TL;DR: A refined optical flow estimation method that performs well in case of low contrast, highly cluttered background, dynamic background, occlusion and illumination change is presented.
Abstract: Optical Flow is a popular method of computer vision for motion estimation. In this paper, we present a refined optical flow estimation method. Central to our approach is exploiting contour information as most of the motion lies on the edges. Further, we have formulated it as sparse to dense motion estimation. Proposed method has been evaluated on challenging real life image sequences of KITTI and Fish4Knowledge database. Results demonstrate that method performs well in case of low contrast, highly cluttered background, dynamic background, occlusion and illumination change.

4 citations


Book ChapterDOI
17 Dec 2019
TL;DR: This paper aims to provide a solution to the problem faced in real-time vehicle detection in aerial images and videos by using hyper maps generated by skip connected Convolutional network to generate object like proposals accurately.
Abstract: Detection of objects in aerial images has gained significant attention in recent years, due to its extensive needs in civilian and military reconnaissance and surveillance applications. With the advent of Unmanned Aerial Vehicles (UAV), the scope of performing such surveillance task has increased. The small size of the objects in aerial images makes it very difficult to detect them. Two-stage Region based Convolutional Neural Network framework for object detection has been proved quite effective. The main problem with these frameworks is the low speed as compared to the one class object detectors due to the computation complexity in generating the region proposals. Region-based methods suffer from poor localization of the objects that leads to a significant number of false positives. This paper aims to provide a solution to the problem faced in real-time vehicle detection in aerial images and videos. The proposed approach used hyper maps generated by skip connected Convolutional network. The hyper feature maps are then passed through region proposal network to generate object like proposals accurately. The issue of detecting objects similar to background is addressed by modifying the loss function of the proposal network. The performance of the proposed network has been evaluated on the publicly available VEDAI dataset.

3 citations


Journal ArticleDOI
TL;DR: The proposed method is promising to serve as a first post-hoc explainable NN-approach for brain-connectivity analysis in clinical applications, by proposing a novel score depending on weights as a quantitative measure of connectivity, called as relative relevance score (xNN-RRS).

3 citations


Book ChapterDOI
17 Dec 2019
TL;DR: This paper presents data-driven sensing for spatial multiplexers trained with combined mean square error (MSE) and perceptual loss using Deep convolutional neural networks and experimentally infer that the encoded information from such spatialmultiplexers can directly be used for action recognition.
Abstract: Tasks such as action recognition requires high quality features for accurate inference. But the use of high resolution and large volume of video data poses a significant challenge for inference in terms of storage and computational complexity. In addition, compressive sensing as a potential solution to the aforementioned problems has been shown to recover signals at higher compression ratios with loss in information. Hence, a framework is required that performs good quality action recognition on compressively sensed data. In this paper, we present data-driven sensing for spatial multiplexers trained with combined mean square error (MSE) and perceptual loss using Deep convolutional neural networks. We employ subpixel convolutional layers with the 2D Convolutional Encoder-Decoder model, that learns the downscaling filters to bring the input from higher dimension to lower dimension in encoder and learns the reverse, i.e. upscaling filters in the decoder. We stack this Encoder with Inflated 3D ConvNet and train the cascaded network with cross-entropy loss for Action recognition. After encoding data and undersampling it by over 100 times (10 \(\times \) 10) from the input size, we obtain 75.05% accuracy on UCF-101 and 50.39% accuracy on HMDB-51 with our proposed architecture setting the baseline for reconstruction free action recognition with data-driven sensing using deep learning. We experimentally infer that the encoded information from such spatial multiplexers can directly be used for action recognition.

2 citations


Book ChapterDOI
12 Dec 2019
TL;DR: The performance of convolutional stacked autoencoder and Convolutional Long short term memory models for classification of Motor imagery EEG signal and the performance of these models have been compared with different machine learning models.
Abstract: Navigation of drones can be conceivably performed by operators by analyzing the brain signals of the person. EEG signal corresponding to the motor imaginations can be used for generation of control signals for drone. Different machine learning and deep learning approaches have been developed in the state of the art literature for the classification of motor imagery EEG signal. There is still a need for developing a suitable model that can classify the motor imagery signal fast and can generate a navigation command for drone in real-time. In this paper, we have reported the performance of convolutional stacked autoencoder and Convolutional Long short term memory models for classification of Motor imagery EEG signal. The developed models have been optimized using TensorRT that speeds up inference performance and the inference engine has been deployed on Jetson TX2 embedded platform. The performance of these models have been compared with different machine learning models.

Book ChapterDOI
17 Dec 2019
TL;DR: A context-aware reasoning framework that adapts to the needs and preferences of inhabitants continuously to provide contextually relevant recommendations to the group of users in a smart home environment is introduced.
Abstract: This paper introduces a context-aware reasoning framework that adapts to the needs and preferences of inhabitants continuously to provide contextually relevant recommendations to the group of users in a smart home environment. User’s activity and mobility plays a crucial role in defining various contexts in and around the home. The observation data acquired from disparate sensors, called user’s context, is interpreted semantically to implicitly disambiguate the users that are being recommended to. The recommendations are provided based on the relationship that exist among multiple users and the decision is made as per the preference or priority. The proposed approach makes extensive use of multimedia ontology in the life cycle of situation recognition to explicitly model and represent user’s context in smart home. Further, dynamic reasoning is exploited to facilitate context-aware situation tracking and intelligently recommending appropriate actions which suit the situation. We illustrate use of the proposed framework for Smart Home use-case.

Proceedings ArticleDOI
01 Oct 2019
TL;DR: This paper shows that by training a Generative Adversarial Network with raw image pixels as input, it can generate scenes which constitute the objects as well as generate the surrounding environment suitable for the combination of the input objects.
Abstract: In this paper, we propose to synthesize natural images from a set of input objects. The proposed technique generates a scene which has high correlation with the provided set of input objects while also maintaining the natural placement of objects within the scene. The technique constitutes of a generative adversarial network trained on a large corpus of objects and natural scenes. This is in contrast with earlier works where the objective was to generate a natural scene from a noise vector or conditioning the network over a variable. However, such methods have limitations in their ability to control the objects within the generated images. On the contrary, we show that by training a Generative Adversarial Network with raw image pixels as input, we can generate scenes which constitute the objects as well as generate the surrounding environment suitable for the combination of the input objects. We provide qualitative and quantitative results on challenging MS-COCO dataset to show the effectiveness of the proposed technique.