Showing papers by "Santanu Chaudhury published in 2020"

PDF

Open Access

Posted Content•

Multi-Task Driven Explainable Diagnosis of COVID-19 using Chest X-ray Images

[...]

Aakarsh Malhotra¹, Surbhi Mittal², Puspita Majumdar¹, Saheb Chhabra¹, Kartik Thakral², Mayank Vatsa², Richa Singh², Santanu Chaudhury², Ashwin Pudrod, Anjali Agrawal - Show less +6 more•Institutions (2)

Indraprastha Institute of Information Technology¹, Indian Institute of Technology, Jodhpur²

03 Aug 2020-arXiv: Image and Video Processing

TL;DR: The proposed network not only predicts whether the CXR has COVID-19 features present or not, it also performs semantic segmentation of the regions of interest to make the model explainable.

...read moreread less

Abstract: With increasing number of COVID-19 cases globally, all the countries are ramping up the testing numbers. While the RT-PCR kits are available in sufficient quantity in several countries, others are facing challenges with limited availability of testing kits and processing centers in remote areas. This has motivated researchers to find alternate methods of testing which are reliable, easily accessible and faster. Chest X-Ray is one of the modalities that is gaining acceptance as a screening modality. Towards this direction, the paper has two primary contributions. Firstly, we present the COVID-19 Multi-Task Network which is an automated end-to-end network for COVID-19 screening. The proposed network not only predicts whether the CXR has COVID-19 features present or not, it also performs semantic segmentation of the regions of interest to make the model explainable. Secondly, with the help of medical professionals, we manually annotate the lung regions of 9000 frontal chest radiographs taken from ChestXray-14, CheXpert and a consolidated COVID-19 dataset. Further, 200 chest radiographs pertaining to COVID-19 patients are also annotated for semantic segmentation. This database will be released to the research community.

...read moreread less

29 citations

Journal Article•DOI•

Walsh–Hadamard-Kernel-Based Features in Particle Filter Framework for Underwater Object Tracking

[...]

Deepak Kumar Rout¹, Badri Narayan Subudhi², T. Veerakumar¹, Santanu Chaudhury³•Institutions (3)

National Institute of Technology Goa¹, Indian Institutes of Technology², Indian Institute of Technology, Jodhpur³

01 Sep 2020-IEEE Transactions on Industrial Informatics

TL;DR: The proposed scheme is quite encouraging in the case of sequences with hazy and degraded, partially occluded, and camouflaged challenges, and the performance evaluation is performed by comparing the scheme with five recent state-of-the-art tracking schemes.

...read moreread less

Abstract: One of the well-established research domains among computer vision scientists is object tracking. However, not much work has been done in underwater scenarios. This article addresses the problem of visual tracking in the underwater environment with the stationary and nonstationary camera setups. In order to deal with the underwater optical dynamics, a dominant color component-based scene representation is employed in the YCbCr color space. An adaptive approach is devised to select the Walsh–Hadamard (WH) kernels for the efficient extraction of color, edge, and texture strengths, whereas a new feature called range strength is proposed to extract the variation of intensity from underwater sequences in the local neighborhood using the WH kernel. The likelihood of these feature strengths is integrated in a particle filter framework to track the object of interest in underwater sequences. The reference feature strengths used in assigning weights to the particles are updated based on the S $\phi$ rensen distance. The coefficients of feature strengths are calculated in such a way that if one feature fails, then its coefficient become insignificant, whereas the more suitable features get higher feature coefficients. The effectiveness of the proposed scheme is evaluated using the underwater video datasets: reefVid, fish4knowledge (F4K), underwaterchangedetection (UWCD), and National Oceanic and Atmospheric Administration (NOAA). The performance evaluation is performed by comparing the scheme with five recent state-of-the-art tracking schemes. The quantitative analysis of the proposed scheme is carried out using three evaluation measures: overall intersection over union , centroid location error , and average tracking error . The performance of the proposed scheme is quite encouraging in the case of sequences with hazy and degraded, partially occluded, and camouflaged challenges.

...read moreread less

13 citations

Book Chapter•DOI•

Automatically Optimized Gradient Boosting Trees for Classifying Large Volume High Cardinality Data Streams Under Concept Drift

[...]

Jobin Wilson¹, Amit Kumar Meher, Bivin Vinodkumar Bindu¹, Santanu Chaudhury¹, Brejesh Lall¹, Manoj Sharma², Vishakha Pareek² - Show less +3 more•Institutions (2)

Indian Institute of Technology Delhi¹, Council of Scientific and Industrial Research²

01 Jan 2020

TL;DR: The winning solution to the NeurIPS 2018 AutoML challenge is described, entitled AutoGBT, which combines an adaptive self-optimized end-to-end machine learning pipeline based on gradient boosting trees with automatic hyper-parameter tuning using Sequential Model-Based Optimization (SMBO).

...read moreread less

Abstract: Data abundance along with scarcity of machine learning experts and domain specialists necessitates progressive automation of end-to-end machine learning workflows. To this end, Automated Machine Learning (AutoML) has emerged as a prominent research area. Real world data often arrives as streams or batches, and data distribution evolves over time causing concept drift. Models need to handle data that is not independent and identically distributed (iid), and transfer knowledge across time through continuous self-evaluation and adaptation adhering to resource constraints. Creating autonomous self-maintaining models which not only discover an optimal pipeline, but also automatically adapt to concept drift to operate in a lifelong learning setting was the crux of NeurIPS 2018 AutoML challenge. We describe our winning solution to the challenge, entitled AutoGBT, which combines an adaptive self-optimized end-to-end machine learning pipeline based on gradient boosting trees with automatic hyper-parameter tuning using Sequential Model-Based Optimization (SMBO). We report experimental results on the challenge datasets as well as several benchmark datasets affected by concept drift and compare it with the baseline model for the challenge and Auto-sklearn. Results indicate the effectiveness of the proposed methodology in this context.

...read moreread less

9 citations

Journal Article•DOI•

Automatic lecture video skimming using shot categorization and contrast based features

[...]

Badri Narayan Subudhi¹, T. Veerakumar², S. Esakkirajan³, Santanu Chaudhury⁴•Institutions (4)

Indian Institutes of Technology¹, National Institute of Technology Goa², PSG College of Technology³, Indian Institute of Technology, Jodhpur⁴

01 Jul 2020-Expert Systems With Applications

TL;DR: An intelligent expert video skimming technique for lecture video sequences, where human intervention is not required is put forward, and is found to be better than that of the existing schemes.

...read moreread less

Abstract: Video skimming is one of the recently, getting popular technique for preparing preview for long watching video sequences. Most of the video skimming techniques developed in the literature uses manual intervention of users to prepare the review. Mostly the literature reported video skimming for sports and movie industries. In sports the portion of video where audience claps are used and in movie important contents are manually selected for preparing the preview. However in literature rarely any work reported for skimming of lecture video sequences. Lecture videos are generally, recorded indoor, low illuminated, noisy environment condition and contents of the scene rarely changes much. Hence designing an automatic skimming scheme is quite difficult task. In this article, we put forward an intelligent expert video skimming technique for lecture video sequences, where human intervention is not required. In the proposed scheme, initially the lecture video is segmented into a number of shots. We proposed the use of radiometric correlation technique for lecture video segmentation or finding the shot transitions. After getting the shot transitions in a video, the shots are recognized. The fuzzy K-nearest neighborhood technique is proposed to recognize the shots in a video. The shots are recognized into three categories: title slides, written texts/displayed slides and talking heads/writing hands. Three contrast based features: one existing i.e., average sharpness (AS) and two newly proposed: relative height (RH) and edge potential (EP) are used to find the contents of a frame. The frames with different contrast values are categorized to prepare the video skimming or the capsule. The media recreation is achieved by selecting a set of frames around these selected content frames. The effectiveness of the proposed scheme is demonstrated in this paper using five test sequences, including three NPTEL and two non NPTEL. It is also observed that the capsule prepared by the proposed scheme, provides a better preview of the actual sequence. The performance of the proposed scheme is tested by comparing it against three state-of-the-art techniques. The evaluation of the proposed scheme is carried out by using three evaluation measures. It is also observed that the proposed scheme is found to be better than that of the existing schemes.

...read moreread less

6 citations

Proceedings Article•DOI•

Prominent Object Detection in Underwater Environment using a Dual-feature Framework

[...]

Deepak Kumar Rout¹, Badri Narayan Subudhi², T. Veerakumar¹, Santanu Chaudhury³•Institutions (3)

National Institute of Technology Goa¹, Indian Institutes of Technology², Indian Institute of Technology, Jodhpur³

05 Oct 2020

TL;DR: In this article, a spatio-contextual Gaussian mixture model based background subtraction method is used to detect prominent objects among a large group of fishes in a stationary camera setup, and the detected objects are analyzed to determine a predefined number of the most prominent objects in the scene of view.

...read moreread less

Abstract: Tracking of a fish or some specific fishes in a school of fish is quite a challenging task. This could help in understanding the behavior of a fish or a small group of fish in a crowd of different varieties of fishes. In this paper we propose a technique to detect prominent objects among a large group of fishes. The problem is formulated with a stationary camera setup. The moving objects are initially detected by a spatio-contextual Gaussian mixture model based background subtraction method. Further, all the detected objects are analyzed to determine a predefined number of the most prominent objects in the scene of view. To characterize the objects we have employed a dual-feature framework, which includes color and texture features. The overall feature strength is computed by combining the two feature-strengths in an adaptive way so that, the color gets more weight if color degradation is less otherwise texture gets more weight. This weight is adaptively computed with the prior information of color degradation phenomena in underwater environment. The proposed technique is tested with a large number of underwater videos and found to perform satisfactorily.

...read moreread less

4 citations

Proceedings Article•DOI•

Unboxing the blackbox - Visualizing the model on hand radiographs in skeletal bone age assessment

[...]

Shipra Madan¹, Anirudra Diwakar², Tapan K. Gandhi¹, Santanu Chaudhury¹•Institutions (2)

Indian Institutes of Technology¹, AIIMS, New Delhi²

10 Dec 2020

TL;DR: In this article, an end-to-end approach which uses inception trained from scratch, achieves 80% accuracy in predicting age within 1 year from the ground truth, using attention maps to explain what regions of the image, the model is focusing on while assessing the bone age and the heat maps thus generated match the features used by the radiologists while predicting manually.

...read moreread less

Abstract: Skeletal Bone age assessment is one of the routine radiological procedures performed by paediatricians and endocrinologists for investigating genetic disorders, developmental abnormalities and metabolic complications In this process skeletal age is compared against child's chronological age to uncover discrepancies if any Hand radiographs being the cheapest, reliable and widely used modality, are used to predict the bone age in children from 1-18 years of age Conventional methods make use of atlases to predict the age which are time consuming, tedious and have problems of inter-observer variability We propose an end to end approach which uses inception trained from scratch, achieves 80% accuracy in predicting age within 1 year from the ground truth Further, attention maps are generated to explain what regions of the image, the model is focusing on while assessing the bone age and the heat maps thus generated match the features used by the radiologists while predicting manually

...read moreread less

1 citations

Book Chapter•DOI•

Bone Age Assessment for Lower Age Groups Using Triplet Network in Small Dataset of Hand X-Rays

[...]

Shipra Madan¹, Tapan K. Gandhi¹, Santanu Chaudhury¹, Santanu Chaudhury²•Institutions (2)

Indian Institute of Technology Delhi¹, Indian Institute of Technology, Jodhpur²

24 Nov 2020

TL;DR: In this article, a bone-age assessment model using triplet loss for children in 0-3 years of age is proposed, which achieves an AUC of 0.92 for binary and 0.82 for multi-class classification with visible separation in embedding clusters.

...read moreread less

Abstract: Skeletal Bone age assessment is a routine clinical procedure carried out by paediatricians and endocrinologists for investigating a variety of endocrinological, metabolic, genetic and growth disorders in children. Skeletal maturity advances with change in structure and size of the skeletal bones with respect to age. This is commonly done by radiological investigation of the left hand due to its non dominant use. Dissent in the skeletal age and bone age values indicates abnormality. In this study, a bone-age assessment model using triplet loss for children in 0–3 years of age is proposed. Furthermore, this is the first automated bone age assessment study on lower age groups with comparable results, using one tenth of the training data samples as opposed to conventional deep neural networks. We have used small number of radiographs per class from Digital Hand Atlas Database System (DHA), a publicly available comprehensive x-ray dataset. Model trained achieves an AUC of 0.92 for binary and 0.82 for multi-class classification with visible separation in embedding clusters; thereby resulting in correct predictions on test data set.

...read moreread less

1 citations

Proceedings Article•DOI•

Eye Movement State Trajectory Estimator based on Ancestor Sampling

[...]

Sai Phani Kumar Malladi¹, Jayanta Mukhopadhyay¹, Mohamed-Chaker Larabi², Santanu Chaudhury¹•Institutions (2)

Indian Institutes of Technology¹, University of Poitiers²

21 Sep 2020

TL;DR: In this paper, a state trajectory estimator based on ancestor sampling (ST EAS) model was proposed for gaze data classification and video retrieval, which captures the features of the human temporal gaze pattern to identify the kind of visual stimuli.

...read moreread less

Abstract: Human gaze dynamics mainly concern about the sequence of the occurrence of three eye movements: fixations, saccades, and microsaccades. In this paper, we correlate them as three different states to velocities of eye movements. We build a state trajectory estimator based on ancestor sampling (ST EAS) model, which captures the features of the human temporal gaze pattern to identify the kind of visual stimuli. We used a gaze dataset of 72 viewers watching 60 video clips which are equally split into four visual categories. Uniformly sampled velocity vectors from the training set, are used to find the best suitable parameters of the proposed statistical model. Then, the optimized model is used for both gaze data classification and video retrieval on the test set. We observed 93.265% of classification accuracy and a mean reciprocal rank of 0.888 for video retrieval on the test set. Hence, this model can be used for viewer independent video indexing for providing viewers an easier way to navigate through the contents.

...read moreread less

1 citations

Proceedings Article•DOI•

Data Adaptive Compressed Sensing using deep neural network for Image recognition

[...]

Ronak Gupta¹, Aditya Kumar¹, Santanu Chaudhury², Brejesh Lall¹, Vinay Kaushik¹ - Show less +1 more•Institutions (2)

Indian Institute of Technology Delhi¹, Indian Institute of Technology, Jodhpur²

01 Feb 2020

TL;DR: A data adaptive CS based on deep learning framework for image recognition where sampling is done considering the global context and encoding to obtain measurements is learned from data, so as to achieve the generalization over large-scale dataset.

...read moreread less

Abstract: Compressive sensing (CS) using deep learning for recovery of images from measurements has been well explored in recent years. Instead of sensing/sampling full image, block or patch based compressive sensing is chosen to overcome memory and computation limitations. The drawback of this block based CS sampling and recovery is that it does not capture global context and focuses only on the local context. This results in artifacts at the boundary of two consecutive image blocks. Random Gaussian or random Bernoulli matrix are commonly used as sensing matrices to sample an image block and generate corresponding linear measurements. Although, random Gaussian or random Bernoulli matrices exhibits Restricted Isometry property (RIP), which is a guarantee for good quality reconstructed image, its two main disadvantages are: 1) large memory and computational requirements and 2) their encoded measurements doesn't generalize well to a large-scale dataset. In this paper, we propose a data adaptive CS based on deep learning framework for image recognition where 1) sampling is done considering the global context and 2) encoding to obtain measurements is learned from data, so as to achieve the generalization over large-scale dataset.

...read moreread less

1 citations

Book Chapter•DOI•

Integrated Semi-Supervised Model for Learning and Classification

[...]

Vandna Bhalla¹, Santanu Chaudhury¹•Institutions (1)

Indian Institute of Technology Delhi¹

01 Jan 2020

TL;DR: This work proposes a novel framework where the small labelled dataset is appropriately augmented using the intelligent learning mechanisms of artificial immune systems to train the proposed model and shows that the generative deep framework utilizing artificial immune system principles provides a highly competitive approach for learning in the semi-supervised environment.

...read moreread less

Abstract: Labelled data are not only time consuming but often expensive and difficult to procure as it involves skilful inputs by humans to tag and annotate. Contrary to this unlabelled data is comparatively easier to procure but fewer methods exist to optimally use them. Semi-Supervised Learning overcomes this problem and assists to build better classifiers by using unlabelled data along with sufficient labelled data and may actually yield higher accuracy with considerably less human input effort. But if the labelled data set is inadequate in size then the Semi-Supervised techniques are also stuck. We propose a novel framework where the small labelled dataset is appropriately augmented using the intelligent learning mechanisms of artificial immune systems to train the proposed model. The model retrains with the unlabelled data to fortify the learning mechanism. We show that the generative deep framework utilizing artificial immune system principles provides a highly competitive approach for learning in the semi-supervised environment.

...read moreread less