scispace - formally typeset
Search or ask a question

Showing papers by "Santanu Chaudhury published in 2020"


Posted Content
TL;DR: The proposed network not only predicts whether the CXR has COVID-19 features present or not, it also performs semantic segmentation of the regions of interest to make the model explainable.
Abstract: With increasing number of COVID-19 cases globally, all the countries are ramping up the testing numbers. While the RT-PCR kits are available in sufficient quantity in several countries, others are facing challenges with limited availability of testing kits and processing centers in remote areas. This has motivated researchers to find alternate methods of testing which are reliable, easily accessible and faster. Chest X-Ray is one of the modalities that is gaining acceptance as a screening modality. Towards this direction, the paper has two primary contributions. Firstly, we present the COVID-19 Multi-Task Network which is an automated end-to-end network for COVID-19 screening. The proposed network not only predicts whether the CXR has COVID-19 features present or not, it also performs semantic segmentation of the regions of interest to make the model explainable. Secondly, with the help of medical professionals, we manually annotate the lung regions of 9000 frontal chest radiographs taken from ChestXray-14, CheXpert and a consolidated COVID-19 dataset. Further, 200 chest radiographs pertaining to COVID-19 patients are also annotated for semantic segmentation. This database will be released to the research community.

29 citations


Journal ArticleDOI
TL;DR: The proposed scheme is quite encouraging in the case of sequences with hazy and degraded, partially occluded, and camouflaged challenges, and the performance evaluation is performed by comparing the scheme with five recent state-of-the-art tracking schemes.
Abstract: One of the well-established research domains among computer vision scientists is object tracking. However, not much work has been done in underwater scenarios. This article addresses the problem of visual tracking in the underwater environment with the stationary and nonstationary camera setups. In order to deal with the underwater optical dynamics, a dominant color component-based scene representation is employed in the YCbCr color space. An adaptive approach is devised to select the Walsh–Hadamard (WH) kernels for the efficient extraction of color, edge, and texture strengths, whereas a new feature called range strength is proposed to extract the variation of intensity from underwater sequences in the local neighborhood using the WH kernel. The likelihood of these feature strengths is integrated in a particle filter framework to track the object of interest in underwater sequences. The reference feature strengths used in assigning weights to the particles are updated based on the S $\phi$ rensen distance. The coefficients of feature strengths are calculated in such a way that if one feature fails, then its coefficient become insignificant, whereas the more suitable features get higher feature coefficients. The effectiveness of the proposed scheme is evaluated using the underwater video datasets: reefVid, fish4knowledge (F4K), underwaterchangedetection (UWCD), and National Oceanic and Atmospheric Administration (NOAA). The performance evaluation is performed by comparing the scheme with five recent state-of-the-art tracking schemes. The quantitative analysis of the proposed scheme is carried out using three evaluation measures: overall intersection over union , centroid location error , and average tracking error . The performance of the proposed scheme is quite encouraging in the case of sequences with hazy and degraded, partially occluded, and camouflaged challenges.

13 citations


Book ChapterDOI
01 Jan 2020
TL;DR: The winning solution to the NeurIPS 2018 AutoML challenge is described, entitled AutoGBT, which combines an adaptive self-optimized end-to-end machine learning pipeline based on gradient boosting trees with automatic hyper-parameter tuning using Sequential Model-Based Optimization (SMBO).
Abstract: Data abundance along with scarcity of machine learning experts and domain specialists necessitates progressive automation of end-to-end machine learning workflows. To this end, Automated Machine Learning (AutoML) has emerged as a prominent research area. Real world data often arrives as streams or batches, and data distribution evolves over time causing concept drift. Models need to handle data that is not independent and identically distributed (iid), and transfer knowledge across time through continuous self-evaluation and adaptation adhering to resource constraints. Creating autonomous self-maintaining models which not only discover an optimal pipeline, but also automatically adapt to concept drift to operate in a lifelong learning setting was the crux of NeurIPS 2018 AutoML challenge. We describe our winning solution to the challenge, entitled AutoGBT, which combines an adaptive self-optimized end-to-end machine learning pipeline based on gradient boosting trees with automatic hyper-parameter tuning using Sequential Model-Based Optimization (SMBO). We report experimental results on the challenge datasets as well as several benchmark datasets affected by concept drift and compare it with the baseline model for the challenge and Auto-sklearn. Results indicate the effectiveness of the proposed methodology in this context.

9 citations


Journal ArticleDOI
TL;DR: An intelligent expert video skimming technique for lecture video sequences, where human intervention is not required is put forward, and is found to be better than that of the existing schemes.
Abstract: Video skimming is one of the recently, getting popular technique for preparing preview for long watching video sequences. Most of the video skimming techniques developed in the literature uses manual intervention of users to prepare the review. Mostly the literature reported video skimming for sports and movie industries. In sports the portion of video where audience claps are used and in movie important contents are manually selected for preparing the preview. However in literature rarely any work reported for skimming of lecture video sequences. Lecture videos are generally, recorded indoor, low illuminated, noisy environment condition and contents of the scene rarely changes much. Hence designing an automatic skimming scheme is quite difficult task. In this article, we put forward an intelligent expert video skimming technique for lecture video sequences, where human intervention is not required. In the proposed scheme, initially the lecture video is segmented into a number of shots. We proposed the use of radiometric correlation technique for lecture video segmentation or finding the shot transitions. After getting the shot transitions in a video, the shots are recognized. The fuzzy K-nearest neighborhood technique is proposed to recognize the shots in a video. The shots are recognized into three categories: title slides, written texts/displayed slides and talking heads/writing hands. Three contrast based features: one existing i.e., average sharpness (AS) and two newly proposed: relative height (RH) and edge potential (EP) are used to find the contents of a frame. The frames with different contrast values are categorized to prepare the video skimming or the capsule. The media recreation is achieved by selecting a set of frames around these selected content frames. The effectiveness of the proposed scheme is demonstrated in this paper using five test sequences, including three NPTEL and two non NPTEL. It is also observed that the capsule prepared by the proposed scheme, provides a better preview of the actual sequence. The performance of the proposed scheme is tested by comparing it against three state-of-the-art techniques. The evaluation of the proposed scheme is carried out by using three evaluation measures. It is also observed that the proposed scheme is found to be better than that of the existing schemes.

6 citations


Proceedings ArticleDOI
05 Oct 2020
TL;DR: In this article, a spatio-contextual Gaussian mixture model based background subtraction method is used to detect prominent objects among a large group of fishes in a stationary camera setup, and the detected objects are analyzed to determine a predefined number of the most prominent objects in the scene of view.
Abstract: Tracking of a fish or some specific fishes in a school of fish is quite a challenging task. This could help in understanding the behavior of a fish or a small group of fish in a crowd of different varieties of fishes. In this paper we propose a technique to detect prominent objects among a large group of fishes. The problem is formulated with a stationary camera setup. The moving objects are initially detected by a spatio-contextual Gaussian mixture model based background subtraction method. Further, all the detected objects are analyzed to determine a predefined number of the most prominent objects in the scene of view. To characterize the objects we have employed a dual-feature framework, which includes color and texture features. The overall feature strength is computed by combining the two feature-strengths in an adaptive way so that, the color gets more weight if color degradation is less otherwise texture gets more weight. This weight is adaptively computed with the prior information of color degradation phenomena in underwater environment. The proposed technique is tested with a large number of underwater videos and found to perform satisfactorily.

4 citations


Proceedings ArticleDOI
10 Dec 2020
TL;DR: In this article, an end-to-end approach which uses inception trained from scratch, achieves 80% accuracy in predicting age within 1 year from the ground truth, using attention maps to explain what regions of the image, the model is focusing on while assessing the bone age and the heat maps thus generated match the features used by the radiologists while predicting manually.
Abstract: Skeletal Bone age assessment is one of the routine radiological procedures performed by paediatricians and endocrinologists for investigating genetic disorders, developmental abnormalities and metabolic complications In this process skeletal age is compared against child's chronological age to uncover discrepancies if any Hand radiographs being the cheapest, reliable and widely used modality, are used to predict the bone age in children from 1-18 years of age Conventional methods make use of atlases to predict the age which are time consuming, tedious and have problems of inter-observer variability We propose an end to end approach which uses inception trained from scratch, achieves 80% accuracy in predicting age within 1 year from the ground truth Further, attention maps are generated to explain what regions of the image, the model is focusing on while assessing the bone age and the heat maps thus generated match the features used by the radiologists while predicting manually

1 citations


Book ChapterDOI
24 Nov 2020
TL;DR: In this article, a bone-age assessment model using triplet loss for children in 0-3 years of age is proposed, which achieves an AUC of 0.92 for binary and 0.82 for multi-class classification with visible separation in embedding clusters.
Abstract: Skeletal Bone age assessment is a routine clinical procedure carried out by paediatricians and endocrinologists for investigating a variety of endocrinological, metabolic, genetic and growth disorders in children. Skeletal maturity advances with change in structure and size of the skeletal bones with respect to age. This is commonly done by radiological investigation of the left hand due to its non dominant use. Dissent in the skeletal age and bone age values indicates abnormality. In this study, a bone-age assessment model using triplet loss for children in 0–3 years of age is proposed. Furthermore, this is the first automated bone age assessment study on lower age groups with comparable results, using one tenth of the training data samples as opposed to conventional deep neural networks. We have used small number of radiographs per class from Digital Hand Atlas Database System (DHA), a publicly available comprehensive x-ray dataset. Model trained achieves an AUC of 0.92 for binary and 0.82 for multi-class classification with visible separation in embedding clusters; thereby resulting in correct predictions on test data set.

1 citations


Proceedings ArticleDOI
21 Sep 2020
TL;DR: In this paper, a state trajectory estimator based on ancestor sampling (ST EAS) model was proposed for gaze data classification and video retrieval, which captures the features of the human temporal gaze pattern to identify the kind of visual stimuli.
Abstract: Human gaze dynamics mainly concern about the sequence of the occurrence of three eye movements: fixations, saccades, and microsaccades. In this paper, we correlate them as three different states to velocities of eye movements. We build a state trajectory estimator based on ancestor sampling (ST EAS) model, which captures the features of the human temporal gaze pattern to identify the kind of visual stimuli. We used a gaze dataset of 72 viewers watching 60 video clips which are equally split into four visual categories. Uniformly sampled velocity vectors from the training set, are used to find the best suitable parameters of the proposed statistical model. Then, the optimized model is used for both gaze data classification and video retrieval on the test set. We observed 93.265% of classification accuracy and a mean reciprocal rank of 0.888 for video retrieval on the test set. Hence, this model can be used for viewer independent video indexing for providing viewers an easier way to navigate through the contents.

1 citations


Proceedings ArticleDOI
01 Feb 2020
TL;DR: A data adaptive CS based on deep learning framework for image recognition where sampling is done considering the global context and encoding to obtain measurements is learned from data, so as to achieve the generalization over large-scale dataset.
Abstract: Compressive sensing (CS) using deep learning for recovery of images from measurements has been well explored in recent years. Instead of sensing/sampling full image, block or patch based compressive sensing is chosen to overcome memory and computation limitations. The drawback of this block based CS sampling and recovery is that it does not capture global context and focuses only on the local context. This results in artifacts at the boundary of two consecutive image blocks. Random Gaussian or random Bernoulli matrix are commonly used as sensing matrices to sample an image block and generate corresponding linear measurements. Although, random Gaussian or random Bernoulli matrices exhibits Restricted Isometry property (RIP), which is a guarantee for good quality reconstructed image, its two main disadvantages are: 1) large memory and computational requirements and 2) their encoded measurements doesn't generalize well to a large-scale dataset. In this paper, we propose a data adaptive CS based on deep learning framework for image recognition where 1) sampling is done considering the global context and 2) encoding to obtain measurements is learned from data, so as to achieve the generalization over large-scale dataset.

1 citations


Book ChapterDOI
01 Jan 2020
TL;DR: This work proposes a novel framework where the small labelled dataset is appropriately augmented using the intelligent learning mechanisms of artificial immune systems to train the proposed model and shows that the generative deep framework utilizing artificial immune system principles provides a highly competitive approach for learning in the semi-supervised environment.
Abstract: Labelled data are not only time consuming but often expensive and difficult to procure as it involves skilful inputs by humans to tag and annotate. Contrary to this unlabelled data is comparatively easier to procure but fewer methods exist to optimally use them. Semi-Supervised Learning overcomes this problem and assists to build better classifiers by using unlabelled data along with sufficient labelled data and may actually yield higher accuracy with considerably less human input effort. But if the labelled data set is inadequate in size then the Semi-Supervised techniques are also stuck. We propose a novel framework where the small labelled dataset is appropriately augmented using the intelligent learning mechanisms of artificial immune systems to train the proposed model. The model retrains with the unlabelled data to fortify the learning mechanism. We show that the generative deep framework utilizing artificial immune system principles provides a highly competitive approach for learning in the semi-supervised environment.