scispace - formally typeset
Search or ask a question
Author

Arnav Bhavsar

Bio: Arnav Bhavsar is an academic researcher from Indian Institute of Technology Mandi. The author has contributed to research in topics: Computer science & Contextual image classification. The author has an hindex of 13, co-authored 106 publications receiving 657 citations. Previous affiliations of Arnav Bhavsar include University of North Carolina at Chapel Hill & Indian Institutes of Technology.


Papers
More filters
Proceedings ArticleDOI
01 Jul 2017
TL;DR: An approach which utilizes joint colour-texture features and a classifier ensemble for classifying breast histopathology images and demonstrates a visible classification invariance with cross-magnification training-testing is proposed.
Abstract: Breast cancer is one of the most common cancer in women worldwide. It is typically diagnosed via histopathological microscopy imaging, for which image analysis can aid physicians for more effective diagnosis. Given a large variability in tissue appearance, to better capture discriminative traits, images can be acquired at different optical magnifications. In this paper, we propose an approach which utilizes joint colour-texture features and a classifier ensemble for classifying breast histopathology images. While we demonstrate the effectiveness of the proposed framework, an important objective of this work is to study the image classification across different optical magnification levels. We provide interesting experimental results and related discussions, demonstrating a visible classification invariance with cross-magnification training-testing. Along with magnification-specific model, we also evaluate the magnification independent model, and compare the two to gain some insights.

93 citations

Book ChapterDOI
10 Jun 2008
TL;DR: This work follows a Bayesian framework by modeling the original HR range as a Markov random field (MRF) and proposes the use of an edge-adaptive MRF prior to handle discontinuities.
Abstract: Photonic mixer device (PMD) range cameras are becoming popular as an alternative to algorithmic 3D reconstruction but their main drawbacks are low-resolution (LR) and noise. Recently, some interesting works have stressed on resolution enhancement of PMD range data. These works use high-resolution (HR) CCD images or stereo pairs. But such a system requires complex setup and camera calibration. In contrast, we propose a super-resolution method through induced camera motion to create a HR range image from multiple LR range images. We follow a Bayesian framework by modeling the original HR range as a Markov random field (MRF). To handle discontinuities, we propose the use of an edge-adaptive MRF prior. Since such a prior renders the energy function non-convex, we minimize it by graduated non-convexity.

57 citations

Proceedings ArticleDOI
29 Apr 2020
TL;DR: This work analyzes several deep learning approaches in the context of deepfakes classification in high compression scenarios and demonstrates that a proposed approach based on metric learning can be very effective in performing such a classification.
Abstract: With the arrival of several face-swapping applications such as FaceApp, SnapChat, MixBooth, FaceBlender and many more, the authenticity of digital media content is hanging on a very loose thread. On social media platforms, videos are widely circulated often at a high compression factor. In this work, we analyze several deep learning approaches in the context of deepfakes classification in high compression scenarios and demonstrate that a proposed approach based on metric learning can be very effective in performing such a classification. Using less number of frames per video to assess its realism, the metric learning approach using a triplet network architecture proves to be fruitful. It learns to enhance the feature space distance between the cluster of real and fake videos embedding vectors. We validated our approaches on two datasets to analyze the behavior in different environments. We achieved a state-of-the-art AUC score of 99.2% on the Celeb-DF dataset and accuracy of 90.71% on a highly compressed Neural Texture dataset. Our approach is especially helpful on social media platforms where data compression is inevitable.

51 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: Considering the dependency exists among the layers in deep learning, a proposed sequential framework which utilizes multi-layered deep features that are extracted from fine-tuned DenseNet is proposed which yields better performance, in most cases, than typically used highest layer features.
Abstract: Computerized approaches for automated classification of histopathology images can help in reducing the manual observational workload of pathologists. In recent years, like in other areas, deep networks have also attracted attention for histopathology image analysis. However, existing approaches have paid little attention in exploring multilayer features for improving the classification. We believe that considering multi-layered features is important as different regions in the images, which are in turn at different magnifications may contain useful discriminative information at different levels of hierarchy. Considering the dependency exists among the layers in deep learning, we propose sequential framework which utilizes multi-layered deep features that are extracted from fine-tuned DenseNet. A decision is made by layer for a sample only if it passes a pre-defined cut-off confidence for that layer otherwise, the sample is passed on to next layers. Various experiments on publicly available BreaKHis dataset, demonstrate the proposed framework yields better performance, in most cases, than typically used highest layer features. We also compare results with the framework where each layer is treated independently. This indicates that low-mid-level features also carry useful discriminative information, when explicitly considered. We also demonstrate an improved performance over various state-of-the-art methods.

51 citations

Journal ArticleDOI
TL;DR: This paper proposes an integrated approach to estimate the HR depth and the SR image from multiple LR stereo observations and demonstrates the efficacy of the proposed method in not only being able to bring out image details but also in enhancing theHR depth over its LR counterpart.
Abstract: Under stereo settings, the twin problems of image superresolution (SR) and high-resolution (HR) depth estimation are intertwined. The subpixel registration information required for image superresolution is tightly coupled to the 3D structure. The effects of parallax and pixel averaging (inherent in the downsampling process) preclude a priori estimation of pixel motion for superresolution. These factors also compound the correspondence problem at low resolution (LR), which in turn affects the quality of the LR depth estimates. In this paper, we propose an integrated approach to estimate the HR depth and the SR image from multiple LR stereo observations. Our results demonstrate the efficacy of the proposed method in not only being able to bring out image details but also in enhancing the HR depth over its LR counterpart.

47 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this mini-review, the application of digital pathological image analysis using machine learning algorithms is introduced, some problems specific to such analysis are addressed, and possible solutions are proposed.
Abstract: Abundant accumulation of digital histopathological images has led to the increased demand for their analysis, such as computer-aided diagnosis using machine learning techniques. However, digital pathological images and related tasks have some issues to be considered. In this mini-review, we introduce the application of digital pathological image analysis using machine learning algorithms, address some problems specific to such analysis, and propose possible solutions.

545 citations

Proceedings ArticleDOI
13 Jun 2010
TL;DR: It is shown the surprising result that 3D scans of reasonable quality can also be obtained with a sensor of such low data quality, and a new combination of a 3D superresolution method with a probabilistic scan alignment approach that explicitly takes into account the sensor's noise characteristics.
Abstract: We describe a method for 3D object scanning by aligning depth scans that were taken from around an object with a time-of-flight camera. These ToF cameras can measure depth scans at video rate. Due to comparably simple technology they bear potential for low cost production in big volumes. Our easy-to-use, cost-effective scanning solution based on such a sensor could make 3D scanning technology more accessible to everyday users. The algorithmic challenge we face is that the sensor's level of random noise is substantial and there is a non-trivial systematic bias. In this paper we show the surprising result that 3D scans of reasonable quality can also be obtained with a sensor of such low data quality. Established filtering and scan alignment techniques from the literature fail to achieve this goal. In contrast, our algorithm is based on a new combination of a 3D superresolution method with a probabilistic scan alignment approach that explicitly takes into account the sensor's noise characteristics.

308 citations

Book ChapterDOI
TL;DR: The siamese neural network architecture is described, and its main applications in a number of computational fields since its appearance in 1994 are outlined, including the programming languages, software packages, tutorials, and guides that can be practically used by readers to implement this powerful machine learning model.
Abstract: Similarity has always been a key aspect in computer science and statistics. Any time two element vectors are compared, many different similarity approaches can be used, depending on the final goal of the comparison (Euclidean distance, Pearson correlation coefficient, Spearman's rank correlation coefficient, and others). But if the comparison has to be applied to more complex data samples, with features having different dimensionality and types which might need compression before processing, these measures would be unsuitable. In these cases, a siamese neural network may be the best choice: it consists of two identical artificial neural networks each capable of learning the hidden representation of an input vector. The two neural networks are both feedforward perceptrons, and employ error back-propagation during training; they work parallelly in tandem and compare their outputs at the end, usually through a cosine distance. The output generated by a siamese neural network execution can be considered the semantic similarity between the projected representation of the two input vectors. In this overview we first describe the siamese neural network architecture, and then we outline its main applications in a number of computational fields since its appearance in 1994. Additionally, we list the programming languages, software packages, tutorials, and guides that can be practically used by readers to implement this powerful machine learning model.

281 citations

Journal ArticleDOI
01 Jul 2012
TL;DR: This work presents a technique for performing high-dimensional filtering of images and videos in real time by computing the filter's response at a reduced set of sampling points, and using these for interpolation at all N input pixels, and shows that for a proper choice of these sampled points, the total cost of the filtering operation is linear.
Abstract: We present a technique for performing high-dimensional filtering of images and videos in real time. Our approach produces high-quality results and accelerates filtering by computing the filter's response at a reduced set of sampling points, and using these for interpolation at all N input pixels. We show that for a proper choice of these sampling points, the total cost of the filtering operation is linear both in N and in the dimension d of the space in which the filter operates. As such, ours is the first high-dimensional filter with such a complexity. We present formal derivations for the equations that define our filter, as well as for an algorithm to compute the sampling points. This provides a sound theoretical justification for our method and for its properties. The resulting filter is quite flexible, being capable of producing responses that approximate either standard Gaussian, bilateral, or non-local-means filters. Such flexibility also allows us to demonstrate the first hybrid Euclidean-geodesic filter that runs in a single pass. Our filter is faster and requires less memory than previous approaches, being able to process a 10-Megapixel full-color image at 50 fps on modern GPUs. We illustrate the effectiveness of our approach by performing a variety of tasks ranging from edge-aware color filtering in 5-D, noise reduction (using up to 147 dimensions), single-pass hybrid Euclidean-geodesic filtering, and detail enhancement, among others.

273 citations