scispace - formally typeset
Search or ask a question

Showing papers by "Rafael Molina published in 2019"


Journal ArticleDOI
TL;DR: This paper introduces a new generator network optimized for the VSR problem, named VSRResNet, along with new discriminator architecture to properly guide V SRResNet during the GAN training, and introduces the PercepDist metric, which more accurately evaluates the perceptual quality of SR solutions obtained from neural networks, compared with the commonly used PSNR/SSIM metrics.
Abstract: Video super-resolution (VSR) has become one of the most critical problems in video processing. In the deep learning literature, recent works have shown the benefits of using adversarial-based and perceptual losses to improve the performance on various image restoration tasks; however, these have yet to be applied for video super-resolution. In this paper, we propose a generative adversarial network (GAN)-based formulation for VSR. We introduce a new generator network optimized for the VSR problem, named VSRResNet, along with new discriminator architecture to properly guide VSRResNet during the GAN training. We further enhance our VSR GAN formulation with two regularizers, a distance loss in feature-space and pixel-space, to obtain our final VSRResFeatGAN model. We show that pre-training our generator with the mean-squared-error loss only quantitatively surpasses the current state-of-the-art VSR models. Finally, we employ the PercepDist metric to compare the state-of-the-art VSR models. We show that this metric more accurately evaluates the perceptual quality of SR solutions obtained from neural networks, compared with the commonly used PSNR/SSIM metrics. Finally, we show that our proposed model, the VSRResFeatGAN model, outperforms the current state-of-the-art SR models, both quantitatively and qualitatively.

112 citations


Journal ArticleDOI
TL;DR: A new crowdsourcing model and inference procedure is introduced which trains a Gaussian Process classifier using the noisy labels provided by the annotators, and can predict the class of new samples and assess the expertise of the involved annotators.

36 citations


Journal ArticleDOI
TL;DR: The achieved detection accuracy defines a new state of the art in object detection on PMMWIs and the low computational training and testing costs of the solution allow its use in real-time applications.
Abstract: Passive millimeter wave images (PMMWIs) can be used to detect and localize objects concealed under clothing. Unfortunately, the quality of the acquired images and the unknown position, shape, and size of the hidden objects render these tasks challenging. In this paper, we discuss a deep learning approach to this detection/localization problem. The effect of the nonstationary acquisition noise on different architectures is analyzed and discussed. A comparison with shallow architectures is also presented. The achieved detection accuracy defines a new state of the art in object detection on PMMWIs. The low computational training and testing costs of the solution allow its use in real-time applications.

32 citations


Journal ArticleDOI
TL;DR: A novel family of morphological descriptors, extracted in the appropriate image space and combined with shallow and deep Gaussian process based classifiers, improves early prostate cancer diagnosis and is competitive to state-of-art CNN architectures both on the proposed SICAPv1 database and on an external database.

25 citations


Journal ArticleDOI
TL;DR: This work introduces two scalable and efficient GP-based crowdsourcing methods that allow for processing previously-prohibitive datasets and compares them with state-of-the-art probabilistic approaches in synthetic and real crowdsourcing datasets of different sizes.

17 citations


Journal ArticleDOI
TL;DR: This paper proposes the use of the spike-and-slab prior together with an efficient variational Expectation Maximization (EM) inference scheme to estimate the blur in the image and investigates the behavior of the prior in the experimental section.

10 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper proposes a fast, efficient post-processing method through fine-tuning that enhances the solution originally provided by the neural network by maintaining its restoration quality while reducing the observed artifacts, as measured qualitatively and quantitatively.
Abstract: While Deep Neural Networks trained for solving inverse imaging problems (such as super-resolution, denoising, or inpainting tasks) regularly achieve new state-of-the-art restoration performance, this increase in performance is often accompanied with undesired artifacts generated in their solution. These artifacts are usually specific to the type of neural network architecture, training, or test input image used for the inverse imaging problem at hand. In this paper, we propose a fast, efficient post-processing method for reducing these artifacts. Given a test input image and its known image formation model, we fine-tune the parameters of the trained network and iteratively update them using a data consistency loss. We show that in addition to being efficient and applicable to large variety of problems, our post-processing through fine-tuning approach enhances the solution originally provided by the neural network by maintaining its restoration quality while reducing the observed artifacts, as measured qualitatively and quantitatively.

6 citations


Posted Content
TL;DR: This first approach, which is referred to as scalable variational Gaussian processes for crowdsourcing (SVGPCR), brings back GP-based methods to a state-of-the-art level, and excels at uncertainty quantification.
Abstract: In the last years, crowdsourcing is transforming the way classification training sets are obtained. Instead of relying on a single expert annotator, crowdsourcing shares the labelling effort among a large number of collaborators. For instance, this is being applied to the data acquired by the laureate Laser Interferometer Gravitational Waves Observatory (LIGO), in order to detect glitches which might hinder the identification of true gravitational-waves. The crowdsourcing scenario poses new challenging difficulties, as it deals with different opinions from a heterogeneous group of annotators with unknown degrees of expertise. Probabilistic methods, such as Gaussian Processes (GP), have proven successful in modeling this setting. However, GPs do not scale well to large data sets, which hampers their broad adoption in real practice (in particular at LIGO). This has led to the recent introduction of deep learning based crowdsourcing methods, which have become the state-of-the-art. However, the accurate uncertainty quantification of GPs has been partially sacrificed. This is an important aspect for astrophysicists in LIGO, since a glitch detection system should provide very accurate probability distributions of its predictions. In this work, we leverage the most popular sparse GP approximation to develop a novel GP based crowdsourcing method that factorizes into mini-batches. This makes it able to cope with previously-prohibitive data sets. The approach, which we refer to as Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR), brings back GP-based methods to the state-of-the-art, and excels at uncertainty quantification. SVGPCR is shown to outperform deep learning based methods and previous probabilistic approaches when applied to the LIGO data. Moreover, its behavior and main properties are carefully analyzed in a controlled experiment based on the MNIST data set.

5 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work proposes to pseudo-invert with regularization the image formation model using GANs and perceptual losses and additionally introduces two feature losses which are used to obtain perceptually improved high resolution images.
Abstract: While high and ultra high definition displays are becoming popular, most of the available content has been acquired at much lower resolutions. In this work we propose to pseudo-invert with regularization the image formation model using GANs and perceptual losses. Our model, which does not require the use of motion compensation, utilizes explicitly the low resolution image formation model and additionally introduces two feature losses which are used to obtain perceptually improved high resolution images. The experimental validation shows that our approach outperforms current video super resolution learning based models.

4 citations


Proceedings ArticleDOI
12 May 2019
TL;DR: This paper aims to train a GAN guided by a spatially adaptive loss function and demonstrates that the learned model achieves improved results with sharper images, fewer artifacts and less noise.
Abstract: Deep Learning techniques and more specifically Generative Adversarial Networks (GANs) have recently been used for solving the video super-resolution (VSR) problem. In some of the published works, feature-based perceptual losses have also been used, resulting in promising results. While there has been work in the literature incorporating temporal information into the loss function, studies which make use of the spatial activity to improve GAN models are still lacking. Towards this end, this paper aims to train a GAN guided by a spatially adaptive loss function. Experimental results demonstrate that the learned model achieves improved results with sharper images, fewer artifacts and less noise.

3 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work forms the blind color deconvolution problem within the Bayesian framework and takes into account the similarity to a given reference color-vector matrix and spatial relations among the concentration pixels by a total variation prior.
Abstract: In digital brightfield microscopy, tissues are usually stained with two or more dyes. Color deconvolution aims at separating multi-stained images into single stained images. We formulate the blind color deconvolution problem within the Bayesian framework. Our model takes into account the similarity to a given reference color-vector matrix and spatial relations among the concentration pixels by a total variation prior. It utilizes variational inference and an evidence lower bound to estimate all the latent variables. The proposed algorithm is tested on real images and compared with classical and state-of-the-art color deconvolution algorithms.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: Experimental results show that the proposed semantic prior based Generative Adversarial Network model for video super-resolution is advantageous in sharpening video frames, reducing noise and artifacts, and recovering realistic textures.
Abstract: Semantic information is widely used in the deep learning literature to improve the performance of visual media processing. In this work, we propose a semantic prior based Generative Adversarial Network (GAN) model for video super-resolution. The model fully utilizes various texture styles from different semantic categories of video-frame patches, contributing to more accurate and efficient learning for the generator. Based on the GAN framework, we introduce the semantic prior by making use of the spatial feature transform during the learning process of the generator. The patch-wise semantic prior is extracted on the whole video frame by a semantic segmentation network. A hybrid loss function is designed to guide the learning performance. Experimental results show that our proposed model is advantageous in sharpening video frames, reducing noise and artifacts, and recovering realistic textures.

01 Jan 2019
TL;DR: This paper proposes an efficient, fully self-supervised approach to remove the observed artifacts, and applies the method to image and video super-resolution neural networks and shows that the proposed framework consistently enhances the solution originally provided by the neural network.
Abstract: While Deep Neural Networks (DNNs) trained for image and video super-resolution regularly achieve new state-of-the-art performance, they also suffer from significant drawbacks. One of their limitations is their tendency to generate strong artifacts in their solution. This may occur when the low-resolution image formation model does not match that seen during training. Artifacts also regularly arise when training Generative Adversarial Networks for inverse imaging problems. In this paper, we propose an efficient, fully self-supervised approach to remove the observed artifacts. More specifically, at test time, given an image and its known image formation model, we fine-tune the parameters of the trained network and iteratively update them using a data consistency loss. We apply our method to image and video super-resolution neural networks and show that our proposed framework consistently enhances the solution originally provided by the neural network.

Posted Content
TL;DR: In this paper, a self-supervised fine-tuning approach is proposed to correct a sub-optimal super-resolution solution by entirely relying on internal learning at test time.
Abstract: While Convolutional Neural Networks (CNNs) trained for image and video super-resolution (SR) regularly achieve new state-of-the-art performance, they also suffer from significant drawbacks. One of their limitations is their lack of robustness to unseen image formation models during training. Other limitations include the generation of artifacts and hallucinated content when training Generative Adversarial Networks (GANs) for SR. While the Deep Learning literature focuses on presenting new training schemes and settings to resolve these various issues, we show that one can avoid training and correct for SR results with a fully self-supervised fine-tuning approach. More specifically, at test time, given an image and its known image formation model, we fine-tune the parameters of the trained network and iteratively update them using a data fidelity loss. We apply our fine-tuning algorithm on multiple image and video SR CNNs and show that it can successfully correct for a sub-optimal SR solution by entirely relying on internal learning at test time. We apply our method on the problem of fine-tuning for unseen image formation models and on removal of artifacts introduced by GANs.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work proposes a new Convolutional Neural Network for video super resolution which is robust to multiple degradation models and uses the pseudo-inverse image formation model as part of the network architecture during training.
Abstract: With the increase of popularity of high and ultra high definition displays, the need to improve the quality of content already obtained at much lower resolutions has grown. Since current video super-resolution methods are trained with a single degradation model (usually bicubic downsampling), they are not robust to mismatch between training and testing degradation models, in which case their performance deteriorates. In this work we propose a new Convolutional Neural Network for video super resolution which is robust to multiple degradation models and uses the pseudo-inverse image formation model as part of the network architecture during training. The experimental validation shows that our approach outperforms current state of the art methods.

Book ChapterDOI
14 Nov 2019
TL;DR: The optical density of each whole slide image is calculated and its eosin and hematoxylin concentration components estimated, and hand-crafted features, which are expected to capture the expertise of pathologists, are extracted from patches of these two concentration components.
Abstract: The increasing use of whole slide digital scanners has led to an enormous interest in the application of machine learning techniques to detect prostate cancer using eosin and hematoxylin stained histopathological images. In this work the above problem is approached as follows: the optical density of each whole slide image is calculated and its eosin and hematoxylin concentration components estimated. Then, hand-crafted features, which are expected to capture the expertise of pathologists, are extracted from patches of these two concentration components. Finally, patches are classified using a Deep Gaussian Process on the extracted features. The new approach outperforms current state of the art shallow as well as deep classifiers like InceptionV3, Xception and VGG19 with an AUC value higher than 0.98.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A new framework which incorporates two collaborative discriminators whose aim is to jointly improve the quality of the reconstructed video sequence is proposed, which outperforms current state of the art models and obtains super-resolved frames.
Abstract: Generative Adversarial Networks (GANs) have been used for solving the video super-resolution problem. So far, video super-resolution GAN-based methods use the traditional GAN framework which consists of a single generator and a single discriminator that are trained against each other. In this work we propose a new framework which incorporates two collaborative discriminators whose aim is to jointly improve the quality of the reconstructed video sequence. While one discriminator concentrates on general properties of the images, the second one specializes on obtaining realistically reconstructed features, such as, edges. Experiments results demonstrate that the learned model outperforms current state of the art models and obtains super-resolved frames, with fine details, sharp edges, and fewer artifacts.