scispace - formally typeset
Search or ask a question
Topic

Upsampling

About: Upsampling is a research topic. Over the lifetime, 2426 publications have been published within this topic receiving 57613 citations.


Papers
More filters
Posted Content
TL;DR: A deep interpretation of this framework that achieves state-of-the-art under such challenging scenarios as face hallucination and a new loss function for super-resolution that combines reconstruction error with a learned face quality measure in adversarial setting are presented.
Abstract: Face hallucination, which is the task of generating a high-resolution face image from a low-resolution input image, is a well-studied problem that is useful in widespread application areas. Face hallucination is particularly challenging when the input face resolution is very low (e.g., 10 x 12 pixels) and/or the image is captured in an uncontrolled setting with large pose and illumination variations. In this paper, we revisit the algorithm introduced in [1] and present a deep interpretation of this framework that achieves state-of-the-art under such challenging scenarios. In our deep network architecture the global and local constraints that define a face can be efficiently modeled and learned end-to-end using training data. Conceptually our network design can be partitioned into two sub-networks: the first one implements the holistic face reconstruction according to global constraints, and the second one enhances face-specific details and enforces local patch statistics. We optimize the deep network using a new loss function for super-resolution that combines reconstruction error with a learned face quality measure in adversarial setting, producing improved visual results. We conduct extensive experiments in both controlled and uncontrolled setups and show that our algorithm improves the state of the art both numerically and visually.

39 citations

Journal ArticleDOI
TL;DR: The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics.
Abstract: Articulation modeling, feature extraction, and classification are the important components of pedestrian segmentation Usually, these components are modeled independently from each other and then combined in a sequential way However, this approach is prone to poor segmentation if any individual component is weakly designed To cope with this problem, we proposed a spatio-temporal convolutional neural network named PedNet which exploits temporal information for spatial segmentation The backbone of the PedNet consists of an encoder–decoder network for downsampling and upsampling the feature maps, respectively The input to the network is a set of three frames and the output is a binary mask of the segmented regions in the middle frame Irrespective of classical deep models where the convolution layers are followed by a fully connected layer for classification, PedNet is a Fully Convolutional Network (FCN) It is trained end-to-end and the segmentation is achieved without the need of any pre- or post-processing The main characteristic of PedNet is its unique design where it performs segmentation on a frame-by-frame basis but it uses the temporal information from the previous and the future frame for segmenting the pedestrian in the current frame Moreover, to combine the low-level features with the high-level semantic information learned by the deeper layers, we used long-skip connections from the encoder to decoder network and concatenate the output of low-level layers with the higher level layers This approach helps to get segmentation map with sharp boundaries To show the potential benefits of temporal information, we also visualized different layers of the network The visualization showed that the network learned different information from the consecutive frames and then combined the information optimally to segment the middle frame We evaluated our approach on eight challenging datasets where humans are involved in different activities with severe articulation (football, road crossing, surveillance) The most common CamVid dataset which is used for calculating the performance of the segmentation algorithm is evaluated against seven state-of-the-art methods The performance is shown on precision/recall, F 1 , F 2 , and mIoU The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics

38 citations

Journal ArticleDOI
TL;DR: A deep convolutional network within the mature Gaussian–Laplacian pyramid for pansharpening (LPPNet), where each level is handled by a spatial subnetwork in a divide-and-conquer way to make the network more efficient.
Abstract: Hyperspectral (HS) pansharpening aims to create a pansharpened image that integrates the spatial details of the panchromatic (PAN) image and the spectral content of the HS image. In this article, we present a deep convolutional network within the mature Gaussian-Laplacian pyramid for pansharpening (LPPNet). The overall structure of LPPNet is a cascade of the Laplacian pyramid dense network with a similar structure at each pyramid level. Following the general idea of multiresolution analysis (MRA), the subband residuals of the desired HS images are extracted from the PAN image and injected into the upsampled HS image to reconstruct the high-resolution HS images level by level. Applying the mature Laplace pyramid decomposition technique to the convolution neural network (CNN) can simplify the pansharpening problem into several pyramid-level learning problems so that the pansharpening problem can be solved with a shallow CNN with fewer parameters. Specifically, the Laplacian pyramid technology is used to decompose the image into different levels that can differentiate large- and small-scale details, and each level is handled by a spatial subnetwork in a divide-and-conquer way to make the network more efficient. Experimental results show that the proposed LPPNet method performs favorably against some state-of-the-art pansharpening methods in terms of objective indexes and subjective visual appearance.

38 citations

Journal ArticleDOI
TL;DR: This paper proposes luma aware chroma downsampling and upsampling algorithms to jointly improve the quality of the chroma image reconstruction, and explores the applicability of the proposed scheme in the scenario of screen content compression, targeting at improving the decoded chromaimage quality for display.
Abstract: Screen content images are originally captured in a full-chroma format. The chroma downsampling, which is commonly applied to the chroma component in screen content image representation and processing (e.g., YUV4:2:0 compression), will significantly degrade the image quality and create annoying artifacts such as blur and color shifting. To tackle this problem, in this paper we propose luma aware chroma downsampling and upsampling algorithms to jointly improve the quality of the chroma image reconstruction. Guided by the luma information, the chroma upsampling algorithm is proposed with the utilization of major color and index map representation. The geometric information-based linear mapping is developed to transfer the structure of luma to the interpolated chroma. Subsequently, the error sensitivity of the upsampling method is analyzed, and content dependent downsampling algorithm is presented to minimize the error sensitivity function. We further explore the applicability of the proposed scheme in the scenario of screen content compression, targeting at improving the decoded chroma image quality for display. Extensive experimental results demonstrate the viability and efficiency of the proposed scheme.

38 citations

Journal ArticleDOI
TL;DR: Experimental evaluations demonstrate that the proposed RAN compares favorably against the state-of-the-art methods and its performance can well generalize across different upscaling factors.
Abstract: In existing deep network-based image super-resolution (SR) methods, each network is only trained for a fixed upscaling factor and can hardly generalize to unseen factors at test time, which is non-scalable in real applications. To mitigate this issue, this paper proposes a resolution-aware network (RAN) for simultaneous SR of multiple factors. The key insight is that SR of multiple factors is essentially different but also shares common operations. To attain stronger generalization across factors, we design an upsampling network (U-Net) consisting of several sub-modules, in which each sub-module implements an intermediate step of the overall image SR and can be shared by SR of different factors. A decision network (D-Net) is further adopted to identify the quality of the input low-resolution image and adaptively select suitable sub-modules to perform SR. U-Net and D-Net together constitute the proposed RAN model, and are jointly trained using a new hierarchical loss function on SR tasks of multiple factors. Experimental evaluations demonstrate that the proposed RAN compares favorably against the state-of-the-art methods and its performance can well generalize across different upscaling factors.

38 citations


Network Information
Related Topics (5)
Convolutional neural network
74.7K papers, 2M citations
90% related
Image segmentation
79.6K papers, 1.8M citations
90% related
Feature extraction
111.8K papers, 2.1M citations
89% related
Deep learning
79.8K papers, 2.1M citations
88% related
Feature (computer vision)
128.2K papers, 1.7M citations
87% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023469
2022859
2021330
2020322
2019298
2018236