scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Image super-resolution

Linwei Yue1, Huanfeng Shen1, Jie Li1, Qiangqiang Yuan1, Hongyan Zhang1, Liangpei Zhang1 
01 Nov 2016-Signal Processing (Elsevier)-Vol. 128, pp 389-408
TL;DR: This paper aims to provide a review of SR from the perspective of techniques and applications, and especially the main contributions in recent years, and discusses the current obstacles for future research.
About: This article is published in Signal Processing.The article was published on 2016-11-01. It has received 378 citations till now.
Citations
More filters
Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a deformable convolution alignment module with a multiscale residual block to alleviate the alignment difficulties caused by scarce motion and various scales of moving objects in remote-sensing images.
Abstract: As a new earth observation tool, satellite video has been widely used in remote-sensing field for dynamic analysis. Video super-resolution (VSR) technique has thus attracted increasing attention due to its improvement to spatial resolution of satellite video. However, the difficulty of remote-sensing image alignment and the low efficiency of spatial–temporal information fusion make poor generalization of the conventional VSR methods applied to satellite videos. In this article, a novel fusion strategy of temporal grouping projection and an accurate alignment module are proposed for satellite VSR. First, we propose a deformable convolution alignment module with a multiscale residual block to alleviate the alignment difficulties caused by scarce motion and various scales of moving objects in remote-sensing images. Second, a temporal grouping projection fusion strategy is proposed, which can reduce the complexity of projection and make the spatial features of reference frames play a continuous guiding role in spatial–temporal information fusion. Finally, a temporal attention module is designed to adaptively learn the different contributions of temporal information extracted from each group. Extensive experiments on Jilin-1 satellite video demonstrate that our method is superior to current state-of-the-art VSR methods.

14 citations

Proceedings ArticleDOI
01 Oct 2018
TL;DR: This paper provides the comprehensive survey of existing state-of-the-art and recently published face hallucination methods and the detailed reconstruction procedure of most successful hallucination approach i.e., position-patch based super-resolution is also provided.
Abstract: In several real-world scenario, the recorded pictures often have various artifacts suchlike blur, noise, varying illuminations, occlusion, etc. due to many reasons including cheap and low-resolution imaging systems, different image processing errors, and far distance of an object from the camera/sensor. The facial images captured from such low-resolution pictures make severe impacts on the performance of various systems namely human-computer interaction, speaker recognition by mouth movements, visual speech recognition, facial expression recognition, face-recognition, etc. Facial image super-resolution (or hallucination), as one of the kernels innovations in the field of computer vision and image processing, has been an engaging but challenging technique to overcome above problems. This paper provides the comprehensive survey of existing state-of-the-art and recently published face hallucination methods. Along with this, the detailed reconstruction procedure of most successful hallucination approach i.e., position-patch based super-resolution is also provided in this work. Moreover, some useful research directions are too presented at the end which may help the research community of this filed to design and develop the new face hallucination methods for providing the more efficient solution to existing problems.

13 citations

Journal ArticleDOI
TL;DR: On experimentation, it is found that deep learning-based SR algorithms produces visually pleasing images retaining sharp edges, enhanced spatial data, and clarity in feature representation while zooming at a certain level beyond interest.
Abstract: Super-resolution (SR) algorithms have now become a bottleneck for several remote sensing applications. SR is a technique that enhances minute details of the image by increasing spatial resolution of imaging systems. SR overcomes the problems of conventional resolution enhancement techniques such as introduction of noise, spectral distortion, and lack of clarity in the details of the image. In this paper, a survey has been conducted since the inception of SR algorithm till the latest state-of-the-art SR techniques to elucidate the importance of the SR algorithms that lead to paradigm shifts in the last two decades revolutionizing toward visually pleasing high-resolution image. Inspired from the natural images, the algorithms addressing the SR problems such as ill-posed, prior and regularization problem, inverse problem, multi-frame problem and illumination and shadow problem in remote sensing applications are analyzed. For an intuitive understanding of the paradigm shifts, publicly available images are tested with representative paradigm shift SR algorithms. The result of this paradigm shift analysis is done both qualitatively and quantitatively in terms of blurs in the image, pattern clarity, edge strength, and super-resolving capability. The convergence of the natural image to the remote sensed image is critically analyzed. The challenges with possible solutions for super-resolving the remote sensed image are recommended. On experimentation, it is found that deep learning-based SR algorithms produces visually pleasing images retaining sharp edges, enhanced spatial data, and clarity in feature representation while zooming at a certain level beyond interest.

13 citations

Journal ArticleDOI
TL;DR: A component semantic prior guided generative adversarial network (CSPGAN) to synthesize faces is proposed and semantic probability maps of facial components are exploited to modulate features in the CSPGAN through affine transformation.
Abstract: Face super-resolved (SR) images aid human perception. The state-of-the-art face SR methods leverage the spatial location of facial components as prior knowledge. However, it remains a great challenge to generate natural textures. In this paper, we propose a component semantic prior guided generative adversarial network (CSPGAN) to synthesize faces. Specifically, semantic probability maps of facial components are exploited to modulate features in the CSPGAN through affine transformation. To compensate for the overly smooth performance of the generative network, a gradient loss is proposed to recover the high-frequency details. Meanwhile, the discriminative network is designed to perform multiple tasks which predict semantic category and distinguish authenticity simultaneously. The extensive experimental results demonstrate the superiority of the CSPGAN in reconstructing photorealistic textures.

13 citations

Journal ArticleDOI
Shaopeng Hu1, Kohei Shimasaki1, Mingjun Jiang1, Taku Senoo1, Idaku Ishii1 
TL;DR: In this paper, a dual-camera system that can simultaneously capture zoomed-in images using an ultrafast pan-tilt camera and a fixed wide-view camera using deep learning methods is presented.
Abstract: This paper presents a novel dual-camera system that can simultaneously capture zoomed-in images using an ultrafast pan-tilt camera and a fixed wide-view camera using deep learning methods. An ultrafast pan-tilt camera can function as multiple virtual pan-tilt cameras by synchronizing a high-frame-rate zooming-view camera and an ultrafast pan-tilt mirror device that can switch over 500 different views in a second. A wide-view camera can obtain images in a fixed view in which multiple targets to be tracked are captured in low resolution and then recognized by processing the images using deep learning methods at a rate of dozens of frames per second. Based on the positions of all targets, recognized by the wide-view camera, the pan and tilt angles of multiple pan-tilt cameras are virtually controlled using an ultrafast pan-tilt camera through multithread viewpoint control to simultaneously capture the zoomed-in images of all targets. Our developed system can operate 10 virtual pan-tilt cameras (25 fps) with multithread viewpoint control and 4 ms time granularity in synchronization with convolutional neural-network-based recognition model operating at 25 fps, which is accelerated by a general-purpose computing on graphics processing units. The effectiveness of our system was demonstrated by the results of several experiments conducted on simultaneous zoom shooting of multiple running objects (persons and cars) in the range of approximately 80 m or higher in a natural outdoor scene, which was formerly too wide for a single fixed camera to capture clearly and simultaneously.

13 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
23 May 2011
TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.
Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

17,433 citations

Journal ArticleDOI
TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Abstract: Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together.The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the

16,538 citations


"Image super-resolution" refers background in this paper

  • ...As the l2 norm represents a smoothing prior and the l1 norm tends to preserve the edges, the lp ( ≤ ≤ p 1 2) norm achieves a balance between them, thereby avoiding the staircase effect [110]....

    [...]

Journal ArticleDOI
TL;DR: In this article, a constrained optimization type of numerical algorithm for removing noise from images is presented, where the total variation of the image is minimized subject to constraints involving the statistics of the noise.

15,225 citations


"Image super-resolution" refers background in this paper

  • ...[93,103], based on the fact that an image is naturally “blocky” and discontinuous....

    [...]

Book
01 Jan 1977

8,009 citations


"Image super-resolution" refers background in this paper

  • ...In the early years, the smoothness of natural images was mainly considered, which leads to the quadratic property of the regularizations [99,100]....

    [...]