scispace - formally typeset
Search or ask a question

Showing papers on "Image scaling published in 2018"


Book ChapterDOI
08 Sep 2018
TL;DR: In this article, an end-to-end deep learning codec is proposed for video compression, which is based on repeated image interpolation, and it outperforms H.264 and MPEG-4 Part 2.
Abstract: An ever increasing amount of our digital communication, media consumption, and content creation revolves around videos. We share, watch, and archive many aspects of our lives through them, all of which are powered by strong video compression. Traditional video compression is laboriously hand designed and hand optimized. This paper presents an alternative in an end-to-end deep learning codec. Our codec builds on one simple idea: Video compression is repeated image interpolation. It thus benefits from recent advances in deep image interpolation and generation. Our deep video codec outperforms today’s prevailing codecs, such as H.261, MPEG-4 Part 2, and performs on par with H.264.

252 citations


Book ChapterDOI
01 Jan 2018
TL;DR: In this chapter, it is proposed the solutions of a problem of changing image resolution based on the use of computational intelligence means, which are constructed using the new neuro-paradigm—Geometric Transformations Model.
Abstract: In this chapter, it is proposed the solutions of a problem of changing image resolution based on the use of computational intelligence means, which are constructed using the new neuro-paradigm—Geometric Transformations Model. The topologies, the training algorithms, and the usage of neural-like structures of Geometric Transformations Model are described. Two methods of solving a problem of reducing and increasing image resolution are considered: using neural-like structures of Geometric Transformations Model and on the basis of the matrix operator of the weight coefficients of synaptic connections. The influences of the parameters of image preprocessing procedure, as well as the parameters of the neural-like structures of Geometric Transformations Model on the work quality of both methods are investigated. A number of the practical experiments using different quality indicators of synthesized images (PSNR, SSIM, UIQ, MSE) are performed. A comparison of the effectiveness of the developed method with the effectiveness of the existing one is implemented.

54 citations


Journal ArticleDOI
TL;DR: The empirical results suggest that the proposed framework not only effectively utilizes the multi-scale images but also outperforms other similar techniques in terms of classification accuracy rate.
Abstract: Driver fatigue is a major cause of traffic accidents. Automatic vision-based driver fatigue recognition is one of the most prospective commercial applications based on facial expression analysis technology. Generally, factors such as noise, illumination effects, image scaling, and redundant data affect the performance of facial expression recognition systems. In this paper, we have proposed an efficient algorithm, which is not only capable of working with multi-scale images but also able to overcome the mentioned obstacles. The proposed framework can be divided into three main phases. In the first step, the input image is converted into four sub-band images by applying a discrete wavelet transform, which preserves the important information of face image. Also, the original image is down-sampled to obtain the image of different sizes. Based on entropy analysis, each image is then further divided into a number of blocks classified as either informative or non-informative blocks. In the second step, the high variance features are selected in a zigzag manner using discrete cosine transform. In the final step, classifiers are trained and tested to accurately classify the expressions into seven generic expression classes. The empirical results suggest that the proposed framework not only effectively utilizes the multi-scale images but also outperforms other similar techniques in terms of classification accuracy rate.

44 citations


Journal ArticleDOI
TL;DR: Simulation results clearly show better performance of the proposed scheme compared to the previous techniques using interpolation-based reversible watermarking on different DICOM images.
Abstract: In the digital world, watermarking technology is a solution for data hiding and completely essential for management and secure communications of digital data propagated over the internet-based platforms. Reversible watermarking is a quality-aware type of watermarking which has been applied in managing digital contents such as digital images, texts, audios and videos. Reversible watermarking is also known as lossless watermarking due to its preservation of all details of host and hidden data. One of the important uses of this kind of watermarking is to manage medical data regarding DICOM images. In the recent years, a new type of reversible watermarking technology entitled interpolation-based reversible watermarking has been introduced, and we are going to enhance it for DICOM images by using a hybrid approach based on computing error histogram and by applying an image interpolation with greedy weights (adaptive weighting). In practice, simulation results clearly show better performance of the proposed scheme compared to the previous techniques using interpolation-based reversible watermarking on different DICOM images.

39 citations


Journal ArticleDOI
TL;DR: Simulation results show that the scaling image using bilinear interpolation is clearer than that using the nearest-neighbor interpolation, and the complexity analysis of the scaling circuits based on the elementary gates is deduced.
Abstract: Image scaling is the basic operation that is widely used in classic image processing, including nearest-neighbor interpolation, bilinear interpolation, and bicubic interpolation. In quantum image p...

34 citations


Journal ArticleDOI
TL;DR: In this paper, reversible data hiding methods using interpolation techniques are described and analyzed on the embedding capacity and the visual image quality that many researchers have tried to improve these different measurements.
Abstract: It is called as a reversible data hiding method when the cover object can be restored together with extracting the secret data at a receiver. In reversible data hiding, interpolation-based data hiding methods are recently proposed, where image interpolation techniques are used before embedding the secret data. In this paper, reversible data hiding methods using interpolation techniques are described and analyzed on the embedding capacity and the visual image quality that many researchers have tried to improve these different measurements. It is concluded with the directions of research with some recommendations.

32 citations


Journal ArticleDOI
TL;DR: Simulation-based experimental results involving different classical images and ratios are simulated based on the classical computer’s MATLAB 2014b software, which demonstrates that the proposed NNV interpolation method has higher performances in terms of high resolution compared to the nearest neighbor and bilinear interpolation.
Abstract: This paper presents the nearest neighbor value (NNV) interpolation algorithm for the improved novel enhanced quantum representation of digital images (INEQR). It is necessary to use interpolation in image scaling because there is an increase or a decrease in the number of pixels. The difference between the proposed scheme and nearest neighbor interpolation is that the concept applied, to estimate the missing pixel value, is guided by the nearest value rather than the distance. Firstly, a sequence of quantum operations is predefined, such as cyclic shift transformations and the basic arithmetic operations. Then, the feasibility of the nearest neighbor value interpolation method for quantum image of INEQR is proven using the previously designed quantum operations. Furthermore, quantum image scaling algorithm in the form of circuits of the NNV interpolation for INEQR is constructed for the first time. The merit of the proposed INEQR circuit lies in their low complexity, which is achieved by utilizing the unique properties of quantum superposition and entanglement. Finally, simulation-based experimental results involving different classical images and ratios (i.e., conventional or non-quantum) are simulated based on the classical computer’s MATLAB 2014b software, which demonstrates that the proposed interpolation method has higher performances in terms of high resolution compared to the nearest neighbor and bilinear interpolation.

32 citations


Journal ArticleDOI
TL;DR: The sparsity and non-local self-similarity priors are used as regularization terms to enhance the stability of an interpolation model and patches of different polarization channels are joined to learn adaptive sub-dictionary.
Abstract: To address the key image interpolation issue in microgrid polarimeters, we propose a machine learning model based on sparse representation. The sparsity and non-local self-similarity priors are used as regularization terms to enhance the stability of an interpolation model. Moreover, to make the best of the correlation among different polarization orientations, patches of different polarization channels are joined to learn adaptive sub-dictionary. Synthetic and real images are used to evaluate the interpolated performance. The experimental results demonstrate that our proposed method achieves state-of-the-art results in terms of quantitative measures and visual quality.

29 citations


Journal ArticleDOI
TL;DR: A generalized directional PVO with varying block size is proposed with a good margin of performance compared with the state-of-the-art methods and several steganographic analysis deemed robust against several attacks.
Abstract: Pixel Value Ordering (PVO) is an efficient data hiding scheme where pixels are ranked in ascending order within an image block and then modify minimum and maximum pixel value to embed secret data. The embedding capacity of existing PVO based data hiding schemes were limited to embed only two bits in a row of any block and unable to perform repeated embedding. To solve the existing problem, we have proposed a generalized directional PVO (DPVO) with varying block size. The original image is partitioned into blocks and then enlarged using image interpolation. A new parameter (α) is introduced and added with maximum pixel value and subtracted from minimum pixel value to maintain the order of the rank which is dependent on the size of the image block. To improve data hiding capacity, overlapped embedding has been considered in three different directions (1) Horizontal, (2) Vertical and (3) Diagonal of each block. Experiments show that the proposed scheme has a good margin of performance compared with the state-of-the-art methods. Several steganographic analysis deemed robust against several attacks.

26 citations


Journal ArticleDOI
TL;DR: This study reduces the value range of the position values and re-encodes the values to reduce the distortion, which is better in terms of hiding payload and image quality.

24 citations


Proceedings ArticleDOI
15 May 2018
TL;DR: This work introduces a new multi-view method that performs inpainting in intermediate, locally common planes that results in correct perspective and multi-View coherence of inPainting results and presents a fast planar region extraction method operating on small image clusters.
Abstract: Image-Based Rendering (IBR) allows high-fidelity free-viewpoint navigation using only a set of photographs and 3D reconstruction as input. It is often necessary or convenient to remove objects from the captured scenes, allowing a form of scene editing for IBR. This requires multi-view inpainting of the input images. Previous methods suffer from several major limitations: they lack true multi-view coherence, resulting in artifacts such as blur, they do not preserve perspective during inpainting, provide inaccurate depth completion and can only handle scenes with a few tens of images. Our approach addresses these limitations by introducing a new multi-view method that performs inpainting in intermediate, locally common planes. Use of these planes results in correct perspective and multi-view coherence of inpainting results. For efficient treatment of large scenes, we present a fast planar region extraction method operating on small image clusters. We adapt the resolution of inpainting to that required in each input image of the multi-view dataset, and carefully handle image resampling between the input images and rectified planes. We show results on large indoors and outdoors environments.

Proceedings ArticleDOI
14 Apr 2018
TL;DR: This paper generalizes the formulation of the guide image filter by using the idea of window functions in image signal processing to represent arbitrary kernel shapes and reveals the relationship between the guided image filtering and the variants of this filter.
Abstract: In this paper, we propose an extension of guided image filtering to support arbitrary window functions. The guided image filtering is a fast edge-preserving filter based on a local linearity assumption. The filter supports not only image smoothing but also edge enhancement and image interpolation. The guided image filter assumes that an input image is a local linear transformation of a guidance image, and the assumption is supported in a local finite region. For realizing the supposition, the guided image filtering consists of a stack of box filtering. The limitation of the guided image filtering is flexibilities of kernel shape setting. Therefore, we generalize the formulation of the guide image filter by using the idea of window functions in image signal processing to represent arbitrary kernel shapes. Also, we reveal the relationship between the guided image filtering and the variants of this filter.

Journal ArticleDOI
TL;DR: This paper investigates the problem of image resampling detection based on the linear parametric model, and exposes the periodic artifact of one-dimensional 1-D) resampled signal and proposes a practical Likelihood Ratio Test (LRT).
Abstract: Resampling forgery generally refers to as the technique that utilizes interpolation algorithm to maliciously geometrically transform a digital image or a portion of an image. This paper investigates the problem of image resampling detection based on the linear parametric model. First, we expose the periodic artifact of one-dimensional 1-D) resampled signal. After dealing with the nuisance parameters, together with Bayes’ rule, the detector is designed based on the probability of residual noise extracted from resampled signal using linear parametric model. Subsequently, we mainly study the characteristic of a resampled image. Meanwhile, it is proposed to estimate the probability of pixels’ noise and establish a practical Likelihood Ratio Test (LRT). Comparison with the state-of-the-art tests, numerical experiments show the relevance of our proposed algorithm with detecting uncompressed/compressed resampled images.

Journal ArticleDOI
TL;DR: This study uses diverse compute unified device architecture (CUDA) optimization strategies to make full use of the graphics processing unit (GPU) (NVIDIA Tesla K80), including a shared memory and register and multi-GPU optimization, and modify the training window to obtain a more concise matrix operation.
Abstract: With the growth in the consumer electronics industry, it is vital to develop an algorithm for ultrahigh definition products that is more effective and has lower time complexity. Image interpolation, which is based on an autoregressive model, has achieved significant improvements compared with the traditional algorithm with respect to image reconstruction, including a better peak signal-to-noise ratio (PSNR) and improved subjective visual quality of the reconstructed image. However, the time-consuming computation involved has become a bottleneck in those autoregressive algorithms. Because of the high time cost, image autoregressive-based interpolation algorithms are rarely used in industry for actual production. In this study, in order to meet the requirements of real-time reconstruction, we use diverse compute unified device architecture (CUDA) optimization strategies to make full use of the graphics processing unit (GPU) (NVIDIA Tesla K80), including a shared memory and register and multi-GPU optimization. To be more suitable for the GPU-parallel optimization, we modify the training window to obtain a more concise matrix operation. Experimental results show that, while maintaining a high PSNR and subjective visual quality and taking into account the I/O transfer time, our algorithm achieves a high speedup of 147.3 times for a Lena image and 174.8 times for a 720p video, compared to the original single-threaded C CPU code with -O2 compiling optimization.

Journal ArticleDOI
TL;DR: The unified coordinate system algorithm (UCSA) improving on the polar format algorithm (PFA) is proposed for THz ViSAR images formation, which is based on the more accurate wavenumber domain signal model derived by second-order Taylor expansion.
Abstract: Video synthetic aperture radar (ViSAR) is a novel imaging mode where the radar system forms a sequence of SAR images while the radar platform is flying. To achieve a high frame rate, ViSAR system typically operates in the terahertz (THz) band. In this paper, the unified coordinate system algorithm (UCSA) improving on the polar format algorithm (PFA) is proposed for THz ViSAR images formation, which is based on the more accurate wavenumber domain signal model derived by second-order Taylor expansion. Considering the small synthetic angle feature of THz ViSAR, the UCSA divides two-dimensional resampling into range and azimuth resampling using the chirp z-transform to reduce the processing complexity in ViSAR imaging. Through the analysis in this paper, it can be known that THz PFA images are free from the quadratic phase errors. However, the PFA images are oriented to different directions and exist image distortion caused by linear phase errors (LPE). Through an image resampling, the UCSA corrects the LPE and rotates the images to the unified coordinate system simultaneously. The effectiveness of UCSA is validated by a ground-based experiment performed by a 0.3 THz radar system.

Proceedings ArticleDOI
02 Nov 2018
TL;DR: In this article, the rescaling bilinear interpolant's pixels is used to estimate the new pixel value, to be assigned at the empty locations of the destination image, to evaluate the effect of rescaling on image interpolation quality.
Abstract: Rescaling bilinear (RB) interpolant's pixels is a novel image interpolation scheme. In the current study, we investigate the effects on the quality of interpolated images. RB determines the lower and upper bounds using the standard deviation of the four nearest pixels to find the new interval or range that will be used to rescale the bilinear interpolant's pixels. The products of the rescaled-pixels and corresponding distance-based-weights are added to estimate the new pixel value, to be assigned at the empty locations of the destination image. Effects of RB on image interpolation quality were investigated using standard full-reference and non-reference objective image quality metrics, particularly those focusing on interpolated images features and distortion similarities. Furthermore, variance and mean based metrics were also employed to further investigate the effects in terms of contrast and intensity increment or decrement. The Matlab based simulations demonstrated generally superior performances of RB compared to the traditional bilinear (TB) interpolation algorithm. The studied scheme's major drawback was a higher processing time and tendency to rely on the image type and/or specific interpolation scaling ratio to achieve superior performances. Potential applications of rescaling based bilinear interpolation may also include ultrasound scan conversion in cardiac ultrasound, endoscopic ultrasound, etc. (Less)

Journal ArticleDOI
TL;DR: This paper proposes an isophote-constrained AR (ICAR) model to perform AR-flavored interpolation within an identified joint stable region and further develop an AR interpolation with an adaptive window extension and introduces a weighted ICAR model.
Abstract: The autoregressive (AR) model is widely used in image interpolations. Traditional AR models consider utilizing the dependence between pixels to model the image signal. However, they ignore the valuable patch-level information for image modeling. In this paper, we propose to integrate both the pixel-level and patch-level information to depict the relationship between high-resolution and low-resolution pixels and obtain better image interpolation results. In particular, we propose an isophote-constrained AR (ICAR) model to perform AR-flavored interpolation within an identified joint stable region and further develop an AR interpolation with an adaptive window extension. Considering the smoothness along the isophote curve, the ICAR model searches only several successive similar patches along the isophote curve over a large region to construct an adaptive window. These overlapped patches, representing the patch-level structure similarity, are used to construct a joint AR model. To better characterize the piecewise stationarity and determine whether a pixel is suitable for AR estimation, we further propose pixel-level and patch-level similarity metrics and embed them into the ICAR model, introducing a weighted ICAR model. Comprehensive experiments demonstrate that our method can effectively reconstruct the edge structures and suppress jaggy or ringing artifacts. In the objective quality evaluation, our method achieves the best results in terms of both peak signal-to-noise ratio and structural similarity for both simple size doubling (two times) and for arbitrary scale enlargements.

Posted Content
TL;DR: In this paper, an end-to-end deep learning codec is proposed for video compression, which is based on repeated image interpolation and can be used to improve the performance of traditional video compression.
Abstract: An ever increasing amount of our digital communication, media consumption, and content creation revolves around videos. We share, watch, and archive many aspects of our lives through them, all of which are powered by strong video compression. Traditional video compression is laboriously hand designed and hand optimized. This paper presents an alternative in an end-to-end deep learning codec. Our codec builds on one simple idea: Video compression is repeated image interpolation. It thus benefits from recent advances in deep image interpolation and generation. Our deep video codec outperforms today's prevailing codecs, such as H.261, MPEG-4 Part 2, and performs on par with H.264.

Journal ArticleDOI
TL;DR: It is observed that the proposed scheme provides better performance than other existing data hiding schemes in terms of data embedding capacity, visual quality and security.
Abstract: In this paper, a new reversible data hiding scheme has been proposed using lagrange’s interpolating polynomial on interpolated sub-sampled images. First, we generate sub-sampled images from original image and enlarge its size using image interpolation. Now, we convert secret message using lagrange interpolating polynomial and generate new secret message. The new secret message is divided and stored within interleaved pixel of each interpolated sub-sampled images. At the receiver end, new secret message is extracted from interleaved pixel of each sub-sampled stego images and then lagrange’s interpolation is applied to generate original secret message. The security has been enhanced due to the distributive nature of hidden data within multiple images. The original pixels are not effected during data embedding which assure reversibility. The proposed scheme provides average embedding capacity with good visual quality measured by peak signal to noise ratio (PSNR) which is greater than 50 dB. It is observed that the proposed scheme provides better performance than other existing data hiding schemes in terms of data embedding capacity, visual quality and security. We have analyzed our stego images through RS analysis, calculate relative entropy, standard deviation and correlation coefficient of original and stego image to show the robustness under various steganographic attacks.

Journal ArticleDOI
Guowen Chen1, Cong Ma1, Zhencheng Fan1, Xiwen Cui1, Hongen Liao1 
TL;DR: To the best of the knowledge, the proposed LBR method outperforms state-of-the-art CGIP algorithms relative to rendering speed and image quality with the authors' super-multiview hardware configurations.
Abstract: We propose a computer generated integral photography (CGIP) method that employs a lens based rendering (LBR) algorithm for super-multiview displays to achieve higher frame rates and better image quality without pixel resampling or view interpolation The algorithm can utilize both fixed and programmable graphics pipelines to accelerate CGIP rendering and inter-perspective antialiasing Two hardware prototypes were fabricated with two high-resolution liquid crystal displays and micro-lens arrays (MLA) Qualitative and quantitative experiments were performed to evaluate the feasibility of the proposed algorithm To the best of our knowledge, the proposed LBR method outperforms state-of-the-art CGIP algorithms relative to rendering speed and image quality with our super-multiview hardware configurations A demonstration experiment was also conducted to reveal the interactivity of a super-multiview display utilizing the proposed algorithm

Proceedings ArticleDOI
01 Feb 2018
TL;DR: This work focuses on the problem of detecting packaged food products in indoor refrigerator environments and analyzes the impact of various data augmentation strategies on the overall accuracy of the object detection and increases the overall mean average precision (mAP).
Abstract: Training of deep Convolutional Neural Networks (CNNs) for object detection tasks requires a huge amount of annotated data which is expensive, difficult and time-consuming to produce. This requirement can be fulfilled by automating the process of dataset generation. We utilize the approach of training deep CNNs using completely synthetically rendered data, with the focus of improving the overall transfer learning performance through online and offline data augmentation techniques. We focus on the problem of detecting packaged food products in indoor refrigerator environments. We analyze the impact of various data augmentation strategies like randomized cropping, pixel shifting, image scaling, image rotation, oversaturation, Gaussian blurring, noise addition, color inversion etc. on the overall accuracy of the object detection and increase the overall mean average precision (mAP). It is found that the use of a combination of data augmentation techniques performs best, with highest mAP of 20.54 obtained with combinations of linear augmentation techniques like scaling, shifting and scaling and rotation.

Proceedings ArticleDOI
02 Nov 2018
TL;DR: A novel image interpolation algorithm that uses the preliminary pixels kernel and extrapolated pixels adjustment has been proposed for interpolation operations and demonstrated generally higher performance than state-of-art algorithms mentioned with objective evaluations and comparable performances with subjective evaluations.
Abstract: Unlike traditional linear interpolation algorithms, which compute all kernel pixels locations, a novel image interpolation algorithm that uses the preliminary pixels kernel and extrapolated pixels adjustment has been proposed for interpolation operations. The proposed interpolation algorithm is mainly based on the weighting functions of the preliminary interpolation kernel and linearly extrapolated pixels adjustments. Experimentally, the proposed method demonstrated generally higher performance than state-of-art algorithms mentioned with objective evaluations as well as comparable performances with subjective evaluations. Potential applications may include the ultrasound scan conversion for displaying the sectored image.

Journal Article
TL;DR: An image steganography method for the nearest-neighbor interpolation method in image scaling is proposed, which can resist both scaling attack and statistical detection.
Abstract: The current image steganography algorithms mainly focus on the anti-detectability rather than the robustness to scaling attacks. Therefore, it is difficult to extract the secret messages correctly after stego images subject to scaling attacks. To this end, an image steganography method for the nearest-neighbor interpolation method in image scaling is proposed, which can resist both scaling attack and statistical detection. First, the principle of nearest-neighbor interpolation method is analyzed and summarized, which is using the resized pixel coordinates to find the original adjacent pixels, and obtaining the weights of the adjacent original pixels by the distance between resized pixel and the original adjacent pixels, finally calculating the resized pixel value. Second, the scaling invariant pixels are extracted using the principle of nearest-neighbor interpolation method to generate a new cover image. Then the distortion functions in WOW, S-UNIWARD and MiPOD are used to calculate the new cover’s distortion to minimize the distortion embedding using STCs coding. Finally, resize the stego image to the original size. The steganalysis experiments based on BossBase-1.01 image library and SPAM, maxSRM features demonstrate that the proposed method has good resistance of scaling attack and Statistical detection under various scaling factors and payloads.

Posted Content
TL;DR: This work first design two modules to extract the intrinsic image structures from the data-driven and knowledge-based perspectives, respectively, and introduces a collaborative methodology to cascade these modules, which can strictly prove the convergence of image propagations to a deblurring-related optimal solution.
Abstract: Blind image deblurring plays a very important role in many vision and multimedia applications. Most existing works tend to introduce complex priors to estimate the sharp image structures for blur kernel estimation. However, it has been verified that directly optimizing these models is challenging and easy to fall into degenerate solutions. Although several experience-based heuristic inference strategies, including trained networks and designed iterations, have been developed, it is still hard to obtain theoretically guaranteed accurate solutions. In this work, a collaborative learning framework is established to address the above issues. Specifically, we first design two modules, named Generator and Corrector, to extract the intrinsic image structures from the data-driven and knowledge-based perspectives, respectively. By introducing a collaborative methodology to cascade these modules, we can strictly prove the convergence of our image propagations to a deblurring-related optimal solution. As a nontrivial byproduct, we also apply the proposed method to address other related tasks, such as image interpolation and edge-preserved smoothing. Plenty of experiments demonstrate that our method can outperform the state-of-the-art approaches on both synthetic and real datasets.

Proceedings ArticleDOI
07 Nov 2018
TL;DR: Preliminary experiments demonstrated promising effects of the AAI metric against state-of-the-art non-reference metrics mentioned, and a new study may further develop the studied metric for potential applications in image quality adaptation and/or monitoring in medical imaging.
Abstract: A preliminary study of a non-reference aliasing artefact index (AAI) metric is presented in this paper. We focus on the effects of combining a full-reference metric and interpolation algorithm. The nearest neighbor algorithm (NN) is used as the gold standard against which test-algorithms are judged in terms of aliased structures. The structural similarity index (SSIM) metric is used to evaluate a test image (i.e. a test-algorithm's image) and a reference image (i.e. the NN's image). Preliminary experiments demonstrated promising effects of the AAI metric against state-of-the-art non-reference metrics mentioned. A new study may further develop the studied metric for potential applications in image quality adaptation and/or monitoring in medical imaging. (Less)

Proceedings ArticleDOI
01 Dec 2018
TL;DR: A general survey of the available multi-frame super resolution approaches is explained and several image quality metrics are discussed to measure the similarity between the reconstructed image and the original image.
Abstract: One of the primary measurements of image quality is image resolution. High-resolution images are often required and desired for most of applications as they embody supplementary information. However, the best utilization of image sensors and optical technologies to increase the image pixel density is usually restrictive and overpriced. Therefore, the effective use of image processing techniques for acquiring a high-resolution image generated from low-resolution images is an inexpensive and powerful solution. This kind of image improvement is named image super-resolution. This paper undertakes to investigate the current super-resolution approaches adopted to generate a high-resolution image. Furthermore, it highlights the strengths and the limitations of these approaches. More to the point, several image quality metrics are discussed to measure the similarity between the reconstructed image and the original image.

Journal ArticleDOI
TL;DR: The proposed image interpolation technique is found to be a valuable scheme for the problems related to digital image interpolated digital images.
Abstract: A cubic trigonometric B-spline representation with two parameters is constructed in this work. A soft computing technique, genetic algorithm, is used to find the optimal value of the parameters in the description of B-spline so that the sum square error is minimised. The newly constructed B-spline is then utilised to interpolate two-dimensional digital images. The image quality metrics peak signal-to-noise ratio, structure SIMilarity index, multi-scale structure SIMilarity index and feature SIMilarity index are used to investigate the quality of interpolated digital images. Comparison with already existing image interpolation schemes leads to the conclusion that the proposed image interpolation technique is found to be a valuable scheme for the problems related to digital image interpolation.

Proceedings ArticleDOI
15 Oct 2018
TL;DR: Zhang et al. as discussed by the authors proposed a collaborative learning framework for deblurring, which first design two modules, named Generator and Corrector, to extract the intrinsic image structures from the data-driven and knowledge-based perspectives, respectively.
Abstract: Blind image deblurring plays a very important role in many vision and multimedia applications. Most existing works tend to introduce complex priors to estimate the sharp image structures for blur kernel estimation. However, it has been verified that directly optimizing these models is challenging and easy to fall into degenerate solutions. Although several experience-based heuristic inference strategies, including trained networks and designed iterations, have been developed, it is still hard to obtain theoretically guaranteed accurate solutions. In this work, a collaborative learning framework is established to address the above issues. Specifically, we first design two modules, named Generator and Corrector, to extract the intrinsic image structures from the data-driven and knowledge-based perspectives, respectively. By introducing a collaborative methodology to cascade these modules, we can strictly prove the convergence of our image propagations to a deblurring-related optimal solution. As a nontrivial byproduct, we also apply the proposed method to address other related tasks, such as image interpolation and edge-preserved smoothing. Plenty of experiments demonstrate that our method can outperform the state-of-the-art approaches on both synthetic and real datasets.

Journal ArticleDOI
Fengwei An1, Xiangyu Zhang1, Aiwen Luo1, Lei Chen1, Hans Jurgen Mattausch1 
TL;DR: A dual-feature-based object recognition coprocessor that exploits both histogram of oriented gradient and Haar-like descriptors with a cell-based parallel sliding-window recognition mechanism enables a hardware-friendly implementation of the binary classification for pedestrian detection with improved accuracy.
Abstract: Many computer-vision and machine-learning applications in robotics, mobile, wearable devices, and automotive domains are constrained by their real-time performance requirements. This paper reports a dual-feature-based object recognition coprocessor that exploits both histogram of oriented gradient (HOG) and Haar-like descriptors with a cell-based parallel sliding-window recognition mechanism. The feature extraction circuitry for HOG and Haar-like descriptors is implemented by a pixel-based pipelined architecture, which synchronizes to the pixel frequency from the image sensor. After extracting each cell feature vector, a cell-based sliding window scheme enables parallelized recognition for all windows, which contain this cell. The nearest neighbor search classifier is, respectively, applied to the HOG and Haar-like feature space. The complementary aspects of the two feature domains enable a hardware-friendly implementation of the binary classification for pedestrian detection with improved accuracy. A proof-of-concept prototype chip fabricated in a 65-nm SOI CMOS, having thin gate oxide and buried oxide layers (SOTB CMOS), with 3.22-mm2 core area achieves an energy efficiency of 1.52 nJ/pixel and a processing speed of 30 fps for $1024\times 1616$ -pixel image frames at 200-MHz recognition working frequency and 1-V supply voltage. Furthermore, multiple chips can implement image scaling, since the designed chip has image-size flexibility attributable to the pixel-based architecture.

Book
02 Jul 2018
TL;DR: Introduction -- Linear PDE-based Image Denoising Schemes -- Nonlinear Diffusion-based image Restoration Models -- Variational and PDE Models for Image Interpolation -- Conclusions.
Abstract: Introduction -- Linear PDE-based Image Denoising Schemes -- Nonlinear Diffusion-based Image Restoration Models -- Variational and PDE Models for Image Interpolation -- Conclusions.