scispace - formally typeset
Search or ask a question

Showing papers on "Image gradient published in 2018"


Journal ArticleDOI
TL;DR: This work introduces an effective technique to enhance the images captured underwater and degraded due to the medium scattering and absorption by building on the blending of two images that are directly derived from a color-compensated and white-balanced version of the original degraded image.
Abstract: We introduce an effective technique to enhance the images captured underwater and degraded due to the medium scattering and absorption. Our method is a single image approach that does not require specialized hardware or knowledge about the underwater conditions or scene structure. It builds on the blending of two images that are directly derived from a color-compensated and white-balanced version of the original degraded image. The two images to fusion, as well as their associated weight maps, are defined to promote the transfer of edges and color contrast to the output image. To avoid that the sharp weight map transitions create artifacts in the low frequency components of the reconstructed image, we also adapt a multiscale fusion strategy. Our extensive qualitative and quantitative evaluation reveals that our enhanced images and videos are characterized by better exposedness of the dark regions, improved global contrast, and edges sharpness. Our validation also proves that our algorithm is reasonably independent of the camera settings, and improves the accuracy of several image processing applications, such as image segmentation and keypoint matching.

601 citations


Journal ArticleDOI
TL;DR: The results show that the proposed ℓ 0TDL method outperforms other competing methods, such as total variation (TV) minimization, TV with low rank (TV+LR), and TDL methods.

110 citations


Journal ArticleDOI
TL;DR: An energy-efficient architecture of the Canny edge detector for advanced mobile vision applications by exploiting the rank characteristic of the convolution kernel of Gaussian smoothing and Sobel gradient filters to reduce the number of additions and multiplications is presented.
Abstract: In this paper, we present an energy-efficient architecture of the Canny edge detector for advanced mobile vision applications. Three key techniques for reducing computational complexity of the Canny edge detector are presented. First, by exploiting the rank characteristic of the convolution kernel of Gaussian smoothing and Sobel gradient filters, common computations are identified and shared in the image filter design to reduce the number of additions and multiplications. For the gradient magnitude/direction computation, only three directions of neighboring pixels are considered to reduce computation energy with minor degradation on conformance performance (CP). For the adaptive threshold selections, an interesting observation is that the mean values of gradient magnitudes show small variations depending on the classified block types. Thus, the threshold selection process can be simplified as multiplying the mean value of the local block with predecided constants. The proposed low complexity Canny edge detector has been implemented using both field-programmable gate arrays (FPGAs) and a 65-nm standard-cell library. The FPGA implementation with Xilinx Virtex-V (XC5VSX240T) shows that our edge detector achieves 48% of area and 73% of execution time savings over the conventional architecture without seriously sacrificing the detection performance. The proposed edge detector implemented with 65-nm standard-cell library can easily support real-time ultrahigh definition video data processing (50 frames/s) with the power consumption of 5.48 mW (108.84 $\mu \text{J}$ /frame).

66 citations


Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a gradient direction-based hierarchical adaptive sparse and low-rank (GD-HASLR) model to solve the real-world occluded face recognition problem.

45 citations


Journal ArticleDOI
TL;DR: This paper presents a joint trilateral filtering (JTF) algorithm for depth image SR that integrates local gradient information of the depth image, which allows the prediction and refinement of HR depth image outputs without artifacts like textural copies or edge discontinuities.
Abstract: Compared to the color images, their associated depth images captured by the RGB-D sensors are typically with lower resolution. The task of depth map super-resolution (SR) aims at increasing the resolution of the range data by utilizing the high-resolution (HR) color image, while the details of the depth information are to be properly preserved. In this paper, we present a joint trilateral filtering (JTF) algorithm for depth image SR. The proposed JTF first observes context information from the HR color image. In addition to the extracted spatial and range information of local pixels, our JTF further integrates local gradient information of the depth image, which allows the prediction and refinement of HR depth image outputs without artifacts like textural copies or edge discontinuities. Quantitative and qualitative experimental results demonstrate the effectiveness and robustness of our approach over prior depth map upsampling works.

44 citations


Journal ArticleDOI
TL;DR: In this article, a new measure called normalized total gradient is proposed for multispectral image registration, which is based on the assumption that the gradient of the difference between two aligned band images is sparser than that between two misaligned ones.
Abstract: Image registration is a fundamental issue in multispectral image processing, and is challenged by two main characteristics of multispectral images. First, the regional intensities can be essentially different between band images. Second, the local contrasts of two difference band images are inconsistent or even reversed. Conventional measures can align images with different regional intensity levels, but may fail in the circumstance of severe local intensity variation. In this paper, a new measure called normalized total gradient is proposed for multispectral image registration. The measure is based on the key assumption (observation) that the gradient of the difference between two aligned band images is sparser than that between two misaligned ones. A registration framework, which incorporates image pyramid and global/local optimization, is further introduced for affine transform. Experimental results validate that the proposed method is not only effective for multispectral image registration, but also applicable to general unimodal/multimodal image registration tasks. It performs better than or comparable to the existing methods, both quantitatively and qualitatively.

42 citations


Journal ArticleDOI
29 Jan 2018-PLOS ONE
TL;DR: A novel two-stage image segmentation method using an edge scaled energy functional based on local and global information for intensity inhomogeneous image segmentsation which not only increases the accuracy but also eliminates the problem of initial contour existed in traditional local segmentation methods.
Abstract: This paper presents a novel two-stage image segmentation method using an edge scaled energy functional based on local and global information for intensity inhomogeneous image segmentation. In the first stage, we integrate global intensity term with a geodesic edge term, which produces a preliminary rough segmentation result. Thereafter, by taking final contour of the first stage as initial contour, we begin second stage segmentation process by integrating local intensity term with geodesic edge term to get final segmentation result. Due to the suitable initialization from the first stage, the second stage precisely achieves desirable segmentation result for inhomogeneous image segmentation. Two stage segmentation technique not only increases the accuracy but also eliminates the problem of initial contour existed in traditional local segmentation methods. The energy function of the proposed method uses both global and local terms incorporated with compacted geodesic edge term in an additive fashion which uses image gradient information to delineate obscured boundaries of objects inside an image. A Gaussian kernel is adapted for the regularization of the level set function and to avoid an expensive re-initialization. The experiments were carried out on synthetic and real images. Quantitative validations were performed on Multimodal Brain Tumor Image Segmentation Benchmark (BRATS) 2015 and PH2 skin lesion database. The visual and quantitative comparisons will demonstrate the efficiency of the proposed method.

39 citations



Journal ArticleDOI
TL;DR: This paper proposes a single-image super-resolution scheme by introducing a gradient field sharpening transform that converts the blurry gradient field of upsampled low-resolution (LR) image to a much sharper gradient field that is suitable for low-complexity applications.
Abstract: This paper proposes a single-image super-resolution scheme by introducing a gradient field sharpening transform that converts the blurry gradient field of upsampled low-resolution (LR) image to a much sharper gradient field of original high-resolution (HR) image. Different from the existing methods that need to figure out the whole gradient profile structure and locate the edge points, we derive a new approach that sharpens the gradient field adaptively only based on the pixels in a small neighborhood. To maintain image contrast, image gradient is adaptively scaled to keep the integral of gradient field stable. Finally, the HR image is reconstructed by fusing the LR image with the sharpened HR gradient field. Experimental results demonstrate that the proposed algorithm can generate more accurate gradient field and produce super-resolved images with better objective and visual qualities. Another advantage is that the proposed gradient sharpening transform is very fast and suitable for low-complexity applications.

34 citations


Journal ArticleDOI
TL;DR: In this article, generalized Bregman distances and infimal convolutions with regard to the well-known total variation functional are used for PET-MRI joint reconstruction, which is superior to the respective separate reconstructions and other joint reconstruction methods.
Abstract: Joint reconstruction has recently attracted a lot of attention, especially in the field of medical multi-modality imaging such as PET-MRI. Most of the developed methods rely on the comparison of image gradients, or more precisely their location, direction and magnitude, to make use of structural similarities between the images. A challenge and still an open issue for most of the methods is to handle images in entirely different scales, i.e. different magnitudes of gradients that cannot be dealt with by a global scaling of the data. We propose the use of generalized Bregman distances and infimal convolutions thereof with regard to the well-known total variation functional. The use of a total variation subgradient respectively the involved vector field rather than an image gradient naturally excludes the magnitudes of gradients, which in particular solves the scaling behavior. Additionally, the presented method features a weighting that allows to control the amount of interaction between channels. We give insights into the general behavior of the method, before we further tailor it to a particular application, namely PET-MRI joint reconstruction. To do so, we compute joint reconstruction results from blurry Poisson data for PET and undersampled Fourier data from MRI and show that we can gain a mutual benefit for both modalities. In particular, the results are superior to the respective separate reconstructions and other joint reconstruction methods.

33 citations


Journal ArticleDOI
TL;DR: In this article, a computer vision-based thresholding method was proposed to detect cracks in an iron pipe with two internal cracks and surface abrasion using a thermal camera and two flash lamps as stimulus.
Abstract: Visual inspection and assessment of the condition of metal structures are essential for safety. Pulse thermography produces visible infrared images, which have been widely applied to detect and characterize defects in structures and materials. When active thermography, a non-destructive testing tool, is applied, the necessity of considerable manual checking can be avoided. However, detecting an internal crack with active thermography remains difficult, since it is usually invisible in the collected sequence of infrared images, which makes the automatic detection of internal cracks even harder. In addition, the detection of an internal crack can be hindered by a complicated inspection environment. With the purpose of putting forward a robust and automatic visual inspection method, a computer vision-based thresholding method is proposed. In this paper, the image signals are a sequence of infrared images collected from the experimental setup with a thermal camera and two flash lamps as stimulus. The contrast of pixels in each frame is enhanced by the Canny operator and then reconstructed by a triple-threshold system. Two features, mean value in the time domain and maximal amplitude in the frequency domain, are extracted from the reconstructed signal to help distinguish the crack pixels from others. Finally, a binary image indicating the location of the internal crack is generated by a K-means clustering method. The proposed procedure has been applied to an iron pipe, which contains two internal cracks and surface abrasion. Some improvements have been made for the computer vision-based automatic crack detection methods. In the future, the proposed method can be applied to realize the automatic detection of internal cracks from many infrared images for the industry.

Journal ArticleDOI
TL;DR: Simulation results indicate that the proposed PSO optimized contrast enhancement improves overall image contrast and enriches the information present in the image.
Abstract: In this paper, we propose an optimized contrast enhancement algorithm for color images that improves visual perception of information. As color plays an important cue in many application areas, to prevent unwanted artifacts on color, our proposed method translates the color image into de-correlated lαβ color space based on the statistics of cone response to natural images. A color is defined in the lαβ space by an achromatic channel (brightness l), the red, green chrominance channel (α) and the yellow-blue chrominance channel (β). In order to avoid over saturation and annoying artifacts, our method is applied to the luminance component of the image and α and β are kept as constants. The key work of this paper is to use an adaptive gamma correction factor chosen by particle swarm optimization (PSO) to improve the entropy and enhance the details of the image. Gamma correction is a well-established technique that preserves the mean brightness of an image and produces more natural looking images by the choice of an optimal gamma factor. In the proposed method, the edge content and entropy are used as an objective function for each particle since a color image with good visual contrast includes many intensive edges. Since edges play a primary role in image understanding, one good way to enhance the contrast is to enhance the edges. Simulation results indicate that the proposed PSO optimized contrast enhancement improves overall image contrast and enriches the information present in the image. The proposed method is suitable for many real-time image processing applications.

Journal ArticleDOI
Shifei Ding, Xingyu Zhao, Hui Xu, Qiangbo Zhu, Yu Xue1 
TL;DR: Compared with other multi-scale decompositions-based image fusion and other improved NSCT-PCNN algorithms, the algorithm presented in this study outperforms them in terms of objective criteria and visual appearance.
Abstract: Pulse coupled neural network (PCNN) is widely used in image processing because of its unique biological characteristics, which is suitable for image fusion. When combining PCNN with non-subsampled contourlet (NSCT) model, it is applied in overcoming the difficulty of coefficients selection for subband of the NSCT model. However in the original model, only the grey values of image pixels are used as input, without considering that the subjective vision of human eyes lacks the sensitivity to the local factors of the image. In this study, the improved pulse-coupled neural network model has replaced the grey-scale value of the image and introduced the weighted product of the strength of the gradient of the image and the local phase coherence as the model input. Finally, compared with other multi-scale decompositions-based image fusion and other improved NSCT-PCNN algorithms, the algorithm presented in this study outperforms them in terms of objective criteria and visual appearance.

Proceedings ArticleDOI
Joowan Kim1, Younggun Cho1, Ayoung Kim1
21 May 2018
TL;DR: A novel metric for image information measure is introduced that can estimate the optimal exposure value for vision-based approaches with minimal cost and an effective exposure control scheme is proposed that covers a wide range of light conditions.
Abstract: Under- and oversaturation can cause severe image degradation in many vision-based robotic applications To control camera exposure in dynamic lighting conditions, we introduce a novel metric for image information measure Measuring an image gradient is typical when evaluating its level of image detail However, emphasizing more informative pixels substantially improves the measure within an image By using this entropy weighted image gradient, we introduce an optimal exposure value for vision-based approaches Using this newly invented metric, we also propose an effective exposure control scheme that covers a wide range of light conditions When evaluating the function (eg, image frame grab) is expensive, the next best estimation needs to be carefully considered Through Bayesian optimization, the algorithm can estimate the optimal exposure value with minimal cost We validated the proposed image information measure and exposure control scheme via a series of thorough experiments using various exposure conditions

Journal ArticleDOI
Yingxuan Chen1, Fang-Fang Yin1, Yawei Zhang1, You Zhang1, Lei Ren1 
TL;DR: PCTV preserved edge information as well as reduced streak artifacts and noise in low dose CBCT reconstruction and can potentially improve the localization accuracy in radiation therapy.
Abstract: Purpose: compressed sensing reconstruction using total variation (TV) tends to over-smooth the edge information by uniformly penalizing the image gradient. The goal of this study is to develop a novel prior contour based TV (PCTV) method to enhance the edge information in compressed sensing reconstruction for CBCT. Methods: the edge information is extracted from prior planning-CT via edge detection. Prior CT is first registered with on-board CBCT reconstructed with TV method through rigid or deformable registration. The edge contours in prior-CT is then mapped to CBCT and used as the weight map for TV regularization to enhance edge information in CBCT reconstruction. The PCTV method was evaluated using extended-cardiac-torso (XCAT) phantom, physical CatPhan phantom and brain patient data. Results were compared with both TV and edge preserving TV (EPTV) methods which are commonly used for limited projection CBCT reconstruction. Relative error was used to calculate pixel value difference and edge cross correlation was defined as the similarity of edge information between reconstructed images and ground truth in the quantitative evaluation. Results: compared to TV and EPTV, PCTV enhanced the edge information of bone, lung vessels and tumor in XCAT reconstruction and complex bony structures in brain patient CBCT. In XCAT study using 45 half-fan CBCT projections, compared with ground truth, relative errors were 1.5%, 0.7% and 0.3% and edge cross correlations were 0.66, 0.72 and 0.78 for TV, EPTV and PCTV, respectively. PCTV is more robust to the projection number reduction. Edge enhancement was reduced slightly with noisy projections but PCTV was still superior to other methods. PCTV can maintain resolution while reducing the noise in the low mAs CatPhan reconstruction. Low contrast edges were preserved better with PCTV compared with TV and EPTV. Conclusion: PCTV preserved edge information as well as reduced streak artifacts and noise in low dose CBCT reconstruction. PCTV is superior to TV and EPTV methods in edge enhancement, which can potentially improve the localization accuracy in radiation therapy.

Proceedings ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed a perceptual similarity index (PerSIM) to model visual system characteristics and chroma similarity in the perceptually uniform color domain (Lab), which outperforms all the compared metrics in the overall databases in terms of ranking, monotonic behavior and linearity.
Abstract: An average observer perceives the world in color instead of black and white. Moreover, the visual system focuses on structures and segments instead of individual pixels. Based on these observations, we propose a full reference objective image quality metric modeling visual system characteristics and chroma similarity in the perceptually uniform color domain (Lab). Laplacian of Gaussian features are obtained in the L channel to model the retinal ganglion cells in human visual system and color similarity is calculated over the a and b channels. In the proposed perceptual similarity index (PerSIM), a multi-resolution approach is followed to mimic the hierarchical nature of human visual system. LIVE and TID2013 databases are used in the validation and PerSIM outperforms all the compared metrics in the overall databases in terms of ranking, monotonic behavior and linearity.

Journal ArticleDOI
TL;DR: A novel adaptive rendering approach is proposed to remove Monte Carlo noise while preserving image details through a feature-based reconstruction that outperforms state-of-the-art methods in terms of visual image quality and numerical error.
Abstract: In this study, a novel adaptive rendering approach is proposed to remove Monte Carlo noise while preserving image details through a feature-based reconstruction. First, noise in the additional features is removed using a guided image filter that reduces the impact of noisy features involving strong motion blur or depth of field. The Sobel operator is then employed to recognize the geometric structures by robustly computing a gradient buffer for each feature. Given the gradient information for high-dimensional features, we compute the optimal filter parameters using a data-driven method. Finally, an error analysis is derived through a two-step smoothing strategy to produce a smooth image and guide the adaptive sampling process. Experimental results indicate that our approach outperforms state-of-the-art methods in terms of visual image quality and numerical error.

Journal ArticleDOI
TL;DR: An automatic image registration approach through line-support region segmentation and geometrical outlier removal is proposed to address the problems associated with the registration of images with affine deformations and inconsistent content, such as remote sensing images with different spectral content or noise interference, or map images with inconsistent annotations.
Abstract: The implementation of automatic image registration is still difficult in various applications. In this paper, an automatic image registration approach through line-support region segmentation and geometrical outlier removal is proposed. This new approach is designed to address the problems associated with the registration of images with affine deformations and inconsistent content, such as remote sensing images with different spectral content or noise interference, or map images with inconsistent annotations. To begin with, line-support regions, namely a straight region whose points share roughly the same image gradient angle, are extracted to address the issues of inconsistent content existing in images. To alleviate the incompleteness of line segments, an iterative strategy with multi-resolution is employed to preserve global structures that are masked at full resolution by image details or noise. Then, geometrical outlier removal is developed to provide reliable feature point matching, which is based on affine-invariant geometrical classifications for corresponding matches initialized by scale invariant feature transform. The candidate outliers are selected by comparing the disparity of accumulated classifications among all matches, instead of conventional methods which only rely on local geometrical relations. Various image sets have been considered in this paper for the evaluation of the proposed approach, including aerial images with simulated affine deformations, remote sensing optical and synthetic aperture radar images taken at different situations (multispectral, multisensor, and multitemporal), and map images with inconsistent annotations. Experimental results demonstrate the superior performance of the proposed method over the existing approaches for the whole data set.

Proceedings ArticleDOI
01 Oct 2018
TL;DR: This paper proposes an edge-based robust RGB-D visual odometry using 2-D edge divergence minimization in the manner of an iterative reweight least squares (IRLS) motion estimation, and proposes a robust edge matching criterion with image gradient vectors.
Abstract: This paper proposes an edge-based robust RGB-D visual odometry (VO)using 2-D edge divergence minimization. Our approach focuses on enabling the VO to operate in more general environments subject to low texture and changing brightness, by employing image edge regions and their image gradient vectors within the iterative closest points (ICP)framework. For more robust and stable ICP-based optimization, we propose a robust edge matching criterion with image gradient vectors. In addition, to reduce a bad effect of outlier residuals, we propose an improved edge registration problem of 2-D edge divergence minimization in the manner of an iterative reweight least squares (IRLS)motion estimation. To accelerate the proposed approach, a pixel sub-sampling method is employed. We evaluate estimation performance of our method in changing brightness conditions and low-textured scenes. Our approach shows more robust motion estimation than state-of-the-art methods while maintaining comparable accuracy in challenging image sequences at real-time (25 Hz)operation.

Journal ArticleDOI
TL;DR: The extensive experiments on various image classification tasks, such as face recognition, texture classification, object categorization, and palmprint recognition show that MOLP could achieve competitive performance with state-of-the art methods.
Abstract: Local feature descriptor plays a key role in different image classification applications. Some of these methods such as local binary pattern and image gradient orientations have been proven effective to some extent. However, such traditional descriptors which only utilize single-type features, are deficient to capture the edges and orientations information and intrinsic structure information of images. In this paper, we propose a kernel embedding multiorientation local pattern (MOLP) to address this problem. For a given image, it is first transformed by gradient operators in local regions, which generate multiorientation gradient images containing edges and orientations information of different directions. Then the histogram feature which takes into account the sign component and magnitude component, is extracted to form the refined feature from each orientation gradient image. The refined feature captures more information of the intrinsic structure, and is effective for image representation and classification. Finally, the multiorientation refined features are automatically fused in the kernel embedding discriminant subspace learning model. The extensive experiments on various image classification tasks, such as face recognition, texture classification, object categorization, and palmprint recognition show that MOLP could achieve competitive performance with those state-of-the art methods.

Posted Content
TL;DR: A mask is introduced to separate the image into low- and high-frequency parts based on image gradient magnitude, and then a gradient sensitive loss is devised to well capture the structures in the image without sacrificing the recovery of low-frequency content.
Abstract: Deep neural networks have exhibited promising performance in image super-resolution (SR) due to the power in learning the non-linear mapping from low-resolution (LR) images to high-resolution (HR) images. However, most deep learning methods employ feed-forward architectures, and thus the dependencies between LR and HR images are not fully exploited, leading to limited learning performance. Moreover, most deep learning based SR methods apply the pixel-wise reconstruction error as the loss, which, however, may fail to capture high-frequency information and produce perceptually unsatisfying results, whilst the recent perceptual loss relies on some pre-trained deep model and they may not generalize well. In this paper, we introduce a mask to separate the image into low- and high-frequency parts based on image gradient magnitude, and then devise a gradient sensitive loss to well capture the structures in the image without sacrificing the recovery of low-frequency content. Moreover, by investigating the duality in SR, we develop a dual reconstruction network (DRN) to improve the SR performance. We provide theoretical analysis on the generalization performance of our method and demonstrate its effectiveness and superiority with thorough experiments.

Posted Content
TL;DR: A deep network based unsupervised visual odometry system for 6-DoF camera pose estimation and finding dense depth map for its monocular view is presented and is shown to provide superior performance in depth and ego-motion estimation compared to the existing state-of-the-art methods.
Abstract: This paper presents an unsupervised deep learning framework called UnDEMoN for estimating dense depth map and 6-DoF camera pose information directly from monocular images. The proposed network is trained using unlabeled monocular stereo image pairs and is shown to provide superior performance in depth and ego-motion estimation compared to the existing state-of-the-art. These improvements are achieved by introducing a new objective function that aims to minimize spatial as well as temporal reconstruction losses simultaneously. These losses are defined using bi-linear sampling kernel and penalized using the Charbonnier penalty function. The objective function, thus created, provides robustness to image gradient noises thereby improving the overall estimation accuracy without resorting to any coarse to fine strategies which are currently prevalent in the literature. Another novelty lies in the fact that we combine a disparity-based depth estimation network with a pose estimation network to obtain absolute scale-aware 6 DOF Camera pose and superior depth map. The effectiveness of the proposed approach is demonstrated through performance comparison with the existing supervised and unsupervised methods on the KITTI driving dataset.

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed constrained optimization approach for image gradient enhancement can significantly improve the subjective image quality by enhancing both the contrast and the image gradient.
Abstract: The human visual system is not very sensitive to the absolute luminance of an image, but rather responds to local luminance changes, i.e., the gradient of an image. In this paper, we propose a constrained optimization approach for image gradient enhancement. The gradient strength of the enhanced image can be controlled directly using a target gradient strength parameter in the cost function. To suppress artifacts and ensure that contrast improves, a novel constraint is included in the optimization. Due to the number of variables in optimization-based image enhancement techniques being equal to the number of gray scales, we quantize the image using a $k$ -means clustering-based histogram mergence (KCHM) method before enhancement. KCHM can significantly reduce the number of image gray scales while effectively preserving the subjective quality. This is useful considering the reduction of variables is good for solving optimization and reducing computation cost. Experimental results demonstrate that the proposed method can significantly improve the subjective image quality by enhancing both the contrast and the image gradient.

Journal ArticleDOI
TL;DR: This study describes a modified non-blind and blind deconvolution model by introducing a regularisation parameter that incorporates the spatial image information and uses a weighted total variation term, where the weight is a spatially adaptive parameter based on the image gradient.
Abstract: In this study, the authors describe a modified non-blind and blind deconvolution model by introducing a regularisation parameter that incorporates the spatial image information. Indeed, they have used a weighted total variation term, where the weight is a spatially adaptive parameter based on the image gradient. The proposed models are solved by the split Bregman method. To handle adequately the discrete convolution transform in a moderate time, fast Fourier transform is used. Tests are conducted on several images, and for assessing the results, they define appropriate weighted versions of two standard image quality metrics. These new weighted metrics clearly highlight the advantage of the spatially adaptive approach.

Journal ArticleDOI
TL;DR: An automatic estimation of standard deviation of Rician noise is designed to enhance the quality of denoised results, and the alternating direction method of multipliers (ADMM) is adopted to implement numerical schedule, and gains the robust solution.
Abstract: This paper presents an improved variational model for denoising MR images degraded by Rician noise. An additional data-fidelity based on image gradient is introduced, which can retain more details and edges. In this model, an automatic estimation of standard deviation of Rician noise is designed to enhance the quality of denoised results. The alternating direction method of multipliers (ADMM) is adopted to implement numerical schedule, and gains the robust solution. Experimental results demonstrate that the proposed method is efficient, and has better denoising capability than the state-of-the-art models.

Journal ArticleDOI
TL;DR: An edge-based superpixel similarity measurement, which globally evaluates the similarity between superpixels by binary edge maps, and demonstrates that the proposed global similarity measurement can improve the clustering accuracy in terms of larger intersection-over-union-criterion-based values.
Abstract: This paper proposes an edge-based superpixel similarity measurement, which globally evaluates the similarity between superpixels by binary edge maps. The basic idea is to assess whether the superpixels are surrounded by the same edges. To this end, we first describe the edge spatial distributions by directional regions and then use the directional regions to represent the surrounding relationships of superpixels and edges by their traverse relationships, which form the histogram feature. Finally, the similarity is simply calculated by the distances between the features. To verify the proposed similarity measurement, we use our global similarity measurement to perform superpixel clustering. Two clustering methods, the directed graph clustering (DGC) and spectral clustering (ultrametric contour map) are combined to achieve the clustering process. The combination of our global similarity measurement and DGC to form a new three-layer-based superpixel generation method, which can quickly generate the superpixel from edge maps, is highlighted. We verify the global similarity measurement by the BSDS500 dataset. The experimental results demonstrate that the proposed global similarity measurement can improve the clustering accuracy in terms of larger intersection-over-union-criterion-based values. The code can be downloaded from https://github.com/FanmanMeng/Superpixel-Similarity-Measurement .

Proceedings ArticleDOI
21 May 2018
TL;DR: This paper shows how to fuse multiple different cues under the same convolutional network framework, adopting a pre-trained Resnet-lOl to extract feature maps from RGB images and connecting it with three extra deconvolution layers.
Abstract: Road detection from images is a key task in autonomous driving. The recent advent of deep learning (and in particular, CNN or convolutional neural networks) has greatly improved the performance of road detection algorithms. In this paper, we show how to fuse multiple different cues under the same convolutional network framework. Specifically, we adopt a pre-trained Resnet-lOl to extract feature maps from RGB images; we then connect it with three extra deconvolution layers. These deconvolution layers is trained conditioning on appropriate image cues, and in our case they are a height image (i.e. elevation map obtained by e.g. Lidar scanner), image gradient, and position map. We also design two skip layers to speed up the convergence. Experiments on KITTI benchmark show competitive performance of our new networks.

Posted Content
TL;DR: In this article, a new gradient-based cloud image segmentation technique is developed using tools from image processing techniques, which integrates morphological image gradient magnitudes to separable cloud systems and patches boundaries.
Abstract: Being able to effectively identify clouds and monitor their evolution is one important step toward more accurate quantitative precipitation estimation and forecast. In this study, a new gradient-based cloud-image segmentation technique is developed using tools from image processing techniques. This method integrates morphological image gradient magnitudes to separable cloud systems and patches boundaries. A varying scale-kernel is implemented to reduce the sensitivity of image segmentation to noise and capture objects with various finenesses of the edges in remote-sensing images. The proposed method is flexible and extendable from single- to multi-spectral imagery. Case studies were carried out to validate the algorithm by applying the proposed segmentation algorithm to synthetic radiances for channels of the Geostationary Operational Environmental Satellites (GOES-R) simulated by a high-resolution weather prediction model. The proposed method compares favorably with the existing cloud-patch-based segmentation technique implemented in the PERSIANN-CCS (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Network - Cloud Classification System) rainfall retrieval algorithm. Evaluation of event-based images indicates that the proposed algorithm has potential to improve rain detection and estimation skills with an average of more than 45% gain comparing to the segmentation technique used in PERSIANN-CCS and identifying cloud regions as objects with accuracy rates up to 98%.

Journal ArticleDOI
TL;DR: A novel transformation between MGGDs is proposed, consisting of an optimal transportation of the second-order statistics and a stochastic-based shape parameter transformation, which is employed for a color transfer and a gradient transfer between images.
Abstract: Multivariate generalized Gaussian distributions (MGGDs) have aroused a great interest in the image processing community thanks to their ability to describe accurately various image features, such as image gradient fields. However, so far their applicability has been limited by the lack of a transformation between two of these parametric distributions. In this paper, we propose a novel transformation between MGGDs, consisting of an optimal transportation of the second-order statistics and a stochastic-based shape parameter transformation. We employ the proposed transformation between MGGDs for a color transfer and a gradient transfer between images. We also propose a new simultaneous transfer of color and gradient, which we apply for image color correction.

Proceedings ArticleDOI
23 Jul 2018
TL;DR: It is empirically show that subsampling with the same step size leads to very similar accuracy changes for different classifiers, and AdaSkip, where the row sampling resolution is adaptively changed based on the image gradient, is proposed.
Abstract: Today's mobile devices are equipped with cameras capable of taking very high-resolution pictures. For computer vision tasks which require relatively low resolution, such as image classification, sub-sampling is desired to reduce the unnecessary power consumption of the image sensor. In this paper, we study the relationship between subsampling and the performance degradation of image classifiers that are based on deep neural networks (DNNs). We empirically show that subsampling with the same step size leads to very similar accuracy changes for different classifiers. In particular, we could achieve over 15x energy savings just by subsampling while suffering almost no accuracy lost. For even better energy accuracy trade-offs, we propose AdaSkip, where the row sampling resolution is adaptively changed based on the image gradient. We implement AdaSkip on an FPGA and report its energy consumption.