Showing papers in "IEEE Transactions on Image Processing in 2011"
TL;DR: A novel feature similarity (FSIM) index for full reference IQA is proposed based on the fact that human visual system (HVS) understands an image mainly according to its low-level features.
Abstract: Image quality assessment (IQA) aims to use computational models to measure the image quality consistently with subjective evaluations. The well-known structural similarity index brings IQA from pixel- to structure-based stage. In this paper, a novel feature similarity (FSIM) index for full reference IQA is proposed based on the fact that human visual system (HVS) understands an image mainly according to its low-level features. Specifically, the phase congruency (PC), which is a dimensionless measure of the significance of a local structure, is used as the primary feature in FSIM. Considering that PC is contrast invariant while the contrast information does affect HVS' perception of image quality, the image gradient magnitude (GM) is employed as the secondary feature in FSIM. PC and GM play complementary roles in characterizing the image local quality. After obtaining the local quality map, we use PC again as a weighting function to derive a single quality score. Extensive experiments performed on six benchmark IQA databases demonstrate that FSIM can achieve much higher consistency with the subjective evaluations than state-of-the-art IQA metrics.
TL;DR: Efficiency figures show that the proposed technique for motion detection outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate.
Abstract: This paper presents a technique for motion detection that incorporates several innovative mechanisms. For example, our proposed technique stores, for each pixel, a set of values taken in the past at the same location or in the neighborhood. It then compares this set to the current pixel value in order to determine whether that pixel belongs to the background, and adapts the model by choosing randomly which values to substitute from the background model. This approach differs from those based upon the classical belief that the oldest values should be replaced first. Finally, when the pixel is found to be part of the background, its value is propagated into the background model of a neighboring pixel. We describe our method in full details (including pseudo-code and the parameter values used) and compare it to other background subtraction techniques. Efficiency figures show that our method outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate. We also analyze the performance of a downscaled version of our algorithm to the absolute minimum of one comparison and one byte of memory per pixel. It appears that even such a simplified version of our algorithm performs better than mainstream techniques.
TL;DR: DIIVINE is capable of assessing the quality of a distorted image across multiple distortion categories, as against most NR IQA algorithms that are distortion-specific in nature, and is statistically superior to the often used measure of peak signal-to-noise ratio (PSNR) and statistically equivalent to the popular structural similarity index (SSIM).
Abstract: Our approach to blind image quality assessment (IQA) is based on the hypothesis that natural scenes possess certain statistical properties which are altered in the presence of distortion, rendering them un-natural; and that by characterizing this un-naturalness using scene statistics, one can identify the distortion afflicting the image and perform no-reference (NR) IQA. Based on this theory, we propose an (NR)/blind algorithm-the Distortion Identification-based Image Verity and INtegrity Evaluation (DIIVINE) index-that assesses the quality of a distorted image without need for a reference image. DIIVINE is based on a 2-stage framework involving distortion identification followed by distortion-specific quality assessment. DIIVINE is capable of assessing the quality of a distorted image across multiple distortion categories, as against most NR IQA algorithms that are distortion-specific in nature. DIIVINE is based on natural scene statistics which govern the behavior of natural images. In this paper, we detail the principles underlying DIIVINE, the statistical features extracted and their relevance to perception and thoroughly evaluate the algorithm on the popular LIVE IQA database. Further, we compare the performance of DIIVINE against leading full-reference (FR) IQA algorithms and demonstrate that DIIVINE is statistically superior to the often used measure of peak signal-to-noise ratio (PSNR) and statistically equivalent to the popular structural similarity index (SSIM). A software release of DIIVINE has been made available online: http://live.ece.utexas.edu/research/quality/DIIVINE_release.zip for public use and evaluation.
TL;DR: Extensive experiments on image deblurring and super-resolution validate that by using adaptive sparse domain selection and adaptive regularization, the proposed method achieves much better results than many state-of-the-art algorithms in terms of both PSNR and visual perception.
Abstract: As a powerful statistical image modeling technique, sparse representation has been successfully used in various image restoration applications. The success of sparse representation owes to the development of the l1-norm optimization techniques and the fact that natural images are intrinsically sparse in some domains. The image restoration quality largely depends on whether the employed sparse domain can represent well the underlying image. Considering that the contents can vary significantly across different images or different patches in a single image, we propose to learn various sets of bases from a precollected dataset of example image patches, and then, for a given patch to be processed, one set of bases are adaptively selected to characterize the local sparse domain. We further introduce two adaptive regularization terms into the sparse representation framework. First, a set of autoregressive (AR) models are learned from the dataset of example image patches. The best fitted AR models to a given patch are adaptively selected to regularize the image local structures. Second, the image nonlocal self-similarity is introduced as another regularization term. In addition, the sparsity regularization parameter is adaptively estimated for better image restoration performance. Extensive experiments on image deblurring and super-resolution validate that by using adaptive sparse domain selection and adaptive regularization, the proposed method achieves much better results than many state-of-the-art algorithms in terms of both PSNR and visual perception.
TL;DR: A novel region-based method for image segmentation, which is able to simultaneously segment the image and estimate the bias field, and the estimated bias field can be used for intensity inhomogeneity correction (or bias correction).
Abstract: Intensity inhomogeneity often occurs in real-world images, which presents a considerable challenge in image segmentation. The most widely used image segmentation algorithms are region-based and typically rely on the homogeneity of the image intensities in the regions of interest, which often fail to provide accurate segmentation results due to the intensity inhomogeneity. This paper proposes a novel region-based method for image segmentation, which is able to deal with intensity inhomogeneities in the segmentation. First, based on the model of images with intensity inhomogeneities, we derive a local intensity clustering property of the image intensities, and define a local clustering criterion function for the image intensities in a neighborhood of each point. This local clustering criterion function is then integrated with respect to the neighborhood center to give a global criterion of image segmentation. In a level set formulation, this criterion defines an energy in terms of the level set functions that represent a partition of the image domain and a bias field that accounts for the intensity inhomogeneity of the image. Therefore, by minimizing this energy, our method is able to simultaneously segment the image and estimate the bias field, and the estimated bias field can be used for intensity inhomogeneity correction (or bias correction). Our method has been validated on synthetic images and real images of various modalities, with desirable performance in the presence of intensity inhomogeneities. Experiments show that our method is more robust to initialization, faster and more accurate than the well-known piecewise smooth model. As an application, our method has been used for segmentation and bias correction of magnetic resonance (MR) images with promising results.
TL;DR: This paper aims to test the hypothesis that when viewing natural images, the optimal perceptual weights for pooling should be proportional to local information content, which can be estimated in units of bit using advanced statistical models of natural images.
Abstract: Many state-of-the-art perceptual image quality assessment (IQA) algorithms share a common two-stage structure: local quality/distortion measurement followed by pooling. While significant progress has been made in measuring local image quality/distortion, the pooling stage is often done in ad-hoc ways, lacking theoretical principles and reliable computational models. This paper aims to test the hypothesis that when viewing natural images, the optimal perceptual weights for pooling should be proportional to local information content, which can be estimated in units of bit using advanced statistical models of natural images. Our extensive studies based upon six publicly-available subject-rated image databases concluded with three useful findings. First, information content weighting leads to consistent improvement in the performance of IQA algorithms. Second, surprisingly, with information content weighting, even the widely criticized peak signal-to-noise-ratio can be converted to a competitive perceptual quality measure when compared with state-of-the-art algorithms. Third, the best overall performance is achieved by combining information content weighting with multiscale structural similarity measures.
TL;DR: In this article, an augmented Lagrangian method is proposed to deal with a variety of imaging ill-posed linear inverse problems, including deconvolution and reconstruction from compressive observations (such as MRI), using either total variation or wavelet-based regularization.
Abstract: We propose a new fast algorithm for solving one of the standard approaches to ill-posed linear inverse problems (IPLIP), where a (possibly nonsmooth) regularizer is minimized under the constraint that the solution explains the observations sufficiently well. Although the regularizer and constraint are usually convex, several particular features of these problems (huge dimensionality, nonsmoothness) preclude the use of off-the-shelf optimization tools and have stimulated a considerable amount of research. In this paper, we propose a new efficient algorithm to handle one class of constrained problems (often known as basis pursuit denoising) tailored to image recovery applications. The proposed algorithm, which belongs to the family of augmented Lagrangian methods, can be used to deal with a variety of imaging IPLIP, including deconvolution and reconstruction from compressive observations (such as MRI), using either total-variation or wavelet-based (or, more generally, frame-based) regularization. The proposed algorithm is an instance of the so-called alternating direction method of multipliers, for which convergence sufficient conditions are known; we show that these conditions are satisfied by the proposed algorithm. Experiments on a set of image restoration and reconstruction benchmark problems show that the proposed algorithm is a strong contender for the state-of-the-art.
TL;DR: A graph based algorithm, called graph regularized sparse coding, is proposed, to learn the sparse representations that explicitly take into account the local manifold structure of the data.
Abstract: Sparse coding has received an increasing amount of interest in recent years. It is an unsupervised learning algorithm, which finds a basis set capturing high-level semantics in the data and learns sparse coordinates in terms of the basis set. Originally applied to modeling the human visual cortex, sparse coding has been shown useful for many applications. However, most of the existing approaches to sparse coding fail to consider the geometrical structure of the data space. In many real applications, the data is more likely to reside on a low-dimensional submanifold embedded in the high-dimensional ambient space. It has been shown that the geometrical information of the data is important for discrimination. In this paper, we propose a graph based algorithm, called graph regularized sparse coding, to learn the sparse representations that explicitly take into account the local manifold structure of the data. By using graph Laplacian as a smooth operator, the obtained sparse representations vary smoothly along the geodesics of the data manifold. The extensive experimental results on image classification and clustering have demonstrated the effectiveness of our proposed algorithm.
TL;DR: A survey of many recent developments and state-of-the-art methods in computational color constancy, including a taxonomy of existing algorithms, and methods are separated in three groups: static methods, gamut- based methods, and learning-based methods.
Abstract: Computational color constancy is a fundamental prerequisite for many computer vision applications. This paper presents a survey of many recent developments and state-of-the-art methods. Several criteria are proposed that are used to assess the approaches. A taxonomy of existing algorithms is proposed and methods are separated in three groups: static methods, gamut-based methods, and learning-based methods. Further, the experimental setup is discussed including an overview of publicly available datasets. Finally, various freely available methods, of which some are considered to be state of the art, are evaluated on two datasets.
TL;DR: The PEE technique is further investigated and an efficient reversible watermarking scheme is proposed, by incorporating in PEE two new strategies, namely, adaptive embedding and pixel selection, which outperforms conventional PEE.
Abstract: Prediction-error expansion (PEE) is an important technique of reversible watermarking which can embed large payloads into digital images with low distortion. In this paper, the PEE technique is further investigated and an efficient reversible watermarking scheme is proposed, by incorporating in PEE two new strategies, namely, adaptive embedding and pixel selection. Unlike conventional PEE which embeds data uniformly, we propose to adaptively embed 1 or 2 bits into expandable pixel according to the local complexity. This avoids expanding pixels with large prediction-errors, and thus, it reduces embedding impact by decreasing the maximum modification to pixel values. Meanwhile, adaptive PEE allows very large payload in a single embedding pass, and it improves the capacity limit of conventional PEE. We also propose to select pixels of smooth area for data embedding and leave rough pixels unchanged. In this way, compared with conventional PEE, a more sharply distributed prediction-error histogram is obtained and a better visual quality of watermarked image is observed. With these improvements, our method outperforms conventional PEE. Its superiority over other state-of-the-art methods is also demonstrated experimentally.
TL;DR: The results show that at high SNR, the multiple description encoder does not need to fine-tune the optimization parameters of the system due to the correlated nature of the subcarriers, and FEC-based multiple description coding without temporal coding provides a greater advantage for smaller description sizes.
Abstract: Recently, multiple description source coding has emerged as an attractive framework for robust multimedia transmission over packet erasure channels. In this paper, we mathematically analyze the performance of n-channel symmetric FEC-based multiple description coding for a progressive mode of transmission over orthogonal frequency division multiplexing (OFDM) networks in a frequency-selective slowly-varying Rayleigh faded environment. We derive the expressions for the bounds of the throughput and distortion performance of the system in an explicit closed form, whereas the exact performance is given by an expression in the form of a single integration. Based on this analysis, the performance of the system can be numerically evaluated. Our results show that at high SNR, the multiple description encoder does not need to fine-tune the optimization parameters of the system due to the correlated nature of the subcarriers. It is also shown that, despite the bursty nature of the errors in a slow fading environment, FEC-based multiple description coding without temporal coding provides a greater advantage for smaller description sizes.
TL;DR: This paper presents a no-reference image blur metric that is based on the study of human blur perception for varying contrast values that utilizes a probabilistic model to estimate the probability of detecting blur at each edge in the image.
Abstract: This paper presents a no-reference image blur metric that is based on the study of human blur perception for varying contrast values. The metric utilizes a probabilistic model to estimate the probability of detecting blur at each edge in the image, and then the information is pooled by computing the cumulative probability of blur detection (CPBD). The performance of the metric is demonstated by comparing it with existing no-reference sharpness/blurriness metrics for various publicly available image databases.
TL;DR: The proposed algorithm, as opposed to existing methods, does not consider video restoration as a sequence of image restoration problems, rather, it treats a video sequence as a space-time volume and poses aspace-time total variation regularization to enhance the smoothness of the solution.
Abstract: This paper presents a fast algorithm for restoring video sequences. The proposed algorithm, as opposed to existing methods, does not consider video restoration as a sequence of image restoration problems. Rather, it treats a video sequence as a space-time volume and poses a space-time total variation regularization to enhance the smoothness of the solution. The optimization problem is solved by transforming the original unconstrained minimization problem to an equivalent constrained minimization problem. An augmented Lagrangian method is used to handle the constraints, and an alternating direction method is used to iteratively find solutions to the subproblems. The proposed algorithm has a wide range of applications, including video deblurring and denoising, video disparity refinement, and hot-air turbulence effect reduction.
TL;DR: The denoising process is expressed as a linear expansion of thresholds (LET) that is optimized by relying on a purely data-adaptive unbiased estimate of the mean-squared error (MSE) derived in a non-Bayesian framework (PURE: Poisson-Gaussian unbiased risk estimate).
Abstract: We propose a general methodology (PURE-LET) to design and optimize a wide class of transform-domain thresholding algorithms for denoising images corrupted by mixed Poisson-Gaussian noise. We express the denoising process as a linear expansion of thresholds (LET) that we optimize by relying on a purely data-adaptive unbiased estimate of the mean-squared error (MSE), derived in a non-Bayesian framework (PURE: Poisson-Gaussian unbiased risk estimate). We provide a practical approximation of this theoretical MSE estimate for the tractable optimization of arbitrary transform-domain thresholding. We then propose a pointwise estimator for undecimated filterbank transforms, which consists of subband-adaptive thresholding functions with signal-dependent thresholds that are globally optimized in the image domain. We finally demonstrate the potential of the proposed approach through extensive comparisons with state-of-the-art techniques that are specifically tailored to the estimation of Poisson intensities. We also present denoising results obtained on real images of low-count fluorescence microscopy.
TL;DR: A hybrid approach to robustly detect and localize texts in natural scene images using a text region detector, a conditional random field model, and a learning-based energy minimization method are presented.
Abstract: Text detection and localization in natural scene images is important for content-based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, the variations of text font, size and line orientation. In this paper, we present a hybrid approach to robustly detect and localize texts in natural scene images. A text region detector is designed to estimate the text existing confidence and scale information in image pyramid, which help segment candidate text components by local binarization. To efficiently filter out the non-text components, a conditional random field (CRF) model considering unary component properties and binary contextual component relationships with supervised parameter learning is proposed. Finally, text components are grouped into text lines/words with a learning-based energy minimization method. Since all the three stages are learning-based, there are very few parameters requiring manual tuning. Experimental results evaluated on the ICDAR 2005 competition dataset show that our approach yields higher precision and recall performance compared with state-of-the-art methods. We also evaluated our approach on a multilingual image dataset with promising results.
TL;DR: An algorithm that enhances the contrast of an input image using interpixel contextual information and produces better or comparable enhanced images than four state-of-the-art algorithms is proposed.
Abstract: This paper proposes an algorithm that enhances the contrast of an input image using interpixel contextual information. The algorithm uses a 2-D histogram of the input image constructed using a mutual relationship between each pixel and its neighboring pixels. A smooth 2-D target histogram is obtained by minimizing the sum of Frobenius norms of the differences from the input histogram and the uniformly distributed histogram. The enhancement is achieved by mapping the diagonal elements of the input histogram to the diagonal elements of the target histogram. Experimental results show that the algorithm produces better or comparable enhanced images than four state-of-the-art algorithms.
TL;DR: The quantitative and visual results are showing the superiority of the proposed technique over the conventional and state-of-art image resolution enhancement techniques.
Abstract: In this correspondence, the authors propose an image resolution enhancement technique based on interpolation of the high frequency subband images obtained by discrete wavelet transform (DWT) and the input image. The edges are enhanced by introducing an intermediate stage by using stationary wavelet transform (SWT). DWT is applied in order to decompose an input image into different subbands. Then the high frequency subbands as well as the input image are interpolated. The estimated high frequency subbands are being modified by using high frequency subband obtained through SWT. Then all these subbands are combined to generate a new high resolution image by using inverse DWT (IDWT). The quantitative and visual results are showing the superiority of the proposed technique over the conventional and state-of-art image resolution enhancement techniques.
TL;DR: A new framework to detect text strings with arbitrary orientations in complex natural scene images with outperform the state-of-the-art results on the public Robust Reading Dataset, which contains text only in horizontal orientation.
Abstract: Text information in natural scene images serves as important clues for many image-based applications such as scene understanding, content-based image retrieval, assistive navigation, and automatic geocoding. However, locating text from a complex background with multiple colors is a challenging task. In this paper, we explore a new framework to detect text strings with arbitrary orientations in complex natural scene images. Our proposed framework of text string detection consists of two steps: 1) image partition to find text character candidates based on local gradient features and color uniformity of character components and 2) character candidate grouping to detect text strings based on joint structural features of text characters in each text string such as character size differences, distances between neighboring characters, and character alignment. By assuming that a text string has at least three characters, we propose two algorithms of text string detection: 1) adjacent character grouping method and 2) text line grouping method. The adjacent character grouping method calculates the sibling groups of each character candidate as string segments and then merges the intersecting sibling groups into text string. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitted text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region covering all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi-scales. The proposed methods outperform the state-of-the-art results on the public Robust Reading Dataset, which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in nonhorizontal orientations.
TL;DR: This work introduces optimal inverses for the Anscombe transformation, in particular the exact unbiased inverse, a maximum likelihood (ML) inverse, and a more sophisticated minimum mean square error (MMSE) inverse.
Abstract: The removal of Poisson noise is often performed through the following three-step procedure. First, the noise variance is stabilized by applying the Anscombe root transformation to the data, producing a signal in which the noise can be treated as additive Gaussian with unitary variance. Second, the noise is removed using a conventional denoising algorithm for additive white Gaussian noise. Third, an inverse transformation is applied to the denoised signal, obtaining the estimate of the signal of interest. The choice of the proper inverse transformation is crucial in order to minimize the bias error which arises when the nonlinear forward transformation is applied. We introduce optimal inverses for the Anscombe transformation, in particular the exact unbiased inverse, a maximum likelihood (ML) inverse, and a more sophisticated minimum mean square error (MMSE) inverse. We then present an experimental analysis using a few state-of-the-art denoising algorithms and show that the estimation can be consistently improved by applying the exact unbiased inverse, particularly at the low-count regime. This results in a very efficient filtering solution that is competitive with some of the best existing methods for Poisson image denoising.
TL;DR: Numerical results demonstrate that the proposed method can outperform robust rotational-invariant PCAs based on L1 norm when outliers occur and requires no assumption about the zero-mean of data for processing and can estimate data mean during optimization.
Abstract: Principal component analysis (PCA) minimizes the mean square error (MSE) and is sensitive to outliers. In this paper, we present a new rotational-invariant PCA based on maximum correntropy criterion (MCC). A half-quadratic optimization algorithm is adopted to compute the correntropy objective. At each iteration, the complex optimization problem is reduced to a quadratic problem that can be efficiently solved by a standard optimization method. The proposed method exhibits the following benefits: 1) it is robust to outliers through the mechanism of MCC which can be more theoretically solid than a heuristic rule based on MSE; 2) it requires no assumption about the zero-mean of data for processing and can estimate data mean during optimization; and 3) its optimal solution consists of principal eigenvectors of a robust covariance matrix corresponding to the largest eigenvalues. In addition, kernel techniques are further introduced in the proposed method to deal with nonlinearly distributed data. Numerical results demonstrate that the proposed method can outperform robust rotational-invariant PCAs based on L1 norm when outliers occur.
TL;DR: A method to detect co-saliency from an image pair that may have some objects in common and employ a normalized single-pair SimRank algorithm to compute the similarity score is introduced.
Abstract: In this paper, we introduce a method to detect co-saliency from an image pair that may have some objects in common. The co-saliency is modeled as a linear combination of the single-image saliency map (SISM) and the multi-image saliency map (MISM). The first term is designed to describe the local attention, which is computed by using three saliency detection techniques available in literature. To compute the MISM, a co-multilayer graph is constructed by dividing the image pair into a spatial pyramid representation. Each node in the graph is described by two types of visual descriptors, which are extracted from a representation of some aspects of local appearance, e.g., color and texture properties. In order to evaluate the similarity between two nodes, we employ a normalized single-pair SimRank algorithm to compute the similarity score. Experimental evaluation on a number of image pairs demonstrates the good performance of the proposed method on the co-saliency detection task.
TL;DR: The manifold regularization and the margin maximization to NMF are introduced and the manifold regularized discriminative NMF (MD-NMF) is obtained to overcome the aforementioned problems.
Abstract: Nonnegative matrix factorization (NMF) has become a popular data-representation method and has been widely used in image processing and pattern-recognition problems. This is because the learned bases can be interpreted as a natural parts-based representation of data and this interpretation is consistent with the psychological intuition of combining parts to form a whole. For practical classification tasks, however, NMF ignores both the local geometry of data and the discriminative information of different classes. In addition, existing research results show that the learned basis is unnecessarily parts-based because there is neither explicit nor implicit constraint to ensure the representation parts-based. In this paper, we introduce the manifold regularization and the margin maximization to NMF and obtain the manifold regularized discriminative NMF (MD-NMF) to overcome the aforementioned problems. The multiplicative update rule (MUR) can be applied to optimizing MD-NMF, but it converges slowly. In this paper, we propose a fast gradient descent (FGD) to optimize MD-NMF. FGD contains a Newton method that searches the optimal step length, and thus, FGD converges much faster than MUR. In addition, FGD includes MUR as a special case and can be applied to optimizing NMF and its variants. For a problem with 165 samples in R1600 , FGD converges in 28 s, while MUR requires 282 s. We also apply FGD in a variant of MD-NMF and experimental results confirm its efficiency. Experimental results on several face image datasets suggest the effectiveness of MD-NMF.
TL;DR: An iterative reconstruction algorithm for discrete tomography, called discrete algebraic reconstruction technique (DART), which is capable of computing more accurate reconstructions from a small number of projection images, or from asmall angular range, than alternative methods.
Abstract: In this paper, we present an iterative reconstruction algorithm for discrete tomography, called discrete algebraic reconstruction technique (DART). DART can be applied if the scanned object is known to consist of only a few different compositions, each corresponding to a constant gray value in the reconstruction. Prior knowledge of the gray values for each of the compositions is exploited to steer the current reconstruction towards a reconstruction that contains only these gray values. Based on experiments with both simulated CT data and experimental μCT data, it is shown that DART is capable of computing more accurate reconstructions from a small number of projection images, or from a small angular range, than alternative methods. It is also shown that DART can deal effectively with noisy projection data and that the algorithm is robust with respect to errors in the estimation of the gray values.
TL;DR: A novel generic image prior-gradient profile prior is proposed, which implies the prior knowledge of natural image gradients and proposes a gradient field transformation to constrain the gradient fields of the high resolution image and the enhanced image when performing single image super-resolution and sharpness enhancement.
Abstract: In this paper, we propose a novel generic image prior-gradient profile prior, which implies the prior knowledge of natural image gradients. In this prior, the image gradients are represented by gradient profiles, which are 1-D profiles of gradient magnitudes perpendicular to image structures. We model the gradient profiles by a parametric gradient profile model. Using this model, the prior knowledge of the gradient profiles are learned from a large collection of natural images, which are called gradient profile prior. Based on this prior, we propose a gradient field transformation to constrain the gradient fields of the high resolution image and the enhanced image when performing single image super-resolution and sharpness enhancement. With this simple but very effective approach, we are able to produce state-of-the-art results. The reconstructed high resolution images or the enhanced images are sharp while have rare ringing or jaggy artifacts.
TL;DR: The proposed generalized unsharp masking algorithm using the exploratory data model as a unified framework is designed to address three issues: simultaneously enhancing contrast and sharpness by means of individual treatment of the model component and the residual, reducing the halo effect by Means of an edge-preserving filter, and solving the out-of-range problem by mean of log-ratio and tangent operations.
Abstract: Enhancement of contrast and sharpness of an image is required in many applications. Unsharp masking is a classical tool for sharpness enhancement. We propose a generalized unsharp masking algorithm using the exploratory data model as a unified framework. The proposed algorithm is designed to address three issues: 1) simultaneously enhancing contrast and sharpness by means of individual treatment of the model component and the residual, 2) reducing the halo effect by means of an edge-preserving filter, and 3) solving the out-of-range problem by means of log-ratio and tangent operations. We also present a study of the properties of the log-ratio operations and reveal a new connection between the Bregman divergence and the generalized linear systems. This connection not only provides a novel insight into the geometrical property of such systems, but also opens a new pathway for system development. We present a new system called the tangent system which is based upon a specific Bregman divergence. Experimental results, which are comparable to recently published results, show that the proposed algorithm is able to significantly improve the contrast and sharpness of an image. In the proposed algorithm, the user can adjust the two parameters controlling the contrast and sharpness to produce the desired results. This makes the proposed algorithm practically useful.
TL;DR: A fast model-based iterative reconstruction algorithm using spatially nonhomogeneous ICD (NH-ICD) optimization that accelerates the reconstructions by roughly a factor of three on average for typical 3-D multislice geometries is presented.
Abstract: Recent applications of model-based iterative reconstruction (MBIR) algorithms to multislice helical CT reconstructions have shown that MBIR can greatly improve image quality by increasing resolution as well as reducing noise and some artifacts However, high computational cost and long reconstruction times remain as a barrier to the use of MBIR in practical applications Among the various iterative methods that have been studied for MBIR, iterative coordinate descent (ICD) has been found to have relatively low overall computational requirements due to its fast convergence This paper presents a fast model-based iterative reconstruction algorithm using spatially nonhomogeneous ICD (NH-ICD) optimization The NH-ICD algorithm speeds up convergence by focusing computation where it is most needed The NH-ICD algorithm has a mechanism that adaptively selects voxels for update First, a voxel selection criterion VSC determines the voxels in greatest need of update Then a voxel selection algorithm VSA selects the order of successive voxel updates based upon the need for repeated updates of some locations, while retaining characteristics for global convergence In order to speed up each voxel update, we also propose a fast 1-D optimization algorithm that uses a quadratic substitute function to upper bound the local 1-D objective function, so that a closed form solution can be obtained rather than using a computationally expensive line search algorithm We examine the performance of the proposed algorithm using several clinical data sets of various anatomy The experimental results show that the proposed method accelerates the reconstructions by roughly a factor of three on average for typical 3-D multislice geometries
TL;DR: The Bayesian framework infers an approximate representation for the noise statistics while simultaneously inferring the low-rank and sparse-outlier contributions; the model is robust to a broad range of noise levels, without having to change model hyperparameter settings.
Abstract: A hierarchical Bayesian model is considered for decomposing a matrix into low-rank and sparse components, assuming the observed matrix is a superposition of the two. The matrix is assumed noisy, with unknown and possibly non-stationary noise statistics. The Bayesian framework infers an approximate representation for the noise statistics while simultaneously inferring the low-rank and sparse-outlier contributions; the model is robust to a broad range of noise levels, without having to change model hyperparameter settings. In addition, the Bayesian framework allows exploitation of additional structure in the matrix. For example, in video applications each row (or column) corresponds to a video frame, and we introduce a Markov dependency between consecutive rows in the matrix (corresponding to consecutive frames in the video). The properties of this Markov process are also inferred based on the observed matrix, while simultaneously denoising and recovering the low-rank and sparse components. We compare the Bayesian model to a state-of-the-art optimization-based implementation of robust PCA; considering several examples, we demonstrate competitive performance of the proposed model.
TL;DR: This paper addresses the super resolution (SR) problem from a set of degraded low resolution (LR) images to obtain a high resolution (HR) image and proposes novel super resolution methods where the HR image and the motion parameters are estimated simultaneously.
Abstract: In this paper, we address the super resolution (SR) problem from a set of degraded low resolution (LR) images to obtain a high resolution (HR) image. Accurate estimation of the sub-pixel motion between the LR images significantly affects the performance of the reconstructed HR image. In this paper, we propose novel super resolution methods where the HR image and the motion parameters are estimated simultaneously. Utilizing a Bayesian formulation, we model the unknown HR image, the acquisition process, the motion parameters and the unknown model parameters in a stochastic sense. Employing a variational Bayesian analysis, we develop two novel algorithms which jointly estimate the distributions of all unknowns. The proposed framework has the following advantages: 1) Through the incorporation of uncertainty of the estimates, the algorithms prevent the propagation of errors between the estimates of the various unknowns; 2) the algorithms are robust to errors in the estimation of the motion parameters; and 3) using a fully Bayesian formulation, the developed algorithms simultaneously estimate all algorithmic parameters along with the HR image and motion parameters, and therefore they are fully-automated and do not require parameter tuning. We also show that the proposed motion estimation method is a stochastic generalization of the classical Lucas-Kanade registration algorithm. Experimental results demonstrate that the proposed approaches are very effective and compare favorably to state-of-the-art SR algorithms.
TL;DR: A novel probabilistic model-based fusion technique for multi-exposure images that aims to achieve an optimal balance between two quality measures, i.e., local contrast and color consistency, while combining the scene details revealed under different exposures.
Abstract: A single captured image of a real-world scene is usually insufficient to reveal all the details due to under- or over-exposed regions To solve this problem, images of the same scene can be first captured under different exposure settings and then combined into a single image using image fusion techniques In this paper, we propose a novel probabilistic model-based fusion technique for multi-exposure images Unlike previous multi-exposure fusion methods, our method aims to achieve an optimal balance between two quality measures, ie, local contrast and color consistency, while combining the scene details revealed under different exposures A generalized random walks framework is proposed to calculate a globally optimal solution subject to the two quality measures by formulating the fusion problem as probability estimation Experiments demonstrate that our algorithm generates high-quality images at low computational cost Comparisons with a number of other techniques show that our method generates better results in most cases
TL;DR: In this article, a trigonometric range kernel was proposed to realize the bilateral filter in constant time, which is done by generalizing the idea presented by Porikli, i.e., using polynomial kernels.
Abstract: It is well known that spatial averaging can be realized (in space or frequency domain) using algorithms whose complexity does not scale with the size or shape of the filter. These fast algorithms are generally referred to as constant-time or O(1) algorithms in the image-processing literature. Along with the spatial filter, the edge-preserving bilateral filter involves an additional range kernel. This is used to restrict the averaging to those neighborhood pixels whose intensity are similar or close to that of the pixel of interest. The range kernel operates by acting on the pixel intensities. This makes the averaging process nonlinear and computationally intensive, particularly when the spatial filter is large. In this paper, we show how the O(1) averaging algorithms can be leveraged for realizing the bilateral filter in constant time, by using trigonometric range kernels. This is done by generalizing the idea presented by Porikli, i.e., using polynomial kernels. The class of trigonometric kernels turns out to be sufficiently rich, allowing for the approximation of the standard Gaussian bilateral filter. The attractive feature of our approach is that, for a fixed number of terms, the quality of approximation achieved using trigonometric kernels is much superior to that obtained by Porikli using polynomials.