scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Image Processing in 2003"


Journal ArticleDOI
TL;DR: The performance of this method for removing noise from digital images substantially surpasses that of previously published methods, both visually and in terms of mean squared error.
Abstract: We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multiscale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the coefficient amplitudes. Under this model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear estimates over all possible values of the hidden multiplier variable. We demonstrate through simulations with images contaminated by additive white Gaussian noise that the performance of this method substantially surpasses that of previously published methods, both visually and in terms of mean squared error.

2,439 citations


Journal ArticleDOI
TL;DR: An expectation-maximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain is introduced, and it is shown that under mild conditions the algorithm converges to a globally optimal restoration.
Abstract: This paper introduces an expectation-maximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with low-complexity, expressed in the wavelet coefficients, taking advantage of the well known sparsity of wavelet representations. Previous works have investigated wavelet-based restoration but, except for certain special cases, the resulting criteria are solved approximately or require demanding optimization methods. The EM algorithm herein proposed combines the efficient image representation offered by the discrete wavelet transform (DWT) with the diagonalization of the convolution operator obtained in the Fourier domain. Thus, it is a general-purpose approach to wavelet-based image restoration with computational complexity comparable to that of standard wavelet denoising schemes or of frequency domain deconvolution methods. The algorithm alternates between an E-step based on the fast Fourier transform (FFT) and a DWT-based M-step, resulting in an efficient iterative process requiring O(NlogN) operations per iteration. The convergence behavior of the algorithm is investigated, and it is shown that under mild conditions the algorithm converges to a globally optimal restoration. Moreover, our new approach performs competitively with, in some cases better than, the best existing methods in benchmark tests.

1,260 citations


Journal ArticleDOI
TL;DR: The novel contribution of this paper is the combination of these three previously developed components, image decomposition with inpainting and texture synthesis, which permits the simultaneous use of filling-in algorithms that are suited for different image characteristics.
Abstract: An algorithm for the simultaneous filling-in of texture and structure in regions of missing image information is presented in this paper. The basic idea is to first decompose the image into the sum of two functions with different basic characteristics, and then reconstruct each one of these functions separately with structure and texture filling-in algorithms. The first function used in the decomposition is of bounded variation, representing the underlying image structure, while the second function captures the texture and possible noise. The region of missing information in the bounded variation image is reconstructed using image inpainting algorithms, while the same region in the texture image is filled-in with texture synthesis techniques. The original image is then reconstructed adding back these two sub-images. The novel contribution of this paper is then in the combination of these three previously developed components, image decomposition with inpainting and texture synthesis, which permits the simultaneous use of filling-in algorithms that are suited for different image characteristics. Examples on real images show the advantages of this proposed approach.

1,024 citations


Journal ArticleDOI
TL;DR: The proposed framework includes some novel low-level processing algorithms, such as dominant color region detection, robust shot boundary detection, and shot classification, as well as some higher-level algorithms for goal detection, referee detection,and penalty-box detection.
Abstract: We propose a fully automatic and computationally efficient framework for analysis and summarization of soccer videos using cinematic and object-based features. The proposed framework includes some novel low-level processing algorithms, such as dominant color region detection, robust shot boundary detection, and shot classification, as well as some higher-level algorithms for goal detection, referee detection, and penalty-box detection. The system can output three types of summaries: i) all slow-motion segments in a game; ii) all goals in a game; iii) slow-motion segments classified according to object-based features. The first two types of summaries are based on cinematic features only for speedy processing, while the summaries of the last type contain higher-level semantics. The proposed framework is efficient, effective, and robust. It is efficient in the sense that there is no need to compute object-based features when cinematic features are sufficient for the detection of certain events, e.g., goals in soccer. It is effective in the sense that the framework can also employ object-based features when needed to increase accuracy (at the expense of more computation). The efficiency, effectiveness, and robustness of the proposed framework are demonstrated over a large data set, consisting of more than 13 hours of soccer video, captured in different countries and under different conditions.

943 citations


Journal ArticleDOI
TL;DR: A new method for image smoothing based on a fourth-order PDE model that demonstrates good noise suppression without destruction of important anatomical or functional detail, even at poor signal-to-noise ratio is introduced.
Abstract: We introduce a new method for image smoothing based on a fourth-order PDE model. The method is tested on a broad range of real medical magnetic resonance images, both in space and time, as well as on nonmedical synthesized test images. Our algorithm demonstrates good noise suppression without destruction of important anatomical or functional detail, even at poor signal-to-noise ratio. We have also compared our method with related PDE models.

883 citations


Journal ArticleDOI
TL;DR: This work proposes an orthonormal version of the ridgelet transform for discrete and finite-size images and uses the finite Radon transform (FRAT) as a building block to overcome the periodization effect of a finite transform.
Abstract: The ridgelet transform was introduced as a sparse expansion for functions on continuous spaces that are smooth away from discontinuities along lines. We propose an orthonormal version of the ridgelet transform for discrete and finite-size images. Our construction uses the finite Radon transform (FRAT) as a building block. To overcome the periodization effect of a finite transform, we introduce a novel ordering of the FRAT coefficients. We also analyze the FRAT as a frame operator and derive the exact frame bounds. The resulting finite ridgelet transform (FRIT) is invertible, nonredundant and computed via fast algorithms. Furthermore, this construction leads to a family of directional and orthonormal bases for images. Numerical results show that the FRIT is more effective than the wavelet transform in approximating and denoising images with straight edges.

734 citations


Journal ArticleDOI
TL;DR: It is shown that the Krawtchouk moments can be employed to extract local features of an image, unlike other orthogonal moments, which generally capture the global features.
Abstract: A new set of orthogonal moments based on the discrete classical Krawtchouk polynomials is introduced. The Krawtchouk polynomials are scaled to ensure numerical stability, thus creating a set of weighted Krawtchouk polynomials. The set of proposed Krawtchouk moments is then derived from the weighted Krawtchouk polynomials. The orthogonality of the proposed moments ensures minimal information redundancy. No numerical approximation is involved in deriving the moments, since the weighted Krawtchouk polynomials are discrete. These properties make the Krawtchouk moments well suited as pattern features in the analysis of two-dimensional images. It is shown that the Krawtchouk moments can be employed to extract local features of an image, unlike other orthogonal moments, which generally capture the global features. The computational aspects of the moments using the recursive and symmetry properties are discussed. The theoretical framework is validated by an experiment on image reconstruction using Krawtchouk moments and the results are compared to that of Zernike, pseudo-Zernike, Legendre, and Tchebyscheff moments. Krawtchouk moment invariants are constructed using a linear combination of geometric moment invariants; an object recognition experiment shows Krawtchouk moment invariants perform significantly better than Hu's moment invariants in both noise-free and noisy conditions.

610 citations


Journal ArticleDOI
TL;DR: Simulation results with the chosen feature set and well-known watermarking and steganographic techniques indicate that the proposed approach is able with reasonable accuracy to distinguish between cover and stego images.
Abstract: We present techniques for steganalysis of images that have been potentially subjected to steganographic algorithms, both within the passive warden and active warden frameworks. Our hypothesis is that steganographic schemes leave statistical evidence that can be exploited for detection with the aid of image quality features and multivariate regression analysis. To this effect image quality metrics have been identified based on the analysis of variance (ANOVA) technique as feature sets to distinguish between cover-images and stego-images. The classifier between cover and stego-images is built using multivariate regression on the selected quality metrics and is trained based on an estimate of the original image. Simulation results with the chosen feature set and well-known watermarking and steganographic techniques indicate that our approach is able with reasonable accuracy to distinguish between cover and stego images.

610 citations


Journal ArticleDOI
TL;DR: A new method for contrast enhancement based on the curvelet transform is presented, which out-performs other enhancement methods on noisy images, but on noiseless or nearNoiseless images curvelet based enhancement is not remarkably better than wave let based enhancement.
Abstract: We present a new method for contrast enhancement based on the curvelet transform. The curvelet transform represents edges better than wavelets, and is therefore well-suited for multiscale edge enhancement. We compare this approach with enhancement based on the wavelet transform, and the multiscale retinex. In a range of examples, we use edge detection and segmentation, among other processing applications, to provide for quantitative comparative evaluation. Our findings are that curvelet based enhancement out-performs other enhancement methods on noisy images, but on noiseless or near noiseless images curvelet based enhancement is not remarkably better than wavelet based enhancement.

532 citations


Journal ArticleDOI
TL;DR: An algorithm for fast elastic multidimensional intensity-based image registration with a parametric model of the deformation that is computationally more efficient than other alternatives and capable of accepting expert hints in the form of soft landmark constraints.
Abstract: We present an algorithm for fast elastic multidimensional intensity-based image registration with a parametric model of the deformation. It is fully automatic in its default mode of operation. In the case of hard real-world problems, it is capable of accepting expert hints in the form of soft landmark constraints. Much fewer landmarks are needed and the results are far superior compared to pure landmark registration. Particular attention has been paid to the factors influencing the speed of this algorithm. The B-spline deformation model is shown to be computationally more efficient than other alternatives. The algorithm has been successfully used for several two-dimensional (2-D) and three-dimensional (3-D) registration tasks in the medical domain, involving MRI, SPECT, CT, and ultrasound image modalities. We also present experiments in a controlled environment, permitting an exact evaluation of the registration accuracy. Test deformations are generated automatically using a random hierarchical fractional wavelet-based generator.

526 citations


Journal ArticleDOI
TL;DR: The proposed biologically motivated method to improve contour detection in machine vision, called nonclassical receptive field (non-CRF) inhibition (more generally, surround inhibition or suppression), is proposed and is more useful for contour-based object recognition tasks, than traditional edge detectors, which do not distinguish between contour and texture edges.
Abstract: We propose a biologically motivated method, called nonclassical receptive field (non-CRF) inhibition (more generally, surround inhibition or suppression), to improve contour detection in machine vision. Non-CRF inhibition is exhibited by 80% of the orientation-selective neurons in the primary visual cortex of monkeys and has been shown to influence human visual perception as well. Essentially, the response of an edge detector at a certain point is suppressed by the responses of the operator in the region outside the supported area. We combine classical edge detection with isotropic and anisotropic inhibition, both of which have counterparts in biology. We also use a biologically motivated method (the Gabor energy operator) for edge detection. The resulting operator responds strongly to isolated lines, edges, and contours, but exhibits weak or no response to edges that are part of texture. We use natural images with associated ground truth contour maps to assess the performance of the proposed operator for detecting contours while suppressing texture edges. Our method enhances contour detection in cluttered visual scenes more effectively than classical edge detectors used in machine vision (Canny edge detector). Therefore, the proposed operator is more useful for contour-based object recognition tasks, such as shape comparison, than traditional edge detectors, which do not distinguish between contour and texture edges. Traditional edge detection algorithms can, however, also be extended with surround suppression. This study contributes also to the understanding of inhibitory mechanisms in biology.

Journal ArticleDOI
Zhigang Fan1, R.L. de Queiroz1
TL;DR: A fast and efficient method is provided to determine whether an image has been previously JPEG compressed, and a method for the maximum likelihood estimation of JPEG quantization steps is developed.
Abstract: Sometimes image processing units inherit images in raster bitmap format only, so that processing is to be carried without knowledge of past operations that may compromise image quality (e.g., compression). To carry further processing, it is useful to not only know whether the image has been previously JPEG compressed, but to learn what quantization table was used. This is the case, for example, if one wants to remove JPEG artifacts or for JPEG re-compression. In this paper, a fast and efficient method is provided to determine whether an image has been previously JPEG compressed. After detecting a compression signature, we estimate compression parameters. Specifically, we developed a method for the maximum likelihood estimation of JPEG quantization steps. The quantizer estimation method is very robust so that only sporadically an estimated quantizer step size is off, and when so, it is by one value.

Journal ArticleDOI
TL;DR: The proposed demosaicking method consists of an interpolation step that estimates missing color values by exploiting spatial and spectral correlations among neighboring pixels, and a post-processing step that suppresses noticeable demosaicks artifacts by adaptive median filtering.
Abstract: Single-sensor digital cameras capture imagery by covering the sensor surface with a color filter array (CFA) such that each sensor pixel only samples one of three primary color values. To render a full-color image, an interpolation process, commonly referred to as CFA demosaicking, is required to estimate the other two missing color values at each pixel. In this paper, we present two contributions to the CFA demosaicking: a new and improved CFA demosaicking method for producing high quality color images and new image measures for quantifying the performance of demosaicking methods. The proposed demosaicking method consists of two successive steps: an interpolation step that estimates missing color values by exploiting spatial and spectral correlations among neighboring pixels, and a post-processing step that suppresses noticeable demosaicking artifacts by adaptive median filtering. Moreover, in recognition of the limitations of current image measures, we propose two types of image measures to quantify the performance of different demosaicking methods; the first type evaluates the fidelity of demosaicked images by computing the peak signal-to-noise ratio and CIELAB /spl utri/E/sup *//sub ab/ for edge and smooth regions separately, and the second type accounts for one major demosaicking artifact-zipper effect. We gauge the proposed demosaicking method and image measures using several existing methods as benchmarks, and demonstrate their efficacy using a variety of test images.

Journal ArticleDOI
TL;DR: This paper proposes a simple and efficient automatic gait recognition algorithm using statistical shape analysis that implicitly uses the action of walking to capture the structural characteristics of gait, especially the shape cues of body biometrics.
Abstract: Gait recognition has recently gained significant attention from computer vision researchers. This interest is strongly motivated by the need for automated person identification systems at a distance in visual surveillance and monitoring applications. The paper proposes a simple and efficient automatic gait recognition algorithm using statistical shape analysis. For each image sequence, an improved background subtraction procedure is used to extract moving silhouettes of a walking figure from the background. Temporal changes of the detected silhouettes are then represented as an associated sequence of complex vector configurations in a common coordinate frame, and are further analyzed using the Procrustes shape analysis method to obtain mean shape as gait signature. Supervised pattern classification techniques, based on the full Procrustes distance measure, are adopted for recognition. This method does not directly analyze the dynamics of gait, but implicitly uses the action of walking to capture the structural characteristics of gait, especially the shape cues of body biometrics. The algorithm is tested on a database consisting of 240 sequences from 20 different subjects walking at 3 viewing angles in an outdoor environment. Experimental results are included to demonstrate the encouraging performance of the proposed algorithm.

Journal ArticleDOI
TL;DR: This work proposes to transfer the super-resolution reconstruction from pixel domain to a lower dimensional face space, and shows that face-space super- Resolution is more robust to registration errors and noise than pixel-domain super- resolution because of the addition of model-based constraints.
Abstract: Face images that are captured by surveillance cameras usually have a very low resolution, which significantly limits the performance of face recognition systems. In the past, super-resolution techniques have been proposed to increase the resolution by combining information from multiple images. These techniques use super-resolution as a preprocessing step to obtain a high-resolution image that is later passed to a face recognition system. Considering that most state-of-the-art face recognition systems use an initial dimensionality reduction method, we propose to transfer the super-resolution reconstruction from pixel domain to a lower dimensional face space. Such an approach has the advantage of a significant decrease in the computational complexity of the super-resolution reconstruction. The reconstruction algorithm no longer tries to obtain a visually improved high-quality image, but instead constructs the information required by the recognition system directly in the low dimensional domain without any unnecessary overhead. In addition, we show that face-space super-resolution is more robust to registration errors and noise than pixel-domain super-resolution because of the addition of model-based constraints.

Journal ArticleDOI
TL;DR: This work investigates central issues such as invertibility, stability, synchronization, and frequency characteristics for nonlinear wavelet transforms built using the lifting framework and describes how earlier families of nonlinear filter banks can be extended through the use of prediction functions operating on a causal neighborhood of pixels.
Abstract: We investigate central issues such as invertibility, stability, synchronization, and frequency characteristics for nonlinear wavelet transforms built using the lifting framework. The nonlinearity comes from adaptively choosing between a class of linear predictors within the lifting framework. We also describe how earlier families of nonlinear filter banks can be extended through the use of prediction functions operating on a causal neighborhood of pixels. Preliminary compression results for model and real-world images demonstrate the promise of our techniques.

Journal ArticleDOI
TL;DR: The decomposition of the anisotropic Gaussian in a one-dimensional (1-D) Gauss filter in the x-direction followed by a 1-D filter in a nonorthogonal direction phi is derived.
Abstract: We derive the decomposition of the anisotropic Gaussian in a one-dimensional (1-D) Gauss filter in the x-direction followed by a 1-D filter in a nonorthogonal direction /spl phi/. So also the anisotropic Gaussian can be decomposed by dimension. This appears to be extremely efficient from a computing perspective. An implementation scheme for normal convolution and for recursive filtering is proposed. Also directed derivative filters are demonstrated. For the recursive implementation, filtering an 512 /spl times/ 512 image is performed within 40 msec on a current state of the art PC, gaining over 3 times in performance for a typical filter, independent of the standard deviations and orientation of the filter. Accuracy of the filters is still reasonable when compared to truncation error or recursive approximation error. The anisotropic Gaussian filtering method allows fast calculation of edge and ridge maps, with high spatial and angular accuracy. For tracking applications, the normal anisotropic convolution scheme is more advantageous, with applications in the detection of dashed lines in engineering drawings. The recursive implementation is more attractive in feature detection applications, for instance in affine invariant edge and ridge detection in computer vision. The proposed computational filtering method enables the practical applicability of orientation scale-space analysis.

Journal ArticleDOI
TL;DR: A class of hue-preserving, contrast-enhancing transformations is proposed; they generalize existing grey scale contrast intensification techniques to color images and are seen to bypass the above mentioned color coordinate transformations for image enhancement.
Abstract: The first step in many techniques for processing intensity and saturation in color images keeping hue unaltered is the transformation of the image data from RGB space to other color spaces such as LHS, HSI, YIQ, HSV, etc. Transforming from one space to another and processing in these spaces usually generate a gamut problem, i.e., the values of the variables may not be in their respective intervals. We study enhancement techniques for color images theoretically in a generalized setup. A principle is suggested to make the transformations gamut-problem free. Using the same principle, a class of hue-preserving, contrast-enhancing transformations is proposed; they generalize existing grey scale contrast intensification techniques to color images. These transformations are also seen to bypass the above mentioned color coordinate transformations for image enhancement. The developed principle is used to generalize the histogram equalization scheme for grey scale images to color images.

Journal ArticleDOI
TL;DR: A decision-based, signal-adaptive median filtering algorithm for removal of impulse noise, which achieves accurate noise detection and high SNR measures without smearing the fine details and edges in the image.
Abstract: We propose a decision-based, signal-adaptive median filtering algorithm for removal of impulse noise. Our algorithm achieves accurate noise detection and high SNR measures without smearing the fine details and edges in the image. The notion of homogeneity level is defined for pixel values based on their global and local statistical properties. The cooccurrence matrix technique is used to represent the correlations between a pixel and its neighbors, and to derive the upper and lower bound of the homogeneity level. Noise detection is performed at two stages: noise candidates are first selected using the homogeneity level, and then a refining process follows to eliminate false detections. The noise detection scheme does not use a quantitative decision measure, but uses qualitative structural information, and it is not subject to burdensome computations for optimization of the threshold values. Empirical results indicate that our scheme performs significantly better than other median filters, in terms of noise suppression and detail preservation.

Journal ArticleDOI
TL;DR: This work introduces a registration algorithm that combines a simple yet powerful search strategy based on a stochastic gradient with two similarity measures, correlation and mutual information, together with a wavelet-based multiresolution pyramid, and shows that mutual information may be better suited for sub-pixel registration than correlation.
Abstract: Image registration is the process by which we determine a transformation that provides the most accurate match between two images. The search for the matching transformation can be automated with the use of a suitable metric, but it can be very time-consuming and tedious. We introduce a registration algorithm that combines a simple yet powerful search strategy based on a stochastic gradient with two similarity measures, correlation and mutual information, together with a wavelet-based multiresolution pyramid. We limit our study to pairs of images, which are misaligned by rotation and/or translation, and present two main results. First, we demonstrate that, in our application, mutual information may be better suited for sub-pixel registration as it produces consistently sharper optimum peaks than correlation. Then, we show that the stochastic gradient search combined with either measure produces accurate results when applied to synthetic data, as well as to multitemporal or multisensor collections of satellite data. Mutual information is generally found to optimize with one-third the number of iterations required by correlation. Results also show that a multiresolution implementation of the algorithm yields significant improvements in terms of both speed and robustness over a single-resolution implementation.

Journal ArticleDOI
TL;DR: It is shown that from any initial value the sequence generated by the SART converges to a weighted least square solution under the condition that coefficients of the linear imaging system are non-negative.
Abstract: Computed tomography (CT) has been extensively studied for years and widely used in the modern society. Although the filtered back-projection algorithm is the method of choice by manufacturers, efforts are being made to revisit iterative methods due to their unique advantages, such as superior performance with incomplete noisy data. In 1984, the simultaneous algebraic reconstruction technique (SART) was developed as a major refinement of the algebraic reconstruction technique (ART). However, the convergence of the SART has never been established since then. In this paper, the convergence is proved under the condition that coefficients of the linear imaging system are nonnegative. It is shown that from any initial guess the sequence generated by the SART converges to a weighted least square solution.

Journal ArticleDOI
TL;DR: A new framework for highly scalable video compression is proposed, using a lifting-based invertible motion adaptive transform (LIMAT) and a compact representation for the motion parameters, having motion overhead comparable to that of motion-compensated predictive coders.
Abstract: We propose a new framework for highly scalable video compression, using a lifting-based invertible motion adaptive transform (LIMAT). We use motion-compensated lifting steps to implement the temporal wavelet transform, which preserves invertibility, regardless of the motion model. By contrast, the invertibility requirement has restricted previous approaches to either block-based or global motion compensation. We show that the proposed framework effectively applies the temporal wavelet transform along a set of motion trajectories. An implementation demonstrates high coding gain from a finely embedded, scalable compressed bit-stream. Results also demonstrate the effectiveness of temporal wavelet kernels other than the simple Haar, and the benefits of complex motion modeling, using a deformable triangular mesh. These advances are either incompatible or difficult to achieve with previously proposed strategies for scalable video compression. Video sequences reconstructed at reduced frame-rates, from subsets of the compressed bit-stream, demonstrate the visually pleasing properties expected from low-pass filtering along the motion trajectories. The paper also describes a compact representation for the motion parameters, having motion overhead comparable to that of motion-compensated predictive coders. Our experimental results compare favorably to others reported in the literature, however, our principal objective is to motivate a new framework for highly scalable video compression.

Journal ArticleDOI
TL;DR: This work addresses two problems that are often encountered in object recognition: object segmentation, for which a distance sets shape filter is formulated, and shape matching, which is illustrated on printed and handwritten character recognition and detection of traffic signs in complex scenes.
Abstract: We introduce a novel rich local descriptor of an image point, we call the (labeled) distance set, which is determined by the spatial arrangement of image features around that point. We describe a two-dimensional (2D) visual object by the set of (labeled) distance sets associated with the feature points of that object. Based on a dissimilarity measure between (labeled) distance sets and a dissimilarity measure between sets of (labeled) distance sets, we address two problems that are often encountered in object recognition: object segmentation, for which we formulate a distance sets shape filter, and shape matching. The use of the shape filter is illustrated on printed and handwritten character recognition and detection of traffic signs in complex scenes. The shape comparison procedure is illustrated on handwritten character classification, COIL-20 database object recognition and MPEG-7 silhouette database retrieval.

Journal ArticleDOI
TL;DR: An approach for filling-in blocks of missing data in wireless image transmission is presented, which aims to reconstruct the lost data using correlation between the lost block and its neighbors.
Abstract: An approach for filling-in blocks of missing data in wireless image transmission is presented. When compression algorithms such as JPEG are used as part of the wireless transmission process, images are first tiled into blocks of 8 /spl times/ 8 pixels. When such images are transmitted over fading channels, the effects of noise can destroy entire blocks of the image. Instead of using common retransmission query protocols, we aim to reconstruct the lost data using correlation between the lost block and its neighbors. If the lost block contained structure, it is reconstructed using an image inpainting algorithm, while texture synthesis is used for the textured blocks. The switch between the two schemes is done in a fully automatic fashion based on the surrounding available blocks. The performance of this method is tested for various images and combinations of lost blocks. The viability of this method for image compression, in association with lossy JPEG, is also discussed.

Journal ArticleDOI
TL;DR: This Part I of a two-part paper addresses a number of fundamental issues of data hiding in image and video and proposes general solutions to them and proposes an adaptive solution switching between using constant embedding rate with shuffling and using variable embedding rates with embedded control bits.
Abstract: We address a number of fundamental issues of data hiding in image and video and propose general solutions to them. We begin with a review of two major types of embedding, based on which we propose a new multilevel embedding framework to allow the amount of extractable data to be adaptive according to the actual noise condition. We then study the issues of hiding multiple bits through a comparison of various modulation and multiplexing techniques. Finally, the nonstationary nature of visual signals leads to highly uneven distribution of embedding capacity and causes difficulty in data hiding. We propose an adaptive solution switching between using constant embedding rate with shuffling and using variable embedding rate with embedded control bits. We verify the effectiveness of our proposed solutions through analysis and simulation.

Journal ArticleDOI
TL;DR: A new feedback approach with progressive learning capability combined with a novel method for the feature subspace extraction based on a Bayesian classifier that treats positive and negative feedback examples with different strategies to improve the retrieval accuracy.
Abstract: Research has been devoted in the past few years to relevance feedback as an effective solution to improve performance of content-based image retrieval (CBIR). In this paper, we propose a new feedback approach with progressive learning capability combined with a novel method for the feature subspace extraction. The proposed approach is based on a Bayesian classifier and treats positive and negative feedback examples with different strategies. Positive examples are used to estimate a Gaussian distribution that represents the desired images for a given query; while the negative examples are used to modify the ranking of the retrieved candidates. In addition, feature subspace is extracted and updated during the feedback process using a principal component analysis (PCA) technique and based on user's feedback. That is, in addition to reducing the dimensionality of feature spaces, a proper subspace for each type of features is obtained in the feedback process to further improve the retrieval accuracy. Experiments demonstrate that the proposed method increases the retrieval speed, reduces the required memory and improves the retrieval accuracy significantly.

Journal ArticleDOI
TL;DR: An asymptotically optimal detector is constructed based on well known results of the detection theory and experimental results prove the superiority of the proposed detector over the correlation detector.
Abstract: Most of the watermarking schemes that have been proposed until now employ a correlation detector (matched filter). The current paper proposes a new detector scheme that can be applied in the case of additive watermarking in the DCT (discrete cosine transform) or DWT (discrete wavelet transform) domain. Certain properties of the probability density function of the coefficients in these domains are exploited. Thus, an asymptotically optimal detector is constructed based on well known results of the detection theory. Experimental results prove the superiority of the proposed detector over the correlation detector.

Journal ArticleDOI
TL;DR: A foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability, and is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.
Abstract: Image and video coding is an optimization problem. A successful image and video coding algorithm delivers a good tradeoff between visual quality and other coding performance measures, such as compression, complexity, scalability, robustness, and security. In this paper, we follow two recent trends in image and video coding research. One is to incorporate human visual system (HVS) models to improve the current state-of-the-art of image and video coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate scalable image and video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. Specifically, we propose a foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability. The key idea is to organize the encoded bitstream to provide the best decoded video at an arbitrary bit rate in terms of foveated visual quality measurement. A foveation-based HVS model plays an important role in the algorithm. The algorithm is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.

Journal ArticleDOI
TL;DR: A filter selection algorithm is proposed to maximize classification performance of a given dataset and the spectral histogram representation provides a robust feature statistic for textures and generalizes well.
Abstract: Based on a local spatial/frequency representation,we employ a spectral histogram as a feature statistic for texture classification. The spectral histogram consists of marginal distributions of responses of a bank of filters and encodes implicitly the local structure of images through the filtering stage and the global appearance through the histogram stage. The distance between two spectral histograms is measured using /spl chi//sup 2/-statistic. The spectral histogram with the associated distance measure exhibits several properties that are necessary for texture classification. A filter selection algorithm is proposed to maximize classification performance of a given dataset. Our classification experiments using natural texture images reveal that the spectral histogram representation provides a robust feature statistic for textures and generalizes well. Comparisons show that our method produces a marked improvement in classification performance. Finally we point out the relationships between existing texture features and the spectral histogram, suggesting that the latter may provide a unified texture feature.

Journal ArticleDOI
TL;DR: A set of theoretical criteria for a subclass of region-growing algorithms that are insensitive to the selection of the initial growing points are defined and this class of algorithms, referred to as symmetric region growing algorithms, leads to a single-pass region- growing algorithm applicable to any dimensionality of images.
Abstract: Of the many proposed image segmentation methods, region growing has been one of the most popular. Research on region growing, however, has focused primarily on the design of feature measures and on growing and merging criteria. Most of these methods have an inherent dependence on the order in which the points and regions are examined. This weakness implies that a desired segmented result is sensitive to the selection of the initial growing points. We define a set of theoretical criteria for a subclass of region-growing algorithms that are insensitive to the selection of the initial growing points. This class of algorithms, referred to as symmetric region growing algorithms, leads to a single-pass region-growing algorithm applicable to any dimensionality of images. Furthermore, they lead to region-growing algorithms that are both memory- and computation-efficient. Results illustrate the method's efficiency and its application to 3D medical image segmentation.