scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Image Processing in 2005"


Journal ArticleDOI
TL;DR: A "true" two-dimensional transform that can capture the intrinsic geometrical structure that is key in visual information is pursued and it is shown that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves.
Abstract: The limitations of commonly used separable extensions of one-dimensional transforms, such as the Fourier and wavelet transforms, in capturing the geometry of image edges are well known. In this paper, we pursue a "true" two-dimensional transform that can capture the intrinsic geometrical structure that is key in visual information. The main challenge in exploring geometry in images comes from the discrete nature of the data. Thus, unlike other approaches, such as curvelets, that first develop a transform in the continuous domain and then discretize for sampled data, our approach starts with a discrete-domain construction and then studies its convergence to an expansion in the continuous domain. Specifically, we construct a discrete-domain multiresolution and multidirection expansion using nonseparable filter banks, in much the same way that wavelets were derived from filter banks. This construction results in a flexible multiresolution, local, and directional image expansion using contour segments, and, thus, it is named the contourlet transform. The discrete contourlet transform has a fast iterated filter bank algorithm that requires an order N operations for N-pixel images. Furthermore, we establish a precise link between the developed filter bank and the associated continuous-domain contourlet expansion via a directional multiresolution analysis framework. We show that with parabolic scaling and sufficient directional vanishing moments, contourlets achieve the optimal approximation rate for piecewise smooth functions with discontinuities along twice continuously differentiable curves. Finally, we show some numerical experiments demonstrating the potential of contourlets in several image processing applications.

3,948 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling.
Abstract: Detecting regions of change in multiple images of the same scene taken at different times is of widespread interest due to a large number of applications in diverse disciplines, including remote sensing, surveillance, medical diagnosis and treatment, civil infrastructure, and underwater sensing. This paper presents a systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling. We also discuss important preprocessing methods, approaches to enforcing the consistency of the change mask, and principles for evaluating and comparing the performance of change detection algorithms. It is hoped that our classification of algorithms into a relatively small number of categories will provide useful guidance to the algorithm designer.

1,693 citations


Journal ArticleDOI
TL;DR: This paper proposes a novel information fidelity criterion that is based on natural scene statistics and derives a novel QA algorithm that provides clear advantages over the traditional approaches and outperforms current methods in testing.
Abstract: Measurement of visual quality is of fundamental importance to numerous image and video processing applications. The goal of quality assessment (QA) research is to design algorithms that can automatically assess the quality of images or videos in a perceptually consistent manner. Traditionally, image QA algorithms interpret image quality as fidelity or similarity with a "reference" or "perfect" image in some perceptual space. Such "full-reference" QA methods attempt to achieve consistency in quality prediction by modeling salient physiological and psychovisual features of the human visual system (HVS), or by arbitrary signal fidelity criteria. In this paper, we approach the problem of image QA by proposing a novel information fidelity criterion that is based on natural scene statistics. QA systems are invariably involved with judging the visual quality of images and videos that are meant for "human consumption". Researchers have developed sophisticated models to capture the statistics of natural signals, that is, pictures and videos of the visual environment. Using these statistical models in an information-theoretic setting, we derive a novel QA algorithm that provides clear advantages over the traditional approaches. In particular, it is parameterless and outperforms current methods in our testing. We validate the performance of our algorithm with an extensive subjective study involving 779 images. We also show that, although our approach distinctly departs from traditional HVS-based methods, it is functionally similar to them under certain conditions, yet it outperforms them due to improved modeling. The code and the data from the subjective study are available at [1].

1,334 citations


Journal ArticleDOI
TL;DR: This scheme can remove salt-and-pepper-noise with a noise level as high as 90% and show a significant improvement compared to those restored by using just nonlinear filters or regularization methods only.
Abstract: This paper proposes a two-phase scheme for removing salt-and-pepper impulse noise. In the first phase, an adaptive median filter is used to identify pixels which are likely to be contaminated by noise (noise candidates). In the second phase, the image is restored using a specialized regularization method that applies only to those selected noise candidates. In terms of edge preservation and noise suppression, our restored images show a significant improvement compared to those restored by using just nonlinear filters or regularization methods only. Our scheme can remove salt-and-pepper-noise with a noise level as high as 90%.

1,078 citations


Journal ArticleDOI
TL;DR: In this paper, a generalization of the well-known least significant bit (LSB) modification is proposed as the data-embedding method, which introduces additional operating points on the capacity-distortion curve.
Abstract: We present a novel lossless (reversible) data-embedding technique, which enables the exact recovery of the original host signal upon extraction of the embedded information. A generalization of the well-known least significant bit (LSB) modification is proposed as the data-embedding method, which introduces additional operating points on the capacity-distortion curve. Lossless recovery of the original is achieved by compressing portions of the signal that are susceptible to embedding distortion and transmitting these compressed descriptions as a part of the embedded payload. A prediction-based conditional entropy coder which utilizes unaltered portions of the host signal as side-information improves the compression efficiency and, thus, the lossless data-embedding capacity.

1,058 citations


Journal ArticleDOI
TL;DR: A novel method for separating images into texture and piecewise smooth (cartoon) parts, exploiting both the variational and the sparsity mechanisms is presented, combining the basis pursuit denoising (BPDN) algorithm and the total-variation (TV) regularization scheme.
Abstract: The separation of image content into semantic parts plays a vital role in applications such as compression, enhancement, restoration, and more. In recent years, several pioneering works suggested such a separation be based on variational formulation and others using independent component analysis and sparsity. This paper presents a novel method for separating images into texture and piecewise smooth (cartoon) parts, exploiting both the variational and the sparsity mechanisms. The method combines the basis pursuit denoising (BPDN) algorithm and the total-variation (TV) regularization scheme. The basic idea presented in this paper is the use of two appropriate dictionaries, one for the representation of textures and the other for the natural scene parts assumed to be piecewise smooth. Both dictionaries are chosen such that they lead to sparse representations over one type of image-content (either texture or piecewise smooth). The use of the BPDN with the two amalgamed dictionaries leads to the desired separation, along with noise removal as a by-product. As the need to choose proper dictionaries is generally hard, a TV regularization is employed to better direct the separation process and reduce ringing artifacts. We present a highly efficient numerical scheme to solve the combined optimization problem posed by our model and to show several experimental results that validate the algorithm's performance.

1,032 citations


Journal ArticleDOI
TL;DR: A new class of bases are introduced, called bandelet bases, which decompose the image along multiscale vectors that are elongated in the direction of a geometric flow, which leads to optimal approximation rates for geometrically regular images.
Abstract: This paper introduces a new class of bases, called bandelet bases, which decompose the image along multiscale vectors that are elongated in the direction of a geometric flow. This geometric flow indicates directions in which the image gray levels have regular variations. The image decomposition in a bandelet basis is implemented with a fast subband-filtering algorithm. Bandelet bases lead to optimal approximation rates for geometrically regular images. For image compression and noise removal applications, the geometric flow is optimized with fast algorithms so that the resulting bandelet basis produces minimum distortion. Comparisons are made with wavelet image compression and noise-removal algorithms.

922 citations


Journal ArticleDOI
TL;DR: It is claimed that natural scenes contain nonlinear dependencies that are disturbed by the compression process, and that this disturbance can be quantified and related to human perceptions of quality.
Abstract: Measurement of image or video quality is crucial for many image-processing algorithms, such as acquisition, compression, restoration, enhancement, and reproduction. Traditionally, image quality assessment (QA) algorithms interpret image quality as similarity with a "reference" or "perfect" image. The obvious limitation of this approach is that the reference image or video may not be available to the QA algorithm. The field of blind, or no-reference, QA, in which image quality is predicted without the reference image or video, has been largely unexplored, with algorithms focussing mostly on measuring the blocking artifacts. Emerging image and video compression technologies can avoid the dreaded blocking artifact by using various mechanisms, but they introduce other types of distortions, specifically blurring and ringing. In this paper, we propose to use natural scene statistics (NSS) to blindly measure the quality of images compressed by JPEG2000 (or any other wavelet based) image coder. We claim that natural scenes contain nonlinear dependencies that are disturbed by the compression process, and that this disturbance can be quantified and related to human perceptions of quality. We train and test our algorithm with data from human subjects, and show that reasonably comprehensive NSS models can help us in making blind, but accurate, predictions of quality. Our algorithm performs close to the limit imposed on useful prediction by the variability between human subjects.

612 citations


Journal ArticleDOI
TL;DR: The result is a new filter capable of reducing both Gaussian and impulse noises from noisy images effectively, which performs remarkably well, both in terms of quantitative measures of signal restoration and qualitative judgements of image quality.
Abstract: We introduce a local image statistic for identifying noise pixels in images corrupted with impulse noise of random values. The statistical values quantify how different in intensity the particular pixels are from their most similar neighbors. We continue to demonstrate how this statistic may be incorporated into a filter designed to remove additive Gaussian noise. The result is a new filter capable of reducing both Gaussian and impulse noises from noisy images effectively, which performs remarkably well, both in terms of quantitative measures of signal restoration and qualitative judgements of image quality. Our approach is extended to automatically remove any mix of Gaussian and impulse noise.

592 citations


Journal ArticleDOI
TL;DR: This software, implemented as a Java plug-in for the public-domain ImageJ software, is used to track the movement of chromosomal loci within nuclei of budding yeast cells to reveal different classes of constraints on mobility of telomeres, reflecting differences in nuclear envelope association.
Abstract: We present a new, robust, computational procedure for tracking fluorescent markers in time-lapse microscopy. The algorithm is optimized for finding the time-trajectory of single particles in very noisy dynamic (two- or three-dimensional) image sequences. It proceeds in three steps. First, the images are aligned to compensate for the movement of the biological structure under investigation. Second, the particle's signature is enhanced by applying a Mexican hat filter, which we show to be the optimal detector of a Gaussian-like spot in 1//spl omega//sup 2/ noise. Finally, the optimal trajectory of the particle is extracted by applying a dynamic programming optimization procedure. We have used this software, which is implemented as a Java plug-in for the public-domain ImageJ software, to track the movement of chromosomal loci within nuclei of budding yeast cells. Besides reducing trajectory analysis time by several 100-fold, we achieve high reproducibility and accuracy of tracking. The application of the method to yeast chromatin dynamics reveals different classes of constraints on mobility of telomeres, reflecting differences in nuclear envelope association. The generic nature of the software allows application to a variety of similar biological imaging tasks that require the extraction and quantitation of a moving particle's trajectory.

463 citations


Journal ArticleDOI
TL;DR: The proposed demosaicing algorithm estimates missing pixels by interpolating in the direction with fewer color artifacts, and the aliasing problem is addressed by applying filterbank techniques to 2-D directional interpolation.
Abstract: A cost-effective digital camera uses a single-image sensor, applying alternating patterns of red, green, and blue color filters to each pixel location. A way to reconstruct a full three-color representation of color images by estimating the missing pixel components in each color plane is called a demosaicing algorithm. This paper presents three inherent problems often associated with demosaicing algorithms that incorporate two-dimensional (2-D) directional interpolation: misguidance color artifacts, interpolation color artifacts, and aliasing. The level of misguidance color artifacts present in two images can be compared using metric neighborhood modeling. The proposed demosaicing algorithm estimates missing pixels by interpolating in the direction with fewer color artifacts. The aliasing problem is addressed by applying filterbank techniques to 2-D directional interpolation. The interpolation artifacts are reduced using a nonlinear iterative procedure. Experimental results using digital images confirm the effectiveness of this approach.

Journal ArticleDOI
TL;DR: This paper investigates high-capacity lossless data-embedding methods that allow one to embed large amounts of data into digital images in such a way that the original image can be reconstructed from the watermarked image.
Abstract: The proliferation of digital information in our society has enticed a lot of research into data-embedding techniques that add information to digital content, like images, audio, and video. In this paper, we investigate high-capacity lossless data-embedding methods that allow one to embed large amounts of data into digital images (or video) in such a way that the original image can be reconstructed from the watermarked image. We present two new techniques: one based on least significant bit prediction and Sweldens' lifting scheme and another that is an improvement of Tian's technique of difference expansion. The new techniques are then compared with various existing embedding methods by looking at capacity-distortion behavior and capacity control.

Journal ArticleDOI
TL;DR: The experimental results show that the presented color demosaicking technique outperforms the existing methods both in PSNR measure and visual perception.
Abstract: Digital cameras sample scenes using a color filter array of mosaic pattern (e.g., the Bayer pattern). The demosaicking of the color samples is critical to the image quality. This paper presents a new color demosaicking technique of optimal directional filtering of the green-red and green-blue difference signals. Under the assumption that the primary difference signals (PDS) between the green and red/blue channels are low pass, the missing green samples are adaptively estimated in both horizontal and vertical directions by the linear minimum mean square-error estimation (LMMSE) technique. These directional estimates are then optimally fused to further improve the green estimates. Finally, guided by the demosaicked full-resolution green channel, the other two color channels are reconstructed from the LMMSE filtered and fused PDS. The experimental results show that the presented color demosaicking technique outperforms the existing methods both in PSNR measure and visual perception.

Journal ArticleDOI
TL;DR: Results on images returned by Google's Image Search reveal the potential of applying CLUE to real-world image data and integrating CLUE as a part of the interface for keyword-based image retrieval systems.
Abstract: In a typical content-based image retrieval (CBIR) system, target images (images in the database) are sorted by feature similarities with respect to the query. Similarities among target images are usually ignored. This paper introduces a new technique, cluster-based retrieval of images by unsupervised learning (CLUE), for improving user interaction with image retrieval systems by fully exploiting the similarity information. CLUE retrieves image clusters by applying a graph-theoretic clustering algorithm to a collection of images in the vicinity of the query. Clustering in CLUE is dynamic. In particular, clusters formed depend on which images are retrieved in response to the query. CLUE can be combined with any real-valued symmetric similarity measure (metric or nonmetric). Thus, it may be embedded in many current CBIR systems, including relevance feedback systems. The performance of an experimental image retrieval system using CLUE is evaluated on a database of around 60,000 images from COREL. Empirical results demonstrate improved performance compared with a CBIR system using the same image similarity measure. In addition, results on images returned by Google's Image Search reveal the potential of applying CLUE to real-world image data and integrating CLUE as a part of the interface for keyword-based image retrieval systems.

Journal ArticleDOI
TL;DR: This model shows that visual artifacts after demosaicing are due to aliasing between luminance and chrominance and could be solved using a preprocessing filter, and gives new insights for the representation of single-color per spatial location images.
Abstract: There is an analogy between single-chip color cameras and the human visual system in that these two systems acquire only one limited wavelength sensitivity band per spatial location. We have exploited this analogy, defining a model that characterizes a one-color per spatial position image as a coding into luminance and chrominance of the corresponding three colors per spatial position image. Luminance is defined with full spatial resolution while chrominance contains subsampled opponent colors. Moreover, luminance and chrominance follow a particular arrangement in the Fourier domain, allowing for demosaicing by spatial frequency filtering. This model shows that visual artifacts after demosaicing are due to aliasing between luminance and chrominance and could be solved using a preprocessing filter. This approach also gives new insights for the representation of single-color per spatial location images and enables formal and controllable procedures to design demosaicing algorithms that perform well compared to concurrent approaches, as demonstrated by experiments.

Journal ArticleDOI
TL;DR: A fully automatic segmentation and tracking method designed to enable quantitative analyses of cellular shape and motion from dynamic three-dimensional microscopy data, robustness to low signal-to-noise ratios and the ability to handle multiple cells that may touch, divide, enter, or leave the observation volume.
Abstract: Cell migrations and deformations play essential roles in biological processes, such as parasite invasion, immune response, embryonic development, and cancer. We describe a fully automatic segmentation and tracking method designed to enable quantitative analyses of cellular shape and motion from dynamic three-dimensional microscopy data. The method uses multiple active surfaces with or without edges, coupled by a penalty for overlaps, and a volume conservation constraint that improves outlining of cell/cell boundaries. Its main advantages are robustness to low signal-to-noise ratios and the ability to handle multiple cells that may touch, divide, enter, or leave the observation volume. We give quantitative validation results based on synthetic images and show two examples of applications to real biological data.

Journal ArticleDOI
TL;DR: A new formulation of the regularized image up-sampling problem that incorporates models of the image acquisition and display processes is presented, giving a new analytic perspective that justifies the use of total-variation regularization from a signal processing perspective.
Abstract: This paper presents a new formulation of the regularized image up-sampling problem that incorporates models of the image acquisition and display processes. We give a new analytic perspective that justifies the use of total-variation regularization from a signal processing perspective, based on an analysis that specifies the requirements of edge-directed filtering. This approach leads to a new data fidelity term that has been coupled with a total-variation regularizer to yield our objective function. This objective function is minimized using a level-sets motion that is based on the level-set method, with two types of motion that interact simultaneously. A new choice of these motions leads to a stable solution scheme that has a unique minimum. One aspect of the human visual system, perceptual uniformity, is treated in accordance with the linear nature of the data fidelity term. The method was implemented and has been verified to provide improved results, yielding crisp edges without introducing ringing or other artifacts.

Journal ArticleDOI
TL;DR: This paper solves the information-theoretic optimization problem by deriving the associated gradient flows and applying curve evolution techniques and uses level-set methods to implement the resulting evolution.
Abstract: In this paper, we present a new information-theoretic approach to image segmentation. We cast the segmentation problem as the maximization of the mutual information between the region labels and the image pixel intensities, subject to a constraint on the total length of the region boundaries. We assume that the probability densities associated with the image pixel intensities within each region are completely unknown a priori, and we formulate the problem based on nonparametric density estimates. Due to the nonparametric structure, our method does not require the image regions to have a particular type of probability distribution and does not require the extraction and use of a particular statistic. We solve the information-theoretic optimization problem by deriving the associated gradient flows and applying curve evolution techniques. We use level-set methods to implement the resulting evolution. The experimental results based on both synthetic and real images demonstrate that the proposed technique can solve a variety of challenging image segmentation problems. Furthermore, our method, which does not require any training, performs as good as methods based on training.

Journal ArticleDOI
TL;DR: A trainable system for analyzing videos of developing C. elegans embryos that automatically detects, segments, and locates cells and nuclei in microscopic images and contains a set of elastic models of the embryo at various stages of development that are matched to the label images.
Abstract: We describe a trainable system for analyzing videos of developing C. elegans embryos. The system automatically detects, segments, and locates cells and nuclei in microscopic images. The system was designed as the central component of a fully automated phenotyping system. The system contains three modules 1) a convolutional network trained to classify each pixel into five categories: cell wall, cytoplasm, nucleus membrane, nucleus, outside medium; 2) an energy-based model, which cleans up the output of the convolutional network by learning local consistency constraints that must be satisfied by label images; 3) a set of elastic models of the embryo at various stages of development that are matched to the label images.

Journal ArticleDOI
TL;DR: A novel super-resolution method for hyperspectral images that fuses information from multiple observations and spectral bands to improve spatial resolution and reconstruct the spectrum of the observed scene as a combination of a small number of spectral basis functions.
Abstract: Hyperspectral images are used for aerial and space imagery applications, including target detection, tracking, agricultural, and natural resource exploration. Unfortunately, atmospheric scattering, secondary illumination, changing viewing angles, and sensor noise degrade the quality of these images. Improving their resolution has a high payoff, but applying super-resolution techniques separately to every spectral band is problematic for two main reasons. First, the number of spectral bands can be in the hundreds, which increases the computational load excessively. Second, considering the bands separately does not make use of the information that is present across them. Furthermore, separate band super resolution does not make use of the inherent low dimensionality of the spectral data, which can effectively be used to improve the robustness against noise. In this paper, we introduce a novel super-resolution method for hyperspectral images. An integral part of our work is to model the hyperspectral image acquisition process. We propose a model that enables us to represent the hyperspectral observations from different wavelengths as weighted linear combinations of a small number of basis image planes. Then, a method for applying super resolution to hyperspectral images using this model is presented. The method fuses information from multiple observations and spectral bands to improve spatial resolution and reconstruct the spectrum of the observed scene as a combination of a small number of spectral basis functions.

Journal ArticleDOI
TL;DR: Two watermarking approaches that are robust to geometric distortions are presented, one based on image normalization, and the other based on a watermark resynchronization scheme aimed to alleviate the effects of random bending attacks.
Abstract: In this paper, we present two watermarking approaches that are robust to geometric distortions. The first approach is based on image normalization, in which both watermark embedding and extraction are carried out with respect to an image normalized to meet a set of predefined moment criteria. We propose a new normalization procedure, which is invariant to affine transform attacks. The resulting watermarking scheme is suitable for public watermarking applications, where the original image is not available for watermark extraction. The second approach is based on a watermark resynchronization scheme aimed to alleviate the effects of random bending attacks. In this scheme, a deformable mesh is used to correct the distortion caused by the attack. The watermark is then extracted from the corrected image. In contrast to the first scheme, the latter is suitable for private watermarking applications, where the original image is necessary for watermark detection. In both schemes, we employ a direct-sequence code division multiple access approach to embed a multibit watermark in the discrete cosine transform domain of the image. Numerical experiments demonstrate that the proposed watermarking schemes are robust to a wide range of geometric attacks.

Journal ArticleDOI
TL;DR: The proposed approach combines knowledge of human perception with an understanding of signal characteristics in order to segment natural scenes into perceptually/semantically uniform regions to convey semantic information that can be used for content-based retrieval.
Abstract: We propose a new approach for image segmentation that is based on low-level features for color and texture. It is aimed at segmentation of natural scenes, in which the color and texture of each segment does not typically exhibit uniform statistical characteristics. The proposed approach combines knowledge of human perception with an understanding of signal characteristics in order to segment natural scenes into perceptually/semantically uniform regions. The proposed approach is based on two types of spatially adaptive low-level features. The first describes the local color composition in terms of spatially adaptive dominant colors, and the second describes the spatial characteristics of the grayscale component of the texture. Together, they provide a simple and effective characterization of texture that the proposed algorithm uses to obtain robust and, at the same time, accurate and precise segmentations. The resulting segmentations convey semantic information that can be used for content-based retrieval. The performance of the proposed algorithms is demonstrated in the domain of photographic images, including low-resolution, degraded, and compressed images.

Journal ArticleDOI
Xin Li1
TL;DR: The major contributions of this work include a new iterative demosaicing algorithm in the color difference domain and a spatially adaptive stopping criterion for suppressing color misregistration and zipper artifacts in the demosaiced images.
Abstract: In this paper, we present a fast and high-performance algorithm for color filter array (CFA) demosaicing. CFA demosaicing is formulated as a problem of reconstructing correlated signals from their downsampled versions with an opposite phase. The major contributions of this work include 1) a new iterative demosaicing algorithm in the color difference domain and 2) a spatially adaptive stopping criterion for suppressing color misregistration and zipper artifacts in the demosaiced images. We have compared the proposed demosaicing algorithm with two current state-of-the-art techniques reported in the literature. Ours outperforms both of them on demosaicing performance and computational cost.

Journal ArticleDOI
TL;DR: A design-based method to fuse Gabor filter and grey level co-occurrence probability (GLCP) features for improved texture recognition is presented and is advocated as a means for improving texture segmentation performance.
Abstract: A design-based method to fuse Gabor filter and grey level co-occurrence probability (GLCP) features for improved texture recognition is presented. The fused feature set utilizes both the Gabor filter's capability of accurately capturing lower and mid-frequency texture information and the GLCP's capability in texture information relevant to higher frequency components. Evaluation methods include comparing feature space separability and comparing image segmentation classification rates. The fused feature sets are demonstrated to produce higher feature space separations, as well as higher segmentation accuracies relative to the individual feature sets. Fused feature sets also outperform individual feature sets for noisy images, across different noise magnitudes. The curse of dimensionality is demonstrated not to affect segmentation using the proposed the 48-dimensional fused feature set. Gabor magnitude responses produce higher segmentation accuracies than linearly normalized Gabor magnitude responses. Feature reduction using principal component analysis is acceptable for maintaining the segmentation performance, but feature reduction using the feature contrast method dramatically reduced the segmentation accuracy. Overall, the designed fused feature set is advocated as a means for improving texture segmentation performance.

Journal ArticleDOI
TL;DR: A novel technique to recover large similarity transformations (rotation/scale/translation) and moderate perspective deformations among image pairs and achieves subpixel accuracy through the use of nonlinear least squares optimization.
Abstract: This paper describes a novel technique to recover large similarity transformations (rotation/scale/translation) and moderate perspective deformations among image pairs We introduce a hybrid algorithm that features log-polar mappings and nonlinear least squares optimization The use of log-polar techniques in the spatial domain is introduced as a preprocessing module to recover large scale changes (eg, at least four-fold) and arbitrary rotations Although log-polar techniques are used in the Fourier-Mellin transform to accommodate rotation and scale in the frequency domain, its use in registering images subjected to very large scale changes has not yet been exploited in the spatial domain In this paper, we demonstrate the superior performance of the log-polar transform in featureless image registration in the spatial domain We achieve subpixel accuracy through the use of nonlinear least squares optimization The registration process yields the eight parameters of the perspective transformation that best aligns the two input images Extensive testing was performed on uncalibrated real images and an array of 10,000 image pairs with known transformations derived from the Corel Stock Photo Library of royalty-free photographic images

Journal ArticleDOI
TL;DR: A generic framework for segmentation evaluation is introduced and a metric based on the distance between segmentation partitions is proposed to overcome some of the limitations of existing approaches.
Abstract: Image segmentation plays a major role in a broad range of applications. Evaluating the adequacy of a segmentation algorithm for a given application is a requisite both to allow the appropriate selection of segmentation algorithms as well as to tune their parameters for optimal performance. However, objective segmentation quality evaluation is far from being a solved problem. In this paper, a generic framework for segmentation evaluation is introduced after a brief review of previous work. A metric based on the distance between segmentation partitions is proposed to overcome some of the limitations of existing approaches. Symmetric and asymmetric distance metric alternatives are presented to meet the specificities of a wide class of applications. Experimental results confirm the potential of the proposed measures.

Journal ArticleDOI
TL;DR: This paper presents a novel multipurpose digital image watermarking method based on the multistage vector quantizer structure, which can be applied to image authentication and copyright protection.
Abstract: The rapid growth of digital multimedia and Internet technologies has made copyright protection, copy protection, and integrity verification three important issues in the digital world. To solve these problems, the digital watermarking technique has been presented and widely researched. Traditional watermarking algorithms are mostly based on discrete transform domains, such as the discrete cosine transform, discrete Fourier transform (DFT), and discrete wavelet transform (DWT). Most of these algorithms are good for only one purpose. Recently, some multipurpose digital watermarking methods have been presented, which can achieve the goal of content authentication and copyright protection simultaneously. However, they are based on DWT or DFT. Lately, several robust watermarking schemes based on vector quantization (VQ) have been presented, but they can only be used for copyright protection. In this paper, we present a novel multipurpose digital image watermarking method based on the multistage vector quantizer structure, which can be applied to image authentication and copyright protection. In the proposed method, the semi-fragile watermark and the robust watermark are embedded in different VQ stages using different techniques, and both of them can be extracted without the original image. Simulation results demonstrate the effectiveness of our algorithm in terms of robustness and fragility.

Journal ArticleDOI
TL;DR: A new numerical measure for visual attention's modulatory aftereffects, perceptual quality significance map (PQSM), is proposed and demonstrates the performance improvement on two PQSM-modulated visual sensitivity models and two P QSM-based visual quality metrics.
Abstract: With the fast development of visual noise-shaping related applications (visual compression, error resilience, watermarking, encryption, and display), there is an increasingly significant demand on incorporating perceptual characteristics into these applications for improved performance. In this paper, a very important mechanism of the human brain, visual attention, is introduced for visual sensitivity and visual quality evaluation. Based upon the analysis, a new numerical measure for visual attention's modulatory aftereffects, perceptual quality significance map (PQSM), is proposed. To a certain extent, the PQSM reflects the processing ability of the human brain on local visual contents statistically. The PQSM is generated with the integration of local perceptual stimuli from color contrast, texture contrast, motion, as well as cognitive features (skin color and face in this study). Experimental results with subjective viewing demonstrate the performance improvement on two PQSM-modulated visual sensitivity models and two PQSM-based visual quality metrics.

Journal ArticleDOI
TL;DR: This paper presents a decoupled, as well as a coupled, version of the classical Gau/spl szlig/-Seidel solver, and develops several multigrid implementations based on a discretization coarse grid approximation that take advantage of intergrid transfer operators that allow for nondyadic grid hierarchies.
Abstract: This paper investigates the usefulness of bidirectional multigrid methods for variational optical flow computations. Although these numerical schemes are among the fastest methods for solving equation systems, they are rarely applied in the field of computer vision. We demonstrate how to employ those numerical methods for the treatment of variational optical flow formulations and show that the efficiency of this approach even allows for real-time performance on standard PCs. As a representative for variational optic flow methods, we consider the recently introduced combined local-global method. It can be considered as a noise-robust generalization of the Horn and Schunck technique. We present a decoupled, as well as a coupled, version of the classical Gau/spl szlig/-Seidel solver, and we develop several multigrid implementations based on a discretization coarse grid approximation. In contrast, with standard bidirectional multigrid algorithms, we take advantage of intergrid transfer operators that allow for nondyadic grid hierarchies. As a consequence, no restrictions concerning the image size or the number of traversed levels have to be imposed. In the experimental section, we juxtapose the developed multigrid schemes and demonstrate their superior performance when compared to unidirectional multigrid methods and nonhierachical solvers. For the well-known 316/spl times/252 Yosemite sequence, we succeeded in computing the complete set of dense flow fields in three quarters of a second on a 3.06-GHz Pentium4 PC. This corresponds to a frame rate of 18 flow fields per second which outperforms the widely-used Gau/spl szlig/-Seidel method by almost three orders of magnitude.

Journal ArticleDOI
TL;DR: A new optical-flow-based method for estimating heart motion from two-dimensional echocardiographic sequences and uses a wavelet-like algorithm for computing B-spline-weighted inner products and moments at dyadic scales to increase computational efficiency.
Abstract: The quantitative assessment of cardiac motion is a fundamental concept to evaluate ventricular malfunction. We present a new optical-flow-based method for estimating heart motion from two-dimensional echocardiographic sequences. To account for typical heart motions, such as contraction/expansion and shear, we analyze the images locally by using a local-affine model for the velocity in space and a linear model in time. The regional motion parameters are estimated in the least-squares sense inside a sliding spatiotemporal B-spline window. Robustness and spatial adaptability is achieved by estimating the model parameters at multiple scales within a coarse-to-fine multiresolution framework. We use a wavelet-like algorithm for computing B-spline-weighted inner products and moments at dyadic scales to increase computational efficiency. In order to characterize myocardial contractility and to simplify the detection of myocardial dysfunction, the radial component of the velocity with respect to a reference point is color coded and visualized inside a time-varying region of interest. The algorithm was first validated on synthetic data sets that simulate a beating heart with a speckle-like appearance of echocardiograms. The ability to estimate motion from real ultrasound sequences was demonstrated by a rotating phantom experiment. The method was also applied to a set of in vivo echocardiograms from an animal study. Motion estimation results were in good agreement with the expert echocardiographic reading.