scispace - formally typeset
Search or ask a question

Showing papers on "Image processing published in 2004"


Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations


Journal ArticleDOI
TL;DR: 40 selected thresholding methods from various categories are compared in the context of nondestructive testing applications as well as for document images, and the thresholding algorithms that perform uniformly better over nonde- structive testing and document image applications are identified.
Abstract: We conduct an exhaustive survey of image thresholding methods, categorize them, express their formulas under a uniform notation, and finally carry their performance comparison. The thresholding methods are categorized according to the information they are exploiting, such as histogram shape, measurement space clustering, entropy, object attributes, spatial correlation, and local gray-level surface. 40 selected thresholding methods from various categories are compared in the context of nondestructive testing applications as well as for document images. The comparison is based on the combined performance measures. We identify the thresholding algorithms that perform uniformly better over nonde- structive testing and document image applications. © 2004 SPIE and IS&T. (DOI: 10.1117/1.1631316)

4,543 citations


Proceedings ArticleDOI
27 Jun 2004
TL;DR: This paper examines (and improves upon) the local image descriptor used by SIFT, and demonstrates that the PCA-based local descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT representation.
Abstract: Stable local feature detection and representation is a fundamental component of many image registration and object recognition algorithms. Mikolajczyk and Schmid (June 2003) recently evaluated a variety of approaches and identified the SIFT [D. G. Lowe, 1999] algorithm as being the most resistant to common image deformations. This paper examines (and improves upon) the local image descriptor used by SIFT. Like SIFT, our descriptors encode the salient aspects of the image gradient in the feature point's neighborhood; however, instead of using SIFT's smoothed weighted histograms, we apply principal components analysis (PCA) to the normalized gradient patch. Our experiments demonstrate that the PCA-based local descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT representation. We also present results showing that using these descriptors in an image retrieval application results in increased accuracy and faster matching.

3,325 citations


Journal ArticleDOI
TL;DR: The simultaneous propagation of texture and structure information is achieved by a single, efficient algorithm that combines the advantages of two approaches: exemplar-based texture synthesis and block-based sampling process.
Abstract: A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. In the past, this problem has been addressed by two classes of algorithms: 1) "texture synthesis" algorithms for generating large image regions from sample textures and 2) "inpainting" techniques for filling in small image gaps. The former has been demonstrated for "textures"-repeating two-dimensional patterns with some stochasticity; the latter focus on linear "structures" which can be thought of as one-dimensional patterns, such as lines and object contours. This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual color values are computed using exemplar-based synthesis. In this paper, the simultaneous propagation of texture and structure information is achieved by a single , efficient algorithm. Computational efficiency is achieved by a block-based sampling process. A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects, as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.

3,066 citations


Proceedings ArticleDOI
19 Jul 2004
TL;DR: This paper proposes a novel method for solving single-image super-resolution problems, given a low-resolution image as input, and recovers its high-resolution counterpart using a set of training examples, inspired by recent manifold teaming methods.
Abstract: In this paper, we propose a novel method for solving single-image super-resolution problems. Given a low-resolution image as input, we recover its high-resolution counterpart using a set of training examples. While this formulation resembles other learning-based methods for super-resolution, our method has been inspired by recent manifold teaming methods, particularly locally linear embedding (LLE). Specifically, small image patches in the lowand high-resolution images form manifolds with similar local geometry in two distinct feature spaces. As in LLE, local geometry is characterized by how a feature vector corresponding to a patch can be reconstructed by its neighbors in the feature space. Besides using the training image pairs to estimate the high-resolution embedding, we also enforce local compatibility and smoothness constraints between patches in the target high-resolution image through overlapping. Experiments show that our method is very flexible and gives good empirical results.

1,951 citations


Journal ArticleDOI
TL;DR: A novel skull-stripping algorithm based on a hybrid approach that combines watershed algorithms and deformable surface models is presented, resulting in a robust and automated procedure that outperforms other publicly available skullstripping tools.

1,947 citations


Journal ArticleDOI
TL;DR: OsiriX was designed for display and interpretation of large sets of multidimensional and multimodality images such as combined PET-CT studies and ensures that all new developments in image processing that could emerge from other academic institutions using these libraries can be directly ported to the OsiriX program.
Abstract: A multidimensional image navigation and display software was designed for display and interpretation of large sets of multidimensional and multimodality images such as combined PET-CT studies. The software is developed in Objective-C on a Macintosh platform under the MacOS X operating system using the GNUstep development environment. It also benefits from the extremely fast and optimized 3D graphic capabilities of the OpenGL graphic standard widely used for computer games optimized for taking advantage of any hardware graphic accelerator boards available. In the design of the software special attention was given to adapt the user interface to the specific and complex tasks of navigating through large sets of image data. An interactive jog-wheel device widely used in the video and movie industry was implemented to allow users to navigate in the different dimensions of an image set much faster than with a traditional mouse or on-screen cursors and sliders. The program can easily be adapted for very specific tasks that require a limited number of functions, by adding and removing tools from the program’s toolbar and avoiding an overwhelming number of unnecessary tools and functions. The processing and image rendering tools of the software are based on the open-source libraries ITK and VTK. This ensures that all new developments in image processing that could emerge from other academic institutions using these libraries can be directly ported to the OsiriX program. OsiriX is provided free of charge under the GNU open-source licensing agreement at http://homepage.mac.com/rossetantoine/osirix.

1,741 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: This work proposes an information fidelity criterion that quantifies the Shannon information that is shared between the reference and distorted images relative to the information contained in the reference image itself, and demonstrates the performance of the algorithm by testing it on a data set of 779 images.
Abstract: Measurement of image quality is crucial for many image-processing algorithms. Traditionally, image quality assessment algorithms predict visual quality by comparing a distorted image against a reference image, typically by modeling the human visual system (HVS), or by using arbitrary signal fidelity criteria. We adopt a new paradigm for image quality assessment. We propose an information fidelity criterion that quantifies the Shannon information that is shared between the reference and distorted images relative to the information contained in the reference image itself. We use natural scene statistics (NSS) modeling in concert with an image degradation model and an HVS model. We demonstrate the performance of our algorithm by testing it on a data set of 779 images, and show that our method is competitive with state of the art quality assessment methods, and outperforms them in our simulations.

1,349 citations


Journal ArticleDOI
TL;DR: This tutorial performs a synthesis between the multiscale-decomposition-based image approach, the ARSIS concept, and a multisensor scheme based on wavelet decomposition, i.e. a multiresolution image fusion approach.

1,187 citations


Journal ArticleDOI
TL;DR: Results indicate that the spatial, quad-based algorithm developed for color images allows for hiding the largest payload at the highest signal-to-noise ratio.
Abstract: A reversible watermarking algorithm with very high data-hiding capacity has been developed for color images. The algorithm allows the watermarking process to be reversed, which restores the exact original image. The algorithm hides several bits in the difference expansion of vectors of adjacent pixels. The required general reversible integer transform and the necessary conditions to avoid underflow and overflow are derived for any vector of arbitrary length. Also, the potential payload size that can be embedded into a host image is discussed, and a feedback system for controlling this size is developed. In addition, to maximize the amount of data that can be hidden into an image, the embedding algorithm can be applied recursively across the color components. Simulation results using spatial triplets, spatial quads, cross-color triplets, and cross-color quads are presented and compared with the existing reversible watermarking algorithms. These results indicate that the spatial, quad-based algorithm allows for hiding the largest payload at the highest signal-to-noise ratio.

1,149 citations


Book
12 Feb 2004
TL;DR: This paper summarizes the findings of the Human Neuroscanning Project on parametric image registration and non-parametric imageRegistration and describes the setting, methodology, and results that were obtained.
Abstract: 1. Introduction 2. The Human Neuroscanning Project 3. The mathematical setting I PARAMETRIC IMAGE REGISTRATION 4. Landmark based registration 5. Principal axes based registration 6. Optimal linear registration 7. Summarizing parametric image registration II NON-PARAMETRIC IMAGE REGISTRATION 8. Non-parametric image registration 9. Elastic registration 10. Fluid registration 11. Diffusion registration 12. Curvature registration 13. Concluding remarks

Journal ArticleDOI
01 Aug 2004
TL;DR: A system level realization of CLAHE is proposed, which is suitable for VLSI or FPGA implementation and the goal for this realization is to minimize the latency without sacrificing precision.
Abstract: Acquired real-time image sequences, in their original form may not have good viewing quality due to lack of proper lighting or inherent noise. For example, in X-ray imaging, when continuous exposure is used to obtain an image sequence or video, usually low-level exposure is administered until the region of interest is identified. In this case, and many other similar situations, it is desired to improve the image quality in real-time. One particular method of interest, which extensively is used for enhancement of still images, is Contrast Limited Adaptive Histogram Equalization (CLAHE) proposed in [1] and summarized in [2]. This approach is computationally extensive and it is usually used for off-line image enhancement. Because of its performance, hardware implementation of this algorithm for enhancement of real-time image sequences is sought. In this paper, a system level realization of CLAHE is proposed, which is suitable for VLSI or FPGA implementation. The goal for this realization is to minimize the latency without sacrificing precision.

Journal ArticleDOI
TL;DR: In this article, the significant elements of a computer vision system and emphasises the important aspects of the image processing technique coupled with a review of the most recent developments throughout the food industry.

Journal ArticleDOI
TL;DR: A lensless optical security system based on double random-phase encoding in the Fresnel domain is proposed, which can encrypt a primary image to random noise by use of two statistically independent random- phase masks in the input and transform planes, respectively.
Abstract: A lensless optical security system based on double random-phase encoding in the Fresnel domain is proposed. This technique can encrypt a primary image to random noise by use of two statistically independent random-phase masks in the input and transform planes, respectively. In this system the positions of the significant planes and the operation wavelength, as well as the phase codes, are used as keys to encrypt and recover the primary image. Therefore higher security is achieved. The sensitivity of the decrypted image to shifting along the propagation direction and to the wavelength are also investigated.

Journal ArticleDOI
TL;DR: A statistical basis for a process often described in computer vision: image segmentation by region merging following a particular order in the choice of regions is explored, leading to a fast segmentation algorithm tailored to processing images described using most common numerical pixel attribute spaces.
Abstract: This paper explores a statistical basis for a process often described in computer vision: image segmentation by region merging following a particular order in the choice of regions. We exhibit a particular blend of algorithmics and statistics whose segmentation error is, as we show, limited from both the qualitative and quantitative standpoints. This approach can be efficiently approximated in linear time/space, leading to a fast segmentation algorithm tailored to processing images described using most common numerical pixel attribute spaces. The conceptual simplicity of the approach makes it simple to modify and cope with hard noise corruption, handle occlusion, authorize the control of the segmentation scale, and process unconventional data such as spherical images. Experiments on gray-level and color images, obtained with a short readily available C-code, display the quality of the segmentations obtained.

Proceedings ArticleDOI
27 Jun 2004
TL;DR: An approach to include contextual features for labeling images, in which each pixel is assigned to one of a finite set of labels, are incorporated into a probabilistic framework, which combines the outputs of several components.
Abstract: We propose an approach to include contextual features for labeling images, in which each pixel is assigned to one of a finite set of labels. The features are incorporated into a probabilistic framework, which combines the outputs of several components. Components differ in the information they encode. Some focus on the image-label mapping, while others focus solely on patterns within the label field. Components also differ in their scale, as some focus on fine-resolution patterns while others on coarser, more global structure. A supervised version of the contrastive divergence algorithm is applied to learn these features from labeled image data. We demonstrate performance on two real-world image databases and compare it to a classifier and a Markov random field.

Journal ArticleDOI
TL;DR: The findings show that atlas selection is an important issue in atlas-based segmentation and that, in particular, multi-classifier techniques can substantially increase the segmentation accuracy.

BookDOI
TL;DR: An extension to the semi-supervised aligned cluster analysis algorithm (SSACA), a temporal clustering algorithm that incorporates pairwise constraints in the form of must-link and cannot-link is proposed that incorporates an exhaustive constraint propagation mechanism to further improve the clustering process.
Abstract: In this paper, we investigate applying semi-supervised clustering to audio-visual emotion analysis, a complex problem that is traditionally solved using supervised methods. We propose an extension to the semi-supervised aligned cluster analysis algorithm (SSACA), a temporal clustering algorithm that incorporates pairwise constraints in the form of must-link and cannot-link. We incorporate an exhaustive constraint propagation mechanism to further improve the clustering process. To validate the proposed method, we apply it to emotion analysis on a multimodal naturalistic emotion database. Results show substantial improvements compared to the original aligned clustering analysis algorithm (ACA) and to our previously proposed semi-supervised approach.

Journal ArticleDOI
TL;DR: This work develops the Retinex computation into a full scale automatic image enhancement algorithm—the multiscale RetineX with color restoration (MSRCR)—which com- bines color constancy with local contrast/lightness enhancement to transform digital images into renditions that approach the realism of direct scene observation.
Abstract: There has been a revivification of interest in the Retinex computation in the last six or seven years, especially in its use for image enhancement. In his last published concept (1986) for a Ret- inex computation, Land introduced a center/surround spatial form, which was inspired by the receptive field structures of neurophysi- ology. With this as our starting point, we develop the Retinex con- cept into a full scale automatic image enhancement algorithm—the multiscale Retinex with color restoration (MSRCR)—which com- bines color constancy with local contrast/lightness enhancement to transform digital images into renditions that approach the realism of direct scene observation. Recently, we have been exploring the fun- damental scientific questions raised by this form of image process- ing. 1. Is the linear representation of digital images adequate in visual terms in capturing the wide scene dynamic range? 2. Can visual quality measures using the MSRCR be developed? 3. Is there a canonical, i.e., statistically ideal, visual image? The answers to these questions can serve as the basis for automating visual as- sessment schemes, which, in turn, are a primitive first step in bring- ing visual intelligence to computers. © 2004 SPIE and IS&T.

Journal ArticleDOI
TL;DR: Recent advances in image processing techniques for food quality evaluation are reviewed, which include charge coupled device camera, ultrasound, magnetic resonance imaging, computed tomography, and electrical tomography for image acquisition; pixel and local pre- processing approaches for image pre-processing; thresholding- based, gradient-based, region-based; and classification-based methods for image segmentation.
Abstract: Image processing techniques have been applied increasingly for food quality evaluation in recent years. This paper reviews recent advances in image processing techniques for food quality evaluation, which include charge coupled device camera, ultrasound, magnetic resonance imaging, computed tomography, and electrical tomography for image acquisition; pixel and local pre-processing approaches for image pre-processing; thresholding-based, gradient-based, region-based, and classification-based methods for image segmentation; size, shape, colour, and texture features for object measurement; and statistical, fuzzy logic, and neural network methods for classification. The promise of image processing techniques for food quality evaluation is demonstrated, and some issues which need to be resolved or investigated further to expedite the application of image processing technologies for food quality evaluation are also discussed.

Journal ArticleDOI
TL;DR: A precise definition of the image foresting transform is given, and a procedure to compute it-a generalization of Dijkstra's algorithm-with a proof of correctness is given.
Abstract: The image foresting transform (IFT) is a graph-based approach to the design of image processing operators based on connectivity. It naturally leads to correct and efficient implementations and to a better understanding of how different operators relate to each other. We give here a precise definition of the IFT, and a procedure to compute it-a generalization of Dijkstra's algorithm-with a proof of correctness. We also discuss implementation issues and illustrate the use of the IFT in a few applications.

Journal ArticleDOI
TL;DR: A novel approach to multiresolution signal-level image fusion is presented for accurately transferring visual information from any number of input image signals, into a single fused image without loss of information or the introduction of distortion.
Abstract: A novel approach to multiresolution signal-level image fusion is presented for accurately transferring visual information from any number of input image signals, into a single fused image without loss of information or the introduction of distortion. The proposed system uses a "fuse-then-decompose" technique realized through a novel, fusion/decomposition system architecture. In particular, information fusion is performed on a multiresolution gradient map representation domain of image signal information. At each resolution, input images are represented as gradient maps and combined to produce new, fused gradient maps. Fused gradient map signals are processed, using gradient filters derived from high-pass quadrature mirror filters to yield a fused multiresolution pyramid representation. The fused output image is obtained by applying, on the fused pyramid, a reconstruction process that is analogous to that of conventional discrete wavelet transform. This new gradient fusion significantly reduces the amount of distortion artefacts and the loss of contrast information usually observed in fused images obtained from conventional multiresolution fusion schemes. This is because fusion in the gradient map domain significantly improves the reliability of the feature selection and information fusion processes. Fusion performance is evaluated through informal visual inspection and subjective psychometric preference tests, as well as objective fusion performance measurements. Results clearly demonstrate the superiority of this new approach when compared to conventional fusion systems.

Reference BookDOI
27 Jul 2004
TL;DR: This book reviews the state-of-the-art advances in skew-elliptical distributions and provides many new developments in a single volume, collecting theoretical results and applications previously scattered throughout the literature.
Abstract: This book reviews the state-of-the-art advances in skew-elliptical distributions and provides many new developments in a single volume, collecting theoretical results and applications previously scattered throughout the literature. The main goal of this research area is to develop flexible parametric classes of distributions beyond the classical normal distribution. The book is divided into two parts. The first part discusses theory and inference for skew-elliptical distribution. The second part presents applications and case studies in areas such as economics, finance, oceanography, climatology, environmetrics, engineering, image processing, astronomy, and biomedical science.

Proceedings ArticleDOI
24 Oct 2004
TL;DR: Experimental results show that the developed technique can achieve fully automatic surveillance of fire accident with a lower false alarm rate and thus is very attractive for the important military, social security, commercial applications, and so on, at a general cost.
Abstract: The paper presents an early fire-alarm raising method based on video processing. The basic idea of the proposed of fire-detection is to adopt a RGB (red, green, blue) model based chromatic and disorder measurement for extracting fire-pixels and smoke-pixels. The decision function of fire-pixels is mainly deduced by the intensity and saturation of R component. The extracted fire-pixels will be verified if it is a real fire by both dynamics of growth and disorder, and further smoke. Based on iterative checking on the growing ratio of flames, a fire-alarm is given when the alarm-raising condition is met. Experimental results show that the developed technique can achieve fully automatic surveillance of fire accident with a lower false alarm rate and thus is very attractive for the important military, social security, commercial applications, and so on, at a general cost.

Journal ArticleDOI
TL;DR: A general overview of a new generation of Xmipp that has been re-engineered to maximize flexibility and modularity, potentially facilitating its integration in future standardization efforts in the field.

Journal ArticleDOI
TL;DR: In this paper, the authors present the method Heliosat-2 that converts observations made by geostationary meteorological satellites into estimates of the global irradiation at ground level.

Journal ArticleDOI
TL;DR: Experimental results of 3D image sensing and volume pixel reconstruction are presented to test and verify the performance of the algorithm and the imaging system.
Abstract: We propose a three-dimensional (3D) imaging technique that can sense a 3D scene and computationally reconstruct it as a 3D volumetric image. Sensing of the 3D scene is carried out by obtaining elemental images optically using a pickup microlens array and a detector array. Reconstruction of volume pixels of the scene is accomplished by computationally simulating optical reconstruction according to ray optics. The entire pixels of the recorded elemental images contribute to volumetric reconstruction of the 3D scene. Image display planes at arbitrary distances from the display microlens array are computed and reconstructed by back propagating the elemental images through a computer synthesized pinhole array based on ray optics. We present experimental results of 3D image sensing and volume pixel reconstruction to test and verify the performance of the algorithm and the imaging system. The volume pixel values can be used for 3D image surface reconstruction.

Journal ArticleDOI
TL;DR: The phantom is capable of producing realistic molecular imaging data from which imaging devices and techniques can be evaluated and can be used in the development of new imaging instrumentation, image acquisition strategies, and image processing and reconstruction methods.
Abstract: Purpose We develop a realistic and flexible 4-D digital mouse phantom and investigate its usefulness in molecular imaging research. Methods Organ shapes were modeled with non-uniform rational B-spline (NURBS) surfaces based on high-resolution 3-D magnetic resonance microscopy (MRM) data. Cardiac and respiratory motions were modeled based on gated magnetic resonance imaging (MRI) data obtained from normal mice. Pilot simulation studies in single-photon emission computed tomography (SPECT) and X-ray computed tomography (CT) were performed to demonstrate the utility of the phantom. Results NURBS are an efficient and flexible way to accurately model the anatomy and cardiac and respiratory motions for a realistic 4-D digital mouse phantom. The phantom is capable of producing realistic molecular imaging data from which imaging devices and techniques can be evaluated. Conclusion The phantom provides a unique and useful tool in molecular imaging research. It can be used in the development of new imaging instrumentation, image acquisition strategies, and image processing and reconstruction methods.

Patent
15 Nov 2004
TL;DR: In this article, a hand-supportable Digital Imaging-based Bar Code Symbol Reading Device comprises: an IR-based Object Presence and Range Detection Subsystem, a Multi-Mode Area-type Image Formation and Detection SubSystem having narrow-area and wide area image capture modes of operation; a multi-mode LED-based Illumination Subsystem having narrow area and wide-area illumination modes of operations; an Automatic Light Exposure Measurement and Illumination Control Subsystem; an Image Capturing and Buffering Subsystem.
Abstract: A hand-supportable Digital Imaging-Based Bar Code Symbol Reading Device comprises: an IR-based Object Presence and Range Detection Subsystem; a Multi-Mode Area-type Image Formation and Detection Subsystem having narrow-area and wide area image capture modes of operation; a Multi-Mode LED-based Illumination Subsystem having narrow-area and wide area illumination modes of operation; an Automatic Light Exposure Measurement and Illumination Control Subsystem; an Image Capturing and Buffering Subsystem; a Multi-Mode Image-Processing Bar Code Symbol Reading Subsystem; an Input/Output Subsystem; a manually-activatable trigger switch; a System Mode Configuration Parameter Table; and a System Control Subsystem integrated with each of the above-described subsystems. The bar code reading device can be configured and operated in numerous programmable modes of system operation to automatically read 1D and 2D bar code symbologies in a high-speed manner using advanced modes of image processing on captured images.

Journal ArticleDOI
TL;DR: Results of theoretical and experimental investigations of artifacts that can arise in spectral-domain optical coherence tomography (SD-OCT) and optical frequency domain imaging (OFDI) as a result of sample or probe beam motion are described.
Abstract: We describe results of theoretical and experimental investigations of artifacts that can arise in spectral-domain optical coherence tomography (SD-OCT) and optical frequency domain imaging (OFDI) as a result of sample or probe beam motion. While SD-OCT and OFDI are based on similar spectral interferometric principles, the specifics of motion effects are quite different because of distinct signal acquisition methods. These results provide an understanding of motion artifacts such as signal fading, spatial distortion and blurring, and emphasize the need for fast image acquisition in biomedical applications.