scispace - formally typeset
Search or ask a question

Showing papers on "Pixel published in 2005"


Journal ArticleDOI
TL;DR: A real-time algorithm for foreground-background segmentation that can handle scenes containing moving backgrounds or illumination variations, and it achieves robust detection for different types of videos is presented.
Abstract: We present a real-time algorithm for foreground-background segmentation. Sample background values at each pixel are quantized into codebooks which represent a compressed form of background model for a long image sequence. This allows us to capture structural background variation due to periodic-like motion over a long period of time under limited memory. The codebook representation is efficient in memory and speed compared with other background modeling techniques. Our method can handle scenes containing moving backgrounds or illumination variations, and it achieves robust detection for different types of videos. We compared our method with other multimode modeling techniques. In addition to the basic algorithm, two features improving the algorithm are presented-layered modeling/detection and adaptive codebook updating. For performance evaluation, we have applied perturbation detection rate analysis to four background subtraction algorithms and two videos of different types of scenes.

1,552 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: A framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks, developed using a Products-of-Experts framework.
Abstract: We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach extends traditional Markov random field (MRF) models by learning potential functions over extended pixel neighborhoods. Field potentials are modeled using a Products-of-Experts framework that exploits nonlinear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field of Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with and even outperform specialized techniques.

1,167 citations


Journal ArticleDOI
TL;DR: A Bayesian classifier with class-conditional probability density functions described as Gaussian mixtures is used, yielding a fast classification, while being able to model complex decision surfaces, for automated segmentation of the vasculature in retinal images.
Abstract: We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or non-vessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and continuous two-dimensional Morlet wavelet transform responses taken at multiple scales. The Morlet wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces and compare its performance with the linear minimum squared error classifier. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE and STARE databases of manually labeled non-mydriatic images. On the DRIVE database, it achieves an area under the receiver operating characteristic (ROC) curve of 0.9598, being slightly superior than that presented by the method of Staal et al.

859 citations


Journal ArticleDOI
TL;DR: This paper presents a comprehensive framework, the general image fusion (GIF) method, which makes it possible to categorize, compare, and evaluate the existing image fusion methods.
Abstract: There are many image fusion methods that can be used to produce high-resolution multispectral images from a high-resolution panchromatic image and low-resolution multispectral images Starting from the physical principle of image formation, this paper presents a comprehensive framework, the general image fusion (GIF) method, which makes it possible to categorize, compare, and evaluate the existing image fusion methods Using the GIF method, it is shown that the pixel values of the high-resolution multispectral images are determined by the corresponding pixel values of the low-resolution panchromatic image, the approximation of the high-resolution panchromatic image at the low-resolution level Many of the existing image fusion methods, including, but not limited to, intensity-hue-saturation, Brovey transform, principal component analysis, high-pass filtering, high-pass modulation, the a/spl grave/ trous algorithm-based wavelet transform, and multiresolution analysis-based intensity modulation (MRAIM), are evaluated and found to be particular cases of the GIF method The performance of each image fusion method is theoretically analyzed based on how the corresponding low-resolution panchromatic image is computed and how the modulation coefficients are set An experiment based on IKONOS images shows that there is consistency between the theoretical analysis and the experimental results and that the MRAIM method synthesizes the images closest to those the corresponding multisensors would observe at the high-resolution level

793 citations


Posted Content
TL;DR: In this article, a general presentation of the extraction of displacement fields from the knowledge of pictures taken at different instants of an experiment is given, and different strategies can be followed to achieve a sub-pixel uncertainty.
Abstract: The current development of digital image correlation, whose displacement uncertainty is well below the pixel value, enables one to better characterise the behaviour of materials and the response of structures to external loads. A general presentation of the extraction of displacement fields from the knowledge of pictures taken at different instants of an experiment is given. Different strategies can be followed to achieve a sub-pixel uncertainty. From these measurements, new identification procedures are devised making use of full-field measures. A priori or a posteriori routes can be followed. They are illustrated on the analysis of a Brazilian test.

772 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the utility of high spectral and spatial resolution imagery for the automated species-level classification of individual tree crowns (ITCs) in a tropical rain forest (TRF).

714 citations


Journal ArticleDOI
TL;DR: An object detection scheme that has three innovations over existing approaches that is based on a model of the background as a single probability density, and the posterior function is maximized efficiently by finding the minimum cut of a capacitated graph.
Abstract: Accurate detection of moving objects is an important precursor to stable tracking or recognition. In this paper, we present an object detection scheme that has three innovations over existing approaches. First, the model of the intensities of image pixels as independent random variables is challenged and it is asserted that useful correlation exists in intensities of spatially proximal pixels. This correlation is exploited to sustain high levels of detection accuracy in the presence of dynamic backgrounds. By using a nonparametric density estimation method over a joint domain-range representation of image pixels, multimodal spatial uncertainties and complex dependencies between the domain (location) and range (color) are directly modeled. We propose a model of the background as a single probability density. Second, temporal persistence is proposed as a detection criterion. Unlike previous approaches to object detection which detect objects by building adaptive models of the background, the foregrounds modeled to augment the detection of objects (without explicit tracking) since objects detected in the preceding frame contain substantial evidence for detection in the current frame. Finally, the background and foreground models are used competitively in a MAP-MRF decision framework, stressing spatial context as a condition of detecting interesting objects and the posterior function is maximized efficiently by finding the minimum cut of a capacitated graph. Experimental validation of the proposed method is performed and presented on a diverse set of dynamic scenes.

685 citations


Journal ArticleDOI
TL;DR: In this Letter, a new image encryption scheme is presented, in which shuffling the positions and changing the grey values of image pixels are combined to confuse the relationship between the cipher-image and the plain-image.

644 citations


Journal ArticleDOI
TL;DR: Improvements to the nonlocal means image denoising method introduced by Buades et al. are presented and filters that eliminate unrelated neighborhoods from the weighted average are introduced.
Abstract: In this letter, improvements to the nonlocal means image denoising method introduced by Buades et al. are presented. The original nonlocal means method replaces a noisy pixel by the weighted average of pixels with related surrounding neighborhoods. While producing state-of-the-art denoising results, this method is computationally impractical. In order to accelerate the algorithm, we introduce filters that eliminate unrelated neighborhoods from the weighted average. These filters are based on local average gray values and gradients, preclassifying neighborhoods and thereby reducing the original quadratic complexity to a linear one and reducing the influence of less-related areas in the denoising of a given pixel. We present the underlying framework and experimental results for gray level and color images as well as for video.

562 citations


Journal ArticleDOI
TL;DR: It is shown that it is possible to calibrate an immersive virtual environment with 16 cameras in less than 60 minutes reaching about 1/5 pixel reprojection error.
Abstract: Virtual immersive environments or telepresence setups often consist of multiple cameras that have to be calibrated, We present a convenient method for doing this. The minimum is three cameras, but there is no upper limit. The method is fully automatic and a freely moving bpdght spot is the only calibration object, A set of virtual 3D points is made by waving the bright spot through the working volume. Its projections are found with subpixel precision and verified by a robust RANSAC analysis. The cameras do not have to see all points; only reasonable overlap between camera subgroups is necessary. Projective structures are computed via rank-4 factorization and the Euclidean stratification is done by imposing geometric constraints. This linear estimate initializes a postprocessing computation of nonlinear distortion, which is also fully automatic. We suggest a trick on how to use a very ordinary laser pointer as the calibration object. We show that it is possible to calibrate an immersive virtual environment with 16 cameras in less than 60 minutes reaching about 1/5 pixel reprojection error, The method has been successfully tested on numerous multicamera environments using varying numbers of cameras of varying quality.

560 citations


Journal ArticleDOI
TL;DR: The purpose of this article is to provide a nuts and bolts procedure for calculating scale factors used for reconstructing images directly in SNR units and to validate the method for SNR measurement with phantom data.
Abstract: The method for phased array image reconstruction of uniform noise images may be used in conjunction with proper image scaling as a means of reconstructing images directly in SNR units. This facilitates accurate and precise SNR measurement on a per pixel basis. This method is applicable to root-sum-of-squares magnitude combining, B(1)-weighted combining, and parallel imaging such as SENSE. A procedure for image reconstruction and scaling is presented, and the method for SNR measurement is validated with phantom data. Alternative methods that rely on noise only regions are not appropriate for parallel imaging where the noise level is highly variable across the field-of-view. The purpose of this article is to provide a nuts and bolts procedure for calculating scale factors used for reconstructing images directly in SNR units. The procedure includes scaling for noise equivalent bandwidth of digital receivers, FFTs and associated window functions (raw data filters), and array combining.

Journal ArticleDOI
TL;DR: The proposed demosaicing algorithm estimates missing pixels by interpolating in the direction with fewer color artifacts, and the aliasing problem is addressed by applying filterbank techniques to 2-D directional interpolation.
Abstract: A cost-effective digital camera uses a single-image sensor, applying alternating patterns of red, green, and blue color filters to each pixel location. A way to reconstruct a full three-color representation of color images by estimating the missing pixel components in each color plane is called a demosaicing algorithm. This paper presents three inherent problems often associated with demosaicing algorithms that incorporate two-dimensional (2-D) directional interpolation: misguidance color artifacts, interpolation color artifacts, and aliasing. The level of misguidance color artifacts present in two images can be compared using metric neighborhood modeling. The proposed demosaicing algorithm estimates missing pixels by interpolating in the direction with fewer color artifacts. The aliasing problem is addressed by applying filterbank techniques to 2-D directional interpolation. The interpolation artifacts are reduced using a nonlinear iterative procedure. Experimental results using digital images confirm the effectiveness of this approach.

Patent
22 Dec 2005
TL;DR: In this paper, an imaging array sensor is used to capture an image of a scene occurring exteriorly of the vehicle and a control algorithmically processes the image data set to extract information from the reduced image data sets.
Abstract: An imaging system for a vehicle includes an imaging array sensor and a control. The image array sensor comprises a plurality of photo-sensing pixels and is positioned at the vehicle with a field of view exteriorly of the vehicle. The imaging array sensor is operable to capture an image of a scene occurring exteriorly of the vehicle. The captured image comprises an image data set representative of the exterior scene. The control algorithmically processes the image data set to a reduced image data set of the image data set. The control processes the reduced image data set to extract information from the reduced image data set. The control selects the reduced image data set based on a steering angle of the vehicle.

Journal ArticleDOI
01 Sep 2005
TL;DR: A robust segmentation technique based on an extension to the traditional fuzzy c-means (FCM) clustering algorithm is proposed and a neighborhood attraction, which is dependent on the relative location and features of neighboring pixels, is shown to improve the segmentation performance dramatically.
Abstract: Image segmentation is an indispensable process in the visualization of human tissues, particularly during clinical analysis of magnetic resonance (MR) images. Unfortunately, MR images always contain a significant amount of noise caused by operator performance, equipment, and the environment, which can lead to serious inaccuracies with segmentation. A robust segmentation technique based on an extension to the traditional fuzzy c-means (FCM) clustering algorithm is proposed in this paper. A neighborhood attraction, which is dependent on the relative location and features of neighboring pixels, is shown to improve the segmentation performance dramatically. The degree of attraction is optimized by a neural-network model. Simulated and real brain MR images with different noise levels are segmented to demonstrate the superiority of the proposed technique compared to other FCM-based methods. This segmentation method is a key component of an MR image-based classification system for brain tumors, currently being developed.

Journal ArticleDOI
TL;DR: Evaluations by comparison with the results of polarizing filters demonstrate the effectiveness of the proposed method, which is based solely on colors, particularly chromaticity, without requiring any geometrical information.
Abstract: In inhomogeneous objects, highlights are linear combinations of diffuse and specular reflection components. To our knowledge, all methods that use a single input image require explicit color segmentation to deal with multicolored surfaces. Unfortunately, for complex textured images, current color segmentation algorithms are still problematic to segment correctly. Consequently, a method without explicit color segmentation becomes indispensable and This work presents such a method. The method is based solely on colors, particularly chromaticity, without requiring any geometrical information. One of the basic ideas is to iteratively compare the intensity logarithmic differentiation of an input image and its specular-free image. A specular-free image is an image that has exactly the same geometrical profile as the diffuse component of the input image and that can be generated by shifting each pixel's intensity and maximum chromaticity nonlinearly. Unlike existing methods using a single image, all processes in the proposed method are done locally, involving a maximum of only two neighboring pixels. This local operation is useful for handling textured objects with complex multicolored scenes. Evaluations by comparison with the results of polarizing filters demonstrate the effectiveness of the proposed method.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: This work uses a new expectation-maximization (EM) scheme to impose both spatial and color smoothness to infer natural connectivity among pixels, and demonstrates results on a variety of applications including image deblurring, enhanced color transfer, and colorizing gray scale images.
Abstract: We address the problem of regional color transfer between two natural images by probabilistic segmentation. We use a new expectation-maximization (EM) scheme to impose both spatial and color smoothness to infer natural connectivity among pixels. Unlike previous work, our method takes local color information into consideration, and segment image with soft region boundaries for seamless color transfer and compositing. Our modified EM method has two advantages in color manipulation: first, subject to different levels of color smoothness in image space, our algorithm produces an optimal number of regions upon convergence, where the color statistics in each region can be adequately characterized by a component of a Gaussian mixture model (GMM). Second, we allow a pixel to fall in several regions according to our estimated probability distribution in the EM step, resulting in a transparency-like ratio for compositing different regions seamlessly. Hence, natural color transition across regions can be achieved, where the necessary intra-region and inter-region smoothness are enforced without losing original details. We demonstrate results on a variety of applications including image deblurring, enhanced color transfer, and colorizing gray scale images. Comparisons with previous methods are also presented.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: This paper combines the segmentation and matting problem together and proposes a unified optimization approach based on belief propagation, which is more efficient to extract high quality mattes for foregrounds with significant semitransparent regions.
Abstract: Separating a foreground object from the background in a static image involves determining both full and partial pixel coverages, also known as extracting a matte. Previous approaches require the input image to be presegmented into three regions: foreground, background and unknown, which are called a trimap. Partial opacity values are then computed only for pixels inside the unknown region. This presegmentation based approach fails for images with large portions of semitransparent foreground where the trimap is difficult to create even manually. In this paper, we combine the segmentation and matting problem together and propose a unified optimization approach based on belief propagation. We iteratively estimate the opacity value for every pixel in the image, based on a small sample of foreground and background pixels marked by the user. Experimental results show that compared with previous approaches, our method is more efficient to extract high quality mattes for foregrounds with significant semitransparent regions

Journal ArticleDOI
09 Jul 2005
TL;DR: The AdaBoost based classifiers presented here achieve over 93% accuracy; these match or surpass the accuracies of the SVM-based classifiers, and yield performance that is 50 times faster.
Abstract: This paper presents a method based on AdaBoost to identify the sex of a person from a low resolution grayscale picture of their face. The method described here is implemented in a system that will process well over 109 images. The goal of this work is to create an efficient system that is both simple to implement and maintain; the methods described here are extremely fast and have straightforward implementations. We achieve 80% accuracy in sex identification with less than 10 pixel comparisons and 90% accuracy with less than 50 pixel comparisons. The best classifiers published to date use Support Vector Machines; we match their accuracies with as few as 500 comparison operations on a 20×20 pixel image. The AdaBoost based classifiers presented here achieve over 93% accuracy; these match or surpass the accuracies of the SVM-based classifiers, and yield performance that is 50 times faster.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A novel algorithm aiming to estimate the 3D shape, the texture of a human face, along with the3D pose and the light direction from a single photograph by recovering the parameters of a 3D morphable model is presented.
Abstract: We present a novel algorithm aiming to estimate the 3D shape, the texture of a human face, along with the 3D pose and the light direction from a single photograph by recovering the parameters of a 3D morphable model. Generally, the algorithms tackling the problem of 3D shape estimation from image data use only the pixels intensity as input to drive the estimation process. This was previously achieved using either a simple model, such as the Lambertian reflectance model, leading to a linear fitting algorithm. Alternatively, this problem was addressed using a more precise model and minimizing a non-convex cost function with many local minima. One way to reduce the local minima problem is to use a stochastic optimization algorithm. However, the convergence properties (such as the radius of convergence) of such algorithms, are limited. Here, as well as the pixel intensity, we use various image features such as the edges or the location of the specular highlights. The 3D shape, texture and imaging parameters are then estimated by maximizing the posterior of the parameters given these image features. The overall cost function obtained is smoother and, hence, a stochastic optimization algorithm is not needed to avoid the local minima problem. This leads to the multi-features fitting algorithm that has a wider radius of convergence and a higher level of precision. This is shown on some example photographs, and on a recognition experiment performed on the CMU-PIE image database.

Journal ArticleDOI
TL;DR: A novel coarse-to-fine algorithm that is able to locate text lines even under complex background is proposed and Experimental results show that this approach can fast and robustly detect text lines under various conditions.

Proceedings ArticleDOI
29 Jun 2005
TL;DR: A new method for colorizing grayscale images by transferring color from a segmented example image is presented, rather than relying on a series of independent pixel-level decisions, that attempts to account for the higher-level context of each pixel.
Abstract: We present a new method for colorizing grayscale images by transferring color from a segmented example image. Rather than relying on a series of independent pixel-level decisions, we develop a new strategy that attempts to account for the higher-level context of each pixel. The colorizations generated by our approach exhibit a much higher degree of spatial consistency, compared to previous automatic color transfer methods [WAM02]. We also demonstrate that our method requires considerably less manual effort than previous user-assisted colorization methods [LLW04].Given a grayscale image to colorize, we first determine for each pixel which example segment it should learn its color from. This is done automatically using a robust supervised classification scheme that analyzes the low-level feature space defined by small neighborhoods of pixels in the example image. Next, each pixel is assigned a color from the appropriate region using a neighborhood matching metric, combined with spatial filtering for improved spatial coherence. Each color assignment is associated with a confidence value, and pixels with a sufficiently high confidence level are provided as "micro-scribbles" to the optimization-based colorization algorithm of Levin et al. [LLW04], which produces the final complete colorization of the image.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: This work re-examines the use of dynamic programming for stereo correspondence by applying it to a tree structure, as opposed to the individual scanlines, and concludes that this algorithm is truly a global optimization method because disparity estimate at one pixel depends on the disparity estimates at all the other pixels, unlike the scanline based methods.
Abstract: Dynamic programming on a scanline is one of the oldest and still popular methods for stereo correspondence. While efficient, its performance is far from the state of the art because the vertical consistency between the scanlines is not enforced. We re-examine the use of dynamic programming for stereo correspondence by applying it to a tree structure, as opposed to the individual scanlines. The nodes of this tree are all the image pixels, but only the "most important" edges of the 4 connected neighbourhood system are included. Thus our algorithm is truly a global optimization method because disparity estimate at one pixel depends on the disparity estimates at all the other pixels, unlike the scanline based methods. We evaluate our algorithm on the benchmark Middlebury database. The algorithm is very fast; it takes only a fraction of a second for a typical image. The results are considerably better than that of the scanline based methods. While the results are not the state of the art, our algorithm offers a good trade off in terms of accuracy and computational efficiency.

Journal ArticleDOI
01 Jul 2005
TL;DR: The Color2Gray results offer viewers salient information missing from previous grayscale image creation methods.
Abstract: Visually important image features often disappear when color images are converted to grayscale. The algorithm introduced here reduces such losses by attempting to preserve the salient features of the color image. The Color2Gray algorithm is a 3-step process: 1) convert RGB inputs to a perceptually uniform CIE L*a*b* color space, 2) use chrominance and luminance differences to create grayscale target differences between nearby image pixels, and 3) solve an optimization problem designed to selectively modulate the grayscale representation as a function of the chroma variation of the source image. The Color2Gray results offer viewers salient information missing from previous grayscale image creation methods.

Proceedings ArticleDOI
17 Oct 2005
TL;DR: A robust parallax filtering scheme is proposed to accumulate the geometric constraint errors within a sliding window and estimate a likelihood map for pixel classification, which is integrated into the tracking framework based on the spatio-temporal joint probability data association filter (JPDAF).
Abstract: We present a novel approach to detect and track independently moving regions in a 3D scene observed by a moving camera in the presence of strong parallax. Detected moving pixels are classified into independently moving regions or parallax regions by analyzing two geometric constraints: the commonly used epipolar constraint, and the structure consistency constraint. The second constraint is implemented within a "plane+parallax" framework and represented by a bilinear relationship which relates the image points to their relative depths. This newly derived relationship is related to trilinear tensor, but can be enforced into more than three frames. It does not assume a constant reference plane in the scene and therefore eliminates the need for manual selection of reference plane. Then, a robust parallax filtering scheme is proposed to accumulate the geometric constraint errors within a sliding window and estimate a likelihood map for pixel classification. The likelihood map is integrated into our tracking framework based on the spatio-temporal joint probability data association filter (JPDAF). This tracking approach infers the trajectory and bounding box of the moving objects by searching the optimal path with maximum joint probability within a fixed size of buffer. We demonstrate the performance of the proposed approach on real video sequences where parallax effects are significant.

Patent
10 Mar 2005
TL;DR: In this paper, a liquid crystal display with a plurality of thin-film transistors is presented, where each pixel consists of two thin film transistors and one pixel electrode, and the gate electrodes are connected to two adjoining scanning lines respectively.
Abstract: A liquid crystal display is suitable for displaying images with rapid motions, and comprises an active matrix substrate equipped with a plurality of thin film transistors. The active matrix substrate comprises a plurality of pixels that are placed at the encircled areas of a plurality of scanning lines and a plurality of data lines. Each pixel consists of two thin film transistors and one pixel electrode. The data lines connected electrodes of the thin film transistors are connected to two adjoining data lines respectively, whereas the pixel connected electrodes of the two thin film transistors are together connected to the pixel electrode. The gate electrodes of the two thin film transistors are connected to two adjoining scanning lines respectively.

Journal ArticleDOI
01 Jul 2005
TL;DR: This work enhances underexposed, low dynamic range videos by adaptively and independently varying the exposure at each photoreceptor in a post-process, which is a dynamic function of both the spatial neighborhood and temporal history at each pixel.
Abstract: We enhance underexposed, low dynamic range videos by adaptively and independently varying the exposure at each photoreceptor in a post-process. This virtual exposure is a dynamic function of both the spatial neighborhood and temporal history at each pixel. Temporal integration enables us to expand the image's dynamic range while simultaneously reducing noise. Our non-linear exposure variation and denoising filters smoothly transition from temporal to spatial for moving scene elements. Our virtual exposure framework also supports temporally coherent per frame tone mapping. Our system outputs restored video sequences with significantly reduced noise, increased exposure time of dark pixels, intact motion, and improved details.

Journal ArticleDOI
TL;DR: In this paper, four algorithms from the two main groups of segmentation algorithms (boundary-based and region-based) were evaluated and compared and an evaluation of each algorithm was carried out with empirical discrepancy evaluation methods.
Abstract: Since 1999, very high spatial resolution satellite data represent the surface of the Earth with more detail. However, information extraction by per pixel multispectral classification techniques proves to be very complex owing to the internal variability increase in land-cover units and to the weakness of spectral resolution. Image segmentation before classification was proposed as an alternative approach, but a large variety of segmentation algorithms were developed during the last 20 years, and a comparison of their implementation on very high spatial resolution images is necessary. In this study, four algorithms from the two main groups of segmentation algorithms (boundarybased and region-based) were evaluated and compared. In order to compare the algorithms, an evaluation of each algorithm was carried out with empirical discrepancy evaluation methods. This evaluation is carried out with a visual segmentation of Ikonos panchromatic images. The results show that the choice of parameters is very important and has a great influence on the segmentation results. The selected boundary-based algorithms are sensitive to the noise or texture. Better results are obtained with regionbased algorithms, but a problem with the transition zones between the contrasted objects can be present.

Proceedings Article
01 Sep 2005
TL;DR: A method for smoke detection in video where edges of image frames start loosing their sharpness and this leads to a decrease in the high frequency content of the image, and periodic behavior in smoke boundaries and convexity of smoke regions are analyzed.
Abstract: A method for smoke detection in video is proposed. It is assumed the camera monitoring the scene is stationary. Since the smoke is semi-transparent, edges of image frames start loosing their sharpness and this leads to a decrease in the high frequency content of the image. To determine the smoke in the field of view of the camera, the background of the scene is estimated and decrease of high frequency energy of the scene is monitored using the spatial wavelet transforms of the current and the background images. Edges of the scene are especially important because they produce local extrema in the wavelet domain. A decrease in values of local extrema is also an indicator of smoke. In addition, scene becomes grayish when there is smoke and this leads to a decrease in chrominance values of pixels. Periodic behavior in smoke boundaries and convexity of smoke regions are also analyzed. All of these clues are combined to reach a final decision.

Proceedings ArticleDOI
06 Jul 2005
TL;DR: The separable implementation of the bilateral filter offers equivalent adaptive filtering capability at a fraction of execution time compared to the traditional filter.
Abstract: Bilateral filtering is an edge-preserving filtering technique that employs both geometric closeness and photometric similarity of neighboring pixels to construct its filter kernel. Multi-dimensional bilateral filtering is computationally expensive because the adaptive kernel has to be recomputed at every pixel. In this paper, we present a separable implementation of the bilateral filter. The separable implementation offers equivalent adaptive filtering capability at a fraction of execution time compared to the traditional filter. Because of this efficiency, the separable bilateral filter can be used for fast preprocessing of images and videos. Experiments show that better image quality and higher compression efficiency is achievable if the original video is preprocessed with the separable bilateral filter.

Proceedings ArticleDOI
Kuk-Jin Yoon1, In So Kweon1
20 Jun 2005
TL;DR: A new area-based method for visual correspondence search that focuses on the dissimilarity computation, which successfully produces piecewise smooth disparity maps while preserving sharp depth discontinuities accurately.
Abstract: In this paper, we present a new area-based method for visual correspondence search that focuses on the dissimilarity computation. Local and area-based matching methods generally measure the similarity (or dissimilarity) between the image pixels using local support windows. In this approach, an appropriate support window should be selected adaptively for each pixel to make the measure reliable and certain. Finding the optimal support window with an arbitrary shape and size is, however, very difficult and generally known as an NP-hard problem. For this reason, unlike the existing methods that try to find an optimal support window, we adjusted the support-weight of each pixel in a given support window. The adaptive support-weight of a pixel is computed based on the photometric and geometric relationship with the pixel under consideration. Dissimilarity is then computed using the raw matching costs and support-weights of both support windows, and the correspondence is finally selected by the WTA (winner-takes-all) method. The experimental results for the rectified real images show that the proposed method successfully produces piecewise smooth disparity maps while preserving sharp depth discontinuities accurately.