scispace - formally typeset
Search or ask a question

Showing papers on "Pixel published in 2007"


Proceedings ArticleDOI
29 Jul 2007
TL;DR: In this article, seam carving is used for content-aware image resizing for both reduction and expansion, where an optimal 8-connected path of pixels on a single image from top to bottom, or left to right, where optimality is defined by an image energy function.
Abstract: Effective resizing of images should not only use geometric constraints, but consider the image content as well We present a simple image operator called seam carving that supports content-aware image resizing for both reduction and expansion A seam is an optimal 8-connected path of pixels on a single image from top to bottom, or left to right, where optimality is defined by an image energy function By repeatedly carving out or inserting seams in one direction we can change the aspect ratio of an image By applying these operators in both directions we can retarget the image to a new size The selection and order of seams protect the content of the image, as defined by the energy function Seam carving can also be used for image content enhancement and object removal We support various visual saliency measures for defining the energy of an image, and can also include user input to guide the process By storing the order of seams in an image we create multi-size images, that are able to continuously change in real time to fit a given size

1,652 citations


Proceedings ArticleDOI
17 Jun 2007
TL;DR: An approach for measuring similarity between visual entities (images or videos) based on matching internal self-similarities, measured densely throughout the image/video, at multiple scales, while accounting for local and global geometric distortions is presented.
Abstract: We present an approach for measuring similarity between visual entities (images or videos) based on matching internal self-similarities. What is correlated across images (or across video sequences) is the internal layout of local self-similarities (up to some distortions), even though the patterns generating those local self-similarities are quite different in each of the images/videos. These internal self-similarities are efficiently captured by a compact local "self-similarity descriptor"', measured densely throughout the image/video, at multiple scales, while accounting for local and global geometric distortions. This gives rise to matching capabilities of complex visual data, including detection of objects in real cluttered images using only rough hand-sketches, handling textured objects with no clear boundaries, and detecting complex actions in cluttered video data with no prior learning. We compare our measure to commonly used image-based and video-based similarity measures, and demonstrate its applicability to object detection, retrieval, and action detection.

1,162 citations


Journal ArticleDOI
TL;DR: This work presents a technique for real-time adaptive thresholding using the integral image of the input, an extension of a previous method that is more robust to illumination changes in the image.
Abstract: Image thresholding is a common task in many computer vision and graphics applications. The goal of thresholding an image is to classify pixels as either "dark" or "light." Adaptive thresholding is a form of thresholding that takes into account spatial variations in illumination. We present a technique for real-time adaptive thresholding using the integral image of the input. Our technique is an extension of a previous method. However, our solution is more robust to illumination changes in the image. Additionally, our method is simple and easy to implement. Our technique is suitable for processing live video streams at a real-time frame-rate, making it a valuable tool for interactive applications such as augmented reality. Source code is available online.

1,041 citations


Proceedings ArticleDOI
17 Jun 2007
TL;DR: A novel algorithm for calibrated multi-view stereopsis that outputs a (quasi) dense set of rectangular patches covering the surfaces visible in the input images, which is currently the top performer in terms of both coverage and accuracy for four of the six benchmark datasets presented in [20].
Abstract: This paper proposes a novel algorithm for calibrated multi-view stereopsis that outputs a (quasi) dense set of rectangular patches covering the surfaces visible in the input images. This algorithm does not require any initialization in the form of a bounding volume, and it detects and discards automatically outliers and obstacles. It does not perform any smoothing across nearby features, yet is currently the top performer in terms of both coverage and accuracy for four of the six benchmark datasets presented in [20]. The keys to its performance are effective techniques for enforcing local photometric consistency and global visibility constraints. Stereopsis is implemented as a match, expand, and filter procedure, starting from a sparse set of matched keypoints, and repeatedly expanding these to nearby pixel correspondences before using visibility constraints to filter away false matches. A simple but effective method for turning the resulting patch model into a mesh appropriate for image-based modeling is also presented. The proposed approach is demonstrated on various datasets including objects with fine surface details, deep concavities, and thin structures, outdoor scenes observed from a restricted set of viewpoints, and "crowded" scenes where moving obstacles appear in different places in multiple images of a static structure of interest.

996 citations


Journal ArticleDOI
TL;DR: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed and two segmentation methods are considered.
Abstract: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed. A line detector, previously used in mammography, is applied to the green channel of the retinal image. It is based on the evaluation of the average grey level along lines of fixed length passing through the target pixel at different orientations. Two segmentation methods are considered. The first uses the basic line detector whose response is thresholded to obtain unsupervised pixel classification. As a further development, we employ two orthogonal line detectors along with the grey level of the target pixel to construct a feature vector for supervised classification using a support vector machine. The effectiveness of both methods is demonstrated through receiver operating characteristic analysis on two publicly available databases of color fundus images.

819 citations


Journal ArticleDOI
TL;DR: The proposed technique would also allow precise coregistration of images for the measurement of surface displacements due to ice-flow or geomorphic processes, or for any other change detection applications.
Abstract: We describe a procedure to accurately measure ground deformations from optical satellite images. Precise orthorectification is obtained owing to an optimized model of the imaging system, where look directions are linearly corrected to compensate for attitude drifts, and sensor orientation uncertainties are accounted for. We introduce a new computation of the inverse projection matrices for which a rigorous resampling is proposed. The irregular resampling problem is explicitly addressed to avoid introducing aliasing in the ortho-rectified images. Image registration and correlation is achieved with a new iterative unbiased processor that estimates the phase plane in the Fourier domain for subpixel shift detection. Without using supplementary data, raw images are wrapped onto the digital elevation model and coregistered with a 1/50 pixel accuracy. The procedure applies to images from any pushbroom imaging system. We analyze its performance using Satellite pour l'Observation de la Terre (SPOT) images in the case of a null test (no coseismic deformation) and in the case of large coseismic deformations due to the Mw 7.1 Hector Mine, California, earthquake of 1999. The proposed technique would also allow precise coregistration of images for the measurement of surface displacements due to ice-flow or geomorphic processes, or for any other change detection applications. A complete software package, the Coregistration of Optically Sensed Images and Correlation, is available for download from the Caltech Tectonics Observatory website

777 citations


Journal ArticleDOI
TL;DR: A new decision-based algorithm is proposed for restoration of images that are highly corrupted by impulse noise that removes the noise effectively even at noise level as high as 90% and preserves the edges without any loss up to 80% of noise level.
Abstract: A new decision-based algorithm is proposed for restoration of images that are highly corrupted by impulse noise. The new algorithm shows significantly better image quality than a standard median filter (SMF), adaptive median filters (AMF), a threshold decomposition filter (TDF), cascade, and recursive nonlinear filters. The proposed method, unlike other nonlinear filters, removes only corrupted pixel by the median value or by its neighboring pixel value. As a result of this, the proposed method removes the noise effectively even at noise level as high as 90% and preserves the edges without any loss up to 80% of noise level. The proposed algorithm (PA) is tested on different images and is found to produce better results in terms of the qualitative and quantitative measures of the image

679 citations


Proceedings ArticleDOI
29 Jul 2007
TL;DR: A novel design to reconstruct the 4D light field from a 2D camera image without any additional refractive elements as required by previous light field cameras is presented.
Abstract: We describe a theoretical framework for reversibly modulating 4D light fields using an attenuating mask in the optical path of a lens based camera. Based on this framework, we present a novel design to reconstruct the 4D light field from a 2D camera image without any additional refractive elements as required by previous light field cameras. The patterned mask attenuates light rays inside the camera instead of bending them, and the attenuation recoverably encodes the rays on the 2D sensor. Our mask-equipped camera focuses just as a traditional camera to capture conventional 2D photos at full sensor resolution, but the raw pixel values also hold a modulated 4D light field. The light field can be recovered by rearranging the tiles of the 2D Fourier transform of sensor values into 4D planes, and computing the inverse Fourier transform. In addition, one can also recover the full resolution image information for the in-focus parts of the scene. We also show how a broadband mask placed at the lens enables us to compute refocused images at full sensor resolution for layered Lambertian scenes. This partial encoding of 4D ray-space data enables editing of image contents by depth, yet does not require computational recovery of the complete 4D light field.

660 citations


Journal ArticleDOI
TL;DR: A new similarity measure for automatic change detection in multitemporal synthetic aperture radar images based on the evolution of the local statistics of the image between two dates, which allows a multiscale approach in the change detection for operational use.
Abstract: In this paper, we present a new similarity measure for automatic change detection in multitemporal synthetic aperture radar images. This measure is based on the evolution of the local statistics of the image between two dates. The local statistics are estimated by using a cumulant-based series expansion, which approximates probability density functions in the neighborhood of each pixel in the image. The degree of evolution of the local statistics is measured using the Kullback-Leibler divergence. An analytical expression for this detector is given, allowing a simple computation which depends on the four first statistical moments of the pixels inside the analysis window only. The proposed change indicator is compared to the classical mean ratio detector and also to other model-based approaches. Tests on the simulated and real data show that our detector outperforms all the others. The fast computation of the proposed detector allows a multiscale approach in the change detection for operational use. The so-called multiscale change profile (MCP) is introduced to yield change information on a wide range of scales and to better characterize the appropriate scale. Two simple yet useful examples of applications show that the MCP allows the design of change indicators, which provide better results than a monoscale analysis

500 citations


Journal ArticleDOI
TL;DR: This paper addresses unsupervised change detection by proposing a proper framework for a formal definition and a theoretical study of the change vector analysis (CVA) technique and the results obtained confirm the interest of the proposed framework and the validity of the related theoretical analysis.
Abstract: This paper addresses unsupervised change detection by proposing a proper framework for a formal definition and a theoretical study of the change vector analysis (CVA) technique. This framework, which is based on the representation of the CVA in polar coordinates, aims at: 1) introducing a set of formal definitions in the polar domain (which are linked to the properties of the data) for a better general description (and thus understanding) of the information present in spectral change vectors; 2) analyzing from a theoretical point of view the distributions of changed and unchanged pixels in the polar domain (also according to possible simplifying assumptions); 3) driving the implementation of proper preprocessing procedures to be applied to multitemporal images on the basis of the results of the theoretical study on the distributions; and 4) defining a solid background for the development of advanced and accurate automatic change-detection algorithms in the polar domain. The findings derived from the theoretical analysis on the statistical models of classes have been validated on real multispectral and multitemporal remote sensing images according to both qualitative and quantitative analyses. The results obtained confirm the interest of the proposed framework and the validity of the related theoretical analysis

486 citations


Journal ArticleDOI
TL;DR: In this paper, morphological image processing is used for classifying spatial patterns at the pixel level on binary land cover maps, which are classified as perforated, edge, patch, core, and core.
Abstract: We use morphological image process- ing for classifying spatial patterns at the pixel le- vel on binary land-cover maps. Land-cover pattern is classified as 'perforated,' 'edge,' 'patch,' and 'core' with higher spatial precision and the- matic accuracy compared to a previous approach based on image convolution, while retaining the capability to label these features at the pixel level for any scale of observation. The implementation of morphological image processing is explained and then demonstrated, with comparisons to results from image convolution, for a forest map of the Val Grande National Park in North Italy.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: This work analyzes the weaknesses of previous matting approaches, proposes a new robust matting algorithm, and presents an extensive and quantitative comparison between the algorithm and a number of previous approaches in hopes of providing a benchmark for future matting research.
Abstract: Image matting is the problem of determining for each pixel in an image whether it is foreground, background, or the mixing parameter, "alpha", for those pixels that are a mixture of foreground and background. Matting is inherently an ill-posed problem. Previous matting approaches either use naive color sampling methods to estimate foreground and background colors for unknown pixels, or use propagation-based methods to avoid color sampling under weak assumptions about image statistics. We argue that neither method itself is enough to generate good results for complex natural images. We analyze the weaknesses of previous matting approaches, and propose a new robust matting algorithm. In our approach we also sample foreground and background colors for unknown pixels, but more importantly, analyze the confidence of these samples. Only high confidence samples are chosen to contribute to the matting energy function which is minimized by a Random Walk. The energy function we define also contains a neighborhood term to enforce the smoothness of the matte. To validate the approach, we present an extensive and quantitative comparison between our algorithm and a number of previous approaches in hopes of providing a benchmark for future matting research.

Journal ArticleDOI
TL;DR: In this paper, the authors used multiple endmember spectral mixture analysis (MESMA) to map the physical components of urban land cover for the city of Manaus, Brazil, using Landsat Enhanced Thematic Mapper (ETM+) imagery.

Proceedings ArticleDOI
26 Dec 2007
TL;DR: A viewpoint-based approach for the quick fusion of multiple stereo depth maps by selecting depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsistencies from the depth maps to produce a consistent surface.
Abstract: We present a viewpoint-based approach for the quick fusion of multiple stereo depth maps. Our method selects depth estimates for each pixel that minimize violations of visibility constraints and thus remove errors and inconsistencies from the depth maps to produce a consistent surface. We advocate a two-stage process in which the first stage generates potentially noisy, overlapping depth maps from a set of calibrated images and the second stage fuses these depth maps to obtain an integrated surface with higher accuracy, suppressed noise, and reduced redundancy. We show that by dividing the processing into two stages we are able to achieve a very high throughput because we are able to use a computationally cheap stereo algorithm and because this architecture is amenable to hardware-accelerated (GPU) implementations. A rigorous formulation based on the notion of stability of a depth estimate is presented first. It aims to determine the validity of a depth estimate by rendering multiple depth maps into the reference view as well as rendering the reference depth map into the other views in order to detect occlusions and free- space violations. We also present an approximate alternative formulation that selects and validates only one hypothesis based on confidence. Both formulations enable us to perform video-based reconstruction at up to 25 frames per second. We show results on the multi-view stereo evaluation benchmark datasets and several outdoors video sequences. Extensive quantitative analysis is performed using an accurately surveyed model of a real building as ground truth.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: This paper introduces a regularized subspace learning model using a Laplacian penalty to constrain the coefficients to be spatially smooth and shows results on face recognition which are better for image representation than their original version.
Abstract: Subspace learning based face recognition methods have attracted considerable interests in recently years, including principal component analysis (PCA), linear discriminant analysis (LDA), locality preserving projection (LPP), neighborhood preserving embedding (NPE), marginal fisher analysis (MFA) and local discriminant embedding (LDE). These methods consider an n1timesn2 image as a vector in Rn 1 timesn 2 and the pixels of each image are considered as independent. While an image represented in the plane is intrinsically a matrix. The pixels spatially close to each other may be correlated. Even though we have n1xn2 pixels per image, this spatial correlation suggests the real number of freedom is far less. In this paper, we introduce a regularized subspace learning model using a Laplacian penalty to constrain the coefficients to be spatially smooth. All these existing subspace learning algorithms can fit into this model and produce a spatially smooth subspace which is better for image representation than their original version. Recognition, clustering and retrieval can be then performed in the image subspace. Experimental results on face recognition demonstrate the effectiveness of our method.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the software tools for reconstruction and analysis of tomographic data that are being developed at the UGCT, and the analysis of the 3D data focuses primarily on the characterization of pore structures, but will be extended to other applications.
Abstract: The technique of X-ray microtomography using X-ray tube radiation offers an interesting tool for the non-destructive investigation of a wide range of materials. A major challenge lies in the analysis and quantification of the resulting data, allowing for a full characterization of the sample under investigation. In this paper, we discuss the software tools for reconstruction and analysis of tomographic data that are being developed at the UGCT. The tomographic reconstruction is performed using Octopus, a high-performance and user-friendly software package. The reconstruction process transforms the raw acquisition data into a stack of 2D cross-sections through the sample, resulting in a 3D data set. A number of artifact and noise reduction algorithms are integrated to reduce ring artifacts, beam hardening artifacts, COR misalignment, detector or stage tilt, pixel non-linearities, etc. These corrections are very important to facilitate the analysis of the 3D data. The analysis of the 3D data focuses primarily on the characterization of pore structures, but will be extended to other applications. A first package for the analysis of pore structures in three dimensions was developed under Matlab®. A new package, called Morpho+, is being developed in a C++ environment, with optimizations and extensions of the previously used algorithms. The current status of this project will be discussed. Examples of pore analysis can be found in pharmaceuticals, material science, geology and numerous other fields.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A closed-form expression for the motion error in order to apply motion compensation on a pixel level is developed and the resulting scanning system can capture accurate depth maps of complex dynamic scenes at 17 fps and can cope with both rigid and deformable objects.
Abstract: We present a novel 3D scanning system combining stereo and active illumination based on phase-shift for robust and accurate scene reconstruction. Stereo overcomes the traditional phase discontinuity problem and allows for the reconstruction of complex scenes containing multiple objects. Due to the sequential recording of three patterns, motion will introduce artifacts in the reconstruction. We develop a closed-form expression for the motion error in order to apply motion compensation on a pixel level. The resulting scanning system can capture accurate depth maps of complex dynamic scenes at 17 fps and can cope with both rigid and deformable objects.

Patent
10 May 2007
TL;DR: In this paper, an image sensor for capturing a color image is disclosed having a two-dimensional array having first and second groups of pixels wherein pixels from the first group of pixels have narrower spectral photoresponses than pixels from a second group.
Abstract: An image sensor for capturing a color image is disclosed having a two-dimensional array having first and second groups of pixels wherein pixels from the first group of pixels have narrower spectral photoresponses than pixels from the second group of pixels and wherein the first group of pixels has individual pixels that have spectral photoresponses that correspond to a set of at least two colors, with the placement of the first and second groups of pixels defining a pattern that has a minimal repeating unit including at least six pixels with at least some rows or columns of the minimal repeating unit composed only of pixels from the second group of pixels, and including ways to combine similarly positioned pixels from at least two adjacent minimal repeating units.

Patent
Yi-Ren Ng1
30 Nov 2007
TL;DR: In this article, a set of images are computed corresponding to the digital photographic image and focused at different depths, and at least a portion of the image is refocused at a desired refocus depth determined from the look-up table.
Abstract: A method is performed to refocus a digital photographic image comprising a plurality of pixels. In the method, a set of images is computed corresponding to the digital photographic image and focused at different depths. Refocus depths for at least a subset of the pixels are identified and stored in a look-up table. At least a portion of the digital photographic image is refocused at a desired refocus depth determined from the look-up table.

Patent
Tatsuro Yamazaki1, Naoto Abe1, Eisaku Tatsumi1, Makiko Mori1, Muneki Ando1, Takeshi Ikeda1 
20 Feb 2007
TL;DR: In this paper, a drive circuit of an image display apparatus has a correction circuit for outputting driving data that is corrected on the basis of a correction value that corrects variation of brightness of a plurality of pixels.
Abstract: A drive circuit of an image display apparatus has a correction circuit for outputting driving data that is corrected on the basis of a correction value. The correction value is a correction value that corrects variation of brightness of a plurality of pixels. The correction on the basis of the correction value is a correction such that the number of pixels to be darkened by the correction when the driving data inputted for the plurality of pixels have a common first value is fewer than the number of pixels to be darkened by the correction when the driving data inputted for the plurality of pixels have a common second value larger than the first value.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A novel colour-based affine co-variant region detector based on a Poisson image noise model that performs better than the commonly used Euclidean distance and extends the state of the art in feature repeatability tests.
Abstract: This paper introduces a novel colour-based affine co-variant region detector. Our algorithm is an extension of the maximally stable extremal region (MSER) to colour. The extension to colour is done by looking at successive time-steps of an agglomerative clustering of image pixels. The selection of time-steps is stabilised against intensity scalings and image blur by modelling the distribution of edge magnitudes. The algorithm contains a novel edge significance measure based on a Poisson image noise model, which we show performs better than the commonly used Euclidean distance. We compare our algorithm to the original MSER detector and a competing colour-based blob feature detector, and show through a repeatability test that our detector performs better. We also extend the state of the art in feature repeatability tests, by using scenes consisting of two planes where one is piecewise transparent. This new test is able to evaluate how stable a feature is against changing backgrounds.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A robust multi-layer background subtraction technique which takes advantages of local texture features represented by local binary patterns (LBP) and photometric invariant color measurements in RGB color space and allows to implicitly smooth detection results over regions of similar intensity and preserve object boundaries.
Abstract: In this paper, we propose a robust multi-layer background subtraction technique which takes advantages of local texture features represented by local binary patterns (LBP) and photometric invariant color measurements in RGB color space. LBP can work robustly with respective to light variation on rich texture regions but not so efficiently on uniform regions. In the latter case, color information should overcome LBP's limitation. Due to the illumination invariance of both the LBP feature and the selected color feature, the method is able to handle local illumination changes such as cast shadows from moving objects. Due to the use of a simple layer-based strategy, the approach can model moving background pixels with quasi-periodic flickering as well as background scenes which may vary over time due to the addition and removal of long-time stationary objects. Finally, the use of a cross-bilateral filter allows to implicitly smooth detection results over regions of similar intensity and preserve object boundaries. Numerical and qualitative experimental results on both simulated and real data demonstrate the robustness of the proposed method.

Proceedings ArticleDOI
TL;DR: A framework for compressive classification that operates directly on the compressive measurements without first reconstructing the image is proposed, and the effectiveness of the smashed filter for target classification using very few measurements is demonstrated.
Abstract: The theory of compressive sensing (CS) enables the reconstruction of a sparse or compressible image or signal from a small set of linear, non-adaptive (even random) projections. However, in many applications, including object and target recognition, we are ultimately interested in making a decision about an image rather than computing a reconstruction. We propose here a framework for compressive classification that operates directly on the compressive measurements without first reconstructing the image. We dub the resulting dimensionally reduced matched filter the smashed filter. The first part of the theory maps traditional maximum likelihood hypothesis testing into the compressive domain; we find that the number of measurements required for a given classification performance level does not depend on the sparsity or compressibility of the images but only on the noise level. The second part of the theory applies the generalized maximum likelihood method to deal with unknown transformations such as the translation, scale, or viewing angle of a target object. We exploit the fact the set of transformed images forms a low-dimensional, nonlinear manifold in the high-dimensional image space. We find that the number of measurements required for a given classification performance level grows linearly in the dimensionality of the manifold but only logarithmically in the number of pixels/samples and image classes. Using both simulations and measurements from a new single-pixel compressive camera, we demonstrate the effectiveness of the smashed filter for target classification using very few measurements.

Patent
12 Oct 2007
TL;DR: In this paper, a display device capable of compensating unevenness in brightness caused by physical restrictions of a display devices or degradation in image quality caused by a partial reduction in contrast occurring in the local dimming technology using human visual characteristics is described.
Abstract: Disclosed is a display device capable of compensating unevenness in brightness caused by physical restrictions of a display device or degradation in image quality caused by a partial reduction in contrast occurring in the local dimming technology using human visual characteristics. A liquid crystal panel ( 101 ) modulates illuminating light in accordance with the transmittance, and displays images on a screen. A backlight ( 102 ) emits the illuminating light to the liquid crystal panel ( 101 ) such that amounts of the illuminating light differ for each light emitting area of the screen. A backlight control unit ( 106 ) controls emission brightness of the backlight ( 102 ) for each light emitting area. A local gradation converting unit ( 104 ) performs gradation conversion on an image signal, and acquires a brightness value for each pixel after the conversion. A backlight driving unit ( 107 ) controls the transmittance for each pixel on the basis of the acquired brightness values after the conversion. The local gradation converting unit sets conversion characteristics for pixels to be processed in the image signal such that the brightness values of the pixels to be processed are low as the lightness of the periphery of the pixels to be processed is high, and performs gradation conversion using the set conversion characteristics.

Patent
01 Mar 2007
TL;DR: In this paper, a method for segmenting video data into foreground and background (324) portions utilizes statistical modeling of the pixels Λ statistical model of the background is built for each pixel, and each pixel in an incoming video frame is compared with the background statistical model for that pixel.
Abstract: A method for segmenting video data into foreground and background (324) portions utilizes statistical modeling of the pixels Λ statistical model of the background is built for each pixel, and each pixel in an incoming video frame is compared (326) with the background statistical model for that pixel. Pixels are determined to be foreground or background based on the comparisons. The method for segmenting video data may be further incorporated into a method for implementing an intelligent video surveillance system The method for segmenting video data may be implemented in hardware.

Journal ArticleDOI
TL;DR: In this article, a stereo method for image-based rendering is proposed, which relies on over-segmenting the source images and computing match values over entire segments rather than single pixels.
Abstract: In this paper, we propose a stereo method specifically designed for image-based rendering For effective image-based rendering, the interpolated views need only be visually plausible The implication is that the extracted depths do not need to be correct, as long as the recovered views appear to be correct Our stereo algorithm relies on over-segmenting the source images Computing match values over entire segments rather than single pixels provides robustness to noise and intensity bias Color-based segmentation also helps to more precisely delineate object boundaries, which is important for reducing boundary artifacts in synthesized views The depths of the segments for each image are computed using loopy belief propagation within a Markov Random Field framework Neighboring MRFs are used for occlusion reasoning and ensuring that neighboring depth maps are consistent We tested our stereo algorithm on several stereo pairs from the Middlebury data set, and show rendering results based on two of these data sets We also show results for video-based rendering

Journal ArticleDOI
TL;DR: A computationally simple super-resolution algorithm using a type of adaptive Wiener filter that produces an improved resolution image from a sequence of low-resolution video frames with overlapping field of view and lends itself to parallel implementation.
Abstract: A computationally simple super-resolution algorithm using a type of adaptive Wiener filter is proposed. The algorithm produces an improved resolution image from a sequence of low-resolution (LR) video frames with overlapping field of view. The algorithm uses subpixel registration to position each LR pixel value on a common spatial grid that is referenced to the average position of the input frames. The positions of the LR pixels are not quantized to a finite grid as with some previous techniques. The output high-resolution (HR) pixels are obtained using a weighted sum of LR pixels in a local moving window. Using a statistical model, the weights for each HR pixel are designed to minimize the mean squared error and they depend on the relative positions of the surrounding LR pixels. Thus, these weights adapt spatially and temporally to changing distributions of LR pixels due to varying motion. Both a global and spatially varying statistical model are considered here. Since the weights adapt with distribution of LR pixels, it is quite robust and will not become unstable when an unfavorable distribution of LR pixels is observed. For translational motion, the algorithm has a low computational complexity and may be readily suitable for real-time and/or near real-time processing applications. With other motion models, the computational complexity goes up significantly. However, regardless of the motion model, the algorithm lends itself to parallel implementation. The efficacy of the proposed algorithm is demonstrated here in a number of experimental results using simulated and real video sequences. A computational analysis is also presented.

Proceedings ArticleDOI
25 Jun 2007
TL;DR: An interactive system for users to easily colorize the natural images of complex scenes and employs a smoothness map to guide the incorporation of intensity-continuity and texture-similarity constraints in the design of the labeling algorithm.
Abstract: In this paper, we present an interactive system for users to easily colorize the natural images of complex scenes. In our system, colorization procedure is explicitly separated into two stages: Color labeling and Color mapping. Pixels that should roughly share similar colors are grouped into coherent regions in the color labeling stage, and the color mapping stage is then introduced to further fine-tune the colors in each coherent region. To handle textures commonly seen in natural images, we propose a new color labeling scheme that groups not only neighboring pixels with similar intensity but also remote pixels with similar texture. Motivated by the insight into the complementary nature possessed by the highly contrastive locations and the smooth locations, we employ a smoothness map to guide the incorporation of intensity-continuity and texture-similarity constraints in the design of our labeling algorithm. Within each coherent region obtained from the color labeling stage, the color mapping is applied to generate vivid colorization effect by assigning colors to a few pixels in the region. A set of intuitive interface tools is designed for labeling, coloring and modifying the result. We demonstrate compelling results of colorizing natural images using our system, with only a modest amount of user input.

Journal ArticleDOI
TL;DR: This paper designs algorithms by using random grids to accomplish the encryption of the secret gray-level and color images in such a way that neither of the two encrypted shares alone leaks the information of thesecret image, whereas the secret can be seen when these two shares are superimposed.

Journal ArticleDOI
TL;DR: A real-time fire-detector that combines foreground object information with color pixel statistics of fire and the use of a generic statistical model for refined fire-pixel classification is proposed.