scispace - formally typeset
Search or ask a question

Showing papers on "Pixel published in 2009"


Proceedings ArticleDOI
01 Sep 2009
TL;DR: This paper proposes a unified framework for combining the classical multi-image super-resolution and the example-based super- resolution, and shows how this combined approach can be applied to obtain super resolution from as little as a single image (with no database or prior examples).
Abstract: Methods for super-resolution can be broadly classified into two families of methods: (i) The classical multi-image super-resolution (combining images obtained at subpixel misalignments), and (ii) Example-Based super-resolution (learning correspondence between low and high resolution image patches from a database). In this paper we propose a unified framework for combining these two families of methods. We further show how this combined approach can be applied to obtain super resolution from as little as a single image (with no database or prior examples). Our approach is based on the observation that patches in a natural image tend to redundantly recur many times inside the image, both within the same scale, as well as across different scales. Recurrence of patches within the same image scale (at subpixel misalignments) gives rise to the classical super-resolution, whereas recurrence of patches across different scales of the same image gives rise to example-based super-resolution. Our approach attempts to recover at each pixel its best possible resolution increase based on its patch redundancy within and across scales.

1,923 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: A prototype based model that can successfully combine local and global discriminative information is proposed that can significantly outperform a state of the art classifier for the indoor scene recognition task.
Abstract: Indoor scene recognition is a challenging open problem in high level vision. Most scene recognition models that work well for outdoor scenes perform poorly in the indoor domain. The main difficulty is that while some indoor scenes (e.g. corridors) can be well characterized by global spatial properties, others (e.g, bookstores) are better characterized by the objects they contain. More generally, to address the indoor scenes recognition problem we need a model that can exploit local and global discriminative information. In this paper we propose a prototype based model that can successfully combine both sources of information. To test our approach we created a dataset of 67 indoor scenes categories (the largest available) covering a wide range of domains. The results show that our approach can significantly outperform a state of the art classifier for the task.

1,517 citations


Proceedings ArticleDOI
01 Sep 2009
TL;DR: A novel algorithm and variants for visibility restoration from a single image which allows visibility restoration to be applied for the first time within real-time processing applications such as sign, lane-marking and obstacle detection from an in-vehicle camera.
Abstract: One source of difficulties when processing outdoor images is the presence of haze, fog or smoke which fades the colors and reduces the contrast of the observed objects. We introduce a novel algorithm and variants for visibility restoration from a single image. The main advantage of the proposed algorithm compared with other is its speed: its complexity is a linear function of the number of image pixels only. This speed allows visibility restoration to be applied for the first time within real-time processing applications such as sign, lane-marking and obstacle detection from an in-vehicle camera. Another advantage is the possibility to handle both color images or gray level images since the ambiguity between the presence of fog and the objects with low color saturation is solved by assuming only small objects can have colors with low saturation. The algorithm is controlled only by a few parameters and consists in: atmospheric veil inference, image restoration and smoothing, tone mapping. A comparative study and quantitative evaluation is proposed with a few other state of the art algorithms which demonstrates that similar or better quality results are obtained. Finally, an application is presented to lane-marking extraction in gray level images, illustrating the interest of the approach.

1,219 citations


01 Jan 2009
TL;DR: This thesis builds a human-assisted motion annotation system to obtain ground-truth motion, missing in the literature, for natural video sequences, and proposes SIFT flow, a new framework for image parsing by transferring the metadata information from the images in a large database to an unknown query image.
Abstract: The focus of motion analysis has been on estimating a flow vector for every pixel by matching intensities. In my thesis, I will explore motion representations beyond the pixel level and new applications to which these representations lead. I first focus on analyzing motion from video sequences. Traditional motion analysis suffers from the inappropriate modeling of the grouping relationship of pixels and from a lack of ground-truth data. Using layers as the interface for humans to interact with videos, we build a human-assisted motion annotation system to obtain ground-truth motion, missing in the literature, for natural video sequences. Furthermore, we show that with the layer representation, we can detect and magnify small motions to make them visible to human eyes. Then we move to a contour presentation to analyze the motion for textureless objects under occlusion. We demonstrate that simultaneous boundary grouping and motion analysis can solve challenging data, where the traditional pixel-wise motion analysis fails. In the second part of my thesis, I will show the benefits of matching local image structures instead of intensity values. We propose SIFT flow that establishes dense, semantically meaningful correspondence between two images across scenes by matching pixel-wise SIFT features. Using SIFT flow, we develop a new framework for image parsing by transferring the metadata information, such as annotation, motion and depth, from the images in a large database to an unknown query image. We demonstrate this framework using new applications such as predicting motion from a single image and motion synthesis via object transfer. Based on SIFT flow, we introduce a nonparametric scene parsing system using label transfer, with very promising experimental results suggesting that our system outperforms state-of-the-art techniques based on training classifiers. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

899 citations


Journal ArticleDOI
TL;DR: Among the best costs are BilSub, which performs consistently very well for low radiometric differences; HMI, which is slightly better as pixelwise matching cost in some cases and for strong image noise; and Census, which showed the best and most robust overall performance.
Abstract: Stereo correspondence methods rely on matching costs for computing the similarity of image locations. We evaluate the insensitivity of different costs for passive binocular stereo methods with respect to radiometric variations of the input images. We consider both pixel-based and window-based variants like the absolute difference, the sampling-insensitive absolute difference, and normalized cross correlation, as well as their zero-mean versions. We also consider filters like LoG, mean, and bilateral background subtraction (BilSub) and nonparametric measures like Rank, SoftRank, Census, and Ordinal. Finally, hierarchical mutual information (HMI) is considered as pixelwise cost. Using stereo data sets with ground-truth disparities taken under controlled changes of exposure and lighting, we evaluate the costs with a local, a semiglobal, and a global stereo method. We measure the performance of all costs in the presence of simulated and real radiometric differences, including exposure differences, vignetting, varying lighting, and noise. Overall, the ranking of methods across all data sets and experiments appears to be consistent. Among the best costs are BilSub, which performs consistently very well for low radiometric differences; HMI, which is slightly better as pixelwise matching cost in some cases and for strong image noise; and Census, which showed the best and most robust overall performance.

765 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present results from efforts to map the global distribution of urban land use at 500 m spatial resolution using remotely sensed data from the Moderate Resolution Imaging Spectroradiometer (MODIS).
Abstract: Although only a small percentage of global land cover, urban areas significantly alter climate, biogeochemistry, and hydrology at local, regional, and global scales. To understand the impact of urban areas on these processes, high quality, regularly updated information on the urban environment—including maps that monitor location and extent—is essential. Here we present results from efforts to map the global distribution of urban land use at 500 m spatial resolution using remotely sensed data from the Moderate Resolution Imaging Spectroradiometer (MODIS). Our approach uses a supervised decision tree classification algorithm that we process using region-specific parameters. An accuracy assessment based on sites from a stratified random sample of 140 cities shows that the new map has an overall accuracy of 93% (k = 0.65) at the pixel level and a high level of agreement at the city scale (R 2 = 0.90). Our results (available at http://sage.wisc.edu/urbanenvironment.html) also reveal that the land footprint of cities occupies less than 0.5% of the Earth’s total land area.

743 citations


Journal ArticleDOI
TL;DR: A new spectral-spatial classification scheme for hyperspectral images is proposed that improves the classification accuracies and provides classification maps with more homogeneous regions, when compared to pixel wise classification.
Abstract: A new spectral-spatial classification scheme for hyperspectral images is proposed. The method combines the results of a pixel wise support vector machine classification and the segmentation map obtained by partitional clustering using majority voting. The ISODATA algorithm and Gaussian mixture resolving techniques are used for image clustering. Experimental results are presented for two hyperspectral airborne images. The developed classification scheme improves the classification accuracies and provides classification maps with more homogeneous regions, when compared to pixel wise classification. The proposed method performs particularly well for classification of images with large spatial structures and when different classes have dissimilar spectral responses and a comparable number of pixels.

704 citations


Journal ArticleDOI
TL;DR: The similarity of neighboring pixels in the images was explored by using the prediction technique and the residual histogram of the predicted errors of the host image was used to hide the secret data in the proposed scheme, and a higher hiding capacity was obtained and a good quality stego-image was preserved.

584 citations


Patent
18 Mar 2009
TL;DR: In this paper, a liquid crystal display (LCD) includes an LCD panel, a scan driver, a timing controller and a data driver, and the data driver outputs adjusted pixel voltage to the target pixel according to the adjusted pixel data.
Abstract: A liquid crystal display (LCD) includes an LCD panel, a scan driver, a timing controller and a data driver The LCD panel includes first and second pixel rows The timing controller determines a correction voltage index according to an absolute difference between an average of original pixel voltages corresponding to original pixel data of all pixels of the first pixel row, and an average of original pixel voltages corresponding to original pixel data of all pixels of the second pixel row, determines a correction voltage according to the correction voltage index, determines an adjusted pixel voltage in a target pixel according to an original pixel voltage of the target pixel in the second pixel row and the correction voltage, and outputs adjusted pixel data corresponding to the adjusted pixel voltage The data driver outputs the adjusted pixel voltage to the target pixel according to the adjusted pixel data

517 citations


Journal ArticleDOI
TL;DR: An area-based local stereo matching algorithm for accurate disparity estimation across all image regions, and is among the best performing local stereo methods according to the benchmark Middlebury stereo evaluation.
Abstract: We propose an area-based local stereo matching algorithm for accurate disparity estimation across all image regions. A well-known challenge to local stereo methods is to decide an appropriate support window for the pixel under consideration, adapting the window shape or the pixelwise support weight to the underlying scene structures. Our stereo method tackles this problem with two key contributions. First, for each anchor pixel an upright cross local support skeleton is adaptively constructed, with four varying arm lengths decided on color similarity and connectivity constraints. Second, given the local cross-decision results, we dynamically construct a shape-adaptive full support region on the fly, merging horizontal segments of the crosses in the vertical neighborhood. Approximating image structures accurately, the proposed method is among the best performing local stereo methods according to the benchmark Middlebury stereo evaluation. Additionally, it reduces memory consumption significantly thanks to our compact local cross representation. To accelerate matching cost aggregation performed in an arbitrarily shaped 2-D region, we also propose an orthogonal integral image technique, yielding a speedup factor of 5-15 over the straightforward integration.

511 citations


Proceedings ArticleDOI
16 Apr 2009
TL;DR: This paper presents a new approach to lightfield capture and image rendering that interprets the microlens array as an imaging system focused on the focal plane of the main camera lens, allowing for high resolution images that meet the expectations of modern photographers.
Abstract: Plenoptic cameras, constructed with internal microlens arrays, focus those microlenses at infinity in order to sample the 4D radiance directly at the microlenses. The consequent assumption is that each microlens image is completely defocused with respect to to the image created by the main camera lens and the outside object. As a result, only a single pixel in the final image can be rendered from it, resulting in disappointingly low resolution. In this paper, we present a new approach to lightfield capture and image rendering that interprets the microlens array as an imaging system focused on the focal plane of the main camera lens. This approach captures a lightfield with significantly higher spatial resolution than the traditional approach, allowing us to render high resolution images that meet the expectations of modern photographers. Although the new approach samples the lightfield with reduced angular density, analysis and experimental results demonstrate that there is sufficient parallax to completely support lightfield manipulation algorithms such as refocusing and novel views

Journal ArticleDOI
TL;DR: This paper introduces a novel framework for adaptive enhancement and spatiotemporal upscaling of videos containing complex activities without explicit need for accurate motion estimation based on multidimensional kernel regression, which significantly widens the applicability of super-resolution methods to a broad variety of video sequences containing complex motions.
Abstract: The need for precise (subpixel accuracy) motion estimates in conventional super-resolution has limited its applicability to only video sequences with relatively simple motions such as global translational or affine displacements. In this paper, we introduce a novel framework for adaptive enhancement and spatiotemporal upscaling of videos containing complex activities without explicit need for accurate motion estimation. Our approach is based on multidimensional kernel regression, where each pixel in the video sequence is approximated with a 3-D local (Taylor) series, capturing the essential local behavior of its spatiotemporal neighborhood. The coefficients of this series are estimated by solving a local weighted least-squares problem, where the weights are a function of the 3-D space-time orientation in the neighborhood. As this framework is fundamentally based upon the comparison of neighboring pixels in both space and time, it implicitly contains information about the local motion of the pixels across time, therefore rendering unnecessary an explicit computation of motions of modest size. The proposed approach not only significantly widens the applicability of super-resolution methods to a broad variety of video sequences containing complex motions, but also yields improved overall performance. Using several examples, we illustrate that the developed algorithm has super-resolution capabilities that provide improved optical resolution in the output, while being able to work on general input video with essentially arbitrary motion.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: Evaluating different features and classifiers in a sliding-window framework indicates that incorporating motion information improves detection performance significantly and the combination of multiple and complementary feature types can also help improve performance.
Abstract: Various powerful people detection methods exist. Surprisingly, most approaches rely on static image features only despite the obvious potential of motion information for people detection. This paper systematically evaluates different features and classifiers in a sliding-window framework. First, our experiments indicate that incorporating motion information improves detection performance significantly. Second, the combination of multiple and complementary feature types can also help improve performance. And third, the choice of the classifier-feature combination and several implementation details are crucial to reach best performance. In contrast to many recent papers experimental results are reported for four different datasets rather than using a single one. Three of them are taken from the literature allowing for direct comparison. The fourth dataset is newly recorded using an onboard camera driving through urban environment. Consequently this dataset is more realistic and more challenging than any currently available dataset.

Journal ArticleDOI
TL;DR: An image-processing based method that identifies the visual symptoms of plant diseases, from an analysis of coloured images, showed that the developed algorithm was able to identify a diseased region even when that region was represented by a wide range of intensities.

Journal ArticleDOI
TL;DR: A planar homographic occupancy constraint is developed that fuses foreground likelihood information from multiple views, to resolve occlusions and localize people on a reference scene plane in the framework of plane to plane homologies.
Abstract: Occlusion and lack of visibility in crowded and cluttered scenes make it difficult to track individual people correctly and consistently, particularly in a single view. We present a multi-view approach to solving this problem. In our approach we neither detect nor track objects from any single camera or camera pair; rather evidence is gathered from all the cameras into a synergistic framework and detection and tracking results are propagated back to each view. Unlike other multi-view approaches that require fully calibrated views our approach is purely image-based and uses only 2D constructs. To this end we develop a planar homographic occupancy constraint that fuses foreground likelihood information from multiple views, to resolve occlusions and localize people on a reference scene plane. For greater robustness this process is extended to multiple planes parallel to the reference plane in the framework of plane to plane homologies. Our fusion methodology also models scene clutter using the Schmieder and Weathersby clutter measure, which acts as a confidence prior, to assign higher fusion weight to views with lesser clutter. Detection and tracking are performed simultaneously by graph cuts segmentation of tracks in the space-time occupancy likelihood data. Experimental results with detailed qualitative and quantitative analysis, are demonstrated in challenging multi-view, crowded scenes.

Journal ArticleDOI
TL;DR: In this article, a new vision sensor-based fire-detection method for an early warning fire-monitoring system was proposed, where candidate fire regions were detected using modified versions of previous related methods, such as the detection of moving regions and fire-colored pixels.

Proceedings ArticleDOI
07 Sep 2009
TL;DR: A method for detection of steganographic methods that embed in the spatial domain by adding a low-amplitude independent stego signal, an example of which is least significant bit (LSB) matching.
Abstract: This paper presents a novel method for detection of steganographic methods that embed in the spatial domain by adding a low-amplitude independent stego signal, an example of which is LSB matching. First, arguments are provided for modeling differences between adjacent pixels using first-order and second-order Markov chains. Subsets of sample transition probability matrices are then used as features for a steganalyzer implemented by support vector machines. The accuracy of the presented steganalyzer is evaluated on LSB matching and four different databases. The steganalyzer achieves superior accuracy with respect to prior art and provides stable results across various cover sources. Since the feature set based on second-order Markov chain is high-dimensional, we address the issue of curse of dimensionality using a feature selection algorithm and show that the curse did not occur in our experiments.

Journal ArticleDOI
TL;DR: This work seeks that projection which produces a type of intrinsic, independent of lighting reflectance-information only image by minimizing entropy, and from there go on to remove shadows as previously, and goes over to the quadratic entropy, rather than Shannon's definition.
Abstract: Recently, a method for removing shadows from colour images was developed (Finlayson et al. in IEEE Trans. Pattern Anal. Mach. Intell. 28:59---68, 2006) that relies upon finding a special direction in a 2D chromaticity feature space. This "invariant direction" is that for which particular colour features, when projected into 1D, produce a greyscale image which is approximately invariant to intensity and colour of scene illumination. Thus shadows, which are in essence a particular type of lighting, are greatly attenuated. The main approach to finding this special angle is a camera calibration: a colour target is imaged under many different lights, and the direction that best makes colour patch images equal across illuminants is the invariant direction. Here, we take a different approach. In this work, instead of a camera calibration we aim at finding the invariant direction from evidence in the colour image itself. Specifically, we recognize that producing a 1D projection in the correct invariant direction will result in a 1D distribution of pixel values that have smaller entropy than projecting in the wrong direction. The reason is that the correct projection results in a probability distribution spike, for pixels all the same except differing by the lighting that produced their observed RGB values and therefore lying along a line with orientation equal to the invariant direction. Hence we seek that projection which produces a type of intrinsic, independent of lighting reflectance-information only image by minimizing entropy, and from there go on to remove shadows as previously. To be able to develop an effective description of the entropy-minimization task, we go over to the quadratic entropy, rather than Shannon's definition. Replacing the observed pixels with a kernel density probability distribution, the quadratic entropy can be written as a very simple formulation, and can be evaluated using the efficient Fast Gauss Transform. The entropy, written in this embodiment, has the advantage that it is more insensitive to quantization than is the usual definition. The resulting algorithm is quite reliable, and the shadow removal step produces good shadow-free colour image results whenever strong shadow edges are present in the image. In most cases studied, entropy has a strong minimum for the invariant direction, revealing a new property of image formation.

Journal ArticleDOI
TL;DR: The proposed method combined with four differential chaotic systems and pixel shuffling can fully banish the outlines of the original image, disorders the distributive characteristics of RGB levels, and dramatically decreases the probability of exhaustive attacks.

Journal ArticleDOI
TL;DR: The G 0 distribution, which can model multilook SAR images within an extensive range of degree of homogeneity, is adopted as the statistical model of clutter in this paper and is shown to be of good performance and strong practicability.
Abstract: An adaptive and fast constant false alarm rate (CFAR) algorithm based on automatic censoring (AC) is proposed for target detection in high-resolution synthetic aperture radar (SAR) images. First, an adaptive global threshold is selected to obtain an index matrix which labels whether each pixel of the image is a potential target pixel or not. Second, by using the index matrix, the clutter environment can be determined adaptively to prescreen the clutter pixels in the sliding window used for detecting. The G 0 distribution, which can model multilook SAR images within an extensive range of degree of homogeneity, is adopted as the statistical model of clutter in this paper. With the introduction of AC, the proposed algorithm gains good CFAR detection performance for homogeneous regions, clutter edge, and multitarget situations. Meanwhile, the corresponding fast algorithm greatly reduces the computational load. Finally, target clustering is implemented to obtain more accurate target regions. According to the theoretical performance analysis and the experiment results of typical real SAR images, the proposed algorithm is shown to be of good performance and strong practicability.

Patent
05 Mar 2009
TL;DR: In this paper, an exemplary liquid crystal display device includes pixel units, each pixel unit includes a first sub pixel unit and a second subpixel unit, which are associated with one of three primary colors.
Abstract: An exemplary liquid crystal display device includes pixel units. Each pixel unit includes a first sub pixel unit and a second sub pixel unit. The first sub pixel unit is associated with one of three primary colors, and the second sub pixel unit is associated with one of three complementary colors of the three primary colors. An exemplary method for driving the liquid crystal display device is also provided.

Journal ArticleDOI
27 Jul 2009
TL;DR: A new family of second-generation wavelets constructed using a robust data-prediction lifting scheme that achieves results, previously computed by solving an inhomogeneous Laplace equation, through an explicit computation, avoiding the difficulties in solving large and poorly-conditioned systems of equations.
Abstract: We propose a new family of second-generation wavelets constructed using a robust data-prediction lifting scheme. The support of these new wavelets is constructed based on the edge content of the image and avoids having pixels from both sides of an edge. Multi-resolution analysis, based on these new edge-avoiding wavelets, shows a better decorrelation of the data compared to common linear translation-invariant multi-resolution analyses. The reduced inter-scale correlation allows us to avoid halo artifacts in band-independent multi-scale processing without taking any special precautions. We thus achieve nonlinear data-dependent multi-scale edge-preserving image filtering and processing at computation times which are linear in the number of image pixels. The new wavelets encode, in their shape, the smoothness information of the image at every scale. We use this to derive a new edge-aware interpolation scheme that achieves results, previously computed by solving an inhomogeneous Laplace equation, through an explicit computation. We thus avoid the difficulties in solving large and poorly-conditioned systems of equations.We demonstrate the effectiveness of the new wavelet basis for various computational photography applications such as multi-scale dynamic-range compression, edge-preserving smoothing and detail enhancement, and image colorization.

Journal ArticleDOI
TL;DR: This paper presents bi-histogram equalization with a plateau level (BHEPL) as one of the options for the system that requires a short processing time image enhancement, and shows better enhancement results as compared with some multi-sections mean brightness preserving histograms equalization methods.
Abstract: Many histogram equalization based methods have been introduced for the use in consumer electronics in recent years. Yet, many of these methods are relatively complicated to be implemented, and mostly require a high computational time. Furthermore, some of the methods require several predefined parameters from the user, which make the optimal results cannot be obtained automatically. Therefore, this paper presents bi-histogram equalization with a plateau level (BHEPL) as one of the options for the system that requires a short processing time image enhancement. First, BHEPL divides the input histogram into two independent sub-histograms. This is done in order to maintain the mean brightness. Then, these sub-histograms are clipped based on the calculated plateau value. By doing this, excessive enhancement can be avoided. Experimental results show that this method only requires 34.20 ms, in average, to process images of size 3648 × 2736 pixels (i.e. 10 Mega pixels images). The proposed method also gives better enhancement results as compared with some multi-sections mean brightness preserving histogram equalization methods.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: The proposed algorithm is the top performer among local stereo methods at the current state-of-the-art in local stereo matching by using the geodesic distance transform.
Abstract: Local stereo matching has recently experienced large progress by the introduction of new support aggregation schemes. These approaches estimate a pixel's support region via color segmentation. Our contribution lies in an improved method for accomplishing this segmentation. Inside a square support window, we compute the geodesic distance from all pixels to the window's center pixel. Pixels of low geodesic distance are given high support weights and therefore large influence in the matching process. In contrast to previous work, we enforce connectivity by using the geodesic distance transform. For obtaining a high support weight, a pixel must have a path to the center point along which the color does not change significantly. This connectivity property leads to improved segmentation results and consequently to improved disparity maps. The success of our geodesic approach is demonstrated on the Middlebury images. According to the Middlebury benchmark, the proposed algorithm is the top performer among local stereo methods at the current state-of-the-art.

Journal ArticleDOI
TL;DR: It is demonstrated that the full-parallax CGH, calculated by the proposed method and fabricated by a laser lithography system, reconstructs a fine 3D image accompanied by a strong sensation of depth.
Abstract: A large-scale full-parallax computer-generated hologram (CGH) with four billion (2(16) x 2(16)) pixels is created to reconstruct a fine true 3D image of a scene, with occlusions. The polygon-based method numerically generates the object field of a surface object, whose shape is provided by a set of vertex data of polygonal facets, while the silhouette method makes it possible to reconstruct the occluded scene. A novel technique using the segmented frame buffer is presented for handling and propagating large wave fields even in the case where the whole wave field cannot be stored in memory. We demonstrate that the full-parallax CGH, calculated by the proposed method and fabricated by a laser lithography system, reconstructs a fine 3D image accompanied by a strong sensation of depth.

Journal ArticleDOI
TL;DR: The proposed algorithm has been tested using moderate resolution imaging spectrometer images for destriping and China-Brazil Earth Resource Satellite and QuickBird images for simulated inpainting and the results and quantitative analyses verify the efficacy of this algorithm.
Abstract: Remotely sensed images often suffer from the common problems of stripe noise and random dead pixels. The techniques to recover a good image from the contaminated one are called image destriping (for stripes) and image inpainting (for dead pixels). This paper presents a maximum a posteriori (MAP)-based algorithm for both destriping and inpainting problems. The main advantage of this algorithm is that it can constrain the solution space according to a priori knowledge during the destriping and inpainting processes. In the MAP framework, the likelihood probability density function (PDF) is constructed based on a linear image observation model, and a robust Huber-Markov model is used as the prior PDF. The gradient descent optimization method is employed to produce the desired image. The proposed algorithm has been tested using moderate resolution imaging spectrometer images for destriping and China-Brazil Earth Resource Satellite and QuickBird images for simulated inpainting. The experiment results and quantitative analyses verify the efficacy of this algorithm.

Patent
23 Jul 2009
TL;DR: In this article, a method and a graphical user interface for modifying a depth map for a digital monoscopic color image is presented, which includes interactively selecting a region of the depth map based on color of a target region in the color image, and modifying depth values in the thereby selected region using a depth modification rule.
Abstract: The invention relates to a method and a graphical user interface for modifying a depth map for a digital monoscopic color image. The method includes interactively selecting a region of the depth map based on color of a target region in the color image, and modifying depth values in the thereby selected region of the depth map using a depth modification rule. The color-based pixel selection rules for the depth map and the depth modification rule selected based on one color image from a video sequence may be saved and applied to automatically modify depths maps of other color images from the same sequence.

Journal ArticleDOI
TL;DR: A study of the accuracy of five supervised classification methods using multispectral and pan-sharpened QuickBird imagery to verify whether remote sensing offers the ability to efficiently identify crops and agro-environmental measures in a typical agricultural Mediterranean area characterized by dry conditions.

Proceedings ArticleDOI
06 May 2009
TL;DR: Experimental results show that the proposed hole filling method provides improved rendering quality both objectively and subjectively.
Abstract: Depth image-based rendering (DIBR) is generally used to synthesize virtual view images in free viewpoint television (FTV) and three-dimensional (3-D) video. One of the main problems in DIBR is how to fill the holes caused by disocclusion regions and inaccurate depth values. In this paper, we propose a new hole filling method using a depth based in-painting technique. Experimental results show that the proposed hole filling method provides improved rendering quality both objectively and subjectively.

Book ChapterDOI
03 Sep 2009
TL;DR: The stixel-world turns out to be a compact but flexible representation of the three-dimensional traffic situation that can be used as the common basis for the scene understanding tasks of driver assistance and autonomous systems.
Abstract: Ambitious driver assistance for complex urban scenarios demands a complete awareness of the situation, including all moving and stationary objects that limit the free space. Recent progress in real-time dense stereo vision provides precise depth information for nearly every pixel of an image. This rises new questions: How can one efficiently analyze half a million disparity values of next generation imagers? And how can one find all relevant obstacles in this huge amount of data in real-time? In this paper we build a medium-level representation named "stixel-world". It takes into account that the free space in front of vehicles is limited by objects with almost vertical surfaces. These surfaces are approximated by adjacent rectangular sticks of a certain width and height. The stixel-world turns out to be a compact but flexible representation of the three-dimensional traffic situation that can be used as the common basis for the scene understanding tasks of driver assistance and autonomous systems.