scispace - formally typeset
Search or ask a question

Showing papers on "Kernel (image processing) published in 2010"


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work proposes a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation, and shows how to jointly optimize the dimension reduction and the indexing algorithm.
Abstract: We address the problem of image search on a very large scale, where three constraints have to be considered jointly: the accuracy of the search, its efficiency, and the memory usage of the representation. We first propose a simple yet efficient way of aggregating local image descriptors into a vector of limited dimension, which can be viewed as a simplification of the Fisher kernel representation. We then show how to jointly optimize the dimension reduction and the indexing algorithm, so that it best preserves the quality of vector comparison. The evaluation shows that our approach significantly outperforms the state of the art: the search accuracy is comparable to the bag-of-features approach for an image representation that fits in 20 bytes. Searching a 10 million image dataset takes about 50ms.

2,782 citations


Proceedings ArticleDOI
02 Nov 2010
TL;DR: This work considers a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence, and considers two spatial extensions.
Abstract: We investigate bag-of-visual-words (BOVW) approaches to land-use classification in high-resolution overhead imagery. We consider a standard non-spatial representation in which the frequencies but not the locations of quantized image features are used to discriminate between classes analogous to how words are used for text document classification without regard to their order of occurrence. We also consider two spatial extensions, the established spatial pyramid match kernel which considers the absolute spatial arrangement of the image features, as well as a novel method which we term the spatial co-occurrence kernel that considers the relative arrangement. These extensions are motivated by the importance of spatial structure in geographic data.The methods are evaluated using a large ground truth image dataset of 21 land-use classes. In addition to comparisons with standard approaches, we perform extensive evaluation of different configurations such as the size of the visual dictionaries used to derive the BOVW representations and the scale at which the spatial relationships are considered.We show that even though BOVW approaches do not necessarily perform better than the best standard approaches overall, they represent a robust alternative that is more effective for certain land-use classes. We also show that extending the BOVW approach with our proposed spatial co-occurrence kernel consistently improves performance.

1,896 citations


Journal ArticleDOI
TL;DR: This work considers factorizations of the form X = FGT, and focuses on algorithms in which G is restricted to containing nonnegative entries, but allowing the data matrix X to have mixed signs, thus extending the applicable range of NMF methods.
Abstract: We present several new variations on the theme of nonnegative matrix factorization (NMF). Considering factorizations of the form X = FGT, we focus on algorithms in which G is restricted to containing nonnegative entries, but allowing the data matrix X to have mixed signs, thus extending the applicable range of NMF methods. We also consider algorithms in which the basis vectors of F are constrained to be convex combinations of the data points. This is used for a kernel extension of NMF. We provide algorithms for computing these new factorizations and we provide supporting theoretical analysis. We also analyze the relationships between our algorithms and clustering algorithms, and consider the implications for sparseness of solutions. Finally, we present experimental results that explore the properties of these new methods.

1,226 citations


Book ChapterDOI
05 Sep 2010
TL;DR: It is found that strong edges do not always profit kernel estimation, but instead under certain circumstance degrade it, which leads to a new metric to measure the usefulness of image edges in motion deblurring and a gradient selection process to mitigate their possible adverse effect.
Abstract: We discuss a few new motion deblurring problems that are significant to kernel estimation and non-blind deconvolution. We found that strong edges do not always profit kernel estimation, but instead under certain circumstance degrade it. This finding leads to a new metric to measure the usefulness of image edges in motion deblurring and a gradient selection process to mitigate their possible adverse effect. We also propose an efficient and high-quality kernel estimation method based on using the spatial prior and the iterative support detection (ISD) kernel refinement, which avoids hard threshold of the kernel elements to enforce sparsity. We employ the TV-l1 deconvolution model, solved with a new variable substitution scheme to robustly suppress noise.

1,056 citations


Journal ArticleDOI
TL;DR: Compared with existing algorithms, KRR leads to a better generalization than simply storing the examples as has been done in existing example-based algorithms and results in much less noisy images.
Abstract: This paper proposes a framework for single-image super-resolution. The underlying idea is to learn a map from input low-resolution images to target high-resolution images based on example pairs of input and output images. Kernel ridge regression (KRR) is adopted for this purpose. To reduce the time complexity of training and testing for KRR, a sparse solution is found by combining the ideas of kernel matching pursuit and gradient descent. As a regularized solution, KRR leads to a better generalization than simply storing the examples as has been done in existing example-based algorithms and results in much less noisy images. However, this may introduce blurring and ringing artifacts around major edges as sharp changes are penalized severely. A prior model of a generic image class which takes into account the discontinuity property of images is adopted to resolve this problem. Comparison with existing algorithms shows the effectiveness of the proposed method.

938 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: This article shows why the Fisher representation is well-suited to the retrieval problem: it describes an image by what makes it different from other images, and why it should be compressed to reduce their memory footprint and speed-up the retrieval.
Abstract: The problem of large-scale image search has been traditionally addressed with the bag-of-visual-words (BOV). In this article, we propose to use as an alternative the Fisher kernel framework. We first show why the Fisher representation is well-suited to the retrieval problem: it describes an image by what makes it different from other images. One drawback of the Fisher vector is that it is high-dimensional and, as opposed to the BOV, it is dense. The resulting memory and computational costs do not make Fisher vectors directly amenable to large-scale retrieval. Therefore, we compress Fisher vectors to reduce their memory footprint and speed-up the retrieval. We compare three binarization approaches: a simple approach devised for this representation and two standard compression techniques. We show on two publicly available datasets that compressed Fisher vectors perform very well using as little as a few hundreds of bits per image, and significantly better than a very recent compressed BOV approach.

860 citations


Journal ArticleDOI
TL;DR: A novel region-based active contour model (ACM) with SBGFRLS has the property of selective local or global segmentation, which is more efficient to construct than the widely used signed distance function (SDF).

710 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: A novel method for face recognition from image sets that combines kernel trick and robust methods to discard input points that are far from the fitted model, thus handling complex and nonlinear manifolds of face images.
Abstract: We introduce a novel method for face recognition from image sets. In our setting each test and training example is a set of images of an individual's face, not just a single image, so recognition decisions need to be based on comparisons of image sets. Methods for this have two main aspects: the models used to represent the individual image sets; and the similarity metric used to compare the models. Here, we represent images as points in a linear or affine feature space and characterize each image set by a convex geometric region (the affine or convex hull) spanned by its feature points. Set dissimilarity is measured by geometric distances (distances of closest approach) between convex models. To reduce the influence of outliers we use robust methods to discard input points that are far from the fitted model. The kernel trick allows the approach to be extended to implicit feature mappings, thus handling complex and nonlinear manifolds of face images. Experiments on two public face datasets show that our proposed methods outperform a number of existing state-of-the-art ones.

504 citations


Journal ArticleDOI
TL;DR: This work reviews the evolution of the nonparametric regression modeling in imaging from the local Nadaraya-Watson kernel estimate to the nonlocal means and further to transform-domain filtering based on nonlocal block-matching.
Abstract: We review the evolution of the nonparametric regression modeling in imaging from the local Nadaraya-Watson kernel estimate to the nonlocal means and further to transform-domain filtering based on nonlocal block-matching. The considered methods are classified mainly according to two main features: local/nonlocal and pointwise/multipoint. Here nonlocal is an alternative to local, and multipoint is an alternative to pointwise. These alternatives, though obvious simplifications, allow to impose a fruitful and transparent classification of the basic ideas in the advanced techniques. Within this framework, we introduce a novel single- and multiple-model transform domain nonlocal approach. The Block Matching and 3-D Filtering (BM3D) algorithm, which is currently one of the best performing denoising algorithms, is treated as a special case of the latter approach.

382 citations


Book ChapterDOI
05 Sep 2010
TL;DR: KSR is essentially the sparse coding technique in a high dimensional feature space mapped by implicit mapping function that outperforms sparse coding and EMK, and achieves state-of-the-art performance for image classification and face recognition on publicly available datasets.
Abstract: Recent research has shown the effectiveness of using sparse coding(Sc) to solve many computer vision problems. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which may reduce the feature quantization error and boost the sparse coding performance, we propose Kernel Sparse Representation(KSR). KSR is essentially the sparse coding technique in a high dimensional feature space mapped by implicit mapping function. We apply KSR to both image classification and face recognition. By incorporating KSR into Spatial Pyramid Matching(SPM), we propose KSRSPM for image classification. KSRSPM can further reduce the information loss in feature quantization step compared with Spatial Pyramid Matching using Sparse Coding(ScSPM). KSRSPM can be both regarded as the generalization of Efficient Match Kernel(EMK) and an extension of ScSPM. Compared with sparse coding, KSR can learn more discriminative sparse codes for face recognition. Extensive experimental results show that KSR outperforms sparse coding and EMK, and achieves state-of-the-art performance for image classification and face recognition on publicly available datasets.

377 citations


Proceedings Article
06 Dec 2010
TL;DR: This work highlights the kernel view of orientation histograms, and shows that they are equivalent to a certain type of match kernels over image patches, and designs a family of kernel descriptors which provide a unified and principled framework to turn pixel attributes into compact patch-level features.
Abstract: The design of low-level image features is critical for computer vision algorithms. Orientation histograms, such as those in SIFT [16] and HOG [3], are the most successful and popular features for visual object and scene recognition. We highlight the kernel view of orientation histograms, and show that they are equivalent to a certain type of match kernels over image patches. This novel view allows us to design a family of kernel descriptors which provide a unified and principled framework to turn pixel attributes (gradient, color, local binary pattern, etc.) into compact patch-level features. In particular, we introduce three types of match kernels to measure similarities between image patches, and construct compact low-dimensional kernel descriptors from these match kernels using kernel principal component analysis (KPCA) [23]. Kernel descriptors are easy to design and can turn any type of pixel attribute into patch-level features. They outperform carefully tuned and sophisticated features including SIFT and deep belief networks. We report superior performance on standard image classification benchmarks: Scene-15, Caltech-101, CIFAR10 and CIFAR10-ImageNet.


Proceedings ArticleDOI
09 Jan 2010
TL;DR: An analytical model to predict the performance of general-purpose applications on a GPU architecture that captures full system complexity and shows high accuracy in predicting the performance trends of different optimized kernel implementations is presented.
Abstract: This paper presents an analytical model to predict the performance ofgeneral-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also be incorporated into a tool to help programmers better assess the performance bottlenecks in their code. We analyze each GPU kernel and identify how the kernel exercises major GPU microarchitecture features. To identify the performance bottlenecks accurately, we introduce an abstract interpretation of a GPU kernel, work flow graph, based on which we estimate the execution time of a GPU kernel. We validated our performance model on the NVIDIA GPUs using CUDA (Compute Unified Device Architecture). For this purpose, we used data parallel benchmarks that stress different GPU microarchitecture events such as uncoalesced memory accesses, scratch-pad memory bank conflicts, and control flow divergence, which must be accurately modeled but represent challenges to the analytical performance models. The proposed model captures full system complexity and shows high accuracy in predicting the performance trends of different optimized kernel implementations. We also describe our approach to extracting the performance model automatically from a kernel code.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: Experimental results on challenging real-world datasets show that the feature combination capability of the proposed algorithm is competitive to the state-of-the-art multiple kernel learning methods.
Abstract: We address the problem of computing joint sparse representation of visual signal across multiple kernel-based representations. Such a problem arises naturally in supervised visual recognition applications where one aims to reconstruct a test sample with multiple features from as few training subjects as possible. We cast the linear version of this problem into a multi-task joint covariate selection model [15], which can be very efficiently optimized via ker-nelizable accelerated proximal gradient method. Furthermore, two kernel-view extensions of this method are provided to handle the situations where descriptors and similarity functions are in the form of kernel matrices. We then investigate into two applications of our algorithm to feature combination: 1) fusing gray-level and LBP features for face recognition, and 2) combining multiple kernels for object categorization. Experimental results on challenging real-world datasets show that the feature combination capability of our proposed algorithm is competitive to the state-of-the-art multiple kernel learning methods.

Journal ArticleDOI
TL;DR: A novel, principled and unified technique for pattern analysis and generation that ensures computational efficiency and enables a straightforward incorporation of domain knowledge will be presented and has the potential to reduce computational time significantly.
Abstract: The advent of multiple-point geostatistics (MPS) gave rise to the integration of complex subsurface geological structures and features into the model by the concept of training images Initial algorithms generate geologically realistic realizations by using these training images to obtain conditional probabilities needed in a stochastic simulation framework More recent pattern-based geostatistical algorithms attempt to improve the accuracy of the training image pattern reproduction In these approaches, the training image is used to construct a pattern database Consequently, sequential simulation will be carried out by selecting a pattern from the database and pasting it onto the simulation grid One of the shortcomings of the present algorithms is the lack of a unifying framework for classifying and modeling the patterns from the training image In this paper, an entirely different approach will be taken toward geostatistical modeling A novel, principled and unified technique for pattern analysis and generation that ensures computational efficiency and enables a straightforward incorporation of domain knowledge will be presented In the developed methodology, patterns scanned from the training image are represented as points in a Cartesian space using multidimensional scaling The idea behind this mapping is to use distance functions as a tool for analyzing variability between all the patterns in a training image These distance functions can be tailored to the application at hand Next, by significantly reducing the dimensionality of the problem and using kernel space mapping, an improved pattern classification algorithm is obtained This paper discusses the various implementation details to accomplish these ideas Several examples are presented and a qualitative comparison is made with previous methods An improved pattern continuity and data-conditioning capability is observed in the generated realizations for both continuous and categorical variables We show how the proposed methodology is much less sensitive to the user-provided parameters, and at the same time has the potential to reduce computational time significantly

Journal ArticleDOI
TL;DR: Experiments carried out in multi- and hyperspectral, contextual, and multisource remote sensing data classification confirm the capability of the method in ranking the relevant features and show the computational efficience of the proposed strategy.
Abstract: The increase in spatial and spectral resolution of the satellite sensors, along with the shortening of the time-revisiting periods, has provided high-quality data for remote sensing image classification. However, the high-dimensional feature space induced by using many heterogeneous information sources precludes the use of simple classifiers: thus, a proper feature selection is required for discarding irrelevant features and adapting the model to the specific problem. This paper proposes to classify the images and simultaneously to learn the relevant features in such high-dimensional scenarios. The proposed method is based on the automatic optimization of a linear combination of kernels dedicated to different meaningful sets of features. Such sets can be groups of bands, contextual or textural features, or bands acquired by different sensors. The combination of kernels is optimized through gradient descent on the support vector machine objective function. Even though the combination is linear, the ranked relevance takes into account the intrinsic nonlinearity of the data through kernels. Since a naive selection of the free parameters of the multiple-kernel method is computationally demanding, we propose an efficient model selection procedure based on the kernel alignment. The result is a weight (learned from the data) for each kernel where both relevant and meaningless image features automatically emerge after training the model. Experiments carried out in multi- and hyperspectral, contextual, and multisource remote sensing data classification confirm the capability of the method in ranking the relevant features and show the computational efficience of the proposed strategy.

Journal ArticleDOI
TL;DR: Improved resolution and contrast versus noise properties can be achieved with the proposed method with similar computation time as the conventional approach, and comparison of the measured spatially variant and invariant reconstruction revealed similar performance with conventional image metrics.
Abstract: Accurate system modeling in tomographic image reconstruction has been shown to reduce the spatial variance of resolution and improve quantitative accuracy. System modeling can be improved through analytic calculations, Monte Carlo simulations, and physical measurements. The purpose of this work is to improve clinical fully-3-D reconstruction without substantially increasing computation time. We present a practical method for measuring the detector blurring component of a whole-body positron emission tomography (PET) system to form an approximate system model for use with fully-3-D reconstruction. We employ Monte Carlo simulations to show that a non-collimated point source is acceptable for modeling the radial blurring present in a PET tomograph and we justify the use of a Na22 point source for collecting these measurements. We measure the system response on a whole-body scanner, simplify it to a 2-D function, and incorporate a parameterized version of this response into a modified fully-3-D OSEM algorithm. Empirical testing of the signal versus noise benefits reveal roughly a 15% improvement in spatial resolution and 10% improvement in contrast at matched image noise levels. Convergence analysis demonstrates improved resolution and contrast versus noise properties can be achieved with the proposed method with similar computation time as the conventional approach. Comparison of the measured spatially variant and invariant reconstruction revealed similar performance with conventional image metrics. Edge artifacts, which are a common artifact of resolution-modeled reconstruction methods, were less apparent in the spatially variant method than in the invariant method. With the proposed and other resolution-modeled reconstruction methods, edge artifacts need to be studied in more detail to determine the optimal tradeoff of resolution/contrast enhancement and edge fidelity.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: Two contributions are made: a local blur cue that measures the likelihood of a small neighborhood being blurred by a candidate blur kernel; and an algorithm that, given an image, simultaneously selects a motion blur kernel and segments the region that it affects.
Abstract: Blur is caused by a pixel receiving light from multiple scene points, and in many cases, such as object motion, the induced blur varies spatially across the image plane. However, the seemingly straight-forward task of estimating spatially-varying blur from a single image has proved hard to accomplish reliably. This work considers such blur and makes two contributions: a local blur cue that measures the likelihood of a small neighborhood being blurred by a candidate blur kernel; and an algorithm that, given an image, simultaneously selects a motion blur kernel and segments the region that it affects. The methods are shown to perform well on a diversity of images.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper considers the problem of learning image categorizers on large image sets (e.g. > 100k images) using bag-of-visual-words (BOV) image representations and Support Vector Machine classifiers and experiments with three approaches to BOV embedding.
Abstract: Kernel machines rely on an implicit mapping of the data such that non-linear classification in the original space corresponds to linear classification in the new space. As kernel machines are difficult to scale to large training sets, it has been proposed to perform an explicit mapping of the data and to learn directly linear classifiers in the new space. In this paper, we consider the problem of learning image categorizers on large image sets (e.g. > 100k images) using bag-of-visual-words (BOV) image representations and Support Vector Machine classifiers. We experiment with three approaches to BOV embedding: 1) kernel PCA (kPCA) [15], 2) a modified kPCA we propose for additive kernels and 3) random projections for shift-invariant kernels [14]. We report experiments on 3 datasets: Cal-tech101, VOC07 and ImageNet. An important conclusion is that simply square-rooting BOV vectors – which corresponds to an exact mapping for the Bhattacharyya kernel – already leads to large improvements, often quite close to the best results obtained with additive kernels. Another conclusion is that, although it is possible to go beyond additive kernels, the embedding comes at a much higher cost.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: A new parametrized geometric model of the blurring process in terms of the rotational velocity of the camera during exposure is proposed, which makes it possible to model and remove a wider class of blurs than previous approaches, including uniform blur as a special case.
Abstract: Blur from camera shake is mostly due to the 3D rotation of the camera, resulting in a blur kernel that can be significantly non-uniform across the image. However, most current deblurring methods model the observed image as a convolution of a sharp image with a uniform blur kernel. We propose a new parametrized geometric model of the blurring process in terms of the rotational velocity of the camera during exposure. We apply this model to two different algorithms for camera shake removal: the first one uses a single blurry image (blind deblurring), while the second one uses both a blurry image and a sharp but noisy image of the same scene. We show that our approach makes it possible to model and remove a wider class of blurs than previous approaches, including uniform blur as a special case, and demonstrate its effectiveness with experiments on real images.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper derives an efficient algorithm to solve a large kernel matting Laplacian and uses adaptive kernel sizes by a KD-tree trimap segmentation technique to reduce running time.
Abstract: Image matting is of great importance in both computer vision and graphics applications. Most existing state-of-the-art techniques rely on large sparse matrices such as the matting Laplacian [12]. However, solving these linear systems is often time-consuming, which is unfavored for the user interaction. In this paper, we propose a fast method for high quality matting. We first derive an efficient algorithm to solve a large kernel matting Laplacian. A large kernel propagates information more quickly and may improve the matte quality. To further reduce running time, we also use adaptive kernel sizes by a KD-tree trimap segmentation technique. A variety of experiments show that our algorithm provides high quality results and is 5 to 20 times faster than previous methods.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This work develops an effective shape-based kernel for upper-body pose similarity and proposes a leave-one-out loss function for learning a sparse subset of exemplars for kernel regression estimates from a learned sparse set of exemplar.
Abstract: Pictorial structure (PS) models are extensively used for part-based recognition of scenes, people, animals and multi-part objects. To achieve tractability, the structure and parameterization of the model is often restricted, for example, by assuming tree dependency structure and unimodal, data-independent pairwise interactions. These expressivity restrictions fail to capture important patterns in the data. On the other hand, local methods such as nearest-neighbor classification and kernel density estimation provide non-parametric flexibility but require large amounts of data to generalize well. We propose a simple semi-parametric approach that combines the tractability of pictorial structure inference with the flexibility of non-parametric methods by expressing a subset of model parameters as kernel regression estimates from a learned sparse set of exemplars. This yields query-specific, image-dependent pose priors. We develop an effective shape-based kernel for upper-body pose similarity and propose a leave-one-out loss function for learning a sparse subset of exemplars for kernel regression. We apply our techniques to two challenging datasets of human figure parsing and advance the state-of-the-art (from 80% to 86% on the Buffy dataset [8]), while using only 15% of the training data as exemplars.

Journal ArticleDOI
TL;DR: The proposed method operates using a single example of an object of interest to find similar matches, does not require prior knowledge about objects being sought, anddoes not require any preprocessing step or segmentation of a target image.
Abstract: We present a generic detection/localization algorithm capable of searching for a visual object of interest without training. The proposed method operates using a single example of an object of interest to find similar matches, does not require prior knowledge (learning) about objects being sought, and does not require any preprocessing step or segmentation of a target image. Our method is based on the computation of local regression kernels as descriptors from a query, which measure the likeness of a pixel to its surroundings. Salient features are extracted from said descriptors and compared against analogous features from the target image. This comparison is done using a matrix generalization of the cosine similarity measure. We illustrate optimality properties of the algorithm using a naive-Bayes framework. The algorithm yields a scalar resemblance map, indicating the likelihood of similarity between the query and all patches in the target image. By employing nonparametric significance tests and nonmaxima suppression, we detect the presence and location of objects similar to the given query. The approach is extended to account for large variations in scale and rotation. High performance is demonstrated on several challenging data sets, indicating successful detection of objects in diverse contexts and under different imaging conditions.

Journal ArticleDOI
TL;DR: Experimental results prove that the proposed kernel-based nonparametric regression method is effective and adaptable to small-target detection under a complex background.
Abstract: Small-target detection in infrared imagery with a complex background is always an important task in remote-sensing fields. Complex clutter background usually results in serious false alarm in target detection for low contrast of infrared imagery. In this letter, a kernel-based nonparametric regression method is proposed for background prediction and clutter removal, furthermore applied in target detection. First, a linear mixture model is used to represent each pixel of the observed infrared imagery. Second, adaptive detection is performed on local regions in the infrared image by means of kernel-based nonparametric regression and two-parameter constant false alarm rate (CFAR) detector. Kernel regression, which is one of the nonparametric regression approaches, is adopted to estimate complex clutter background. Then, CFAR detection is performed on “pure” target-like region after estimation and removal of clutter background. Experimental results prove that the proposed algorithm is effective and adaptable to small-target detection under a complex background.

Proceedings Article
31 Mar 2010
TL;DR: This paper introduces the concept of variational inducing functions to handle potential non-smooth functions involved in the kernel CP construction and considers an alternative approach to approximate inference based on variational methods.
Abstract: Interest in multioutput kernel methods is increasing, whether under the guise of multitask learning, multisensor networks or structured output data. From the Gaussian process perspective a multioutput Mercer kernel is a covariance function over correlated output functions. One way of constructing such kernels is based on convolution processes (CP). A key problem for this approach is efficient inference. Alvarez and Lawrence recently presented a sparse approximation for CPs that enabled efficient inference. In this paper, we extend this work in two directions: we introduce the concept of variational inducing functions to handle potential non-smooth functions involved in the kernel CP construction and we consider an alternative approach to approximate inference based on variational methods, extending the work by Titsias (2009) to the multiple output case. We demonstrate our approaches on prediction of school marks, compiler performance and financial time series.

Proceedings Article
21 Jun 2010
TL;DR: An accurate and scalable Nystrom scheme that first samples a large column subset from the input matrix, but then only performs an approximate SVD on the inner submatrix by using the recent randomized low-rank matrix approximation algorithms.
Abstract: The Nystrom method is an efficient technique for the eigenvalue decomposition of large kernel matrices. However, in order to ensure an accurate approximation, a sufficiently large number of columns have to be sampled. On very large data sets, the SVD step on the resultant data submatrix will soon dominate the computations and become prohibitive. In this paper, we propose an accurate and scalable Nystrom scheme that first samples a large column subset from the input matrix, but then only performs an approximate SVD on the inner submatrix by using the recent randomized low-rank matrix approximation algorithms. Theoretical analysis shows that the proposed algorithm is as accurate as the standard Nystrom method that directly performs a large SVD on the inner submatrix. On the other hand, its time complexity is only as low as performing a small SVD. Experiments are performed on a number of large-scale data sets for low-rank approximation and spectral embedding. In particular, spectral embedding of a MNIST data set with 3.3 million examples takes less than an hour on a standard PC with 4G memory.

Patent
30 Nov 2010
TL;DR: In this paper, a super-resolved demosaicing technique for rendering focused plenoptic camera data performs simultaneous super-resolution and demosaice. But the technique is limited to a single image at a specified depth of focus.
Abstract: A super-resolved demosaicing technique for rendering focused plenoptic camera data performs simultaneous super-resolution and demosaicing. The technique renders a high-resolution output image from a plurality of separate microimages in an input image at a specified depth of focus. For each point on an image plane of the output image, the technique determines a line of projection through the microimages in optical phase space according to the current point and angle of projection determined from the depth of focus. For each microimage, the technique applies a kernel centered at a position on the current microimage intersected by the line of projection to accumulate, from pixels at each microimage covered by the kernel at the respective position, values for each color channel weighted according to the kernel. A value for a pixel at the current point in the output image is computed from the accumulated values for the color channels.

Proceedings ArticleDOI
18 Dec 2010
TL;DR: Experimental evaluation validates that the proposed kernel fusion method could reduce energy consumption without performance loss for several typical kernels and effective method to reduce the usage of shared memory and coordinate the thread space of the kernels to be fused is proposed.
Abstract: As one of the most popular accelerators, Graphics Processing Unit (GPU) has demonstrated high computing power in several application fields. On the other hand, GPU also produces high power consumption and has been one of the most largest power consumers in desktop and supercomputer systems. However, software power optimization method targeted for GPU has not been well studied. In this work, we propose kernel fusion method to reduce energy consumption and improve power efficiency on GPU architecture. Through fusing two or more independent kernels, kernel fusion method achieves higher utilization and much more balanced demand for hardware resources, which provides much more potential for power optimization, such as dynamic voltage and frequency scaling (DVFS). Basing on the CUDA programming model, this paper also gives several different fusion methods targeted for different situations. In order to make judicious fusion strategy, we deduce the process of fusing multiple independent kernels as a dynamic programming problem, which could be well solved with many existing tools and be simply embedded into compiler or runtime system. To reduce the overhead introduced by kernel fusion, we also propose effective method to reduce the usage of shared memory and coordinate the thread space of the kernels to be fused. Detailed experimental evaluation validates that the proposed kernel fusion method could reduce energy consumption without performance loss for several typical kernels.

Journal ArticleDOI
TL;DR: A semisupervised support vector machine classifier based on the combination of clustering and the mean map kernel is proposed, which reinforces samples in the same cluster belonging to the same class by combining sample and cluster similarities implicitly in the kernel space.
Abstract: Remote sensing image classification constitutes a challenging problem since very few labeled pixels are typically available from the analyzed scene. In such situations, labeled data extracted from other images modeling similar problems might be used to improve the classification accuracy. However, when training and test samples follow even slightly different distributions, classification is very difficult. This problem is known as sample selection bias. In this paper, we propose a new method to combine labeled and unlabeled pixels to increase classification reliability and accuracy. A semisupervised support vector machine classifier based on the combination of clustering and the mean map kernel is proposed. The method reinforces samples in the same cluster belonging to the same class by combining sample and cluster similarities implicitly in the kernel space. A soft version of the method is also proposed where only the most reliable training samples, in terms of likelihood of the image data distribution, are used. Capabilities of the proposed method are illustrated in a cloud screening application using data from the MEdium Resolution Imaging Spectrometer (MERIS) instrument onboard the European Space Agency ENVISAT satellite. Cloud screening constitutes a clear example of sample selection bias since cloud features change to a great extent depending on the cloud type, thickness, transparency, height, and background. Good results are obtained and show that the method is particularly well suited for situations where the available labeled information does not adequately describe the classes in the test data.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: Evaluation on the INRIA benchmark database and an experimental study on a real-world intersection using multi-spectral hypothesis generation confirm state-of-the-art classification and real-time performance.
Abstract: We present a real-time multi-sensor architecture for video-based pedestrian detection used within a road side unit for intersection assistance. The entire system is implemented on available PC hardware, combining a frame grabber board with embedded FPGA and a graphics card into a powerful processing network. Giving classification performance top priority, we use HOG descriptors with a Gaussian kernel support vector machine. In order to achieve real-time performance, we propose a hardware architecture that incorporates FPGA-based feature extraction and GPU-based classification. The FPGA-GPU pipeline is managed by a multi-core CPU that further performs sensor data fusion. Evaluation on the INRIA benchmark database and an experimental study on a real-world intersection using multi-spectral hypothesis generation confirm state-of-the-art classification and real-time performance.