Showing papers on "Kernel (image processing) published in 2014"

PDF

Open Access

Proceedings Article•DOI•

Convolutional Neural Networks for No-Reference Image Quality Assessment

[...]

Le Kang¹, Peng Ye¹, Yi Li², David Doermann¹•Institutions (2)

University of Maryland, College Park¹, NICTA²

23 Jun 2014

TL;DR: A Convolutional Neural Network is described to accurately predict image quality without a reference image to achieve state of the art performance on the LIVE dataset and shows excellent generalization ability in cross dataset experiments.

...read moreread less

Abstract: In this work we describe a Convolutional Neural Network (CNN) to accurately predict image quality without a reference image. Taking image patches as input, the CNN works in the spatial domain without using hand-crafted features that are employed by most previous methods. The network consists of one convolutional layer with max and min pooling, two fully connected layers and an output node. Within the network structure, feature learning and regression are integrated into one optimization process, which leads to a more effective model for estimating image quality. This approach achieves state of the art performance on the LIVE dataset and shows excellent generalization ability in cross dataset experiments. Further experiments on images with local distortions demonstrate the local quality estimation ability of our CNN, which is rarely reported in previous literature.

...read moreread less

942 citations

Journal Article•DOI•

A Multi-View Embedding Space for Modeling Internet Images, Tags, and Their Semantics

[...]

Yunchao Gong¹, Qifa Ke², Michael Isard², Svetlana Lazebnik³•Institutions (3)

University of North Carolina at Chapel Hill¹, Microsoft², University of Illinois at Urbana–Champaign³

01 Jan 2014-International Journal of Computer Vision

TL;DR: This paper starts with canonical correlation analysis (CCA), a popular and successful approach for mapping visual and textual features to the same latent space, and incorporates a third view capturing high-level image semantics, represented either by a single category or multiple non-mutually-exclusive concepts.

...read moreread less

Abstract: This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation). We start with canonical correlation analysis (CCA), a popular and successful approach for mapping visual and textual features to the same latent space, and incorporate a third view capturing high-level image semantics, represented either by a single category or multiple non-mutually-exclusive concepts. We present two ways to train the three-view embedding: supervised, with the third view coming from ground-truth labels or search keywords; and unsupervised, with semantic themes automatically obtained by clustering the tags. To ensure high accuracy for retrieval tasks while keeping the learning process scalable, we combine multiple strong visual features and use explicit nonlinear kernel mappings to efficiently approximate kernel CCA. To perform retrieval, we use a specially designed similarity function in the embedded space, which substantially outperforms the Euclidean distance. The resulting system produces compelling qualitative results and outperforms a number of two-view baselines on retrieval tasks on three large-scale Internet image datasets.

...read moreread less

612 citations

Proceedings Article•DOI•

Medical image classification with convolutional neural network

[...]

Qing Li¹, Weidong Cai¹, Xiaogang Wang², Yun Zhou³, David Dagan Feng¹, Mei Chen⁴ - Show less +2 more•Institutions (4)

University of Sydney¹, The Chinese University of Hong Kong², Johns Hopkins University School of Medicine³, Carnegie Mellon University⁴

01 Jan 2014

TL;DR: A customized Convolutional Neural Networks with shallow convolution layer to classify lung image patches with interstitial lung disease and the same architecture can be generalized to perform other medical image or texture classification tasks.

...read moreread less

Abstract: Image patch classification is an important task in many different medical imaging applications. In this work, we have designed a customized Convolutional Neural Networks (CNN) with shallow convolution layer to classify lung image patches with interstitial lung disease (ILD). While many feature descriptors have been proposed over the past years, they can be quite complicated and domain-specific. Our customized CNN framework can, on the other hand, automatically and efficiently learn the intrinsic image features from lung image patches that are most suitable for the classification purpose. The same architecture can be generalized to perform other medical image or texture classification tasks.

...read moreread less

551 citations

Proceedings Article•DOI•

Deblurring Text Images via L0-Regularized Intensity and Gradient Prior

[...]

Jinshan Pan¹, Zhe Hu², Zhixun Su¹, Ming-Hsuan Yang²•Institutions (2)

Dalian University of Technology¹, University of California, Merced²

23 Jun 2014

TL;DR: An efficient optimization method is developed to generate reliable intermediate results for kernel estimation based on a simple yet effective L0-regularized prior based on intensity and gradient for text image deblurring.

...read moreread less

Abstract: We propose a simple yet effective 0-regularized prior based on intensity and gradient for text image deblurring. The proposed image prior is motivated by observing distinct properties of text images. Based on this prior, we develop an efficient optimization method to generate reliable intermediate results for kernel estimation. The proposed method does not require any complex filtering strategies to select salient edges which are critical to the state-of-the-art deblurring algorithms. We discuss the relationship with other deblurring algorithms based on edge selection and provide insight on how to select salient edges in a more principled way. In the final latent image restoration step, we develop a simple method to remove artifacts and render better deblurred images. Experimental results demonstrate that the proposed algorithm performs favorably against the state-of-the-art text image deblurring methods. In addition, we show that the proposed method can be effectively applied to deblur low-illumination images.

...read moreread less

400 citations

Book Chapter•DOI•

Blind Deblurring Using Internal Patch Recurrence

[...]

Tomer Michaeli¹, Michal Irani¹•Institutions (1)

Weizmann Institute of Science¹

06 Sep 2014

TL;DR: This paper exploits deviations from ideal patch recurrence as a cue for recovering the underlying (unknown) blur kernel k, such that if its effect is “undone” (if the blurry image is deconvolved with k), the patch similarity across scales of the image will be maximized.

...read moreread less

Abstract: Recurrence of small image patches across different scales of a natural image has been previously used for solving ill-posed problems (e.g. super- resolution from a single image). In this paper we show how this multi-scale property can also be used for “blind-deblurring”, namely, removal of an unknown blur from a blurry image. While patches repeat ‘as is’ across scales in a sharp natural image, this cross-scale recurrence significantly diminishes in blurry images. We exploit these deviations from ideal patch recurrence as a cue for recovering the underlying (unknown) blur kernel. More specifically, we look for the blur kernel k, such that if its effect is “undone” (if the blurry image is deconvolved with k), the patch similarity across scales of the image will be maximized. We report extensive experimental evaluations, which indicate that our approach compares favorably to state-of-the-art blind deblurring methods, and in particular, is more robust than them.

...read moreread less

394 citations

Journal Article•DOI•

On Bayesian Adaptive Video Super Resolution

[...]

Ce Liu¹, Deqing Sun²•Institutions (2)

Microsoft¹, Harvard University²

01 Feb 2014-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper proposes a Bayesian approach to adaptive video super resolution via simultaneously estimating underlying motion, blur kernel, and noise level while reconstructing the original high-resolution frames and confirms empirical observations that an intermediate size blur kernel achieves the optimal image reconstruction results.

...read moreread less

Abstract: Although multiframe super resolution has been extensively studied in past decades, super resolving real-world video sequences still remains challenging. In existing systems, either the motion models are oversimplified or important factors such as blur kernel and noise level are assumed to be known. Such models cannot capture the intrinsic characteristics that may differ from one sequence to another. In this paper, we propose a Bayesian approach to adaptive video super resolution via simultaneously estimating underlying motion, blur kernel, and noise level while reconstructing the original high-resolution frames. As a result, our system not only produces very promising super resolution results outperforming the state of the art, but also adapts to a variety of noise levels and blur kernels. To further analyze the effect of noise and blur kernel, we perform a two-step analysis using the Cramer-Rao bounds. We study how blur kernel and noise influence motion estimation with aliasing signals, how noise affects super resolution with perfect motion, and finally how blur kernel and noise influence super resolution with unknown motion. Our analysis results confirm empirical observations, in particular that an intermediate size blur kernel achieves the optimal image reconstruction results.

...read moreread less

370 citations

Proceedings Article•

Convolutional Kernel Networks

[...]

Julien Mairal¹, Piotr Koniusz¹, Zaid Harchaoui¹, Cordelia Schmid¹•Institutions (1)

University of Grenoble¹

08 Dec 2014

TL;DR: In this paper, a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel is proposed, which can learn to approximate the kernel feature map on training data.

...read moreread less

Abstract: An important goal in visual recognition is to devise image representations that are invariant to particular transformations. In this paper, we address this goal with a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel. Unlike traditional approaches where neural networks are learned either to represent data or for solving a classification task, our network learns to approximate the kernel feature map on training data. Such an approach enjoys several benefits over classical ones. First, by teaching CNNs to be invariant, we obtain simple network architectures that achieve a similar accuracy to more complex ones, while being easy to train and robust to overfitting. Second, we bridge a gap between the neural network literature and kernels, which are natural tools to model invariance. We evaluate our methodology on visual recognition tasks where CNNs have proven to perform well, e.g., digit recognition with the MNIST dataset, and the more challenging CIFAR-10 and STL-10 datasets, where our accuracy is competitive with the state of the art.

...read moreread less

317 citations

Posted Content•

Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

[...]

Vadim Lebedev¹, Vadim Lebedev², Yaroslav Ganin¹, Maksim Rakhuba³, Maksim Rakhuba¹, Ivan V. Oseledets¹, Victor Lempitsky¹ - Show less +3 more•Institutions (3)

Skolkovo Institute of Science and Technology¹, Yandex², Moscow Institute of Physics and Technology³

19 Dec 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: A simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning is proposed, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks.

...read moreread less

Abstract: We propose a simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning. Given a layer, we use non-linear least squares to compute a low-rank CP-decomposition of the 4D convolution kernel tensor into a sum of a small number of rank-one tensors. At the second step, this decomposition is used to replace the original convolutional layer with a sequence of four convolutional layers with small kernels. After such replacement, the entire network is fine-tuned on the training data using standard backpropagation process. We evaluate this approach on two CNNs and show that it is competitive with previous approaches, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks. Thus, for the 36-class character classification CNN, our approach obtains a 8.5x CPU speedup of the whole network with only minor accuracy drop (1% from 91% to 90%). For the standard ImageNet architecture (AlexNet), the approach speeds up the second convolution layer by a factor of 4x at the cost of $1\%$ increase of the overall top-5 classification error.

...read moreread less

300 citations

Proceedings Article•DOI•

Multi-view Super Vector for Action Recognition

[...]

Zhuowei Cai, Limin Wang, Xiaojiang Peng, Yu Qiao

23 Jun 2014

TL;DR: This paper proposes a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.

...read moreread less

Abstract: Images and videos are often characterized by multiple types of local descriptors such as SIFT, HOG and HOF, each of which describes certain aspects of object feature. Recognition systems benefit from fusing multiple types of these descriptors. Two widely applied fusion pipelines are descriptor concatenation and kernel average. The first one is effective when different descriptors are strongly correlated, while the second one is probably better when descriptors are relatively independent. In practice, however, different descriptors are neither fully independent nor fully correlated, and previous fusion methods may not be satisfying. In this paper, we propose a new global representation, Multi-View Super Vector (MVSV), which is composed of relatively independent components derived from a pair of descriptors. Kernel average is then applied on these components to produce recognition result. To obtain MVSV, we develop a generative mixture model of probabilistic canonical correlation analyzers (M-PCCA), and utilize the hidden factors and gradient vectors of M-PCCA to construct MVSV for video representation. Experiments on video based action recognition tasks show that MVSV achieves promising results, and outperforms FV and VLAD with descriptor concatenation or kernel average fusion strategy.

...read moreread less

229 citations

Journal Article•DOI•

Spectral-Spatial Classification of Hyperspectral Image Based on Kernel Extreme Learning Machine

[...]

Chen Chen¹, Wei Li, Hongjun Su, Kui Liu•Institutions (1)

University of Texas at Dallas¹

19 Jun 2014-Remote Sensing

TL;DR: This paper proposes to integrate spectral-spatial information for hyperspectral image classification and exploit the benefits of using spatial features for the kernel based ELM (KELM) classifier and demonstrates that the proposed methods outperform the conventional pixel-wise classifiers as well as Gabor-filtering-based support vector machine (SVM) and MH-prediction-based SVM in challenging small training sample size conditions.

...read moreread less

Abstract: Extreme learning machine (ELM) is a single-layer feedforward neural network based classifier that has attracted significant attention in computer vision and pattern recognition due to its fast learning speed and strong generalization. In this paper, we propose to integrate spectral-spatial information for hyperspectral image classification and exploit the benefits of using spatial features for the kernel based ELM (KELM) classifier. Specifically, Gabor filtering and multihypothesis (MH) prediction preprocessing are two approaches employed for spatial feature extraction. Gabor features have currently been successfully applied for hyperspectral image analysis due to the ability to represent useful spatial information. MH prediction preprocessing makes use of the spatial piecewise-continuous nature of hyperspectral imagery to integrate spectral and spatial information. The proposed Gabor-filtering-based KELM classifier and MH-prediction-based KELM classifier have been validated on two real hyperspectral datasets. Classification results demonstrate that the proposed methods outperform the conventional pixel-wise classifiers as well as Gabor-filtering-based support vector machine (SVM) and MH-prediction-based SVM in challenging small training sample size conditions.

...read moreread less

212 citations

Posted Content•

Fast Convolutional Nets With fbfft: A GPU Performance Evaluation

[...]

Nicolas Vasilache¹, Jeff Johnson¹, Michael Mathieu¹, Soumith Chintala¹, Serkan Piantino¹, Yann LeCun¹ - Show less +2 more•Institutions (1)

Facebook¹

24 Dec 2014-arXiv: Learning

TL;DR: In this article, two new Fast Fourier Transform (FFT) implementations are introduced: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft.

...read moreread less

Abstract: We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. Both of these convolution implementations are available in open source, and are faster than NVIDIA's cuDNN implementation for many common convolutional layers (up to 23.5x for some synthetic kernel configurations). We discuss different performance regimes of convolutions, comparing areas where straightforward time domain convolutions outperform Fourier frequency domain convolutions. Details on algorithmic applications of NVIDIA GPU hardware specifics in the implementation of fbfft are also provided.

...read moreread less

Proceedings Article•DOI•

Paraprox: pattern-based approximation for data parallel applications

[...]

Mehrzad Samadi¹, Davoud Anoushe Jamshidi¹, Janghaeng Lee¹, Scott Mahlke¹•Institutions (1)

University of Michigan¹

24 Feb 2014

TL;DR: This paper proposes a software-only system, Paraprox, for realizing transparent approximation of data-parallel programs that operates on commodity hardware systems and yields an average performance gain of 2.7x on a NVIDIA GTX 560 GPU and 2.5x on an Intel Core i7 quad-core processor.

...read moreread less

Abstract: Approximate computing is an approach where reduced accuracy of results is traded off for increased speed, throughput, or both. Loss of accuracy is not permissible in all computing domains, but there are a growing number of data-intensive domains where the output of programs need not be perfectly correct to provide useful results or even noticeable differences to the end user. These soft domains include multimedia processing, machine learning, and data mining/analysis. An important challenge with approximate computing is transparency to insulate both software and hardware developers from the time, cost, and difficulty of using approximation. This paper proposes a software-only system, Paraprox, for realizing transparent approximation of data-parallel programs that operates on commodity hardware systems. Paraprox starts with a data-parallel kernel implemented using OpenCL or CUDA and creates a parameterized approximate kernel that is tuned at runtime to maximize performance subject to a target output quality (TOQ) that is supplied by the user. Approximate kernels are created by recognizing common computation idioms found in data-parallel programs (e.g., Map, Scatter/Gather, Reduction, Scan, Stencil, and Partition) and substituting approximate implementations in their place. Across a set of 13 soft data-parallel applications with at most 10% quality degradation, Paraprox yields an average performance gain of 2.7x on a NVIDIA GTX 560 GPU and 2.5x on an Intel Core i7 quad-core processor compared to accurate execution on each platform.

...read moreread less

Proceedings Article•DOI•

SAR target recognition based on deep learning

[...]

Sizhe Chen¹, Haipeng Wang¹•Institutions (1)

Fudan University¹

01 Oct 2014

TL;DR: This paper attempts to adapt the optical camera-oriented CNN to its microwave counterpart, i.e. synthetic aperture radar (SAR), as a preliminary study, a single layer of convolutional neural network is used to automatically learn features from SAR images.

...read moreread less

Abstract: Deep learning algorithms such as convolutional neural networks (CNN) have been successfully applied in computer vision This paper attempts to adapt the optical camera-oriented CNN to its microwave counterpart, ie synthetic aperture radar (SAR) As a preliminary study, a single layer of convolutional neural network is used to automatically learn features from SAR images Instead of using the classical backpropagation algorithm, the convolution kernel is trained on randomly sampled image patches using unsupervised sparse auto-encoder After convolution and pooling, an input SAR image is then transformed into a series of feature maps These feature maps are then used to train a final softmax classifier Initial experiments on MSTAR public data set show that an accuracy of 901% can be achieved on three types of targets classification task, and an accuracy of 847% is achievable on ten types of targets classification task

...read moreread less

Journal Article•DOI•

Classification of Alzheimer Disease Based on Structural Magnetic Resonance Imaging by Kernel Support Vector Machine Decision Tree

[...]

Yudong Zhang, Shuihua Wang, Zhengchao Dong

01 Jan 2014-Progress in Electromagnetics Research-pier

TL;DR: The results show that the proposed kSVM-DT achieves 80% classiflcation accuracy, better than 74% of the method without kernel, and the PSO exceeds the random selection method in choosing the parameters of the classifler.

...read moreread less

Abstract: In this paper we proposed a novel classiflcation system to distinguish among elderly subjects with Alzheimer's disease (AD), mild cognitive impairment (MCI), and normal controls (NC). The method employed the magnetic resonance imaging (MRI) data of 178 subjects consisting of 97NCs, 57MCIs, and 24ADs. First, all these three dimensional (3D) MRI images were preprocessed with atlas-registered normalization. Then, gray matter images were extracted and the 3D images were under-sampled. Afterwards, principle component analysis was applied for feature extraction. In total, 20 principal components (PC) were extracted from 3D MRI data using singular value decomposition (SVD) algorithm, and 2 PCs were extracted from additional information (consisting of demographics, clinical examination, and derived anatomic volumes) using alternating least squares (ALS). On the basic of the 22 features, we constructed a kernel support vector machine decision tree (kSVM-DT). The error penalty parameter C and kernel parameter ae were determined by Particle Swarm Optimization (PSO). The weights ! and biases b were still obtained by quadratic programming method. 5-fold cross validation was employed to obtain the out-of-sample estimate. The results show that the proposed kSVM-DT achieves 80% classiflcation accuracy, better than 74% of the method without kernel. Besides, the PSO exceeds the random selection method in choosing the parameters of the classifler. The computation time to predict a new patient is only 0.022s.

...read moreread less

Proceedings Article•DOI•

Matching People across Camera Views using Kernel Canonical Correlation Analysis

[...]

Giuseppe Lisanti¹, Iacopo Masi¹, Alberto Del Bimbo¹•Institutions (1)

University of Florence¹

04 Nov 2014

TL;DR: This paper addresses the problem of person re-identification across disjoint cameras by proposing an efficient but robust kernel descriptor to encode the appearance of a person by applying a learning technique based on Kernel Canonical Correlation Analysis (KCCA).

...read moreread less

Abstract: Matching people across views is still an open problem in computer vision and in video surveillance systems. In this paper we address the problem of person re-identification across disjoint cameras by proposing an efficient but robust kernel descriptor to encode the appearance of a person. The matching is then improved by applying a learning technique based on Kernel Canonical Correlation Analysis (KCCA) which finds a common subspace between the proposed descriptors extracted from disjoint cameras, projecting them into a new description space. This common description space is then used to identify a person from one camera to another with a standard nearest-neighbor voting method. We evaluate our approach on two publicly available datasets for re-identification (VIPeR and PRID), demonstrating that our method yields state-of-the-art performance with respect to recent techniques proposed for the re-identification task.

...read moreread less

Journal Article•DOI•

Kernel Collaborative Representation With Tikhonov Regularization for Hyperspectral Image Classification

[...]

Wei Li¹, Qian Du², Mingming Xiong¹•Institutions (2)

Peking University¹, Mississippi State University²

10 Jun 2014-IEEE Geoscience and Remote Sensing Letters

TL;DR: Experimental results on two hyperspectral data prove that the proposed kernel collaborative representation with Tikhonov regularization technique outperforms the traditional support vector machines with composite kernels and other state-of-the-art classifiers, such as kernel sparse representation classifier and kernel collaborative representations classifier.

...read moreread less

Abstract: In this letter, kernel collaborative representation with Tikhonov regularization (KCRT) is proposed for hyperspectral image classification. The original data is projected into a high-dimensional kernel space by using a nonlinear mapping function to improve the class separability. Moreover, spatial information at neighboring locations is incorporated in the kernel space. Experimental results on two hyperspectral data prove that our proposed technique outperforms the traditional support vector machines with composite kernels and other state-of-the-art classifiers, such as kernel sparse representation classifier and kernel collaborative representation classifier.

...read moreread less

Journal Article•DOI•

Data-driven realizations of kernel and image representations and their application to fault detection and control system design

[...]

Steven X. Ding¹, Ying Yang², Yong Zhang¹, Linlin Li¹•Institutions (2)

University of Duisburg-Essen¹, Peking University²

01 Oct 2014-Automatica

TL;DR: The definitions of the data-driven forms of kernel and image representations are introduced and their identification is studied in the context of a fault-tolerant architecture.

...read moreread less

Journal Article•DOI•

A General Framework for Regularized, Similarity-Based Image Restoration

[...]

Amin Kheradmand¹, Peyman Milanfar¹•Institutions (1)

University of California, Santa Cruz¹

08 Oct 2014-IEEE Transactions on Image Processing

TL;DR: An iterative graph-based framework for image restoration based on a new definition of the normalized graph Laplacian, which comprises of outer and inner iterations, where in each outer iteration, the similarity weights are recomputed using the previous estimate and the updated objective function is minimized using inner conjugate gradient iterations.

...read moreread less

Abstract: Any image can be represented as a function defined on a weighted graph, in which the underlying structure of the image is encoded in kernel similarity and associated Laplacian matrices. In this paper, we develop an iterative graph-based framework for image restoration based on a new definition of the normalized graph Laplacian. We propose a cost function, which consists of a new data fidelity term and regularization term derived from the specific definition of the normalized graph Laplacian. The normalizing coefficients used in the definition of the Laplacian and associated regularization term are obtained using fast symmetry preserving matrix balancing. This results in some desired spectral properties for the normalized Laplacian such as being symmetric, positive semidefinite, and returning zero vector when applied to a constant image. Our algorithm comprises of outer and inner iterations, where in each outer iteration, the similarity weights are recomputed using the previous estimate and the updated objective function is minimized using inner conjugate gradient iterations. This procedure improves the performance of the algorithm for image deblurring, where we do not have access to a good initial estimate of the underlying image. In addition, the specific form of the cost function allows us to render the spectral analysis for the solutions of the corresponding linear equations. In addition, the proposed approach is general in the sense that we have shown its effectiveness for different restoration problems, including deblurring, denoising, and sharpening. Experimental results verify the effectiveness of the proposed algorithm on both synthetic and real examples.

...read moreread less

Posted Content•

Convolutional Kernel Networks

[...]

Julien Mairal¹, Piotr Koniusz¹, Zaid Harchaoui¹, Cordelia Schmid¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

12 Jun 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a new type of convolutional neural network (CNN) whose invariance is encoded by a reproducing kernel, and bridges a gap between the neural network literature and kernels, which are natural tools to model invariance.

...read moreread less

Journal Article•DOI•

One-class kernel subspace ensemble for medical image classification

[...]

Yungang Zhang¹, Yungang Zhang², Bailing Zhang², Frans Coenen³, Jimin Xiao², Wenjin Lu² - Show less +2 more•Institutions (3)

Yunnan Normal University¹, Xi'an Jiaotong-Liverpool University², University of Liverpool³

07 Feb 2014-EURASIP Journal on Advances in Signal Processing

TL;DR: The proposed classification scheme obtained promising results on the two medical image sets and was evaluated on the UCI breast cancer dataset (diagnostic), and a competitive result was obtained.

...read moreread less

Abstract: Classification of medical images is an important issue in computer-assisted diagnosis. In this paper, a classification scheme based on a one-class kernel principle component analysis (KPCA) model ensemble has been proposed for the classification of medical images. The ensemble consists of one-class KPCA models trained using different image features from each image class, and a proposed product combining rule was used for combining the KPCA models to produce classification confidence scores for assigning an image to each class. The effectiveness of the proposed classification scheme was verified using a breast cancer biopsy image dataset and a 3D optical coherence tomography (OCT) retinal image set. The combination of different image features exploits the complementary strengths of these different feature extractors. The proposed classification scheme obtained promising results on the two medical image sets. The proposed method was also evaluated on the UCI breast cancer dataset (diagnostic), and a competitive result was obtained.

...read moreread less

Journal Article•DOI•

Walsh–Hadamard Transform Kernel-Based Feature Vector for Shot Boundary Detection

[...]

G. G. Lakshmi Priya¹, S. Domnic•Institutions (1)

VIT University¹

09 Oct 2014-IEEE Transactions on Image Processing

TL;DR: A new SBD method is proposed using color, edge, texture, and motion strength as vector of features (feature vector) of Walsh-Hadamard transform (WHT) kernel and WHT matrix, which shows that WHT-based features can perform well than the other existing methods.

...read moreread less

Abstract: Video shot boundary detection (SBD) is the first step of video analysis, summarization, indexing, and retrieval. In SBD process, videos are segmented into basic units called shots. In this paper, a new SBD method is proposed using color, edge, texture, and motion strength as vector of features (feature vector). Features are extracted by projecting the frames on selected basis vectors of Walsh–Hadamard transform (WHT) kernel and WHT matrix. After extracting the features, based on the significance of the features, weights are calculated. The weighted features are combined to form a single continuity signal, used as input for Procedure Based shot transition Identification process (PBI). Using the procedure, shot transitions are classified into abrupt and gradual transitions. Experimental results are examined using large-scale test sets provided by the TRECVID 2007, which has evaluated hard cut and gradual transition detection. To evaluate the robustness of the proposed method, the system evaluation is performed. The proposed method yields F1-Score of 97.4% for cut, 78% for gradual, and 96.1% for overall transitions. We have also evaluated the proposed feature vector with support vector machine classifier. The results show that WHT-based features can perform well than the other existing methods. In addition to this, few more video sequences are taken from the Openvideo project and the performance of the proposed method is compared with the recent existing SBD method.

...read moreread less

Journal Article•

Revisiting Bayesian blind deconvolution

[...]

David Wipf¹, Haichao Zhang²•Institutions (2)

Microsoft¹, Northwestern Polytechnical University²

01 Jan 2014-Journal of Machine Learning Research

TL;DR: In this paper, the VB methodology is recast as an unconventional MAP problem with a very particular penalty/prior that conjoins the image, blur kernel, and noise level in a principled way.

...read moreread less

Abstract: Blind deconvolution involves the estimation of a sharp signal or image given only a blurry observation. Because this problem is fundamentally ill-posed, strong priors on both the sharp image and blur kernel are required to regularize the solution space. While this naturally leads to a standard MAP estimation framework, performance is compromised by unknown trade-off parameter settings, optimization heuristics, and convergence issues stemming from non-convexity and/or poor prior selections. To mitigate some of these problems, a number of authors have recently proposed substituting a variational Bayesian (VB) strategy that marginalizes over the high-dimensional image space leading to better estimates of the blur kernel. However, the underlying cost function now involves both integrals with no closed-form solution and complex, function-valued arguments, thus losing the transparency of MAP. Beyond standard Bayesian-inspired intuitions, it thus remains unclear by exactly what mechanism these methods are able to operate, rendering understanding, improvements and extensions more difficult. To elucidate these issues, we demonstrate that the VB methodology can be recast as an unconventional MAP problem with a very particular penalty/prior that conjoins the image, blur kernel, and noise level in a principled way. This unique penalty has a number of useful characteristics pertaining to relative concavity, local minima avoidance, normalization, and scale-invariance that allow us to rigorously explain the success of VB including its existing implementational heuristics and approximations. It also provides strict criteria for learning the noise level and choosing the optimal image prior that, perhaps counter-intuitively, need not reflect the statistics of natural scenes. In so doing we challenge the prevailing notion of why VB is successful for blind deconvolution while providing a transparent platform for introducing enhancements and extensions. Moreover, the underlying insights carry over to a wide variety of other bilinear models common in the machine learning literature such as independent component analysis, dictionary learning/sparse coding, and non-negative matrix factorization.

...read moreread less

Proceedings Article•DOI•

Scalable kernel fusion for memory-bound GPU applications

[...]

Mohamed Wahib, Naoya Maruyama

16 Nov 2014

TL;DR: Results show that using the proposed scalable method for kernel fusion improved the performance of two real-world applications containing tens of kernels by 1.35x and 1.2x.

...read moreread less

Abstract: GPU implementations of HPC applications relying on finite difference methods can include tens of kernels that are memory-bound. Kernel fusion can improve performance by reducing data traffic to off-chip memory, kernels that share data arrays are fused to larger kernels where on-chip cache is used to hold the data reused by instructions originating from different kernels. The main challenges are a) searching for the optimal kernel fusions while constrained by data dependencies and kernels' precedences and b) effectively applying kernel fusion to achieve speedup. This paper introduces a problem definition and proposes a scalable method for searching the space of possible kernel fusions to identify optimal kernel fusions for large problems. The paper also proposes a codeless performance upper-bound projection model to achieve effective fusions. Results show that using the proposed scalable method for kernel fusion improved the performance of two real-world applications containing tens of kernels by 1.35x and 1.2x.

...read moreread less

Journal Article•

Recent Progress in Image Deblurring

[...]

Ruxin Wang¹, Dacheng Tao¹•Institutions (1)

University of Technology, Sydney¹

24 Sep 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper comprehensively reviews the recent development of imagedeblurring, including nonblind/blind, spatially invariant/variant deblurring techniques, and provides a holistic understanding and deep insight into image deblursing.

...read moreread less

Abstract: This paper comprehensively reviews the recent development of image deblurring, including nonblind/blind, spatially invariant/variant deblurring techniques. Indeed, these techniques share the same objective of inferring a latent sharp image from one or several corresponding blurry images, while the blind deblurring techniques are also required to derive an accurate blur kernel. Considering the critical role of image restoration in modern imaging systems to provide high-quality images under complex environments such as motion, undesirable lighting conditions, and imperfect system components, image deblurring has attracted growing attention in recent years. From the viewpoint of how to handle the illposedness which is a crucial issue in deblurring tasks, existing methods can be grouped into ve categories: Bayesian inference framework, variational methods, sparse representation-based methods, homographybased modeling, and region-based methods. In spite of achieving a certain level of development, image deblurring, especially the blind case, is limited in its success by complex application conditions which make the blur kernel hard to obtain and be spatially variant. We provide a holistic understanding and deep insight into image deblurring in this review. An analysis of the empirical evidence for representative methods, practical issues, as well as a discussion of promising future directions are also presented.

...read moreread less

Journal Article•DOI•

Group Sparse Multiview Patch Alignment Framework With View Consistency for Image Classification

[...]

Jie Gui¹, Dacheng Tao², Zhenan Sun¹, Yong Luo³, Xinge You⁴, Yuan Yan Tang⁵ - Show less +2 more•Institutions (5)

Chinese Academy of Sciences¹, University of Technology, Sydney², Peking University³, Huazhong University of Science and Technology⁴, University of Macau⁵

01 Jul 2014-IEEE Transactions on Image Processing

TL;DR: A group sparse multiview patch alignment framework (GSM-PAF) is developed that considers not only the complementary properties of different views, but also view consistency, which models the correlations between all possible combinations of any two kinds of view.

...read moreread less

Abstract: No single feature can satisfactorily characterize the semantic concepts of an image. Multiview learning aims to unify different kinds of features to produce a consensual and efficient representation. This paper redefines part optimization in the patch alignment framework (PAF) and develops a group sparse multiview patch alignment framework (GSM-PAF). The new part optimization considers not only the complementary properties of different views, but also view consistency. In particular, view consistency models the correlations between all possible combinations of any two kinds of view. In contrast to conventional dimensionality reduction algorithms that perform feature extraction and feature selection independently, GSM-PAF enjoys joint feature extraction and feature selection by exploiting ${l_{2,1}}$ -norm on the projection matrix to achieve row sparsity, which leads to the simultaneous selection of relevant features and learning transformation, and thus makes the algorithm more discriminative. Experiments on two real-world image data sets demonstrate the effectiveness of GSM-PAF for image classification.

...read moreread less

Journal Article•DOI•

Class-Specific Kernel Fusion of Multiple Descriptors for Face Verification Using Multiscale Binarised Statistical Image Features

[...]

Shervin Rahimzadeh Arashloo¹, Josef Kittler²•Institutions (2)

Urmia University¹, University of Surrey²

01 Dec 2014-IEEE Transactions on Information Forensics and Security

TL;DR: A nonlinear binary class-specific kernel discriminant analysis classifier (CS-KDA) based on spectral regression kernel discriminatory analysis is proposed, which offers a number of desirable properties such as specificity of the transformation for each subject, computational efficiency, simplicity of training, isolation of the enrolment of each client from others and increased speed in probe testing.

...read moreread less

Abstract: This paper addresses face verification in unconstrained settings. For this purpose, first, a nonlinear binary class-specific kernel discriminant analysis classifier (CS-KDA) based on spectral regression kernel discriminant analysis is proposed. By virtue of the two-class formulation, the proposed CS-KDA approach offers a number of desirable properties such as specificity of the transformation for each subject, computational efficiency, simplicity of training, isolation of the enrolment of each client from others and increased speed in probe testing. Using the proposed CS-KDA approach, a regional discriminative face image representation based on a multiscale variant of the binarized statistical image features is proposed next. The proposed component-based representation when coupled with the dense pixel-wise alignments provided by a symmetric MRF matching model reduces the sensitivity to misalignments and pose variations, gauging the similarity more effectively. Finally, the discriminative representation is combined with two other effective image descriptors, namely the multiscale local binary patterns and the multiscale local phase quantization histograms via a kernel fusion approach to further enhance system accuracy. The experimental evaluation of the proposed methodology on challenging databases demonstrates its advantage over other methods.

...read moreread less

Journal Article•DOI•

Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling

[...]

Jianlong Zhong¹, Bingsheng He¹•Institutions (1)

Nanyang Technological University¹

01 Jun 2014-IEEE Transactions on Parallel and Distributed Systems

TL;DR: Kernelet as mentioned in this paper divides a GPU kernel into multiple sub-kernels, each slice has tunable occupancy to allow co-scheduling with other slices for high GPU utilization.

...read moreread less

Abstract: Graphics processors, or GPUs, have recently been widely used as accelerators in shared environments such as clusters and clouds. In such shared environments, many kernels are submitted to GPUs from different users, and throughput is an important metric for performance and total ownership cost. Despite recently improved runtime support for concurrent GPU kernel executions, the GPU can be severely underutilized, resulting in suboptimal throughput. In this paper, we propose Kernelet, a runtime system to improve the throughput of concurrent kernel executions on the GPU. Kernelet embraces transparent memory management and PCI-e data transfer techniques, and dynamic slicing and scheduling techniques for kernel executions. With slicing, Kernelet divides a GPU kernel into multiple sub-kernels (namely slices ). Each slice has tunable occupancy to allow co-scheduling with other slices for high GPU utilization. We develop a novel Markov chain-based performance model to guide the scheduling decision. Our experimental results demonstrate up to 31 percent and 23 percent performance improvement on NVIDIA Tesla C2050 and GTX680 GPUs, respectively.

...read moreread less

Proceedings Article•DOI•

Real-time scalable cortical computing at 46 giga-synaptic OPS/watt with ~100× speedup in time-to-solution and ~100,000× reduction in energy-to-solution

[...]

Andrew S. Cassidy¹, Rodrigo Alvarez-Icaza¹, Filipp Akopyan¹, Jun Sawada¹, John V. Arthur¹, Paul A. Merolla¹, Pallab Datta¹, Marc Gonzalez Tallada¹, Brian Taba¹, Alexander Andreopoulos¹, Arnon Amir¹, Steven K. Esser¹, Jeff Kusnitz¹, Rathinakumar Appuswamy¹, C. Haymes¹, Bernard Brezzo¹, Roger Moussalli¹, Ralph Bellofatto¹, Christian W. Baks¹, Michael Mastro¹, Kai Schleupen¹, Charles Edwin Cox¹, Ken Inoue¹, Steve Millman¹, Nabil Imam², Emmett McQuinn¹, Yutaka Nakamura¹, Ivan Vo¹, Chen Guok¹, Don Nguyen, Scott Lekuch¹, Sameh W. Asaad¹, Daniel Friedman¹, Bryan L. Jackson¹, Myron D. Flickner¹, William P. Risk¹, Rajit Manohar², Dharmendra S. Modha¹ - Show less +34 more•Institutions (2)

IBM¹, Cornell University²

16 Nov 2014

TL;DR: True North is a 4,096 core, 1 million neuron, and 256 million synapse brain-inspired neurosynaptic processor, that consumes 65mW of power running at real-time and delivers performance of 46 Giga-Synaptic OPS/Watt.

...read moreread less

Abstract: Drawing on neuroscience, we have developed a parallel, event-driven kernel for neurosynaptic computation, that is efficient with respect to computation, memory, and communication. Building on the previously demonstrated highly optimized software expression of the kernel, here, we demonstrate True North, a co-designed silicon expression of the kernel. True North achieves five orders of magnitude reduction in energy to-solution and two orders of magnitude speedup in time-to solution, when running computer vision applications and complex recurrent neural network simulations. Breaking path with the von Neumann architecture, True North is a 4,096 core, 1 million neuron, and 256 million synapse brain-inspired neurosynaptic processor, that consumes 65mW of power running at real-time and delivers performance of 46 Giga-Synaptic OPS/Watt. We demonstrate seamless tiling of True North chips into arrays, forming a foundation for cortex-like scalability. True North's unprecedented time-to-solution, energy-to-solution, size, scalability, and performance combined with the underlying flexibility of the kernel enable a broad range of cognitive applications.

...read moreread less

Journal Article•DOI•

Multiple feature kernel hashing for large-scale visual search

[...]

Xianglong Liu¹, Junfeng He², Bo Lang¹•Institutions (2)

Beihang University¹, Facebook²

01 Feb 2014-Pattern Recognition

TL;DR: Experimental results show that the proposed multiple feature kernel hashing framework can achieve superior accuracy and efficiency over state-of-the-art methods, and alternating optimization ways efficiently learn hashing functions and the kernel space.

...read moreread less

Journal Article•DOI•

A Kernel Clustering Algorithm With Fuzzy Factor: Application to SAR Image Segmentation

[...]

Deliang Xiang¹, Tao Tang¹, Canbin Hu¹, Yu Li¹, Yi Su¹ - Show less +1 more•Institutions (1)

National University of Defense Technology¹

01 Jul 2014-IEEE Geoscience and Remote Sensing Letters

TL;DR: A kernel FCM algorithm with pixel intensity and location information for SAR image segmentation is presented, which incorporates a weighted fuzzy factor into the objective function, which considers the spatial and intensity distances of all neighboring pixels simultaneously.

...read moreread less

Abstract: The presence of multiplicative noise in synthetic aperture radar (SAR) images makes segmentation and classification difficult to handle. Although a fuzzy C-means (FCM) algorithm and its variants (e.g., the FCM_S, the fast generalized FCM, the fuzzy local information C-means, etc.) can achieve satisfactory segmentation results and are robust to Gaussian noise, uniform noise, and salt and pepper noise, they are not adaptable to SAR image speckle. This letter presents a kernel FCM algorithm with pixel intensity and location information for SAR image segmentation. We incorporate a weighted fuzzy factor into the objective function, which considers the spatial and intensity distances of all neighboring pixels simultaneously. In addition, the energy measures of SAR image wavelet decomposition are used to represent the texture information, and a kernel metric is adopted to measure the feature similarity. The weighted fuzzy factor and the kernel distance measure are both robust to speckle. Experimental results on synthetic and real SAR images demonstrate that the proposed algorithm is effective for SAR image segmentation.

...read moreread less

Collapse