Showing papers on "Multiple kernel learning published in 2015"

PDF

Open Access

Proceedings Article•DOI•

Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis

[...]

Soujanya Poria¹, Erik Cambria¹, Alexander Gelbukh•Institutions (1)

Nanyang Technological University¹

01 Jan 2015

TL;DR: A novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network, is presented and a parallelizable decision-level data fusion method is presented, which is much faster, though slightly less accurate.

...read moreread less

Abstract: We present a novel way of extracting features from short texts, based on the activation values of an inner layer of a deep convolutional neural network. We use the extracted features in multimodal sentiment analysis of short video clips representing one sentence each. We use the combined feature vectors of textual, visual, and audio modalities to train a classifier based on multiple kernel learning, which is known to be good at heterogeneous data. We obtain 14% performance improvement over the state of the art and present a parallelizable decision-level data fusion method, which is much faster, though slightly less accurate.

...read moreread less

449 citations

Journal Article•DOI•

Multiple Feature Learning for Hyperspectral Image Classification

[...]

Jun Li¹, Xin Huang², Paolo Gamba³, Jose M. Bioucas-Dias, Liangpei Zhang², Jon Atli Benediktsson⁴, Antonio Plaza⁵ - Show less +3 more•Institutions (5)

Sun Yat-sen University¹, Wuhan University², University of Pavia³, University of Iceland⁴, University of Extremadura⁵

01 Mar 2015-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: An important characteristic of the presented approach is that it does not require any regularization parameters to control the weights of considered features so that different types of features can be efficiently exploited and integrated in a collaborative and flexible way.

...read moreread less

Abstract: Hyperspectral image classification has been an active topic of research in recent years. In the past, many different types of features have been extracted (using both linear and nonlinear strategies) for classification problems. On the one hand, some approaches have exploited the original spectral information or other features linearly derived from such information in order to have classes which are linearly separable. On the other hand, other techniques have exploited features obtained through nonlinear transformations intended to reduce data dimensionality, to better model the inherent nonlinearity of the original data (e.g., kernels) or to adequately exploit the spatial information contained in the scene (e.g., using morphological analysis). Special attention has been given to techniques able to exploit a single kind of features, such as composite kernel learning or multiple kernel learning, developed in order to deal with multiple kernels. However, few approaches have been designed to integrate multiple types of features extracted from both linear and nonlinear transformations. In this paper, we develop a new framework for the classification of hyperspectral scenes that pursues the combination of multiple features. The ultimate goal of the proposed framework is to be able to cope with linear and nonlinear class boundaries present in the data, thus following the two main mixing models considered for hyperspectral data interpretation. An important characteristic of the presented approach is that it does not require any regularization parameters to control the weights of considered features so that different types of features can be efficiently exploited and integrated in a collaborative and flexible way. Our experimental results, conducted using a variety of input features and hyperspectral scenes, indicate that the proposed framework for multiple feature learning provides state-of-the-art classification results without significantly increasing computational complexity.

...read moreread less

299 citations

Journal Article•DOI•

Multiple kernel extreme learning machine

[...]

Xinwang Liu¹, Lei Wang², Guang-Bin Huang³, Jian Zhang⁴, Jianping Yin¹ - Show less +1 more•Institutions (4)

National University of Defense Technology¹, University of Wollongong², Nanyang Technological University³, University of Technology, Sydney⁴

03 Feb 2015-Neurocomputing

TL;DR: A general learning framework, termed multiple kernel extreme learning machines (MK-ELM), to address the lack of a general framework for ELM to integrate multiple heterogeneous data sources for classification and can achieve comparable or even better classification performance than state-of-the-art MKL algorithms, while incurring much less computational cost.

...read moreread less

160 citations

Journal Article•DOI•

EasyMKL: a scalable multiple kernel learning algorithm

[...]

Fabio Aiolli¹, Michele Donini¹•Institutions (1)

University of Padua¹

02 Dec 2015-Neurocomputing

TL;DR: It is shown empirically that the advantage of using the method proposed in this paper is even clearer when noise features are added, and the proposed method has been compared with other baselines and three state-of-the-art MKL methods showing that the approach is often superior.

...read moreread less

159 citations

Journal Article•DOI•

Integrating different data types by regularized unsupervised multiple kernel learning with application to cancer subtype discovery.

[...]

Nora K. Speicher¹, Nico Pfeifer¹•Institutions (1)

Max Planck Society¹

15 Jun 2015-Bioinformatics

TL;DR: Current multiple kernel learning for dimensionality reduction approaches are applied and extended, and it is shown that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand.

...read moreread less

Abstract: Motivation: Despite ongoing cancer research, available therapies are still limited in quantity and effectiveness, and making treatment decisions for individual patients remains a hard problem. Established subtypes, which help guide these decisions, are mainly based on individual data types. However, the analysis of multidimensional patient data involving the measurements of various molecular features could reveal intrinsic characteristics of the tumor. Large-scale projects accumulate this kind of data for various cancer types, but we still lack the computational methods to reliably integrate this information in a meaningful manner. Therefore, we apply and extend current multiple kernel learning for dimensionality reduction approaches. On the one hand, we add a regularization term to avoid overfitting during the optimization procedure, and on the other hand, we show that one can even use several kernels per data type and thereby alleviate the user from having to choose the best kernel functions and kernel parameters for each data type beforehand. Results: We have identified biologically meaningful subgroups for five different cancer types. Survival analysis has revealed significant differences between the survival times of the identified subtypes, with P values comparable or even better than state-of-the-art methods. Moreover, our resulting subtypes reflect combined patterns from the different data sources, and we demonstrate that input kernel matrices with only little information have less impact on the integrated kernel matrix. Our subtypes show different responses to specific therapies, which could eventually assist in treatment decision making. Availability and implementation: An executable is available upon request. Contact: ed.gpm.fni-ipm@aron or ed.gpm.fni-ipm@refiefpn

...read moreread less

148 citations

Journal Article•DOI•

Predicting abnormal returns from news using text classification

[...]

Ronny Luss¹, Alexandre d'Aspremont¹•Institutions (1)

Princeton University¹

03 Jun 2015-Quantitative Finance

TL;DR: It is observed that while the direction of returns is not predictable using either text or returns, their size is, with text features producing significantly better performance than historical returns alone.

...read moreread less

Abstract: We show how text from news articles can be used to predict intraday price movements of financial assets using support vector machines. Multiple kernel learning is used to combine equity returns with text as predictive features to increase classification performance and we develop an analytic center cutting plane method to solve the kernel learning problem efficiently. We observe that while the direction of returns is not predictable using either text or returns, their size is, with text features producing significantly better performance than historical returns alone.

...read moreread less

135 citations

Journal Article•DOI•

Learning to Rank for Blind Image Quality Assessment

[...]

Fei Gao¹, Dacheng Tao², Xinbo Gao¹, Xuelong Li³•Institutions (3)

Xidian University¹, University of Technology, Sydney², Chinese Academy of Sciences³

19 Jan 2015-IEEE Transactions on Neural Networks

TL;DR: Zhang et al. as discussed by the authors explored and exploited preference image pairs (PIPs) such as the quality of image I is better than image B for training a robust blind image quality assessment (BIQA) model.

...read moreread less

Abstract: Blind image quality assessment (BIQA) aims to predict perceptual image quality scores without access to reference images. State-of-the-art BIQA methods typically require subjects to score a large number of images to train a robust model. However, subjective quality scores are imprecise, biased, and inconsistent, and it is challenging to obtain a large-scale database, or to extend existing databases, because of the inconvenience of collecting images, training the subjects, conducting subjective experiments, and realigning human quality evaluations. To combat these limitations, this paper explores and exploits preference image pairs (PIPs) such as the quality of image ${\boldsymbol {I}}_{\boldsymbol {a}}$ is better than that of image ${\boldsymbol {I}}_{\boldsymbol {b}}$ for training a robust BIQA model. The preference label, representing the relative quality of two images, is generally precise and consistent, and is not sensitive to image content, distortion type, or subject identity; such PIPs can be generated at a very low cost. The proposed BIQA method is one of learning to rank. We first formulate the problem of learning the mapping from the image features to the preference label as one of classification. In particular, we investigate the utilization of a multiple kernel learning algorithm based on group lasso to provide a solution. A simple but effective strategy to estimate perceptual image quality scores is then presented. Experiments show that the proposed BIQA method is highly effective and achieves a performance comparable with that of state-of-the-art BIQA algorithms. Moreover, the proposed method can be easily extended to new distortion categories.

...read moreread less

128 citations

Journal Article•DOI•

Automatic classification for field crop insects via multiple-task sparse representation and multiple-kernel learning

[...]

Chengjun Xie¹, Jie Zhang¹, Rui Li¹, Jinyan Li², Hong Peilin¹, Junfeng Xia³, Peng Chen³ - Show less +3 more•Institutions (3)

Chinese Academy of Sciences¹, University of Technology, Sydney², Anhui University³

01 Nov 2015-Computers and Electronics in Agriculture

TL;DR: Experimental results show that the proposed method performs well on the classification of insect species, and outperforms the state-of-the-art methods of the generic insect categorization.

...read moreread less

107 citations

Journal Article•DOI•

Free-hand sketch recognition by multi-kernel feature learning

[...]

Yi Li¹, Timothy M. Hospedales¹, Yi-Zhe Song¹, Shaogang Gong¹•Institutions (1)

Queen Mary University of London¹

01 Aug 2015-Computer Vision and Image Understanding

TL;DR: This work proposes a Multiple Kernel Learning (MKL) framework for sketch recognition, fusing several features common to sketches, and investigates the use of attributes as a high-level feature for sketches and shows how this complements low-level features for improving recognition performance under the MKL framework.

...read moreread less

95 citations

Journal Article•DOI•

Mark-elm

[...]

John M. Fossaceca¹, Thomas A. Mazzuchi¹, Shahram Sarkani¹•Institutions (1)

George Washington University¹

15 May 2015-Expert Systems With Applications

TL;DR: The novel Multiple Adaptive Reduced Kernel Extreme Learning Machine (MARK-ELM) is introduced which combines Multiple Kernel Boosting and Multiclass KELM to Network Intrusion Detection to improve the efficacy of network intrusion on data that contains instances of multiple classes of attacks.

...read moreread less

Abstract: Apply Multiple Kernel Boosting and Multiclass KELM to Network Intrusion Detection.Tested approach on several machine learning datasets and the KDD Cup 99 dataset.Utilized Fractional Polynomial Kernels for the Network ID problem for the first time.Requires no feature selection, minimal pre-processing and works on imbalanced data.Achieves superior detection rates and lower false alarm rates than other approaches. Detection of cyber-based attacks on computer networks continues to be a relevant and challenging area of research. Daily reports of incidents appear in public media including major ex-filtrations of data for the purposes of stealing identities, credit card numbers, and intellectual property as well as to take control of network resources. Methods used by attackers constantly change in order to defeat techniques employed by information technology (IT) teams intended to discover or block intrusions. "Zero Day" attacks whose "signatures" are not yet in IT databases are continually being uncovered. Machine learning approaches have been widely used to increase the effectiveness of intrusion detection platforms. While some machine learning techniques are effective at detecting certain types of attacks, there are no known methods that can be applied universally and achieve consistent results for multiple attack types. The focus of our research is the development of a framework that combines the outputs of multiple learners in order to improve the efficacy of network intrusion on data that contains instances of multiple classes of attacks. We have chosen the Extreme Learning Machine (ELM) as the core learning algorithm due to recent research that suggests that ELMs are straightforward to implement, computationally efficient and have excellent learning performance characteristics on par with the Support Vector Machine (SVM), one of the most widely used and best performing machine learning platforms (Liu, Gao, & Li, 2012). We introduce the novel Multiple Adaptive Reduced Kernel Extreme Learning Machine (MARK-ELM) which combines Multiple Kernel Boosting (Xia & Hoi, 2013) with the Multiple Classification Reduced Kernel ELM (Deng, Zheng, & Zhang, 2013). We tested this approach on several machine learning datasets as well as the KDD Cup 99 (Hettich & Bay, 1999) intrusion detection dataset. Our results indicate that MARK-ELM works well for the majority of University of California, Irvine (UCI) Machine Learning Repository small datasets and is scalable for larger datasets. For UCI datasets we achieved performance similar to the MKBoost Support Vector Machine (SVM) approach. In our experiments we demonstrate that MARK-ELM achieves superior detection rates and much lower false alarm rates than other approaches on intrusion detection data.

...read moreread less

89 citations

Journal Article•DOI•

An overview of kernel alignment and its applications

[...]

Tinghua Wang¹, Dongyan Zhao¹, Shengfeng Tian²•Institutions (2)

Peking University¹, Beijing Jiaotong University²

01 Feb 2015-Artificial Intelligence Review

TL;DR: The basic idea of kernel alignment and its theoretical properties, as well as the extensions and improvements for specific learning problems, are introduced and the typical applications, including kernel parameter tuning, multiple kernel learning, spectral kernel learning and feature selection and extraction are reviewed.

...read moreread less

Abstract: The success of kernel methods is very much dependent on the choice of kernel. Kernel design and learning a kernel from the data require evaluation measures to assess the quality of the kernel. In recent years, the notion of kernel alignment, which measures the degree of agreement between a kernel and a learning task, is widely used for kernel selection due to its effectiveness and low computational complexity. In this paper, we present an overview of the research progress of kernel alignment and its applications. We introduce the basic idea of kernel alignment and its theoretical properties, as well as the extensions and improvements for specific learning problems. The typical applications, including kernel parameter tuning, multiple kernel learning, spectral kernel learning and feature selection and extraction, are reviewed in the context of classification framework. The relationship between kernel alignment and other evaluation measures is also explored. Finally, concluding remarks and future directions are presented.

...read moreread less

Journal Article•DOI•

PredcircRNA: computational classification of circular RNA from other long non-coding RNA using hybrid features

[...]

Xiaoyong Pan¹, Kai Xiong¹•Institutions (1)

University of Copenhagen¹

14 Jul 2015-Molecular BioSystems

TL;DR: This study presented a machine learning approach, named as PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using multiple kernel learning, and showed that the proposed method can classify circular RNA from other types of lnc RNAs with an accuracy of 0.778.

...read moreread less

Abstract: Recently circular RNA (circularRNA) has been discovered as an increasingly important type of long non-coding RNA (lncRNA), playing an important role in gene regulation, such as functioning as miRNA sponges. So it is very promising to identify circularRNA transcripts from de novo assembled transcripts obtained by high-throughput sequencing, such as RNA-seq data. In this study, we presented a machine learning approach, named as PredcircRNA, focused on distinguishing circularRNA from other lncRNAs using multiple kernel learning. Firstly we extracted different sources of discriminative features, including graph features, conservation information and sequence compositions, ALU and tandem repeats, SNP densities and open reading frames (ORFs) from transcripts. Secondly, to better integrate features from different sources, we proposed a computational approach based on a multiple kernel learning framework to fuse those heterogeneous features. Our preliminary 5-fold cross-validation result showed that our proposed method can classify circularRNA from other types of lncRNAs with an accuracy of 0.778, sensitivity of 0.781, specificity of 0.770, precision of 0.784 and MCC of 0.554 in our constructed gold-standard dataset, respectively. Our feature importance analysis based on Random Forest illustrated some discriminative features, such as conservation features and a GTAG sequence motif. Our PredcircRNA tool is available for download at https://github.com/xypan1232/PredcircRNA.

...read moreread less

Journal Article•DOI•

Facial expression recognition using $${l}_{p}$$lp-norm MKL multiclass-SVM

[...]

Xiao Zhang¹, Mohammad H. Mahoor¹, S. Mohammad Mavadati¹•Institutions (1)

University of Denver¹

01 May 2015

TL;DR: A novel framework for person-independent expression recognition by combining multiple types of facial features via multiple kernel learning (MKL) in multiclass support vector machines (SVM) that outperforms the state-of-the-art methods and the SimpleMKL-based multiclass-SVM for facial expression recognition.

...read moreread less

Abstract: Automatic recognition of facial expressions is an interesting and challenging research topic in the field of pattern recognition due to applications such as human---machine interface design and developmental psychology. Designing classifiers for facial expression recognition with high reliability is a vital step in this research. This paper presents a novel framework for person-independent expression recognition by combining multiple types of facial features via multiple kernel learning (MKL) in multiclass support vector machines (SVM). Existing MKL-based approaches jointly learn the same kernel weights with $$l_{1}$$l1-norm constraint for all binary classifiers, whereas our framework learns one kernel weight vector per binary classifier in the multiclass-SVM with $$l_{p}$$lp-norm constraints $$(p \ge 1)$$(p?1), which considers both sparse and non-sparse kernel combinations within MKL. We studied the effect of $$l_{p}$$lp-norm MKL algorithm for learning the kernel weights and empirically evaluated the recognition results of six basic facial expressions and neutral faces with respect to the value of "$$p$$p". In our experiments, we combined two popular facial feature representations, histogram of oriented gradient and local binary pattern histogram, with two kernel functions, the heavy-tailed radial basis function and the polynomial function. Our experimental results on the CK$$+$$+, MMI and GEMEP-FERA face databases as well as our theoretical justification show that this framework outperforms the state-of-the-art methods and the SimpleMKL-based multiclass-SVM for facial expression recognition.

...read moreread less

Journal Article•DOI•

Multiple Kernel Learning via Low-Rank Nonnegative Matrix Factorization for Classification of Hyperspectral Imagery

[...]

Yanfeng Gu¹, Qingwang Wang¹, Hong Wang², Di You³, Ye Zhang¹ - Show less +1 more•Institutions (3)

Harbin Institute of Technology¹, Huawei², Motorola³

01 Jun 2015-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: The proposed algorithms, especially for KNMF-based MKL, achieve the outstanding performance for hyperspectral image classification with few labeled samples when compared with several state-of-the-art algorithms.

...read moreread less

Abstract: In this paper, a novel multiple kernel learning (MKL) algorithm is proposed for the classification of hyperspectral images. The proposed MKL algorithm adopts a two-step strategy to learn a multiple kernel machine. In the first step, unsupervised learning is carried out to learn a combined kernel from the predefined base kernels. In our algorithms, low-rank nonnegative matrix factorization (NMF) is used to carry out the unsupervised learning and learn an optimal combined kernel. Furthermore, the kernel NMF (KNMF) is introduced to substitute NMF for enhancing the ability of the unsupervised learning with the predefined base kernels. In the second step, the optimal kernel is embedded into the standard optimization routine of support vector machine (SVM). In addition, we address a major challenge in hyperspectral data classification, i.e., using very few labeled samples in a high-dimensional space. Experiments are conducted on three real hyperspectral datasets, and the experimental results show that the proposed algorithms, especially for KNMF-based MKL, achieve the outstanding performance for hyperspectral image classification with few labeled samples when compared with several state-of-the-art algorithms.

...read moreread less

Journal Article•DOI•

Hybrid Method of Multiple Kernel Learning and Genetic Algorithm for Forecasting Short-Term Foreign Exchange Rates

[...]

Shangkun Deng¹, Kazuki Yoshiyama¹, Takashi Mitsubuchi¹, Akito Sakurai¹•Institutions (1)

Keio University¹

01 Jan 2015-Computing in Economics and Finance

TL;DR: This study demonstrates that the evaluation criteria used to examine the effectiveness of a financial market price forecasting method should be the profit and profit-risk ratio, rather than errors in prediction.

...read moreread less

Abstract: Our proposed prediction and learning method is a hybrid referred to as MKL-GA, which combines multiple kernel learning (MKL) for regression (MKR) and a genetic algorithm (GA) to construct the trading rules. In this study, we demonstrate that the evaluation criteria used to examine the effectiveness of a financial market price forecasting method should be the profit and profit-risk ratio, rather than errors in prediction. Thus, it is necessary to use a price prediction method and a trading rules learning method. We tested the proposed method on the foreign exchange market for the USD/JPY currency pair, where the features used for prediction were extracted from the trading history of the three main currency pairs with three different short-term horizons. MKR is essential for utilizing the information contained in many of the features derived from different information sources and for various representations of the same information source. The GA is essential for generating trading rules, which are described using a mixture of discrete structures and continuous parameters. First, the MKR predicts the change in the exchange rate based on technical indicators such as the moving average convergence and divergence of the three currency pairs. Next, the GA generates a trading rule by combining the results of the MKR with several commonly used overbought/oversold technical indicators. The experimental results show that the proposed hybrid method outperforms other baseline methods in terms of the returns and return-risk ratio. In addition, the kernel weights employed for different currency pairs and the different time horizons used in the MKR step, as well as the trading strategy generated in the GA step, should be beneficial during actual trading.

...read moreread less

Journal Article•DOI•

Two-stage multiple kernel learning for supervised dimensionality reduction

[...]

Abdollah Nazarpour¹, Peyman Adibi¹•Institutions (1)

University of Isfahan¹

01 May 2015-Pattern Recognition

TL;DR: Many experiments on a variety of real-world datasets show that the proposed approach among a number of well-known related techniques, results in accurate and fast classifications.

...read moreread less

Proceedings Article•DOI•

Monogenic Riesz wavelet representation for micro-expression recognition

[...]

Yee-Hui Oh¹, Anh Cat Le Ngo¹, John See¹, Sze-Teng Liong², Raphael C.-W. Phan¹, Huo-Chong Ling¹ - Show less +2 more•Institutions (2)

Multimedia University¹, University of Malaya²

21 Jul 2015

TL;DR: The capability of the proposed feature representation method in outperforming the state-of-the-art monogenic signal approach to solving the micro-expression recognition problem is demonstrated.

...read moreread less

Abstract: A monogenic signal is a two-dimensional analytical signal that provides the local information of magnitude, phase, and orientation. While it has been applied on the field of face and expression recognition [1], [2], [3], there are no known usages for subtle facial micro-expressions. In this paper, we propose a feature representation method which succinctly captures these three low-level components at multiple scales. Riesz wavelet transform is employed to obtain multi-scale monogenic wavelets, which are formulated by quaternion representation. Instead of summing up the multi-scale monogenic representations, we consider all monogenic representations across multiple scales as individual features. For classification, two schemes were applied to integrate these multiple feature representations: a fusion-based method which combines the features efficiently and discriminately using the ultra-fast, optimized Multiple Kernel Learning (UFO-MKL) algorithm; and concatenation-based method where the features are combined into a single feature vector and classified by a linear SVM. Experiments carried out on a recent spontaneous micro-expression database demonstrated the capability of the proposed method in outperforming the state-of-the-art monogenic signal approach to solving the micro-expression recognition problem.

...read moreread less

Facial expression recognition using lp-norm MKL multiclass-SVM

[...]

Xiao Zhang¹, Mohammad H. Mahoor¹, S. Mohammad Mavadati¹•Institutions (1)

University of Denver¹

01 Jan 2015

TL;DR: A novel framework for person-independent expression recog- nition by combining multiple types of facial features via multiple kernel learning (MKL) in multiclass support vector machines (SVM), which outperforms the state-of-the-art methods.

...read moreread less

Abstract: Automaticrecognitionoffacialexpressionsisan interesting and challenging research topic in the field of pat- tern recognition due to applications such as human-machine interface design and developmental psychology. Designing classifiers for facial expression recognition with high relia- bility is a vital step in this research. This paper presents a novel framework for person-independent expression recog- nition by combining multiple types of facial features via multiple kernel learning (MKL) in multiclass support vector machines (SVM). Existing MKL-based approaches jointly learn the same kernel weights withl1-norm constraint for all binary classifiers, whereas our framework learns one kernel weight vector per binary classifier in the multiclass-SVM with lp-norm constraints (p ≥ 1), which considers both sparse and non-sparse kernel combinations within MKL. We studied the effect of lp-norm MKL algorithm for learning the kernel weights and empirically evaluated the recog- nition results of six basic facial expressions and neutral faces with respect to the value of "p". In our experiments, we combined two popular facial feature representations, histogram of oriented gradient and local binary pattern his- togram, with two kernel functions, the heavy-tailed radial basis function and the polynomial function. Our experi- mental results on the CK+, MMI and GEMEP-FERA face databases as well as our theoretical justification show that this framework outperforms the state-of-the-art methods and

...read moreread less

Proceedings Article•DOI•

Multi-class weather classification on single images

[...]

Zheng Zhang¹, Huadong Ma¹•Institutions (1)

Peking University¹

10 Dec 2015

TL;DR: This paper presents a method for any scenario multi-class weather classification based on multiple weather features and multiple kernel learning, and shows that the proposed method can efficiently recognize weather on MWI dataset.

...read moreread less

Abstract: Multi-class weather classification from single images is a fundamental operation in many outdoor computer vision applications. However, it remains difficult and the limited work is carried out for addressing the difficulty. Moreover, existing method is based on the fixed scene. In this paper we present a method for any scenario multi-class weather classification based on multiple weather features and multiple kernel learning. Our approach extracts multiple weather features and takes properly processing. By combining these features into high dimensional vectors, we utilize multiple kernel learning to learn an adaptive classifier. We collect an outdoor image set that contains 20K images called MWI (Multi-class Weather Image) set. Experimental results show that the proposed method can efficiently recognize weather on MWI dataset.

...read moreread less

Proceedings Article•

Recovery of corrupted multiple kernels for clustering

[...]

Peng Zhou¹, Liang Du¹, Lei Shi¹, Hanmo Wang¹, Yi-Dong Shen¹ - Show less +1 more•Institutions (1)

Chinese Academy of Sciences¹

25 Jul 2015

TL;DR: This paper proposes a novel method for learning a robust yet low-rank kernel for clustering tasks, observing that the noises of each kernel have specific structures, so it can make full use of them to clean multiple input kernels and then aggregate them into a robust, low- rank consensus kernel.

...read moreread less

Abstract: Kernel-based methods, such as kernel k-means and kernel PCA, have been widely used in machine learning tasks. The performance of these methods critically depends on the selection of kernel functions; however, the challenge is that we usually do not know what kind of kernels is suitable for the given data and task in advance; this leads to research on multiple kernel learning, i.e. we learn a consensus kernel from multiple candidate kernels. Existing multiple kernel learning methods have difficulty in dealing with noises. In this paper, we propose a novel method for learning a robust yet low-rank kernel for clustering tasks. We observe that the noises of each kernel have specific structures, so we can make full use of them to clean multiple input kernels and then aggregate them into a robust, low-rank consensus kernel. The underlying optimization problem is hard to solve and we will show that it can be solved via alternating minimization, whose convergence is theoretically guaranteed. Experimental results on several benchmark data sets further demonstrate the effectiveness of our method.

...read moreread less

Proceedings Article•DOI•

Semi supervised deep kernel design for image annotation

[...]

Mingyuan Jiu¹, Hichem Sahbi¹•Institutions (1)

Télécom ParisTech¹

19 Apr 2015

TL;DR: This paper redefine multiple kernels using a deep architecture, where a global kernel is learned as a multi-layered linear combination of activation functions, each one involves a combination of several elementary or intermediate functions on multiple features.

...read moreread less

Abstract: It is commonly agreed that the success of support vector machines (SVMs), is highly dependent on the choice of particular similarity functions referred to as kernels. The latter are usually handcrafted or designed using appropriate optimization schemes. Multiple kernel learning (MKL) is one possible scheme that designs kernels as sparse or convex linear combinations of existing elementary functions. However, this results into shallow kernels, which are powerless to capture the right similarity between data, especially when content of these data is highly semantic. In this paper, we redefine multiple kernels using a deep architecture. In this new formulation, a global kernel is learned as a multi-layered linear combination of activation functions, each one involves a combination of several elementary or intermediate functions on multiple features. We propose three different settings to learn the weights of these kernel combinations; supervised, unsupervised and semi-supervised. When plugged into SVMs, the resulting deep multiple kernels show a gain, compared to shallow kernels, for the challenging task of image annotation using the ImageCLEF benchmark.

...read moreread less

Journal Article•DOI•

Image Classification With Densely Sampled Image Windows and Generalized Adaptive Multiple Kernel Learning

[...]

Shengye Yan¹, Xinxing Xu², Dong Xu², Stephen Lin³, Xuelong Li⁴ - Show less +1 more•Institutions (4)

Nanjing University of Information Science and Technology¹, Nanyang Technological University², Microsoft³, Chinese Academy of Sciences⁴

01 Mar 2015-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: In this article, a generalized adaptive $ell _{p}$ -norm multiple kernel learning (GA-MKL) was proposed to learn a robust classifier based on multiple base kernels constructed from the new image features and multiple sets of prelearned classifiers from other classes.

...read moreread less

Abstract: We present a framework for image classification that extends beyond the window sampling of fixed spatial pyramids and is supported by a new learning algorithm. Based on the observation that fixed spatial pyramids sample a rather limited subset of the possible image windows, we propose a method that accounts for a comprehensive set of windows densely sampled over location, size, and aspect ratio. A concise high-level image feature is derived to effectively deal with this large set of windows, and this higher level of abstraction offers both efficient handling of the dense samples and reduced sensitivity to misalignment. In addition to dense window sampling, we introduce generalized adaptive $\ell _{p}$ -norm multiple kernel learning (GA-MKL) to learn a robust classifier based on multiple base kernels constructed from the new image features and multiple sets of prelearned classifiers from other classes. With GA-MKL, multiple levels of image features are effectively fused, and information is shared among different classifiers. Extensive evaluation on benchmark datasets for object recognition (Caltech256 and Caltech101) and scene recognition (15Scenes) demonstrate that the proposed method outperforms the state-of-the-art under a broad range of settings.

...read moreread less

Journal Article•DOI•

Predicting protein function using multiple kernels

[...]

Guoxian Yu¹, Huzefa Rangwala², Carlotta Domeniconi², Guoji Zhang¹, Zili Zhang³ - Show less +1 more•Institutions (3)

South China University of Technology¹, George Mason University², Deakin University³

01 Jan 2015-IEEE/ACM Transactions on Computational Biology and Bioinformatics

TL;DR: The proposed ProMK iteratively optimizes the phases of learning optimal weights and reduces the empirical loss of multi-label classifier for each of the labels simultaneously and performs better than previously proposed protein function prediction approaches that integrate multiple data sources and multi- label multiple kernel learning methods.

...read moreread less

Abstract: High-throughput experimental techniques provide a wide variety of heterogeneous proteomic data sources. To exploit the information spread across multiple sources for protein function prediction, these data sources are transformed into kernels and then integrated into a composite kernel. Several methods first optimize the weights on these kernels to produce a composite kernel, and then train a classifier on the composite kernel. As such, these approaches result in an optimal composite kernel, but not necessarily in an optimal classifier. On the other hand, some approaches optimize the loss of binary classifiers and learn weights for the different kernels iteratively. For multi-class or multi-label data, these methods have to solve the problem of optimizing weights on these kernels for each of the labels, which are computationally expensive and ignore the correlation among labels. In this paper, we propose a method called Predicting Pro tein Function using M ultiple K ernels (ProMK). ProMK iteratively optimizes the phases of learning optimal weights and reduces the empirical loss of multi-label classifier for each of the labels simultaneously. ProMK can integrate kernels selectively and downgrade the weights on noisy kernels. We investigate the performance of ProMK on several publicly available protein function prediction benchmarks and synthetic datasets. We show that the proposed approach performs better than previously proposed protein function prediction approaches that integrate multiple data sources and multi-label multiple kernel learning methods. The codes of our proposed method are available at https://sites.google.com/site/guoxian85/promk.

...read moreread less

Journal Article•DOI•

Texture classification using feature selection and kernel-based techniques

[...]

Carlos Fernandez-Lozano¹, Jose A. Seoane², Jose A. Seoane³, Marcos Gestal¹, Tom R. Gaunt², Julián Dorado¹, Colin Campbell² - Show less +3 more•Institutions (3)

University of A Coruña¹, University of Bristol², Stanford University³

08 Jan 2015

TL;DR: An evaluation of a number of feature selection techniques for classification in a biomedical image texture dataset (2-DE gel images) finds the best technique found is SVM-RFE, with an AUROC score of ($95.88±0.39), but this method is not significantly better than RFE-TREE, R FE-RF and grouped MKL, whilst MKL uses lower number of features, increasing the interpretability of the results.

...read moreread less

Abstract: The interpretation of the results in a classification problem can be enhanced, specially in image texture analysis problems, by feature selection techniques, knowing which features contribute more to the classification performance. This paper presents an evaluation of a number of feature selection techniques for classification in a biomedical image texture dataset (2-DE gel images), with the aim of studying their performance and the stability in the selection of the features. We analyse three different techniques: subgroup-based multiple kernel learning (MKL), which can perform a feature selection by down-weighting or eliminating subsets of features which shares similar characteristic, and two different conventional feature selection techniques such as recursive feature elimination (RFE), with different classifiers (naive Bayes, support vector machines, bagged trees, random forest and linear discriminant analysis), and a genetic algorithm-based approach with an SVM as decision function. The different classifiers were compared using a ten times tenfold cross-validation model, and the best technique found is SVM-RFE, with an AUROC score of ( $$95.88 \pm 0.39\,\%$$ ). However, this method is not significantly better than RFE-TREE, RFE-RF and grouped MKL, whilst MKL uses lower number of features, increasing the interpretability of the results. MKL selects always the same features, related to wavelet-based textures, while RFE methods focuses specially co-occurrence matrix-based features, but with high instability in the number of features selected.

...read moreread less

Journal Article•DOI•

Multi-channel EEG-based sleep stage classification with joint collaborative representation and multiple kernel learning.

[...]

Jun Shi¹, Xiao Liu¹, Yan Li², Qi Zhang¹, Yingjie Li¹, Shihui Ying¹ - Show less +2 more•Institutions (2)

Shanghai University¹, Shenzhen University²

30 Oct 2015-Journal of Neuroscience Methods

TL;DR: The two-stage multi-view learning based sleep staging framework outperforms all other classification methods compared in this work, while JCR is superior to JSR.

...read moreread less

Journal Article•DOI•

Classification of first-episode psychosis: a multi-modal multi-feature approach integrating structural and diffusion imaging

[...]

Denis Peruzzo¹, Umberto Castellani¹, Cinzia Perlini¹, Marcella Bellani¹, Veronica Marinelli¹, Gianluca Rambaldelli¹, Antonio Lasalvia¹, Sarah Tosato¹, Katia De Santi¹, Vittorio Murino¹, Vittorio Murino², Mirella Ruggeri¹, Paolo Brambilla³, Paolo Brambilla⁴ - Show less +10 more•Institutions (4)

University of Verona¹, Istituto Italiano di Tecnologia², University of Udine³, University of Texas at Austin⁴

01 Jun 2015-Journal of Neural Transmission

TL;DR: This study shows that multivariate machine learning approaches integrating multi-modal and multisource imaging data can classify FEP patients with high accuracy, and specific grey matter structures and white matter bundles reach high classification reliability when using different imaging modalities and indices.

...read moreread less

Abstract: Currently, most of the classification studies of psychosis focused on chronic patients and employed single machine learning approaches. To overcome these limitations, we here compare, to our best knowledge for the first time, different classification methods of first-episode psychosis (FEP) using multi-modal imaging data exploited on several cortical and subcortical structures and white matter fiber bundles. 23 FEP patients and 23 age-, gender-, and race-matched healthy participants were included in the study. An innovative multivariate approach based on multiple kernel learning (MKL) methods was implemented on structural MRI and diffusion tensor imaging. MKL provides the best classification performances in comparison with the more widely used support vector machine, enabling the definition of a reliable automatic decisional system based on the integration of multi-modal imaging information. Our results show a discrimination accuracy greater than 90 % between healthy subjects and patients with FEP. Regions with an accuracy greater than 70 % on different imaging sources and measures were middle and superior frontal gyrus, parahippocampal gyrus, uncinate fascicles, and cingulum. This study shows that multivariate machine learning approaches integrating multi-modal and multisource imaging data can classify FEP patients with high accuracy. Interestingly, specific grey matter structures and white matter bundles reach high classification reliability when using different imaging modalities and indices, potentially outlining a prefronto-limbic network impaired in FEP with particular regard to the right hemisphere.

...read moreread less

Proceedings Article•DOI•

Feature and decision level fusion using multiple kernel learning and fuzzy integrals

[...]

Anthony J. Pinar¹, Timothy C. Havens¹, Derek T. Anderson², Lequn Hu²•Institutions (2)

Michigan Technological University¹, Mississippi State University²

30 Nov 2015

TL;DR: This work proposes an ℓp-normed genetic algorithm MKL (GAMKLp), which uses a genetic algorithm to learn the weights of a set of pre-computed kernel matrices for use with MKL classification, and proves that this approach is equivalent to a previously proposed fuzzy integral aggregation of multiple kernels called fuzzy integral: genetic algorithm (FIGA).

...read moreread less

Abstract: Kernel methods for classification is a well-studied area in which data are implicitly mapped from a lower-dimensional space to a higher-dimensional space to improve classification accuracy. However, for most kernel methods, one must still choose a kernel to use for the problem. Since there is, in general, no way of knowing which kernel is the best, multiple kernel learning (MKL) is a technique used to learn the aggregation of a set of valid kernels into a single (ideally) superior kernel. The aggregation can be done using weighted sums of the pre-computed kernels, but determining the summation weights is not a trivial task. A popular and successful approach to this problem is MKL-group lasso (MKLGL), where the weights and classification surface are simultaneously solved by iteratively optimizing a min-max optimization until convergence. In this work, we propose an l p -normed genetic algorithm MKL (GAMKL p ), which uses a genetic algorithm to learn the weights of a set of pre-computed kernel matrices for use with MKL classification. We prove that this approach is equivalent to a previously proposed fuzzy integral aggregation of multiple kernels called fuzzy integral: genetic algorithm (FIGA). A second algorithm, which we call decision-level fuzzy integral MKL (DeFIMKL), is also proposed, where a fuzzy measure with respect to the fuzzy Choquet integral is learned via quadratic programming, and the decision value—viz., the class label—is computed using the fuzzy Choquet integral aggregation. Experiments on several benchmark data sets show that our proposed algorithms can outperform MKLGL when applied to support vector machine (SVM)-based classification.

...read moreread less

Proceedings Article•DOI•

Multiple Graph-Kernel Learning

[...]

Fabio Aiolli¹, Michele Donini¹, Nicolò Navarin¹, Alessandro Sperduti¹•Institutions (1)

University of Padua¹

01 Dec 2015

TL;DR: A Multiple Kernel Learning (MKL) approach to learn different weights of different bunches of features which are grouped by complexity, and defines a notion of kernel complexity, namely Kernel Spectral Complexity, and shows how this complexity relates to the well-known Empirical Rademacher Complexity for a natural class of functions which include SVM.

...read moreread less

Abstract: Kernels for structures, including graphs, generally suffer of the diagonally dominant gram matrix issue, the effect by which the number of sub-structures, or features, shared between instances are very few with respect to those shared by an instance with itself. A parametric rule is typically used to reduce the weights of largest (more complex) sub-structures. The particular rule which is adopted is in fact a strong external bias that may strongly affect the resulting predictive performance. Thus, in principle, the applied rule should be validated in addition to the other hyper-parameters of the kernel. Nevertheless, for the majority of graph kernels proposed in literature, the parameters of the weighting rule are fixed a priori. The contribution of this paper is two-fold. Firstly, we propose a Multiple Kernel Learning (MKL) approach to learn different weights of different bunches of features which are grouped by complexity. Secondly, we define a notion of kernel complexity, namely Kernel Spectral Complexity, and we show how this complexity relates to the well-known Empirical Rademacher Complexity for a natural class of functions which include SVM. The proposed approach is applied to a recently defined graph kernel and evaluated on several real-world datasets. The obtained results show that our approach outperforms the original kernel on all the considered tasks.

...read moreread less

Journal Article•DOI•

Pareto-Path Multitask Multiple Kernel Learning

[...]

Cong Li¹, Michael Georgiopoulos¹, Georgios C. Anagnostopoulos²•Institutions (2)

University of Central Florida¹, Florida Institute of Technology²

01 Jan 2015-IEEE Transactions on Neural Networks

TL;DR: This work proposes a novel support vector machine MT-MKL framework that considers an implicitly defined set of conic combinations of task objectives and demonstrates that the framework is capable of achieving a better classification performance, when compared with other similar MTL approaches.

...read moreread less

Abstract: A traditional and intuitively appealing Multitask Multiple Kernel Learning (MT-MKL) method is to optimize the sum (thus, the average) of objective functions with (partially) shared kernel function, which allows information sharing among the tasks. We point out that the obtained solution corresponds to a single point on the Pareto Front (PF) of a multiobjective optimization problem, which considers the concurrent optimization of all task objectives involved in the Multitask Learning (MTL) problem. Motivated by this last observation and arguing that the former approach is heuristic, we propose a novel support vector machine MT-MKL framework that considers an implicitly defined set of conic combinations of task objectives. We show that solving our framework produces solutions along a path on the aforementioned PF and that it subsumes the optimization of the average of objective functions as a special case. Using the algorithms we derived, we demonstrate through a series of experimental results that the framework is capable of achieving a better classification performance, when compared with other similar MTL approaches.

...read moreread less

Journal Article•DOI•

Kernel methods for heterogeneous feature selection

[...]

Jérôme Paul¹, Roberto D'Ambrosio¹, Pierre Dupont¹•Institutions (1)

Université catholique de Louvain¹

02 Dec 2015-Neurocomputing

TL;DR: Two feature selection methods to deal with heterogeneous data that include continuous and categorical variables are introduced and shown to offer state-of-the-art performances on a variety of high-dimensional classification tasks.

...read moreread less