scispace - formally typeset
Search or ask a question
Topic

Multiple kernel learning

About: Multiple kernel learning is a research topic. Over the lifetime, 1630 publications have been published within this topic receiving 56082 citations.


Papers
More filters
Posted Content
TL;DR: In this article, a machine learning approach that ranked on the first place in the Arabic Dialect Identification (ADI) Closed Shared Task of the 2018 VarDial Evaluation Campaign was presented.
Abstract: We present a machine learning approach that ranked on the first place in the Arabic Dialect Identification (ADI) Closed Shared Tasks of the 2018 VarDial Evaluation Campaign. The proposed approach combines several kernels using multiple kernel learning. While most of our kernels are based on character p-grams (also known as n-grams) extracted from speech or phonetic transcripts, we also use a kernel based on dialectal embeddings generated from audio recordings by the organizers. In the learning stage, we independently employ Kernel Discriminant Analysis (KDA) and Kernel Ridge Regression (KRR). Preliminary experiments indicate that KRR provides better classification results. Our approach is shallow and simple, but the empirical results obtained in the 2018 ADI Closed Shared Task prove that it achieves the best performance. Furthermore, our top macro-F1 score (58.92%) is significantly better than the second best score (57.59%) in the 2018 ADI Shared Task, according to the statistical significance test performed by the organizers. Nevertheless, we obtain even better post-competition results (a macro-F1 score of 62.28%) using the audio embeddings released by the organizers after the competition. With a very similar approach (that did not include phonetic features), we also ranked first in the ADI Closed Shared Tasks of the 2017 VarDial Evaluation Campaign, surpassing the second best method by 4.62%. We therefore conclude that our multiple kernel learning method is the best approach to date for Arabic dialect identification.

3 citations

Patent
25 Jan 2017
TL;DR: In this article, a hyperspectral image classification method based on morphology contour characteristics and nonlinear multiple kernel learning was proposed to overcome the defects that the spatial information can not be fully excavated and useful information produced by nonlinear interaction among base kernels is not considered in a HSI classification method.
Abstract: The invention relates to a hyperspectral image classification method based on morphology contour characteristics and nonlinear multiple kernel learning and aims at overcoming the defects that the spatial information of a hyperspectral image can not be fully excavated and useful information produced by nonlinear interaction among base kernels is not considered in a hyperspectral image classification method The hyperspectral image classification method provided by the invention comprises the following concrete steps: firstly, extracting a principal component of the hyperspectral image by utilizing a principal component analysis method, and obtaining multi-structure element morphology contour characteristics expanded by the hyperspectral image on the basis of the principal component; secondly, constructing linear base kernels; thirdly, obtaining a nonlinear combined kernel; fourthly, substituting the nonlinear combined kernel into a support vector machine, and obtaining optimal kernel weight by adopting a gradient descent method; and fifthly, classifying the hyperspectral image The hyperspectral image classification method provided by the invention is applied to the field of image classification

3 citations

Journal Article
TL;DR: The Tessellated Kernel (TK) class as discussed by the authors is a class of kernels that admits a linear parameterization using positive matrices; is dense in all kernels; and every element in the class is universal.
Abstract: The accuracy and complexity of kernel learning algorithms is determined by the set of kernels over which it is able to optimize. An ideal set of kernels should: admit a linear parameterization (tractability); be dense in the set of all kernels (accuracy); and every member should be universal so that the hypothesis space is infinite-dimensional (scalability). Currently, there is no class of kernel that meets all three criteria - e.g. Gaussians are not tractable or accurate; polynomials are not scalable. We propose a new class that meet all three criteria - the Tessellated Kernel (TK) class. Specifically, the TK class: admits a linear parameterization using positive matrices; is dense in all kernels; and every element in the class is universal. This implies that the use of TK kernels for learning the kernel can obviate the need for selecting candidate kernels in algorithms such as SimpleMKL and parameters such as the bandwidth. Numerical testing on soft margin Support Vector Machine (SVM) problems show that algorithms using TK kernels outperform other kernel learning algorithms and neural networks. Furthermore, our results show that when the ratio of the number of training data to features is high, the improvement of TK over MKL increases significantly.

3 citations

Dissertation
28 Jul 2017
TL;DR: This work uses MKL to help building models to distinguish between malignant and benign findings and adopts a strategy of weighing the benign and malignant cases in order to produce models that are more reliable and robust to the class distribution.
Abstract: Detecting breast cancer in mammograms can be a hard task even to most experienced specialists. Several works in the literature have tried to build models to describe malignant or benign findings using BI-RADS annotated features or features automatically extracted from images. Some of the best models are based on Support Vector Machines (SVMs). Features from mammograms have heterogeneous types and most methods handle them equally. Multiple Kernel Learning (MKL) can create models where each feature can be treated in a different way, which may improve the quality of the learned models. In this work, we use MKL to help building models to distinguish between malignant and benign findings. One of the problems with this domain is that the classes are unbalanced: fortunately the number of malignant cases is much smaller than the number of benign cases. However, this imbalance may lead an MKL classifier to label most of the cases as benign. We improve on these models by adopting a strategy of weighing the benign and malignant cases in order to produce models that are more reliable and robust to the class distribution. Our results show that our weighted approach produces better quality models for both balanced and unbalanced mammogram datasets.

3 citations

Book ChapterDOI
01 Jan 2016
TL;DR: This chapter introduces a novel kernel for histograms of visual words , namely the PQ kernel and a proof that PQ is indeed a kernel is also given in this chapter.
Abstract: This chapter presents some improvements of the of visual words model for two applications, namely object recognition and facial expression recognition . For the bag of visual words approach, images are represented as histograms of visual words from a codebook that is usually obtained with a simple clustering method. Next, kernel methods are used to compare such histograms. The chapter introduces a novel kernel for histograms of visual words , namely the PQ kernel . A proof that PQ is indeed a kernel is also given in this chapter. Object recognition experiments are conducted to compare the PQ kernel with other state-of-the-art kernels on two benchmark data sets. The PQ kernel has the best performance on both data sets. Researchers have demonstrated that the object recognition performance with the bag of visual words can be improved by including spatial information . A state-of-the-art approach is the spatial pyramid representation, which divides the image into spatial bins. In this chapter, another general approach that encodes the spatial information in a much better and efficient way is described. The approach is to embed the spatial information into a kernel function termed the Spatial Non-Alignment Kernel (SNAK). For each visual word, the average position and the standard deviation is computed based on all the occurrences of the visual word in the image. The pairwise similarity of two images is then computed by taking into account the difference between the average positions and the difference between the standard deviations of each visual word in the two images. In all the experiments, the SNAK framework shows a better recognition accuracy compared to the spatial pyramid . Finally, the chapter presents a bag of visual words model based on local multiple kernel learning for facial expression recognition .

3 citations


Network Information
Related Topics (5)
Convolutional neural network
74.7K papers, 2M citations
89% related
Deep learning
79.8K papers, 2.1M citations
89% related
Feature extraction
111.8K papers, 2.1M citations
87% related
Feature (computer vision)
128.2K papers, 1.7M citations
87% related
Image segmentation
79.6K papers, 1.8M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202321
202244
202172
2020101
2019113
2018114