scispace - formally typeset
Search or ask a question

Showing papers on "Multiple kernel learning published in 2022"


Journal ArticleDOI
TL;DR: A multiple kernel-based triple collaborative matrix factorization (MK-TCMF) method to predict DTIs, which has better performance on four test data sets and can regulate the weight of each kernel matrix according to the prediction error.
Abstract: Targeted drugs have been applied to the treatment of cancer on a large scale, and some patients have certain therapeutic effects. It is a time-consuming task to detect drug-target interactions (DTIs) through biochemical experiments. At present, machine learning (ML) has been widely applied in large-scale drug screening. However, there are few methods for multiple information fusion. We propose a multiple kernel-based triple collaborative matrix factorization (MK-TCMF) method to predict DTIs. The multiple kernel matrices (contain chemical, biological and clinical information) are integrated via multi-kernel learning (MKL) algorithm. And the original adjacency matrix of DTIs could be decomposed into three matrices, including the latent feature matrix of the drug space, latent feature matrix of the target space and the bi-projection matrix (used to join the two feature spaces). To obtain better prediction performance, MKL algorithm can regulate the weight of each kernel matrix according to the prediction error. The weights of drug side-effects and target sequence are the highest. Compared with other computational methods, our model has better performance on four test data sets.

27 citations


Journal ArticleDOI
TL;DR: In this article, a sparsified version of manifold learning is proposed to align the latent spaces encoding each descriptor and weighting the strength of the alignment depending on each pair of samples.

7 citations


Journal ArticleDOI
TL;DR: In this article , a novel multiple kernel learning (MKLKL) was proposed for Coronary artery disease (CAD) detection using phonocardiogram (PCG).
Abstract: Conventional machine learning has paved the way for a simple, affordable, non-invasive approach for Coronary artery disease (CAD) detection using phonocardiogram (PCG). It leaves a scope to explore improvement of performance metrics by fusion of learned representations from deep learning. In this study, we propose a novel, multiple kernel learning (MKL) for their fusion using deep embeddings transferred from pre-trained convolutional neural network (CNN). The proposed MKL, finds optimal kernel combination by maximizing the similarity with ideal kernel and minimizing the redundancy with other basis kernels. Experiments are performed on 960 PCG epochs collected from 40 CAD and 40 normal subjects. The transferred embeddings attain maximum subject-level accuracy of 89.25% with kappa of 0.7850. Later, their fusion with handcrafted features using the proposed MKL gives an accuracy of 91.19% and kappa 0.8238. The study shows the potential of development of high accuracy CAD detection system by using easy to acquire, non-invasive PCG signal.

7 citations


Journal ArticleDOI
TL;DR: In this article , the authors examined the efficacy of the multiple kernel learning (MKL) classifier and the Boruta feature selection method for schizophrenia patients (SZ) and healthy controls (HC) single-subject classification.
Abstract: Antecedent The event-related potential (ERP) components P300 and mismatch negativity (MMN) have been linked to cognitive deficits in patients with schizophrenia. The diagnosis of schizophrenia could be improved by applying machine learning procedures to these objective neurophysiological biomarkers. Several studies have attempted to achieve this goal, but no study has examined Multiple Kernel Learning (MKL) classifiers. This algorithm finds optimally a combination of kernel functions, integrating them in a meaningful manner, and thus could improve diagnosis. Objective This study aimed to examine the efficacy of the MKL classifier and the Boruta feature selection method for schizophrenia patients (SZ) and healthy controls (HC) single-subject classification. Methods A cohort of 54 SZ and 54 HC participants were studied. Three sets of features related to ERP signals were calculated as follows: peak related features, peak to peak related features, and signal related features. The Boruta algorithm was used to evaluate the impact of feature selection on classification performance. An MKL algorithm was applied to address schizophrenia detection. Results A classification accuracy of 83% using the whole dataset, and 86% after applying Boruta feature selection was obtained. The variables that contributed most to the classification were mainly related to the latency and amplitude of the auditory P300 paradigm. Conclusion This study showed that MKL can be useful in distinguishing between schizophrenic patients and controls when using ERP measures. Moreover, the use of the Boruta algorithm provides an improvement in classification accuracy and computational cost.

5 citations


Journal ArticleDOI
TL;DR: In this paper , a support vector machine (SVM) classifier based on the MKL algorithm EasyMKL was proposed to investigate the feasibility of MKL algorithms in EEG-based emotion recognition problems.
Abstract: Emotion recognition based on electroencephalography (EEG) has a wide range of applications and has great potential value, so it has received increasing attention from academia and industry in recent years. Meanwhile, multiple kernel learning (MKL) has also been favored by researchers for its data-driven convenience and high accuracy. However, there is little research on MKL in EEG-based emotion recognition. Therefore, this paper is dedicated to exploring the application of MKL methods in the field of EEG emotion recognition and promoting the application of MKL methods in EEG emotion recognition. Thus, we proposed a support vector machine (SVM) classifier based on the MKL algorithm EasyMKL to investigate the feasibility of MKL algorithms in EEG-based emotion recognition problems. We designed two data partition methods, random division to verify the validity of the MKL method and sequential division to simulate practical applications. Then, tri-categorization experiments were performed for neutral, negative and positive emotions based on a commonly used dataset, the Shanghai Jiao Tong University emotional EEG dataset (SEED). The average classification accuracies for random division and sequential division were 92.25% and 74.37%, respectively, which shows better classification performance than the traditional single kernel SVM. The final results show that the MKL method is obviously effective, and the application of MKL in EEG emotion recognition is worthy of further study. Through the analysis of the experimental results, we discovered that the simple mathematical operations of the features on the symmetrical electrodes could not effectively integrate the spatial information of the EEG signals to obtain better performance. It is also confirmed that higher frequency band information is more correlated with emotional state and contributes more to emotion recognition. In summary, this paper explores research on MKL methods in the field of EEG emotion recognition and provides a new way of thinking for EEG-based emotion recognition research.

5 citations


Journal ArticleDOI
TL;DR: In this paper , a multiple kernel mutual learning method based on transfer learning of combined mid-level features is proposed for hyperspectral classification, where three-layer homogenous superpixels are computed on the image formed by PCA, which is used for computing mid-Level features.
Abstract: By training different models and averaging their predictions, the performance of the machine-learning algorithm can be improved. The performance optimization of multiple models is supposed to generalize further data well. This requires the knowledge transfer of generalization information between models. In this article, a multiple kernel mutual learning method based on transfer learning of combined mid-level features is proposed for hyperspectral classification. Three-layer homogenous superpixels are computed on the image formed by PCA, which is used for computing mid-level features. The three mid-level features include: 1) the sparse reconstructed feature; 2) combined mean feature; and 3) uniqueness. The sparse reconstruction feature is obtained by a joint sparse representation model under the constraint of three-scale superpixels' boundaries and regions. The combined mean features are computed with average values of spectra in multilayer superpixels, and the uniqueness is obtained by the superposed manifold ranking values of multilayer superpixels. Next, three kernels of samples in different feature spaces are computed for mutual learning by minimizing the divergence. Then, a combined kernel is constructed to optimize the sample distance measurement and applied by employing SVM training to build classifiers. Experiments are performed on real hyperspectral datasets, and the corresponding results demonstrated that the proposed method can perform significantly better than several state-of-the-art competitive algorithms based on MKL and deep learning.

4 citations


Journal ArticleDOI
TL;DR: It is found that the suggested algorithm AFO combined with MKSVM (AFO-MKSVM) scales very well to high-dimensional DSs which outperforms the existing approach "Linear Discriminant Analysis-Support Vector Machine (LDA-SVM)" in terms of performance.
Abstract: The data's dimensionality had already risen sharply in the last several decades. The "Dimensionality Curse" (DC) is a problem for conventional learning techniques when dealing with "Big Data (BD)" with a higher level of dimensionality. A learning model's performance degrades when there is a numerous range of features present. "Dimensionality Reduction (DR)" approaches are used to solve the DC issue, and the field of "Machine Learning (ML)" research is significant in this regard. It is a prominent procedure to use "Feature Selection (FS)" to reduce dimensions. Improved learning effectiveness such as greater classification precision, cheaper processing costs, and improved model comprehensibility are all typical outcomes of this approach that selects an optimal portion of the original features based on some relevant assessment criteria. An "Adaptive Firefly Optimization (AFO)" technique based on the "Map Reduce (MR)" platform is developed in this research. During the initial phase (mapping stage) the whole large "DataSet (DS)" is first subdivided into blocks of contexts. The AFO technique is then used to choose features from its large DS. In the final phase (reduction stage), every one of the fragmentary findings is combined into a single feature vector. Then the "Multi Kernel Support Vector Machine (MKSVM)" classifier is used as classification in this research to classify the data for appropriate class from the optimal features obtained from AFO for DR purposes. We found that the suggested algorithm AFO combined with MKSVM (AFO-MKSVM) scales very well to high-dimensional DSs which outperforms the existing approach "Linear Discriminant Analysis-Support Vector Machine (LDA-SVM)" in terms of performance. The evaluation metrics such as Information-Ratio for Dimension-Reduction, Accuracy, and Recall, indicate that the AFO-MKSVM method established a better outcome than the LDA-SVM method.

4 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a multiple kernel learning (MKL) approach to tackle two challenges, named as Multi-Kernel Multi-Label (MKML) method, which contains three kernel modules.

4 citations


Proceedings ArticleDOI
23 May 2022
TL;DR: The Discrete Multi-kernel k-means with Diverse and Optimal Kernel Learning (DMK-DOK) model is proposed, which adaptively seeks for a better kernel by residing in the base kernel neighborhood and negotiates the kernel learning and clustering.
Abstract: Multiple Kernel k-means and its variants integrate a group of kernels to improve clustering performance, but it still has some drawbacks: 1) linearly combining base kernels to get the optimal one limits the kernel representability and cuts off the negotiation of kernel learning and clustering; 2) ignoring the correlation among kernels leads to kernel redundancy; 3) solving NP-hard cluster assignment problem by a two-stage strategy leads to information loss. In this paper, we propose the Discrete Multi-kernel k-means with Diverse and Optimal Kernel Learning (DMK-DOK) model, which adaptively seeks for a better kernel by residing in the base kernel neighborhood and negotiates the kernel learning and clustering. Moreover, it implicitly penalizes the highly correlated kernels to enhance the kernel fusion with less redundancy and more diversity. What’s more, it jointly learns discrete and relaxed labels in the same optimization objective, which can avoid information loss. Lastly, extensive experiments conducted on real-world datasets illustrated the superiority of our model.

4 citations


Journal ArticleDOI
TL;DR: The experimental results demonstrate that the segmentation accuracy of the proposed KLRR-SR method is superior to several existing methods under different indicators and that the sparsity constraint for the coefficient matrix in the kernel space, which is integrated into the kernel low-rank model, has certain effects in preserving the local structure and details of brain tumours.
Abstract: Given the need for quantitative measurement and 3D visualisation of brain tumours, more and more attention has been paid to the automatic segmentation of tumour regions from brain tumour magnetic resonance (MR) images. In view of the uneven grey distribution of MR images and the fuzzy boundaries of brain tumours, a representation model based on the joint constraints of kernel low-rank and sparsity (KLRR-SR) is proposed to mine the characteristics and structural prior knowledge of brain tumour image in the spectral kernel space. In addition, the optimal kernel based on superpixel uniform regions and multikernel learning (MKL) is constructed to improve the accuracy of the pairwise similarity measurement of pixels in the kernel space. By introducing the optimal kernel into KLRR-SR, the coefficient matrix can be solved, which allows brain tumour segmentation results to conform with the spatial information of the image. The experimental results demonstrate that the segmentation accuracy of the proposed method is superior to several existing methods under different indicators and that the sparsity constraint for the coefficient matrix in the kernel space, which is integrated into the kernel low-rank model, has certain effects in preserving the local structure and details of brain tumours.

4 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a multitask multiple kernel learning (MKL) algorithm with a clustering of tasks and developed a highly time-efficient solution approach for it based on the Benders decomposition and treating the clustering problem as finding a given number of tree structures in a graph.
Abstract: Multitask multiple kernel learning (MKL) algorithms combine the capabilities of incorporating different data sources into the prediction model and using the data from one task to improve the accuracy on others. However, these methods do not necessarily produce interpretable results. Restricting the solutions to the set of interpretable solutions increases the computational burden of the learning problem significantly, leading to computationally prohibitive run times for some important biomedical applications. That is why we propose a multitask MKL formulation with a clustering of tasks and develop a highly time-efficient solution approach for it. Our solution method is based on the Benders decomposition and treating the clustering problem as finding a given number of tree structures in a graph; hence, it is called the forest formulation. We use our method to discriminate early-stage and late-stage cancers using genomic data and gene sets and compare our algorithm against two other algorithms. The two other algorithms are based on different approaches for linearization of the problem while all algorithms make use of the cutting-plane method. Our results indicate that as the number of tasks and/or the number of desired clusters increase, the forest formulation becomes increasingly favorable in terms of computational performance.

Journal ArticleDOI
TL;DR: In this article , instead of directly calculating label correlations by cosine distance and so on, the authors introduce a kernel function and an manifold regularization to learn them by iteratively updating.
Abstract: It is important to fully utilize label correlations in multi-label learning. If there is a strong positive correlation between label i and label j , an instance associated with label i also likely has label j simultaneously. So, label correlations can provide some auxiliary information when predicting unseen instances. Existing some multi-label algorithms utilize label correlations to constraint model parameters in the training stage, while label correlations are ignored in the prediction stage. Moreover, it is difficult to obtain relatively accurate label correlations by directly observing data when some labels have few positive instances in the training data. In this paper, instead of directly calculating label correlations by cosine distance and so on, we introduce a kernel function and an manifold regularization to learn them by iteratively updating. Meanwhile, we utilize them and local label information to aid label prediction. Ultimately, unseen instances are predicted by combining auxiliary label predictions and the model outputs. We compare the proposed algorithm with related algorithms on 10 data sets, and the experimental results validate its effectiveness. • Label correlations are used to train a model and predict labels simultaneously. • Use label correlations and local label information to aid label prediction. • Introduce a kernel function and the constraint to learn label correlations. • Extensive experiments verify the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: This work explores different divergence measures on the values in the kernel matrices and in the reproducing kernel Hilbert space (RKHS) to take a different approach to Multiple kernel learning.
Abstract: Kernel theory is a demonstrated tool that has made its way into nearly all areas of machine learning. However, a serious limitation of kernel methods is knowing which kernel is needed in practice. Multiple kernel learning (MKL) is an attempt to learn a new tailored kernel through the aggregation of a set of valid known kernels. There are generally three approaches to MKL: fixed rules, heuristics, and optimization. Optimization is the most popular; however, a shortcoming of most optimization approaches is that they are tightly coupled with the underlying objective function and overfitting occurs. Herein, we take a different approach to MKL. Specifically, we explore different divergence measures on the values in the kernel matrices and in the reproducing kernel Hilbert space (RKHS). Experiments on benchmark datasets and a computer vision feature learning task in explosive hazard detection demonstrate the effectiveness and generalizability of our proposed methods.

Journal ArticleDOI
TL;DR: In this article , a graph-based heuristic approach for multiple kernel learning (MKL) is proposed, which assigns sample-specific kernel weights based on contribution to graph modularity.

Journal ArticleDOI
TL;DR: In this paper , a method based on spatial-spectral Schroedinger eigenmaps (SSSE) and multiple kernel learning (MKL) was proposed to classify hyperspectral images more efficiently while using a low number of training samples.
Abstract: Abstract The classification of hyperspectral images is one of the most popular fields in remote sensing applications. It should be noted that spectral and spatial features have critical roles in this research area. This paper proposes a method based on spatial-spectral Schroedinger eigenmaps (SSSE) and multiple kernel learning (MKL) to classify hyperspectral images more efficiently while using a low number of training samples. In the proposed method, first SSSE is applied to spectral domain in order to extract significant features and reduce dimension of the original image. Then MKL is utilized to enhance the feature learning process and obtain an optimum combination of some specified kernels. Finally, the classification is carried out by substituting the optimal kernel in support vector machine (SVM) algorithm. Experimental results show that the proposed method improves classification accuracy significantly and provides highly efficient results in the case of a small number of training samples. Furthermore, the computation time of the proposed method is much lower than the state-of-the-art MKL methods.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a new multi-kernel learning ensemble algorithm, called Ada-MKL-WSVR, which can be regarded as an extension of MKL and weighted support vector regression (WSVR).
Abstract: This paper proposes a new multi-kernel learning ensemble algorithm, called Ada- $L_{1}$ MKL-WSVR, which can be regarded as an extension of multi-kernel learning (MKL) and weighted support vector regression (WSVR). The first novelty is to add the $L_{1}$ norm of the weights of the combined kernel function to the objective function of WSVR, which is used to adaptively select the optimal base models and their parameters. In addition, an accelerated method based on fast iterative shrinkage thresholding algorithm (FISTA) is developed to solve the weights of the combined kernel function. The second novelty is to propose an integrated learning framework based on AdaBoost, named Ada- $L_{1}$ MKL-WSVR. In this framework, we integrate FISTA into AdaBoost. At each iteration, we optimize the weights of the combined kernel function and update the weights of the training samples at the same time. Then an ensemble regression function of a set of regression functions is output. Finally, two groups of the experiments are designed to verify the performance of our algorithm. On the first group of the experiments including eight datasets from UCI machine learning repository, the MAEs and RMSEs of Ada- $L_{1}$ MKL-WSVR are reduced by 11.14% and 9.08% on average, respectively. Furthermore, on the second group of the experiments including the COVID-19 epidemic datasets from eight countries, the MAEs and RMSEs of Ada- $L_{1}$ MKL-WSVR are reduced by 31.19% and 29.98% on average, respectively.

Posted ContentDOI
16 Jun 2022
TL;DR: In this paper , the Kreın-SVM was proposed and developed for AMP classification and prediction by employing the Levenshtein distance and local alignment score as sequence similarity functions.
Abstract: Antimicrobial peptides (AMPs) represent a potential solution to the growing problem of antimicrobial resistance, yet their identification through wet-lab experiments is a costly and time-consuming process. Accurate computational predictions would allow rapid in silico screening of candidate AMPs, thereby accelerating the discovery process. Kernel methods are a class of machine learning algorithms that utilise a kernel function to transform input data into a new representation. When appropriately normalised, the kernel function can be regarded as a notion of similarity between instances. However, many expressive notions of similarity are not valid kernel functions, meaning they cannot be used with standard kernel methods such as the support-vector machine (SVM). The Kreın-SVM represents a generalisation of the standard SVM that admits a much larger class of similarity functions. In this study, we propose and develop Kreın-SVM models for AMP classification and prediction by employing the Levenshtein distance and local alignment score as sequence similarity functions. Utilising two datasets from the literature, each containing more than 3000 peptides, we train models to predict general antimicrobial activity. Our best models achieve an AUC of 0.967 and 0.863 on the test sets of each respective dataset, outperforming the in-house and literature baselines in both cases. We also curate a dataset of experimentally validated peptides, measured against Staphylococcus aureus and Pseudomonas aeruginosa, in order to evaluate the applicability of our methodology in predicting microbe-specific activity. In this case, our best models achieve an AUC of 0.933 and 0.917, respectively. Models to predict both general and microbe-specific activities are made available as web applications.

Journal ArticleDOI
TL;DR: This article proposed an approach to estimate optimal individualized treatment rules by exploiting multiple kernel functions to describe the similarity of features between subjects both within and across data domains within the OWL framework, as opposed to preselecting a single kernel function to be used for all features for all domains.
Abstract: Abstract Individualized treatment rules (ITRs) recommend treatments that are tailored specifically according to each patient’s own characteristics. It can be challenging to estimate optimal ITRs when there are many features, especially when these features have arisen from multiple data domains (e.g., demographics, clinical measurements, neuroimaging modalities). Considering data from complementary domains and using multiple similarity measures to capture the potential complex relationship between features and treatment can potentially improve the accuracy of assigning treatments. Outcome weighted learning (OWL) methods that are based on support vector machines using a predetermined single kernel function have previously been developed to estimate optimal ITRs. In this article, we propose an approach to estimate optimal ITRs by exploiting multiple kernel functions to describe the similarity of features between subjects both within and across data domains within the OWL framework, as opposed to preselecting a single kernel function to be used for all features for all domains. Our method takes into account the heterogeneity of each data domain and combines multiple data domains optimally. Our learning process estimates optimal ITRs and also identifies the data domains that are most important for determining ITRs. This approach can thus be used to prioritize the collection of data from multiple domains, potentially reducing cost without sacrificing accuracy. The comparative advantage of our method is demonstrated by simulation studies and by an application to a randomized clinical trial for major depressive disorder that collected features from multiple data domains. Supplementary materials for this article are available online.

Journal ArticleDOI
01 Jul 2022
TL;DR: MRIs contain valuable information on future disease activity, especially in and around the LV, and MKL techniques for combining different data types can be used for the prediction of disease activity in a relatively small MS cohort.
Abstract: Background Lack of easy-to-interpret disease activity prediction methods in early MS can lead to worse patient prognosis. Objectives Using machine learning (multiple kernel learning – MKL) models, we assessed the prognostic value of various clinical and MRI measures for disease activity. Methods Early MS patients (n = 148) with at least two associated clinical and MRI visits were investigated. T2-weighted MRIs were cropped to contain mainly the lateral ventricles (LV). High disease activity was defined as surpassing NEDA-3 Criteria more than once per year. Clinical demographic, MRI-extracted image-derived phenotypes (IDP), and MRI data were used as inputs for separate kernels to predict future disease activity with MKL. Model performance was compared using bootstrapped effect size analysis of mean differences. Results A total of 681 visits were included, where 81 (55%) patients had high disease activity in a combined end point measure using all follow-up visits. MKL model discrimination performance was moderate (AUC ≥ 0.62); however, modelling with combined clinical and cropped LV kernels gave the highest prediction performance (AUC = 0.70). Conclusions MRIs contain valuable information on future disease activity, especially in and around the LV. MKL techniques for combining different data types can be used for the prediction of disease activity in a relatively small MS cohort.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a deeper match framework for multi-view oriented kernel (DMMV) learning, which brings a deeper insight into the kernel match for similarity based multiview representation fusion.

Journal ArticleDOI
TL;DR: In this paper , a soft margin multiple kernel learning (MKL) method was proposed for short-term wind power prediction, where a kernel slack variable is introduced into each base kernel to solve the objective function.
Abstract: For short-term wind power prediction, a soft margin multiple kernel learning (MKL) method is proposed. In order to improve the predictive effect of the MKL method for wind power, a kernel slack variable is introduced into each base kernel to solve the objective function. Two kinds of soft margin MKL methods based on hinge loss function and square hinge loss function can be obtained when hinge loss functions and square hinge loss functions are selected. The improved methods demonstrate good robustness and avoid the disadvantage of the hard margin MKL method which only selects a few base kernels and discards other useful kernels when solving the objective function, thereby achieving an effective yet sparse solution for the MKL method. In order to verify the effectiveness of the proposed method, the soft margin MKL method was applied to the second wind farm of Tianfeng from Xinjiang for short-term wind power single-step prediction, and the single-step and multi-step predictions of short-term wind power was also carried out using measured data provided by alberta electric system operator (AESO). Compared with the support vector machine (SVM), extreme learning machine (ELM), kernel based extreme learning machine (KELM) methods as well as the SimpleMKL method under the same conditions, the experimental results demonstrate that the soft margin MKL method with different loss functions can efficiently achieve higher prediction accuracy and good generalization performance for short-term wind power prediction, which confirms the effectiveness of the method.

Journal ArticleDOI
TL;DR: In this article , a stream-based active online multiple kernel learning (AMKL) is proposed, in which a learner is allowed to label some selected data from an oracle according to a selection criterion.
Abstract: Online multiple kernel learning (OMKL) has provided an attractive performance in nonlinear function learning tasks. Leveraging a random feature (RF) approximation, the major drawback of OMKL, known as the curse of dimensionality, has been recently alleviated. These advantages enable RF-based OMKL to be considered in practice. In this article, we introduce a new research problem, named stream-based active MKL (AMKL), in which a learner is allowed to label some selected data from an oracle according to a selection criterion. This is necessary for many real-world applications as acquiring a true label is costly or time consuming. We theoretically prove that the proposed AMKL achieves an optimal sublinear regret O(√T) as in OMKL with little labeled data, implying that the proposed selection criterion indeed avoids unnecessary label requests. Furthermore, we present AMKL with an adaptive kernel selection (named AMKL-AKS) in which irrelevant kernels can be excluded from a kernel dictionary "on the fly." This approach improves the efficiency of active learning and the accuracy of function learning. Via numerical tests with real data sets, we verify the superiority of AMKL-AKS, yielding a similar accuracy performance with OMKL counterpart using a fewer number of labeled data.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy.
Abstract: Differentiating cancer subtypes is crucial to guide personalized treatment and improve the prognosis for patients. Integrating multi-omics data can offer a comprehensive landscape of cancer biological process and provide promising ways for cancer diagnosis and treatment. Taking the heterogeneity of different omics data types into account, we propose a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy. In stage 1, we obtain a composite kernel borrowing the cancer integration via multi-kernel learning (CIMLR) idea by optimizing the kernel parameters for individual omics data type. In stage 2, we obtain a final fused kernel through a weighted linear combination of individual kernels learned from stage 1 using an unsupervised multiple kernel learning method. Based on the final fusion kernel, k-means clustering is applied to identify cancer subtypes. Simulation studies show that hMKL outperforms the one-stage CIMLR method when there is data heterogeneity. hMKL can estimate the number of clusters correctly, which is the key challenge in subtyping. Application to two real data sets shows that hMKL identified meaningful subtypes and key cancer-associated biomarkers. The proposed method provides a novel toolkit for heterogeneous multi-omics data integration and cancer subtypes identification.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a fast and efficient multiple kernel learning (MKL) algorithm to be particularly used with large-scale data that integrates kernel approximation and group Lasso formulations into a conjoint model.
Abstract: Dataset sizes in computational biology have been increased drastically with the help of improved data collection tools and increasing size of patient cohorts. Previous kernel-based machine learning algorithms proposed for increased interpretability started to fail with large sample sizes, owing to their lack of scalability. To overcome this problem, we proposed a fast and efficient multiple kernel learning (MKL) algorithm to be particularly used with large-scale data that integrates kernel approximation and group Lasso formulations into a conjoint model. Our method extracts significant and meaningful information from the genomic data while conjointly learning a model for out-of-sample prediction. It is scalable with increasing sample size by approximating instead of calculating distinct kernel matrices.To test our computational framework, namely, Multiple Approximate Kernel Learning (MAKL), we demonstrated our experiments on three cancer datasets and showed that MAKL is capable to outperform the baseline algorithm while using only a small fraction of the input features. We also reported selection frequencies of approximated kernel matrices associated with feature subsets (i.e. gene sets/pathways), which helps to see their relevance for the given classification task. Our fast and interpretable MKL algorithm producing sparse solutions is promising for computational biology applications considering its scalability and highly correlated structure of genomic datasets, and it can be used to discover new biomarkers and new therapeutic guidelines.MAKL is available at https://github.com/begumbektas/makl together with the scripts that replicate the reported experiments. MAKL is also available as an R package at https://cran.r-project.org/web/packages/MAKL.Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
01 Aug 2022
TL;DR: The Multiple Kernel Transfer Clustering (MKTC) method as mentioned in this paper uses a weakly supervised multi-instance subset of the dataset, where a set of data instances are together provided some labels.
Abstract: Multiple kernel clustering methods have been quite successful recently especially concerning the multi-view clustering of complex datasets. These methods simultaneously learn a multiple kernel metric while clustering in an unsupervised setting. With the motivation that some minimal supervision can potentially increase their effectiveness, we propose a Multiple Kernel Transfer Clustering (MKTC) method that can be described in terms of two tasks: a source task, where the multiple kernel metric is learned, and a target task where the multiple kernel metric is transferred to partition a dataset. In the source task, we create a weakly supervised multi-instance subset of the dataset, where a set of data instances are together provided some labels. We put forth a Multiple Kernel Multi-Instance $k$ -Means (MKMIKM) method to simultaneously cluster the multi-instance subset while also learning a multiple kernel metric under weak supervision. In the target task, MKTC transfers the multiple kernel metric learned by MKMIKM to perform unsupervised single-instance clustering of the entire dataset in a single step. The advantage of using a multi-instance setup for the source task is that it requires reduced labeling effort to guide the learning of the multiple kernel metric. Our formulations lead to a significantly lower computational cost in comparison to the state-of-the-art multiple kernel clustering algorithms, making them more applicable to larger datasets. Experiments over benchmark computer vision datasets suggest that MKTC can achieve significant improvements in clustering performance in comparison to the state-of-the-art unsupervised multiple-kernel clustering methods and other transfer clustering methods.


Proceedings ArticleDOI
01 Oct 2022
TL;DR: In this article , a method based on kernel density negative example learning is proposed to classify the passing data effectively in RoboCup2D simulation soccer platform, which extracts the action chain feature data conforming to our preset by parsing the game log files based on Kernel Density Analysis, and weakens the weights of common features of passing data through negative examples learning to improve the classification of different kinds of passing.
Abstract: RoboCup2D simulation soccer is a multi-intelligence collaborative simulation experiment platform. In RoboCup2D simulation soccer platform, the passing module is one of the of the team, and an effective passing strategy will improve the overall team strength. Since the parameters of the passing module are too complicated and the data distribution is too heterogeneous. In order to be able to classify the passing data effectively, we propose a method based on kernel density negative example learning, which extracts the action chain feature data conforming to our preset by parsing the game log files based on kernel density analysis, and weakens the weights of common features of passing data through negative example learning to improve the classification of different kinds of passing. The classification effect of feature data is improved by weakening the weight of common features of passing data through negative example learning. By analyzing the similarities and differences of the passing action chain features when facing different teams, we extract the key areas of attack to optimize the team's passing strategy and improve the average number of goals scored in the test with several teams.

Proceedings ArticleDOI
20 Jan 2022
TL;DR: In this paper , a new geometric feature fusion framework like score and features fusion on 3D skeletal data was developed, where geometric features like relative distance features, relative angle features are extracted from skeleton data and various fusion models were adopted in early fusion combining features and late fusion after classifier to combine the output scores of both geometric features.
Abstract: The identification of human action or behavior is very much discussed through the use of computer vision and artificial intelligence. Various action recognition strategies were proposed by many researchers to extract the action features in order to identify the action accurately. In the present article, we developed a new geometric feature fusion framework like score and features fusion on 3D skeletal data. First, geometric features like relative distance features, relative angle features are extracted from skeleton data. Next, various fusion models were adopted in early fusion combining features and late fusion after classifier to combine the output scores of both geometric features. In this work, the extracted features trained with the adaptive kernel learning classifier for recognizing actions. To test the working of our architecture, tested with other architectures on our own PACE3DAction and benchmark skeleton action datasets like PKU-MMD and NTU RGB-D.

Posted ContentDOI
u9cjaxm8431
26 Apr 2022
TL;DR: In this article , a convex multiple kernel learning (MKL) based approach was proposed to detect unknown/unseen face presentation attacks in the face spoofing dataset. And the proposed approach was applied on general object image datasets, which showed its efficacy for abnormality and novelty detection.
Abstract: <p>The paper studies face spoofing, a.k.a. presentation attack detection (PAD) in the demanding scenarios of unknown attacks. While earlier studies have revealed the benefits of ensemble methods, and in particular, a multiple kernel learning (MKL) approach to the problem, one limitation of such techniques is that they typically treat the entire observation space similarly and ignore any variability and \textit{local} structure inherent to the data. This work studies this aspect of face presentation attack detection with regards to one-class multiple kernel learning to benefit from the intrinsic local structure in bona fide samples to adaptively weight each representation in the composite kernel. More concretely, inspired by the success of the one-class Fisher null formalism, we formulate a convex \textit{localised} multiple kernel learning algorithm by imposing a joint matrix-norm constraint on the collection of local kernel weights and infer locally adaptive weights for zero-shot one-class unseen attack detection.</p> <p>We present a theoretical study of the proposed localised MKL algorithm using Rademacher complexities to characterise its generalisation capability and demonstrate the advantages of the proposed technique over some other options. An assessment of the proposed approach on general object image datasets illustrates its efficacy for abnormality and novelty detection while the results of the experiments on face PAD datasets verify its potential in detecting unknown/unseen face presentation attacks.</p>

Journal ArticleDOI
08 Jun 2022
TL;DR: In this paper , the authors study the kernel learning problems with ramp loss, a nonconvex but noise-resistant loss function, and show that the generalization bound for empirical ramp risk minimizer is similar to that of convex surrogate losses, which implies kernel learning with such loss function is not only noise resistant but also statistically consistent.
Abstract: We study the kernel learning problems with ramp loss, a nonconvex but noise-resistant loss function. In this work, we justify the validity of ramp loss under the classical kernel learning framework. In particular, we show that the generalization bound for empirical ramp risk minimizer is similar to that of convex surrogate losses, which implies kernel learning with such loss function is not only noise-resistant but, more importantly, statistically consistent. For adapting to real-time data streams, we introduce PA-ramp, a heuristic online algorithm based on the passive-aggressive framework, to solve this learning problem. Empirically, with fewer support vectors, this algorithm achieves robust empirical performances on tested noisy scenarios.