scispace - formally typeset
Search or ask a question

Showing papers on "Multiple kernel learning published in 2023"


Journal ArticleDOI
TL;DR: In this paper , the Kreĭn-SVM was proposed and developed for AMP classification and prediction by employing the Levenshtein distance and local alignment score as sequence similarity functions.
Abstract: Antimicrobial peptides (AMPs) represent a potential solution to the growing problem of antimicrobial resistance, yet their identification through wet-lab experiments is a costly and time-consuming process. Accurate computational predictions would allow rapid in silico screening of candidate AMPs, thereby accelerating the discovery process. Kernel methods are a class of machine learning algorithms that utilise a kernel function to transform input data into a new representation. When appropriately normalised, the kernel function can be regarded as a notion of similarity between instances. However, many expressive notions of similarity are not valid kernel functions, meaning they cannot be used with standard kernel methods such as the support-vector machine (SVM). The Kreĭn-SVM represents generalisation of the standard SVM that admits a much larger class of similarity functions. In this study, we propose and develop Kreĭn-SVM models for AMP classification and prediction by employing the Levenshtein distance and local alignment score as sequence similarity functions. Utilising two datasets from the literature, each containing more than 3000 peptides, we train models to predict general antimicrobial activity. Our best models achieve an AUC of 0.967 and 0.863 on the test sets of each respective dataset, outperforming the in-house and literature baselines in both cases. We also curate a dataset of experimentally validated peptides, measured against Staphylococcus aureus and Pseudomonas aeruginosa, in order to evaluate the applicability of our methodology in predicting microbe-specific activity. In this case, our best models achieve an AUC of 0.982 and 0.891, respectively. Models to predict both general and microbe-specific activities are made available as web applications.

2 citations


Journal ArticleDOI
01 Apr 2023
TL;DR: In this paper , a multiple kernel learning (MKL) embedded multi-objective swarm intelligence technique has been proposed to identify the candidate biomarker genes from the transcriptomic profile of arsenicosis samples.
Abstract: Arsenic is a carcinogen, and long-term exposure to it may result in the development of multi-organ disease. Understanding the underlying intricate molecular network of toxicity and carcinogenicity is crucial for identifying a small set of differentially expressed biomarker genes to predict the risk of the exposed population. In this paper, a multiple kernel learning (MKL) embedded multi-objective swarm intelligence technique has been proposed to identify the candidate biomarker genes from the transcriptomic profile of arsenicosis samples. To achieve the optimal classification accuracy along with the minimum number of genes, a multi-objective random spatial local best particle swarm optimization (MO-RSplbestPSO) has been utilized. The proposed MO-RSplbestPSO also guides the multiple kernel learning mechanism which provides data specific classification. The proposed computational framework has been applied to the developed whole genome DNA microarray prepared using blood samples collected from a specific arsenic exposed area of the Indian state of West Bengal. A set of twelve biomarker genes, with four novel genes, are successfully identified for the classification of exposure to arsenic and its subcategories, which can be used as future prognostic biomarkers for screening of arsenic exposed populations. Also, the biological significance of each gene is detailed to delineate the complex molecular networking and mode of toxicity.

1 citations


Journal ArticleDOI
TL;DR: In this article , a one-stage RFF-based kernel learning method is proposed, where a generative network via RFFs is devised to implicitly learn the kernel, followed by a linear classifier parameterized as a full-connected layer.

1 citations


Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed adding a matrix-induced regularization to localized SimpleMKKM (LI-SimpleMKKMs-MR) to enhance the complementarity between base kernels.
Abstract: Multikernel clustering achieves clustering of linearly inseparable data by applying a kernel method to samples in multiple views. A localized SimpleMKKM (LI-SimpleMKKM) algorithm has recently been proposed to perform min-max optimization in multikernel clustering where each instance is only required to be aligned with a certain proportion of the relatively close samples. The method has improved the reliability of clustering by focusing on the more closely paired samples and dropping the more distant ones. Although LI-SimpleMKKM achieves remarkable success in a wide range of applications, the method keeps the sum of the kernel weights unchanged. Thus, it restricts kernel weights and does not consider the correlation between the kernel matrices, especially between paired instances. To overcome such limitations, we propose adding a matrix-induced regularization to localized SimpleMKKM (LI-SimpleMKKM-MR). Our approach addresses the kernel weight restrictions with the regularization term and enhances the complementarity between base kernels. Thus, it does not limit kernel weights and fully considers the correlation between paired instances. Extensive experiments on several publicly available multikernel datasets show that our method performs better than its counterparts.

1 citations


Book ChapterDOI
01 Jan 2023
TL;DR: Wang et al. as mentioned in this paper proposed a multiple scale multiple layer multiple kernel learning (MS-DKL) method that fuses deep and shallow representations of mineral image features for mining.
Abstract: Identifying sandstone images and judging the types of minerals play an important role in oil and gas reservoir exploration and evaluation. Multiple kernel learning (MKL) method has shown high performance in solving some practical applications. While this method belongs to a shallow structure and cannot handle relatively complex problems well. With the development of deep learning in recent years, many researchers have proposed a deep multiple layer multiple kernel learning (DMLMKL) method based on deep structure. While the existing DMLMKL method only considers the deep representation of the data but ignores the shallow representation between the data. Therefore, this paper propose a multiple scale multiple layer multiple kernel learning (MS-DKL) method that “richer” feature data by fusing deep and shallow representations of mineral image features. Mineral recognition results show that MS-DKL algorithm is higher accuracy in mineral recognition than the MKL and DMLMKL methods.

Posted ContentDOI
13 Apr 2023
TL;DR: In this paper , a comprehensive overview of the application of kernel learning algorithms in survival analysis is provided, which suggests that using multiple kernels instead of one single kernel can make decision functions more interpretable and can improve performance.
Abstract: Abstract Background The time until an event happens is the outcome variable of interest in the statistical data analysis method known as survival analysis. Some researchers have created kernel statistics for various types of data and kernels that allow the association of a set of markers with survival data. Multiple Kernel Learning (MKL) is often considered a linear or convex combination of multiple kernels. This paper aims to provide a comprehensive overview of the application of kernel learning algorithms in survival analysis. Methods We conducted a systematic review which involved an extensive search for relevant literature in the field of biomedicine. After using the keywords in literature searching, 435 articles were identified based on the title and abstract screening. Result In this review, out of a total of 56 selected articles, only 20 articles that have used MKL for high-dimensional data, were included. In most of these articles, the MKL method has been expanded and has been introduced as a novel method. In these studies, the extended MKL models due to the nature of classification or regression have been compared with SVM, Cox PH (Cox), Extreme Learning (ELM), MKCox, Gradient Boosting (GBCox), Parametric Censored Regression Models (PCRM), Elastic-net Cox (EN-Cox), LASSO-Cox, Random Survival Forests (RSF), and Boosting Concordance Index (BoostCI). In most of these articles, the optimal model’s parameters are estimated by 10-fold cross-validation. In addition, the Concordance index (C-index) and the area under the ROC curve (AUC) were calculated to quantitatively measure the performance of all methods for validation. Predictive accuracy is improved by using kernels. Conclusion Our findings suggest that using multiple kernels instead of one single kernel can make decision functions more interpretable and can improve performance.


Proceedings ArticleDOI
27 Jan 2023
TL;DR: LAKE as discussed by the authors explores the potential of ML to improve decision-making in OS kernels and explores the tradeoff spaces for subsystems such as memory management and process and I/O scheduling that currently rely on hand-tuned heuristics.
Abstract: The complexity of modern operating systems (OSes), rapid diversification of hardware, and steady evolution of machine learning (ML) motivate us to explore the potential of ML to improve decision-making in OS kernels. We conjecture that ML can better manage tradeoff spaces for subsystems such as memory management and process and I/O scheduling that currently rely on hand-tuned heuristics to provide reasonable average-case performance. We explore the replacement of heuristics with ML-driven decision-making in five kernel subsystems, consider the implications for kernel design, shared OS-level components, and access to hardware acceleration. We identify obstacles, address challenges and characterize tradeoffs for the benefits ML can provide that arise in kernel-space. We find that use of specialized hardware such as GPUs is critical to absorbing the additional computational load required by ML decisioning, but that poor accessibility of accelerators in kernel space is a barrier to adoption. We also find that the benefits of ML and acceleration for OSes is subsystem-, workload- and hardware-dependent, suggesting that using ML in kernels will require frameworks to help kernel developers navigate new tradeoff spaces. We address these challenge by building a system called LAKE for supporting ML and exposing accelerators in kernel space. LAKE includes APIs for feature collection and management across abstraction layers and module boundaries. LAKE provides mechanisms for managing the variable profitability of acceleration, and interfaces for mitigating contention for resources between user and kernel space. We show that an ML-backed I/O latency predictor can have its inference time reduced by up to 96% with acceleration.

Journal ArticleDOI
TL;DR: Asymmetric kernels naturally exist in real life, e.g., for conditional probability and directed graphs as discussed by the authors , and most of the existing kernel-based learning methods require kernels to be symmetric, which prevents the use of asymmetric kernels.
Abstract: Asymmetric kernels naturally exist in real life, e.g., for conditional probability and directed graphs. However, most of the existing kernel-based learning methods require kernels to be symmetric, which prevents the use of asymmetric kernels. This paper addresses the asymmetric kernel-based learning in the framework of the least squares support vector machine named AsK-LS , resulting in the first classification method that can utilize asymmetric kernels directly. We will show that AsK-LS can learn with asymmetric features, namely source and target features, while the kernel trick remains applicable, i.e., the source and target features exist but are not necessarily known. Besides, the computational burden of AsK-LS is as cheap as dealing with symmetric kernels. Experimental results on various tasks, including Corel, PASCAL VOC, Satellite, directed graphs, and UCI database, all show that in the case asymmetric information is crucial, the proposed AsK-LS can learn with asymmetric kernels and performs much better than the existing kernel methods that rely on symmetrization to accommodate asymmetric kernels.

Journal ArticleDOI
TL;DR: In this paper , a multi-view kernel learning approach is proposed that aims to learn a consensus kernel, which efficiently captures the heterogeneous information of individual views as well as depicts the underlying inherent cluster structure.
Abstract: Gene expression data sets and protein-protein interaction (PPI) networks are two heterogeneous data sources that have been extensively studied, due to their ability to capture the co-expression patterns among genes and their topological connections. Although they depict different traits of the data, both of them tend to group co-functional genes together. This phenomenon agrees with the basic assumption of multi-view kernel learning, according to which different views of the data contain a similar inherent cluster structure. Based on this inference, a new multi-view kernel learning based disease gene identification algorithm, termed as DiGId, is put forward. A novel multi-view kernel learning approach is proposed that aims to learn a consensus kernel, which efficiently captures the heterogeneous information of individual views as well as depicts the underlying inherent cluster structure. Some low-rank constraints are imposed on the learned multi-view kernel, so that it can effectively be partitioned into $k$ or fewer clusters. The learned joint cluster structure is used to curate a set of potential disease genes. Moreover, a novel approach is put forward to quantify the importance of each view. In order to demonstrate the effectiveness of the proposed approach in capturing the relevant information depicted by individual views, an extensive analysis is performed on four different cancer-related gene expression data sets and PPI network, considering different similarity measures.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a three-factor penalized, non-negative matrix factorization-based multiple kernel learning with soft margin hinge loss (3PNMF-MKL) for gene signature detection.
Abstract: In this current era, biomedical big data handling is a challenging task. Interestingly, the integration of multi-modal data, followed by significant feature mining (gene signature detection), becomes a daunting task. Remembering this, here, we proposed a novel framework, namely, three-factor penalized, non-negative matrix factorization-based multiple kernel learning with soft margin hinge loss (3PNMF-MKL) for multi-modal data integration, followed by gene signature detection. In brief, limma, employing the empirical Bayes statistics, was initially applied to each individual molecular profile, and the statistically significant features were extracted, which was followed by the three-factor penalized non-negative matrix factorization method used for data/matrix fusion using the reduced feature sets. Multiple kernel learning models with soft margin hinge loss had been deployed to estimate average accuracy scores and the area under the curve (AUC). Gene modules had been identified by the consecutive analysis of average linkage clustering and dynamic tree cut. The best module containing the highest correlation was considered the potential gene signature. We utilized an acute myeloid leukemia cancer dataset from The Cancer Genome Atlas (TCGA) repository containing five molecular profiles. Our algorithm generated a 50-gene signature that achieved a high classification AUC score (viz., 0.827). We explored the functions of signature genes using pathway and Gene Ontology (GO) databases. Our method outperformed the state-of-the-art methods in terms of computing AUC. Furthermore, we included some comparative studies with other related methods to enhance the acceptability of our method. Finally, it can be notified that our algorithm can be applied to any multi-modal dataset for data integration, followed by gene module discovery.

Journal ArticleDOI
TL;DR: Contrastive Multi-view Kernel (CMK) as mentioned in this paper is a novel kernel function based on the emerging contrastive learning framework, which implicitly embeds the views into a joint semantic space where all of them resemble each other while promoting to learn diverse views.
Abstract: Kernel method is a proven technique in multi-view learning. It implicitly defines a Hilbert space where samples can be linearly separated. Most kernel-based multi-view learning algorithms compute a kernel function aggregating and compressing the views into a single kernel. However, existing approaches compute the kernels independently for each view. This ignores complementary information across views and thus may result in a bad kernel choice. In contrast, we propose the Contrastive Multi-view Kernel — a novel kernel function based on the emerging contrastive learning framework. The Contrastive Multi-view Kernel implicitly embeds the views into a joint semantic space where all of them resemble each other while promoting to learn diverse views. We validate the method's effectiveness in a large empirical study. It is worth noting that the proposed kernel functions share the types and parameters with traditional ones, making them fully compatible with existing kernel theory and application. On this basis, we also propose a contrastive multi-view clustering framework and instantiate it with multiple kernel $k$ -means, achieving a promising performance. To the best of our knowledge, this is the first attempt to explore kernel generation in multi-view setting and the first approach to use contrastive learning for a multi-view kernel learning.

Journal ArticleDOI
TL;DR: In this article , a triple-kernel gated attention-based multiple instance learning with contrastive learning is proposed to overcome the limitations of the existing multiple-instance learning approaches to medical image analysis.
Abstract: In machine learning, multiple instance learning is a method evolved from supervised learning algorithms, which defines a "bag" as a collection of multiple examples with a wide range of applications. In this paper, we propose a novel deep multiple instance learning model for medical image analysis, called triple-kernel gated attention-based multiple instance learning with contrastive learning. It can be used to overcome the limitations of the existing multiple instance learning approaches to medical image analysis. Our model consists of four steps. i) Extracting the representations by a simple convolutional neural network using contrastive learning for training. ii) Using three different kernel functions to obtain the importance of each instance from the entire image and forming an attention map. iii) Based on the attention map, aggregating the entire image together by attention-based MIL pooling. iv) Feeding the results into the classifier for prediction. The results on different datasets demonstrate that the proposed model outperforms state-of-the-art methods on binary and weakly supervised classification tasks. It can provide more efficient classification results for various disease models and additional explanatory information.


Journal ArticleDOI
TL;DR: In this paper , the kernel similarity embeddings are used to model nonlinear feature relationships for dynamic texture synthesis, which can not only mitigate the high dimensionality and small sample issues, but also have the advantage of modeling nonlinear features.
Abstract: Dynamic texture (DT) exhibits statistical stationarity in the spatial domain and stochastic repetitiveness in the temporal dimension, indicating that different frames of DT possess a high similarity correlation that is critical prior knowledge. However, existing methods cannot effectively learn a synthesis model for high-dimensional DT from a small number of training samples. In this article, we propose a novel DT synthesis method, which makes full use of similarity as prior knowledge to address this issue. Our method is based on the proposed kernel similarity embedding, which can not only mitigate the high dimensionality and small sample issues, but also has the advantage of modeling nonlinear feature relationships. Specifically, we first put forward two hypotheses that are essential for the DT model to generate new frames using similarity correlations. Then, we integrate kernel learning and the extreme learning machine into a unified synthesis model to learn kernel similarity embeddings for representing DTs. Extensive experiments on DT videos collected from the Internet and two benchmark datasets, i.e., Gatech Graphcut Textures and Dyntex, demonstrate that the learned kernel similarity embeddings can provide discriminative representations for DTs. Further, our method can preserve the long-term temporal continuity of the synthesized DT sequences with excellent sustainability and generalization. Meanwhile, it effectively generates realistic DT videos with higher speed and lower computation than the current state-of-the-art methods. The code and more synthesis videos are available at our project page https://shiming-chen.github.io/Similarity-page/Similarit.html .

Journal ArticleDOI
TL;DR: LatSim as mentioned in this paper combines metric learning with a kernel similarity function and softmax aggregation to identify task-related similarities between subjects, which is utilized to improve performance on three prediction tasks using multi-paradigm fMRI data.
Abstract: Endophenotypes such as brain age and fluid intelligence are important biomarkers of disease status. However, brain imaging studies to identify these biomarkers often encounter limited numbers of subjects but high dimensional imaging features, hindering reproducibility. Therefore, we develop an interpretable, multivariate classification/regression algorithm, called Latent Similarity (LatSim), suitable for small sample size but high feature dimension datasets.LatSim combines metric learning with a kernel similarity function and softmax aggregation to identify task-related similarities between subjects. Inter-subject similarity is utilized to improve performance on three prediction tasks using multi-paradigm fMRI data. A greedy selection algorithm, made possible by LatSim's computational efficiency, is developed as an interpretability method.LatSim achieved significantly higher predictive accuracy at small sample sizes on the Philadelphia Neurodevelopmental Cohort (PNC) dataset. Connections identified by LatSim gave superior discriminative power compared to those identified by other methods. We identified 4 functional brain networks enriched in connections for predicting brain age, sex, and intelligence.We find that most information for a predictive task comes from only a few (1-5) connections. Additionally, we find that the default mode network is over-represented in the top connections of all predictive tasks.We propose a novel prediction algorithm for small sample, high feature dimension datasets and use it to identify connections in task fMRI data. Our work can lead to new insights in both algorithm design and neuroscience research.

Posted ContentDOI
08 Mar 2023
TL;DR: In this paper , the authors formulate the multiple kernel learning problem for the support vector machine with the infamous $(0,1)$-loss function and give some first-order optimality conditions, which could be readily exploited to develop fast numerical solvers.
Abstract: We formulate the Multiple Kernel Learning (abbreviated as MKL) problem for the support vector machine with the infamous $(0,1)$-loss function. Some first-order optimality conditions are given, which could be readily exploited to develop fast numerical solvers e.g., of the ADMM type.

Journal ArticleDOI
TL;DR: In this paper , a tensor multi-task learning (MTL) algorithm based on similarity measurement of spatio-temporal variability of brain biomarkers to model AD progression was proposed.
Abstract: Machine learning approaches for predicting Alzheimer’s disease (AD) progression can substantially assist researchers and clinicians in developing effective AD preventive and treatment strategies. This study proposes a novel machine learning algorithm to predict the AD progression utilising a multi-task ensemble learning approach. Specifically, we present a novel tensor multi-task learning (MTL) algorithm based on similarity measurement of spatio-temporal variability of brain biomarkers to model AD progression. In this model, the prediction of each patient sample in the tensor is set as one task, where all tasks share a set of latent factors obtained through tensor decomposition. Furthermore, as subjects have continuous records of brain biomarker testing, the model is extended to ensemble the subjects’ temporally continuous prediction results utilising a gradient boosting kernel to find more accurate predictions. We have conducted extensive experiments utilising data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) to evaluate the performance of the proposed algorithm and model. Results demonstrate that the proposed model have superior accuracy and stability in predicting AD progression compared to benchmarks and state-of-the-art multi-task regression methods in terms of the Mini Mental State Examination (MMSE) questionnaire and The Alzheimer’s Disease Assessment Scale-Cognitive Subscale (ADAS-Cog) cognitive scores. Brain biomarker correlation information can be utilised to identify variations in individual brain structures and the model can be utilised to effectively predict the progression of AD with magnetic resonance imaging (MRI) data and cognitive scores of AD patients at different stages.

Journal ArticleDOI
TL;DR: In this article , the Fisher null-space OCC principle is applied to the multiple kernel learning (MKL) problem and a one-class MKL approach is proposed to solve it.
Abstract: We address the one-class classification (OCC) problem and advocate a one-class MKL (multiple kernel learning) approach for this purpose. To this aim, based on the Fisher null-space OCC principle, we present a multiple kernel learning algorithm where an $\ell _{p}$ -norm regularisation ( $p \geq 1$ ) is considered for kernel weight learning. We cast the proposed one-class MKL problem as a min-max saddle point Lagrangian optimisation task and propose an efficient approach to optimise it. An extension of the proposed approach is also considered where several related one-class MKL tasks are learned concurrently by constraining them to share common weights for kernels. An extensive evaluation of the proposed MKL approach on a range of data sets from different application domains confirms its merits against the baseline and several other algorithms.

Proceedings ArticleDOI
21 Mar 2023
TL;DR: In this article , a multiple kernel ensemble (MKE) learning framework and combined gradient boosting decision tree (GBDT), genomic best linear unbiased prediction (GBLUP) and random forest (RF) were used to predict three economic traits of milk fat percentage (MFP), milk yield (MY), and somatic cell score (SCS) in German Holstein dairy cattle.
Abstract: Genomic selection (GS) to estimate genomic estimated breeding values (GEBVs) of individuals by using high-density molecular markers covering a genome-wide range combined with phenotypic records or pedigree information has revolutionized animal and plant breeding. Support vector machines (SVM) have been shown to be an important method for implementing genomic selection, showing excellent prediction performance on a variety of traits, but the choice of hyperparameters and kernel functions has an important impact on the prediction performance. In this study, we integrated four kernel functions of SVM to construct a multiple kernel ensemble (MKE) learning framework and combined gradient boosting decision tree (GBDT), genomic best linear unbiased prediction (GBLUP) and random forest (RF) to predict GEBVs for three economic traits of milk fat percentage (MFP), milk yield (MY), and somatic cell score (SCS) in German Holstein dairy cattle. We also constructed an Optuna hyperparameter optimization (HO) framework and compared the prediction performance and time to find the optimal parameters with two commonly used grid search and random search methods. The results show that the MKE framework outperforms the single kernel SVM as well as several other machine learning (ML) algorithms, with an average improvement of 10% in prediction accuracy for the three traits. Besides, the MKE framework with Optuna optimization has the best predictive performance on each trait. Therefore, we believed that MKE is an efficient and stable GS method for phenotypes prediction.

Journal ArticleDOI
TL;DR: In this paper , a manifold optimization based kernel preserving embedding (MOKPE) is proposed to model heterogeneous drug and target data into a unified embedding space by preserving drug-target interactions and drug-drug, target-target similarities simultaneously.
Abstract: In many applications of bioinformatics, data stem from distinct heterogeneous sources. One of the well-known examples is the identification of drug-target interactions (DTIs), which is of significant importance in drug discovery. In this paper, we propose a novel framework, manifold optimization based kernel preserving embedding (MOKPE), to efficiently solve the problem of modeling heterogeneous data. Our model projects heterogeneous drug and target data into a unified embedding space by preserving drug-target interactions and drug-drug, target-target similarities simultaneously.We performed ten replications of ten-fold cross validation on four different drug-target interaction network data sets for predicting DTIs for previously unseen drugs. The classification evaluation metrics showed better or comparable performance compared to previous similarity-based state-of-the-art methods. We also evaluated MOKPE on predicting unknown DTIs of a given network. Our implementation of the proposed algorithm in R together with the scripts that replicate the reported experiments is publicly available at https://github.com/ocbinatli/mokpe .