Cancer subtyping with heterogeneous multi-omics data via hierarchical multi-kernel learning

doi:10.1093/bib/bbac488

Journal ArticleDOI

Cancer subtyping with heterogeneous multi-omics data via hierarchical multi-kernel learning

Yifang Wei, +6 more

- 25 Nov 2022 -

Briefings in Bioinformatics

- Vol. 24, Iss: 1

Chats0

TLDR

Wang et al. as discussed by the authors proposed a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy.

Abstract:

Differentiating cancer subtypes is crucial to guide personalized treatment and improve the prognosis for patients. Integrating multi-omics data can offer a comprehensive landscape of cancer biological process and provide promising ways for cancer diagnosis and treatment. Taking the heterogeneity of different omics data types into account, we propose a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy. In stage 1, we obtain a composite kernel borrowing the cancer integration via multi-kernel learning (CIMLR) idea by optimizing the kernel parameters for individual omics data type. In stage 2, we obtain a final fused kernel through a weighted linear combination of individual kernels learned from stage 1 using an unsupervised multiple kernel learning method. Based on the final fusion kernel, k-means clustering is applied to identify cancer subtypes. Simulation studies show that hMKL outperforms the one-stage CIMLR method when there is data heterogeneity. hMKL can estimate the number of clusters correctly, which is the key challenge in subtyping. Application to two real data sets shows that hMKL identified meaningful subtypes and key cancer-associated biomarkers. The proposed method provides a novel toolkit for heterogeneous multi-omics data integration and cancer subtypes identification.

References

PDF

Open Access

More filters

Journal ArticleDOI

clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters

Guangchuang Yu, +3 more

- 03 May 2012 -

Omics A Journal of Integrative Biology

TL;DR: An R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters and can be easily extended to other species and ontologies is presented.

...read moreread less

Journal ArticleDOI

An automated method for finding molecular complexes in large protein interaction networks.

Gary D. Bader, +2 more

- 13 Jan 2003 -

BMC Bioinformatics

TL;DR: A novel graph theoretic clustering algorithm, "Molecular Complex Detection" (MCODE), that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes is described.

...read moreread less

Journal ArticleDOI

KEGG for integration and interpretation of large-scale molecular data sets

Minoru Kanehisa, +4 more

- 01 Jan 2012 -

Nucleic Acids Research

TL;DR: KEGG Mapper, a collection of tools for KEGG PATHWAY, BRITE and MODULE mapping, enabling integration and interpretation of large-scale data sets and recent enhancements to the K EGG content, especially the incorporation of disease and drug information used in practice and in society, to support translational bioinformatics.

...read moreread less

Journal ArticleDOI

Missing value estimation methods for DNA microarrays.

Olga G. Troyanskaya, +7 more

- 01 Jun 2001 -

Bioinformatics

TL;DR: It is shown that KNNimpute appears to provide a more robust and sensitive method for missing value estimation than SVDimpute, and both SVD Impute and KNN Impute surpass the commonly used row average method (as well as filling missing values with zeros).

...read moreread less

Collapse

Cancer subtyping with heterogeneous multi-omics data via hierarchical multi-kernel learning

References

Gene Ontology: tool for the unification of biology

clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters

An automated method for finding molecular complexes in large protein interaction networks.

KEGG for integration and interpretation of large-scale molecular data sets

Missing value estimation methods for DNA microarrays.

Related Papers (5)

On multiple kernel learning with multiple labels

A Unified View of Localized Kernel Learning.

Multi-label Multiple Kernel Learning by Stochastic Approximation: Application to Visual Object Recognition

Multiple kernel learning

MKBoost: A Framework of Multiple Kernel Boosting.

Trending Questions (2)