FaRoC: Fast and Robust Supervised Canonical Correlation Analysis for Multimodal Omics Data

doi:10.1109/TCYB.2017.2685625

Home
/
Papers
/
FaRoC: Fast and Robust Supervised Canonical Correlation Analysis for Multimodal Omics Data

Journal Article•DOI•

FaRoC: Fast and Robust Supervised Canonical Correlation Analysis for Multimodal Omics Data

Ankita Mandal¹, Pradipta Maji¹•Institutions (1)

Indian Statistical Institute¹

01 Apr 2018-IEEE Transactions on Systems, Man, and Cybernetics (IEEE)-Vol. 48, Iss: 4, pp 1229-1241

TL;DR: The formulation enables the proposed method to extract required number of correlated features sequentially with lesser computational cost as compared to existing methods, and provides an efficient way to find optimum regularization parameters employed in CCA.

read less

Abstract: One of the main problems associated with high dimensional multimodal real life data sets is how to extract relevant and significant features. In this regard, a fast and robust feature extraction algorithm, termed as FaRoC, is proposed, integrating judiciously the merits of canonical correlation analysis (CCA) and rough sets. The proposed method extracts new features sequentially from two multidimensional data sets by maximizing their relevance with respect to class label and significance with respect to already-extracted features. To generate canonical variables sequentially, an analytical formulation is introduced to establish the relation between regularization parameters and CCA. The formulation enables the proposed method to extract required number of correlated features sequentially with lesser computational cost as compared to existing methods. To compute both significance and relevance measures of a feature, the concept of hypercuboid equivalence partition matrix of rough hypercuboid approach is used. It also provides an efficient way to find optimum regularization parameters employed in CCA. The efficacy of the proposed FaRoC algorithm, along with a comparison with other existing methods, is extensively established on several real life data sets.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Contextual Correlation Preserving Multiview Featured Graph Clustering

[...]

Tiantian He¹, Yang Liu², Tobey H. Ko³, Keith C. C. Chan⁴, Yew-Soon Ong¹ - Show less +1 more•Institutions (4)

Nanyang Technological University¹, Hong Kong Baptist University², University of Hong Kong³, Hong Kong Polytechnic University⁴

01 Oct 2020-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: This paper proposes a unified objective function for CCPMVFGC and develops an iterative strategy to solve the formulated optimization problem, and provides the theoretical analysis of the proposed model, including convergence proof and computational complexity analysis.

...read moreread less

Abstract: Graph clustering, which aims at discovering sets of related vertices in graph-structured data, plays a crucial role in various applications, such as social community detection and biological module discovery. With the huge increase in the volume of data in recent years, graph clustering is used in an increasing number of real-life scenarios. However, the classical and state-of-the-art methods, which consider only single-view features or a single vector concatenating features from different views and neglect the contextual correlation between pairwise features, are insufficient for the task, as features that characterize vertices in a graph are usually from multiple views and the contextual correlation between pairwise features may influence the cluster preference for vertices. To address this challenging problem, we introduce in this paper, a novel graph clustering model, dubbed contextual correlation preserving multiview featured graph clustering (CCPMVFGC) for discovering clusters in graphs with multiview vertex features. Unlike most of the aforementioned approaches, CCPMVFGC is capable of learning a shared latent space from multiview features as the cluster preference for each vertex and making use of this latent space to model the inter-relationship between pairwise vertices. CCPMVFGC uses an effective method to compute the degree of contextual correlation between pairwise vertex features and utilizes view-wise latent space representing the feature–cluster preference to model the computed correlation. Thus, the cluster preference learned by CCPMVFGC is jointly inferred by multiview features, view-wise correlations of pairwise features, and the graph topology. Accordingly, we propose a unified objective function for CCPMVFGC and develop an iterative strategy to solve the formulated optimization problem. We also provide the theoretical analysis of the proposed model, including convergence proof and computational complexity analysis. In our experiments, we extensively compare the proposed CCPMVFGC with both classical and state-of-the-art graph clustering methods on eight standard graph datasets (six multiview and two single-view datasets). The results show that CCPMVFGC achieves competitive performance on all eight datasets, which validates the effectiveness of the proposed model.

...read moreread less

42 citations

Cites methods from "FaRoC: Fast and Robust Supervised C..."

...Empirically, feature correlations can be captured via many approaches, such as canonical correlation analysis [30] and sparse dictionary learning [48], which are...
[...]

Proceedings Article•DOI•

Multimodal Representation Learning: Advances, Trends and Challenges

[...]

Su-Fang Zhang¹, Junhai Zhai², Bo-Jun Xie², Yan Zhan², Xin Wang² - Show less +1 more•Institutions (2)

China Meteorological Administration¹, Hebei University²

07 Jul 2019

TL;DR: An overview of deep multimodal learning, especially the approaches proposed within the last decades, is presented to provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine.

...read moreread less

Abstract: Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.

...read moreread less

14 citations

Cites methods from "FaRoC: Fast and Robust Supervised C..."

...For example, Mandal and Maji [40] proposed a feature extraction algorithm named FaRoC which integrates judiciously the merits of canonical correlation analysis (CCA) and rough sets....
[...]

Proceedings Article•DOI•

Partially-Observed Discrete Dynamical Systems

[...]

Mahdi Imani¹, Seyede Fatemeh Ghoreishi²•Institutions (2)

George Washington University¹, University of Maryland, College Park²

25 May 2021

TL;DR: In this paper, a partially-observed discrete dynamical systems (PODDS) model is introduced, where the state is a vector containing the information of different components of the system, and each component takes its value from a finite real-valued set.

...read moreread less

Abstract: This paper introduces a new signal model called partially-observed discrete dynamical systems (PODDS). This signal model is a special case of the hidden Markov model (HMM), where the state is a vector containing the information of different components of the system, and each component takes its value from a finite real-valued set. This signal model is currently treated as a finite-state HMM, where maximum a posteriori (MAP) criterion is used for state estimator purpose. This paper takes advantage of the discrete structure of the state variables in PODDS and develops the optimal componentwise MAP (CMAP) state estimator, which yields the MAP solution in each state variable. A fully-recursive process is provided for computation of this optimal estimator, followed by introducing a specific instance of the PODDS model suitable for regulatory networks observed through noisy time series data. The high performance of the proposed estimator is demonstrated by numerical experiments with a PODDS model of random regulatory networks.

...read moreread less

11 citations

Journal Article•DOI•

<i>K</i>-Means Clustering-Based Kernel Canonical Correlation Analysis for Multimodal Emotion Recognition in Human–Robot Interaction

[...]

01 Jan 2023-IEEE transactions on industrial electronics

TL;DR: In this paper , a Kernel canonical correlation analysis algorithm is proposed for multimodal emotion recognition in human-robot interaction (HRI), which can improve the heterogenicity among different modalities and make multiple modalities complementary.

...read moreread less

Abstract: In this article, K -meansclustering-based Kernel canonical correlation analysis algorithm is proposed for multimodal emotion recognition in human–robot interaction (HRI). The multimodal features (gray pixels; time and frequency domain) extracted from facial expression and speech are fused based on Kernel canonical correlation analysis. K -means clustering is used to select features from multiple modalities and reduce dimensionality. The proposed approach can improve the heterogenicity among different modalities and make multiple modalities complementary to promote multimodal emotion recognition. Experiments on two datasets, namely SAVEE and eNTERFACE‘05, are conducted to evaluate the accuracy of the proposed method. The results show that the proposed method produces good recognition rates that are higher than the ones produced by the methods without K -means clustering; more specifically, they are 2.77% higher in SAVEE and 4.7% higher in eNTERFACE‘05.

...read moreread less

8 citations

Journal Article•DOI•

MSPL: Multimodal Self-Paced Learning for Multi-Omics Feature Selection and Data Integration

[...]

Ziyi Yang¹, Liang-Yong Xia², Hui Zhang¹, Yong Liang¹•Institutions (2)

Macau University of Science and Technology¹, Shanghai Jiao Tong University²

26 Nov 2019-IEEE Access

TL;DR: MSPL is presented, a robust supervisedmulti-omics data integration method that simultaneously identifies significant multi-omics signatures during the integration process and predicts the cancer subtypes and makesMulti-omicsData integration more systematic and expands its range of applications.

...read moreread less

Abstract: Rapid advances in high-throughput sequencing technology have led to the generation of a large number of multi-omics biological datasets. Integrating data from different omics provides an unprecedented opportunity to gain insight into disease mechanisms from different perspectives. However, integrative analysis and predictive modeling from multi-omics data are facing three major challenges: i) heavy noises; ii) the high dimensions compared to the small samples; iii) data heterogeneity. Current multi-omics data integration approaches have some limitations and are susceptible to heavy noise. In this paper, we present MSPL, a robust supervised multi-omics data integration method that simultaneously identifies significant multi-omics signatures during the integration process and predicts the cancer subtypes. The proposed method not only inherits the generalization performance of self-paced learning but also leverages the properties of multi-omics data containing correlated information to interactively recommend high-confidence samples for model training. We demonstrate the capabilities of MSPL using simulated data and five multi-omics biological datasets, integrating up three omics to identify potential biological signatures, and evaluating the performance compared to state-of-the-art methods in binary and multi-class classification problems. Our proposed model makes multi-omics data integration more systematic and expands its range of applications.

...read moreread less

7 citations

Cites background from "FaRoC: Fast and Robust Supervised C..."

...The problem of learning predictive models from multiomics data can be naturally considered a multimodal learning problem [13], [14]....
[...]

1
2
3
4
…
5

References

PDF

Open Access

More filters

Book•

The Nature of Statistical Learning Theory

[...]

Vladimir Vapnik¹•Institutions (1)

Bell Labs¹

01 Jan 1995

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

...read moreread less

40,147 citations

"FaRoC: Fast and Robust Supervised C..." refers methods in this paper

...1 with respect to B.632+ error rate of the SVM considering 25 extracted features, while Table I compares the minimum error rates and average cosine distance obtained using different methods....
[...]
...4 and Table I, it is seen that the performance of the FaRoC algorithm is significantly better than that of other SRCCA algorithms with respect to the B.632+ error rate of the SVM....
[...]
...The support vector machine (SVM) [55] with linear kernels is used to compute this error....
[...]
...Subsequent discussions analyze the results with respect to B.632+ error rate of the SVM....
[...]

Book•

Matrix computations (3rd ed.)

[...]

Gene H. Golub¹, Charles Van Loan²•Institutions (2)

Stanford University¹, Cornell University²

01 Nov 1996

8,608 citations

"FaRoC: Fast and Robust Supervised C..." refers methods in this paper

...In the proposed method, each eigenvalue-eigenvector pair of H is calculated sequentially by using power method [52]....
[...]
...Assuming K = min(p, q), K eigenvalue-eigenvector pairs can be calculated using Jacobi method [52]....
[...]

Journal Article•DOI•

Ridge regression: biased estimation for nonorthogonal problems

[...]

Arthur E. Hoerl¹, Robert W. Kennard¹•Institutions (1)

University of Delaware¹

01 Feb 2000-Technometrics

TL;DR: In this paper, an estimation procedure based on adding small positive quantities to the diagonal of X′X was proposed, which is a method for showing in two dimensions the effects of nonorthogonality.

...read moreread less

Abstract: In multiple regression it is shown that parameter estimates based on minimum residual sum of squares have a high probability of being unsatisfactory, if not incorrect, if the prediction vectors are not orthogonal. Proposed is an estimation procedure based on adding small positive quantities to the diagonal of X′X. Introduced is the ridge trace, a method for showing in two dimensions the effects of nonorthogonality. It is then shown how to augment X′X to obtain biased estimates with smaller mean square error.

...read moreread less

8,091 citations

"FaRoC: Fast and Robust Supervised C..." refers background in this paper

...It works by adding small positive quantities to the diagonals of Cxx and Cyy to guarantee their invertibility [31]....
[...]

Book•

Rough Sets: Theoretical Aspects of Reasoning about Data

[...]

Zdzisław Pawlak

31 Oct 1991

TL;DR: Theoretical Foundations.

...read moreread less

Abstract: I. Theoretical Foundations.- 1. Knowledge.- 1.1. Introduction.- 1.2. Knowledge and Classification.- 1.3. Knowledge Base.- 1.4. Equivalence, Generalization and Specialization of Knowledge.- Summary.- Exercises.- References.- 2. Imprecise Categories, Approximations and Rough Sets.- 2.1. Introduction.- 2.2. Rough Sets.- 2.3. Approximations of Set.- 2.4. Properties of Approximations.- 2.5. Approximations and Membership Relation.- 2.6. Numerical Characterization of Imprecision.- 2.7. Topological Characterization of Imprecision.- 2.8. Approximation of Classifications.- 2.9. Rough Equality of Sets.- 2.10. Rough Inclusion of Sets.- Summary.- Exercises.- References.- 3. Reduction of Knowledge.- 3.1. Introduction.- 3.2. Reduct and Core of Knowledge.- 3.3. Relative Reduct and Relative Core of Knowledge.- 3.4. Reduction of Categories.- 3.5. Relative Reduct and Core of Categories.- Summary.- Exercises.- References.- 4. Dependencies in Knowledge Base.- 4.1. Introduction.- 4.2. Dependency of Knowledge.- 4.3. Partial Dependency of Knowledge.- Summary.- Exercises.- References.- 5. Knowledge Representation.- 5.1. Introduction.- 5.2. Examples.- 5.3. Formal Definition.- 5.4. Significance of Attributes.- 5.5. Discernibility Matrix.- Summary.- Exercises.- References.- 6. Decision Tables.- 6.1. Introduction.- 6.2. Formal Definition and Some Properties.- 6.3. Simplification of Decision Tables.- Summary.- Exercises.- References.- 7. Reasoning about Knowledge.- 7.1. Introduction.- 7.2. Language of Decision Logic.- 7.3. Semantics of Decision Logic Language.- 7.4. Deduction in Decision Logic.- 7.5. Normal Forms.- 7.6. Decision Rules and Decision Algorithms.- 7.7. Truth and Indiscernibility.- 7.8. Dependency of Attributes.- 7.9. Reduction of Consistent Algorithms.- 7.10. Reduction of Inconsistent Algorithms.- 7.11. Reduction of Decision Rules.- 7.12. Minimization of Decision Algorithms.- Summary.- Exercises.- References.- II. Applications.- 8. Decision Making.- 8.1. Introduction.- 8.2. Optician's Decisions Table.- 8.3. Simplification of Decision Table.- 8.4. Decision Algorithm.- 8.5. The Case of Incomplete Information.- Summary.- Exercises.- References.- 9. Data Analysis.- 9.1. Introduction.- 9.2. Decision Table as Protocol of Observations.- 9.3. Derivation of Control Algorithms from Observation.- 9.4. Another Approach.- 9.5. The Case of Inconsistent Data.- Summary.- Exercises.- References.- 10. Dissimilarity Analysis.- 10.1. Introduction.- 10.2. The Middle East Situation.- 10.3. Beauty Contest.- 10.4. Pattern Recognition.- 10.5. Buying a Car.- Summary.- Exercises.- References.- 11. Switching Circuits.- 11.1. Introduction.- 11.2. Minimization of Partially Defined Switching Functions.- 11.3. Multiple-Output Switching Functions.- Summary.- Exercises.- References.- 12. Machine Learning.- 12.1. Introduction.- 12.2. Learning From Examples.- 12.3. The Case of an Imperfect Teacher.- 12.4. Inductive Learning.- Summary.- Exercises.- References.

...read moreread less

7,826 citations

Book Chapter•DOI•

Relations Between Two Sets of Variates

[...]

Harold Hotelling¹•Institutions (1)

Columbia University¹

01 Dec 1936-Biometrika

TL;DR: The concept of correlation and regression may be applied not only to ordinary one-dimensional variates but also to variates of two or more dimensions as discussed by the authors, where the correlation of the horizontal components is ordinarily discussed, whereas the complex consisting of horizontal and vertical deviations may be even more interesting.

...read moreread less

Abstract: Concepts of correlation and regression may be applied not only to ordinary one-dimensional variates but also to variates of two or more dimensions. Marksmen side by side firing simultaneous shots at targets, so that the deviations are in part due to independent individual errors and in part to common causes such as wind, provide a familiar introduction to the theory of correlation; but only the correlation of the horizontal components is ordinarily discussed, whereas the complex consisting of horizontal and vertical deviations may be even more interesting. The wind at two places may be compared, using both components of the velocity in each place. A fluctuating vector is thus matched at each moment with another fluctuating vector. The study of individual differences in mental and physical traits calls for a detailed study of the relations between sets of correlated variates. For example the scores on a number of mental tests may be compared with physical measurements on the same persons. The questions then arise of determining the number and nature of the independent relations of mind and body shown by these data to exist, and of extracting from the multiplicity of correlations in the system suitable characterizations of these independent relations. As another example, the inheritance of intelligence in rats might be studied by applying not one but s different mental tests to N mothers and to a daughter of each

...read moreread less

6,122 citations

"FaRoC: Fast and Robust Supervised C..." refers background in this paper

...CCA [10] obtains a linear relationship between two multidimensional variables....
[...]
...Canonical correlation analysis (CCA) [10] provides an efficient way of measuring the linear relationship between two multidimensional data sets....
[...]