μHEM for identification of differentially expressed miRNAs using hypercuboid equivalence partition matrix

doi:10.1186/1471-2105-14-266

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A comparison of machine learning classifiers for dementia with Lewy bodies using miRNA expression data.

[...]

Daichi Shigemizu, Shintaro Akiyama, Yuya Asanomi, Keith A. Boroevich, Alok Sharma, Tatsuhiko Tsunoda¹, Takashi Sakurai², Kouichi Ozaki, Takahiro Ochiya³, Shumpei Niida - Show less +6 more•Institutions (3)

Tokyo Medical and Dental University¹, Nagoya University², Tokyo Medical University³

30 Oct 2019-BMC Medical Genomics

TL;DR: The proposed prediction model provides an effective tool for DLB classification and predicted candidate target genes from the miRNAs, including 6 functional genes included in the DHA signaling pathway associated with DLB pathology.

...read moreread less

Abstract: Dementia with Lewy bodies (DLB) is the second most common subtype of neurodegenerative dementia in humans following Alzheimer’s disease (AD). Present clinical diagnosis of DLB has high specificity and low sensitivity and finding potential biomarkers of prodromal DLB is still challenging. MicroRNAs (miRNAs) have recently received a lot of attention as a source of novel biomarkers. In this study, using serum miRNA expression of 478 Japanese individuals, we investigated potential miRNA biomarkers and constructed an optimal risk prediction model based on several machine learning methods: penalized regression, random forest, support vector machine, and gradient boosting decision tree. The final risk prediction model, constructed via a gradient boosting decision tree using 180 miRNAs and two clinical features, achieved an accuracy of 0.829 on an independent test set. We further predicted candidate target genes from the miRNAs. Gene set enrichment analysis of the miRNA target genes revealed 6 functional genes included in the DHA signaling pathway associated with DLB pathology. Two of them were further supported by gene-based association studies using a large number of single nucleotide polymorphism markers (BCL2L1: P = 0.012, PIK3R2: P = 0.021). Our proposed prediction model provides an effective tool for DLB classification. Also, a gene-based association test of rare variants revealed that BCL2L1 and PIK3R2 were statistically significantly associated with DLB.

...read moreread less

24 citations

Cites methods from "μHEM for identification of differen..."

...This final risk prediction model using μHEM algorithm achieved an accuracy of 0.803 on an independent test set when pre-selecting the top-ranked 330 miRNAs and three clinical features....
[...]
...Paul S, Maji P. muHEM for identification of differentially expressed miRNAs using hypercuboid equivalence partition matrix....
[...]
...We also constructed a GBDT risk prediction model using another feature selection algorithm, μHEM [23], publicly available at http://www....
[...]
...We also constructed a GBDT risk prediction model using another feature selection algorithm, μHEM [23], publicly available at http://www.isical.ac.in/~bibl/results/ mihem/mihem.html, and investigated whether this feature selection methodology can further improve the predictive ability of our model....
[...]
...Hyperparameter values in the final GBDT model when using μHEM algorithm....
[...]

Journal Article•DOI•

FaRoC: Fast and Robust Supervised Canonical Correlation Analysis for Multimodal Omics Data

[...]

Ankita Mandal¹, Pradipta Maji¹•Institutions (1)

Indian Statistical Institute¹

01 Apr 2018-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The formulation enables the proposed method to extract required number of correlated features sequentially with lesser computational cost as compared to existing methods, and provides an efficient way to find optimum regularization parameters employed in CCA.

...read moreread less

Abstract: One of the main problems associated with high dimensional multimodal real life data sets is how to extract relevant and significant features. In this regard, a fast and robust feature extraction algorithm, termed as FaRoC, is proposed, integrating judiciously the merits of canonical correlation analysis (CCA) and rough sets. The proposed method extracts new features sequentially from two multidimensional data sets by maximizing their relevance with respect to class label and significance with respect to already-extracted features. To generate canonical variables sequentially, an analytical formulation is introduced to establish the relation between regularization parameters and CCA. The formulation enables the proposed method to extract required number of correlated features sequentially with lesser computational cost as compared to existing methods. To compute both significance and relevance measures of a feature, the concept of hypercuboid equivalence partition matrix of rough hypercuboid approach is used. It also provides an efficient way to find optimum regularization parameters employed in CCA. The efficacy of the proposed FaRoC algorithm, along with a comparison with other existing methods, is extensively established on several real life data sets.

...read moreread less

23 citations

Cites methods from "μHEM for identification of differen..."

...It has been applied successfully for analyzing omics data [34], [45], [46]....
[...]

Journal Article•DOI•

Gene expression and protein---protein interaction data for identification of colon cancer related genes using f-information measures

[...]

Sushmita Paul¹, Pradipta Maji²•Institutions (2)

University of Erlangen-Nuremberg¹, Indian Statistical Institute²

01 Sep 2016-Natural Computing

TL;DR: Results indicate that the integrated method presented is quite promising and may become a useful tool for identifying disease genes.

...read moreread less

Abstract: One of the most important and challenging problems in functional genomics is how to select the disease genes. In this regard, the paper presents a new computational method to identify disease genes. It judiciously integrates the information of gene expression profiles and shortest path analysis of protein---protein interaction networks. While the $$f$$f-information based maximum relevance-maximum significance framework is used to select differentially expressed genes as disease genes using gene expression profiles, the functional protein association network is used to study the mechanism of diseases. An important finding is that some $$f$$f-information measures are shown to be effective for selecting relevant and significant genes from microarray data. Extensive experimental study on colorectal cancer establishes the fact that the genes identified by the integrated method have more colorectal cancer genes than the genes identified from the gene expression profiles alone, irrespective of any gene selection algorithm. Also, these genes have greater functional similarity with the reported colorectal cancer genes than the genes identified from the gene expression profiles alone. The enrichment analysis of the obtained genes reveals to be associated with some of the important KEGG pathways. All these results indicate that the integrated method is quite promising and may become a useful tool for identifying disease genes.

...read moreread less

12 citations

Cites methods from "μHEM for identification of differen..."

...The f -MRMS algorithm judiciously integrates the merits of maximum relevancemaximum significance (MRMS) criterion (Maji and Paul 2011; Paul and Maji 2013a, b) and f -information measures....
[...]

Journal Article•DOI•

Identification of miRNA-mRNA Modules in Colorectal Cancer Using Rough Hypercuboid Based Supervised Clustering.

[...]

Sushmita Paul¹, Petra Lakatos², Arndt Hartmann², Regine Schneider-Stock², Julio Vera² - Show less +1 more•Institutions (2)

Indian Institute of Technology, Jodhpur¹, University of Erlangen-Nuremberg²

21 Feb 2017-Scientific Reports

TL;DR: This study presents an application of the RH-SAC algorithm on miRNA and mRNA expression data for identification of potential miRNA-mRNA modules and identified novel miRNA/mRNA interactions in colorectal cancer.

...read moreread less

Abstract: Differences in the expression profiles of miRNAs and mRNAs have been reported in colorectal cancer. Nevertheless, information on important miRNA-mRNA regulatory modules in colorectal cancer is still lacking. In this regard, this study presents an application of the RH-SAC algorithm on miRNA and mRNA expression data for identification of potential miRNA-mRNA modules. First, a set of miRNA rules was generated using the RH-SAC algorithm. The mRNA targets of the selected miRNAs were identified using the miRTarBase database. Next, the expression values of target mRNAs were used to generate mRNA rules using the RH-SAC. Then all miRNA-mRNA rules have been integrated for generating networks. The RH-SAC algorithm unlike other existing methods selects a group of co-expressed miRNAs and mRNAs that are also differentially expressed. In total 17 miRNAs and 141 mRNAs were selected. The enrichment analysis of selected mRNAs revealed that our method selected mRNAs that are significantly associated with colorectal cancer. We identified novel miRNA/mRNA interactions in colorectal cancer. Through experiment, we could confirm that one of our discovered miRNAs, hsa-miR-93-5p, was significantly up-regulated in 75.8% CRC in comparison to their corresponding non-tumor samples. It could have the potential to examine colorectal cancer subtype specific unique miRNA/mRNA interactions.

...read moreread less

9 citations

Journal Article•DOI•

Multimodal Omics Data Integration Using Max Relevance--Max Significance Criterion

[...]

Pradipta Maji¹, Ankita Mandal¹•Institutions (1)

Indian Statistical Institute¹

01 Aug 2017-IEEE Transactions on Biomedical Engineering

TL;DR: A novel supervised regularized canonical correlation analysis, termed as CuRSaR, to extract relevant and significant features from multimodal high dimensional omics datasets by maximizing the relevance of extracted features with respect to sample categories and significance among them.

...read moreread less

Abstract: Objective: This paper presents a novel supervised regularized canonical correlation analysis, termed as CuRSaR, to extract relevant and significant features from multimodal high dimensional omics datasets. Methods: The proposed method extracts a new set of features from two multidimensional datasets by maximizing the relevance of extracted features with respect to sample categories and significance among them. It integrates judiciously the merits of regularized canonical correlation analysis (RCCA) and rough hypercuboid approach. An analytical formulation, based on spectral decomposition, is introduced to establish the relation between canonical correlation analysis (CCA) and RCCA. The concept of hypercuboid equivalence partition matrix of rough hypercuboid is used to compute both relevance and significance of a feature. Significance: The analytical formulation makes the computational complexity of the proposed algorithm significantly lower than existing methods. The equivalence partition matrix offers an efficient way to find optimum regularization parameters employed in CCA. Results: The superiority of the proposed algorithm over other existing methods, in terms of computational complexity and classification accuracy, is established extensively on real life data.

...read moreread less

9 citations

Cites methods from "μHEM for identification of differen..."

...It has been applied successfully to feature selection and clustering [27] as well as to omics data analysis [26]–[30]....
[...]

μHEM for identification of differentially expressed miRNAs using hypercuboid equivalence partition matrix

Citations

Cites methods from "μHEM for identification of differen..."

Cites methods from "μHEM for identification of differen..."

Cites methods from "μHEM for identification of differen..."

Cites methods from "μHEM for identification of differen..."

References

"μHEM for identification of differen..." refers methods in this paper

"μHEM for identification of differen..." refers methods in this paper

"μHEM for identification of differen..." refers methods in this paper

"μHEM for identification of differen..." refers background in this paper

Related Papers (5)