scispace - formally typeset
Search or ask a question
Author

Gangadhara Rao Kancharla

Bio: Gangadhara Rao Kancharla is an academic researcher. The author has contributed to research in topics: Computer science & Algorithm. The author has an hindex of 1, co-authored 2 publications receiving 4 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This is the first work that utilizes an optimal feature selection model using QICO and QICO-RNN for effective classification of cancer data using gene expression data and this model outperforms the other heuristic-based feature selection and optimized RNN methods.
Abstract: PurposeGene selection is considered as the fundamental process in the bioinformatics field. The existing methodologies pertain to cancer classification are mostly clinical basis, and its diagnosis capability is limited. Nowadays, the significant problems of cancer diagnosis are solved by the utilization of gene expression data. The researchers have been introducing many possibilities to diagnose cancer appropriately and effectively. This paper aims to develop the cancer data classification using gene expression data.Design/methodology/approachThe proposed classification model involves three main phases: “(1) Feature extraction, (2) Optimal Feature Selection and (3) Classification”. Initially, five benchmark gene expression datasets are collected. From the collected gene expression data, the feature extraction is performed. To diminish the length of the feature vectors, optimal feature selection is performed, for which a new meta-heuristic algorithm termed as quantum-inspired immune clone optimization algorithm (QICO) is used. Once the relevant features are selected, the classification is performed by a deep learning model called recurrent neural network (RNN). Finally, the experimental analysis reveals that the proposed QICO-based feature selection model outperforms the other heuristic-based feature selection and optimized RNN outperforms the other machine learning methods.FindingsThe proposed QICO-RNN is acquiring the best outcomes at any learning percentage. On considering the learning percentage 85, the accuracy of the proposed QICO-RNN was 3.2% excellent than RNN, 4.3% excellent than RF, 3.8% excellent than NB and 2.1% excellent than KNN for Dataset 1. For Dataset 2, at learning percentage 35, the accuracy of the proposed QICO-RNN was 13.3% exclusive than RNN, 8.9% exclusive than RF and 14.8% exclusive than NB and KNN. Hence, the developed QICO algorithm is performing well in classifying the cancer data using gene expression data accurately.Originality/valueThis paper introduces a new optimal feature selection model using QICO and QICO-based RNN for effective classification of cancer data using gene expression data. This is the first work that utilizes an optimal feature selection model using QICO and QICO-RNN for effective classification of cancer data using gene expression data.

6 citations

Journal ArticleDOI
TL;DR: In this paper, a Spider Monkey Optimization (SMO) based approach has been implemented to get feature selection subset from gene expression data of high dimension, where a well defined fast preprocessing heuristic based approach for diminishing features.
Abstract: In this paper, we presented an adequate algorithm in gene expression data for feature selection. A Spider Monkey Optimization (SMO) based approach has implemented to get feature selection subset from gene expression data of high dimension. In consideration of high dimensionality data where lie large amount of superfluous features, we are using a well defined fast preprocessing heuristic based approach for diminishing features. Fitness function can be illustrated as it supports SMO scheme to knob the inconsistent targets i.e., to decrease feature cardinality and preserving unique efficiency (i.e., accuracy of classification). The bench mark datasets of microarray gene expression have used and demonstrated the results along in-depth provisional research using a classifier named k-NN.

3 citations

Journal ArticleDOI
TL;DR: The alignment of multiple sequences is examined through swarm intelligence based an improved particle swarm optimization (PSO) system, which is compared with few existing approaches such as deoxyribonucleic acid (DNA) or ribonuclear acid (RNA) alignment (DIALIGN), PILEUP8, hidden Markov model training (HMMT), rubber band technique-genetic algorithm (RBT-GA) and ML-PIMA.
Abstract: In this article, the alignment of multiple sequences is examined through swarm intelligence based an improved particle swarm optimization (PSO). A random heuristic technique for solving discrete optimization problems and realistic estimation was recently discovered in PSO. The PSO approach is a nature-inspired technique based on intelligence and swarm movement. Thus, each solution is encoded as “chromosomes” in the genetic algorithm (GA). Based on the optimization of the objective function, the fitness function is designed to maximize the suitable components of the sequence and reduce the unsuitable components of the sequence. The availability of a public benchmark data set such as the Bali base is seen as an assessment of the proposed system performance, with the potential for PSO to reveal problems in adapting to better performance. This proposed system is compared with few existing approaches such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) alignment (DIALIGN), PILEUP8, hidden Markov model training (HMMT), rubber band technique-genetic algorithm (RBT-GA) and ML-PIMA. In many cases, the experimental results are well implemented in the proposed system compared to other existing approaches.

Cited by
More filters
Journal ArticleDOI
TL;DR: A spider monkey optimization (SMO) based DNN model was compared with principal component analysis (PCA)-based DNN and the classical DNN models, wherein the results justified the advantage of implementing the proposed model over other approaches.
Abstract: The enormous growth in internet usage has led to the development of different malicious software posing serious threats to computer security. The various computational activities carried out over the network have huge chances to be tampered and manipulated and this necessitates the emergence of efficient intrusion detection systems. The network attacks are also dynamic in nature, something which increases the importance of developing appropriate models for classification and predictions. Machine learning (ML) and deep learning algorithms have been prevalent choices in the analysis of intrusion detection systems (IDS) datasets. The issues pertaining to quality and quality of data and the handling of high dimensional data is managed by the use of nature inspired algorithms. The present study uses a NSL-KDD and KDD Cup 99 dataset collected from the Kaggle repository. The dataset was cleansed using the min-max normalization technique and passed through the 1-N encoding method for achieving homogeneity. A spider monkey optimization (SMO) algorithm was used for dimensionality reduction and the reduced dataset was fed into a deep neural network (DNN). The SMO based DNN model generated classification results with 99.4% and 92% accuracy, 99.5%and 92.7% of precision, 99.5% and 92.8% of recall and 99.6%and 92.7% of F1-score, utilizing minimal training time. The model was further compared with principal component analysis (PCA)-based DNN and the classical DNN models, wherein the results justified the advantage of implementing the proposed model over other approaches.

70 citations

Journal ArticleDOI
TL;DR: In this paper, a multi-objective quadratic binary HHO (MOQBHHO) technique with KNN method as wrapper classifier is implemented for extracting the optimal feature subsets.

42 citations

Journal ArticleDOI
TL;DR: In this article , the authors proposed a novel Binary Coronavirus Disease Optimization Algorithm (BCOVIDOA) for feature selection, where the CORONAVIR disease optimization algorithm mimics the replication mechanism used by CoronAVirus when hijacking human cells, and the experimental results demonstrate that the proposed BCOVIDOA significantly outperforms the existing algorithms in terms of accuracy, best cost, average cost, the average cost (AVG), standard deviation (STD), and size of selected features.
Abstract: The increased use of digital tools such as smart phones, Internet of Things devices, cameras, and microphones, has led to the produuction of big data. Large data dimensionality, redundancy, and irrelevance are inherent challenging problems when it comes to big data. Feature selection is a necessary process to select the optimal subset of features when addressing such problems. In this paper, the authors propose a novel Binary Coronavirus Disease Optimization Algorithm (BCOVIDOA) for feature selection, where the Coronavirus Disease Optimization Algorithm (COVIDOA) is a new optimization technique that mimics the replication mechanism used by Coronavirus when hijacking human cells. The performance of the proposed algorithm is evaluated using twenty-six standard benchmark datasets from UCI Repository. The results are compared with nine recent wrapper feature selection algorithms. The experimental results demonstrate that the proposed BCOVIDOA significantly outperforms the existing algorithms in terms of accuracy, best cost, the average cost (AVG), standard deviation (STD), and size of selected features. Additionally, the Wilcoxon rank-sum test is calculated to prove the statistical significance of the results.

9 citations

01 Jan 2018
TL;DR: In this paper, a unified statistical model for joint normalization and DE detection of RNA-seq data is proposed, where sample-specific normalization factors are modeled as unknown parameters in the gene-wise linear models and jointly estimated with the regression coefficients.
Abstract: The RNA-sequencing (RNA-seq) is becoming increasingly popular for quantifying gene expression levels. Since the RNA-seq measurements are relative in nature, between-sample normalization is an essential step in differential expression (DE) analysis. The normalization step of existing DE detection algorithms is usually ad hoc and performed only once prior to DE detection, which may be suboptimal since ideally normalization should be based on non-DE genes only and thus coupled with DE detection. We propose a unified statistical model for joint normalization and DE detection of RNA-seq data. Sample-specific normalization factors are modeled as unknown parameters in the gene-wise linear models and jointly estimated with the regression coefficients. By imposing sparsity-inducing L1 penalty (or mixed L1/L2 penalty for multiple treatment conditions) on the regression coefficients, we formulate the problem as a penalized least-squares regression problem and apply the augmented Lagrangian method to solve it. Simulation and real data studies show that the proposed model and algorithms perform better than or comparably to existing methods in terms of detection power and false-positive rate. The performance gain increases with increasingly larger sample size or higher signal to noise ratio, and is more significant when a large proportion of genes are differentially expressed in an asymmetric manner.

6 citations

Journal ArticleDOI
TL;DR: The results demonstrate that the proposed Memetic Cellular Genetic Algorithm (MCGA) can provide efficient solutions to find a minimal subset of the genes in cancer microarray datasets.
Abstract: Gene selection aims at identifying a -small- subset of informative genes from the initial data to obtain high predictive accuracy for classification in human cancers. Gene selection can be considered as a combinatorial search problem and thus can be conveniently handled with optimization methods. This paper proposes a Memetic Cellular Genetic Algorithm (MCGA) to solve the Feature Selection problem of cancer microarray datasets. Benchmark gene expression datasets, i.e., colon, lymphoma, and leukaemia available in the literature were used for experimentation. MCGA is compared with other well-known metaheuristic' strategies. The results demonstrate that our proposal can provide efficient solutions to find a minimal subset of the genes.

5 citations