scispace - formally typeset
Search or ask a question
Author

Alok Sharma

Bio: Alok Sharma is an academic researcher from Griffith University. The author has contributed to research in topics: Transplantation & Feature selection. The author has an hindex of 36, co-authored 200 publications receiving 4636 citations. Previous affiliations of Alok Sharma include University of Erlangen-Nuremberg & Indian Institute of Science.


Papers
More filters
Journal ArticleDOI
TL;DR: The accuracy of the iterative use of predicted secondary structure and backbone torsion angles and dihedrals based on Cα atoms is higher than those of model structures from current state-of-the-art techniques, suggesting the potentially beneficial use of these predicted properties for model assessment and ranking.
Abstract: Direct prediction of protein structure from sequence is a challenging problem. An effective approach is to break it up into independent sub-problems. These sub-problems such as prediction of protein secondary structure can then be solved independently. In a previous study, we found that an iterative use of predicted secondary structure and backbone torsion angles can further improve secondary structure and torsion angle prediction. In this study, we expand the iterative features to include solvent accessible surface area and backbone angles and dihedrals based on Cα atoms. By using a deep learning neural network in three iterations, we achieved 82% accuracy for secondary structure prediction, 0.76 for the correlation coefficient between predicted and actual solvent accessible surface area, 19° and 30° for mean absolute errors of backbone φ and ψ angles, respectively, and 8° and 32° for mean absolute errors of Cα-based θ and τ angles, respectively, for an independent test dataset of 1199 proteins. The accuracy of the method is slightly lower for 72 CASP 11 targets but much higher than those of model structures from current state-of-the-art techniques. This suggests the potentially beneficial use of these predicted properties for model assessment and ranking.

334 citations

Journal ArticleDOI
TL;DR: This study proposes two segmentation based feature extraction methods and shows that by applying a Support Vector Machine (SVM) classifier to the extracted features, they are able to enhance Gram-positive and Gram-negative subcellular localization prediction accuracies by up to 6.4% better than previous studies including the studies that used GO for feature extraction.

222 citations

Journal ArticleDOI
TL;DR: The algorithm finds the desired number of leading eigenvectors with a computational cost that is much less than that from the eigenvalue decomposition (EVD) based PCA method.

198 citations

Journal ArticleDOI
TL;DR: The proposed algorithm first divides genes into subsets, the sizes of which are relatively small, then selects informative smaller subsets of genes from a subset and merges the chosen genes with another gene subset to update the gene subset.
Abstract: Most of the conventional feature selection algorithms have a drawback whereby a weakly ranked gene that could perform well in terms of classification accuracy with an appropriate subset of genes will be left out of the selection. Considering this shortcoming, we propose a feature selection algorithm in gene expression data analysis of sample classifications. The proposed algorithm first divides genes into subsets, the sizes of which are relatively small (roughly of size h), then selects informative smaller subsets of genes (of size r <; h) from a subset and merges the chosen genes with another gene subset (of size r) to update the gene subset. We repeat this process until all subsets are merged into one informative subset. We illustrate the effectiveness of the proposed algorithm by analyzing three distinct gene expression data sets. Our method shows promising classification accuracy for all the test data sets. We also show the relevance of the selected genes in terms of their biological functions.

186 citations

Book ChapterDOI
TL;DR: A newly developed method SPIDER2 is described that utilizes three iterations of deep learning neural networks to improve the prediction accuracy of several structural properties simultaneously and provides state-of-the-art, all-in-one accurate prediction of local structure and solvent accessible surface area.
Abstract: Predicting one-dimensional structure properties has played an important role to improve prediction of protein three-dimensional structures and functions. The most commonly predicted properties are secondary structure and accessible surface area (ASA) representing local and nonlocal structural characteristics, respectively. Secondary structure prediction is further complemented by prediction of continuous main-chain torsional angles. Here we describe a newly developed method SPIDER2 that utilizes three iterations of deep learning neural networks to improve the prediction accuracy of several structural properties simultaneously. For an independent test set of 1199 proteins SPIDER2 achieves 82 % accuracy for secondary structure prediction, 0.76 for the correlation coefficient between predicted and actual solvent accessible surface area, 19° and 30° for mean absolute errors of backbone φ and ψ angles, respectively, and 8° and 32° for mean absolute errors of Cα-based θ and τ angles, respectively. The method provides state-of-the-art, all-in-one accurate prediction of local structure and solvent accessible surface area. The method is implemented, as a webserver along with a standalone package that are available in our website: http://sparks-lab.org .

151 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
10 Mar 1970

8,159 citations

01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations