scispace - formally typeset

Book ChapterDOI

Application of support vector machines in viral biology

23 Nov 2019-pp 361-403

TL;DR: This chapter provides lucid and easy to understand details of SVM algorithms along with applications in virology, one such robust tool based rigorously on statistical learning theory.

AbstractNovel experimental and sequencing techniques have led to an exponential explosion and spiraling of data in viral genomics. To analyse such data, rapidly gain information, and transform this information to knowledge, interdisciplinary approaches involving several different types of expertise are necessary. Machine learning has been in the forefront of providing models with increasing accuracy due to development of newer paradigms with strong fundamental bases. Support Vector Machines (SVM) is one such robust tool, based rigorously on statistical learning theory. SVM provides very high quality and robust solutions to classification and regression problems. Several studies in virology employ high performance tools including SVM for identification of potentially important gene and protein functions. This is mainly due to the highly beneficial aspects of SVM. In this chapter we briefly provide lucid and easy to understand details of SVM algorithms along with applications in virology.

...read more

Content maybe subject to copyright    Report


Citations
More filters
Posted ContentDOI
26 Jul 2021-bioRxiv
Abstract: The chemical basis of smell remains an unsolved problem, with ongoing studies mapping perceptual descriptor data from human participants to the chemical structures using computational methods. These approaches are, however, limited by linguistic capabilities and inter-individual differences in participants. We use olfactory behaviour data from the nematode C. elegans, which has isogenic populations in a laboratory setting, and employ machine learning approaches for a binary classification task predicting whether or not the worm will be attracted to a given monomolecular odorant. Among others, we use architectures based on Natural Language Processing methods on the SMILES representation of chemicals for molecular descriptor generation and show that machine learning algorithms trained on the descriptors give robust prediction results. We further show, by data augmentation, that increasing the number of samples increases the accuracy of the models. From this detailed analysis, we are able to achieve accuracies comparable to that in human studies and infer that there exists a non trivial relationship between the features of chemical structures and the nematode9s behaviour.
Journal ArticleDOI
Abstract: Synchronization and bursting activity are intrinsic electrophysiological properties of in vivo and in vitro neural networks. During early development, cortical cultures exhibit a wide repertoire of synchronous bursting dynamics whose characterization may help to understand the parameters governing the transition from immature to mature networks. Here we used machine learning techniques to characterize and predict the developing spontaneous activity in mouse cortical neurons on microelectrode arrays (MEAs) during the first three weeks in vitro. Network activity at three stages of early development was defined by 18 electrophysiological features of spikes, bursts, synchrony, and connectivity. The variability of neuronal network activity during early development was investigated by applying k-means and self-organizing map (SOM) clustering analysis to features of bursts and synchrony. These electrophysiological features were predicted at the third week in vitro with high accuracy from those at earlier times using three machine learning models: Multivariate Adaptive Regression Splines, Support Vector Machines, and Random Forest. Our results indicate that initial patterns of electrical activity during the first week in vitro may already predetermine the final development of the neuronal network activity. The methodological approach used here may be applied to explore the biological mechanisms underlying the complex dynamics of spontaneous activity in developing neuronal cultures.

References
More filters
Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

37,868 citations


"Application of support vector machi..." refers methods in this paper

  • ...LIBSVM [87] was used to generate model with the radial basis function (RBF) as a kernel function....

    [...]

  • ...Again, LIBSVM [87] with RBF was used to develop model....

    [...]

Book
01 Jan 2000
TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
Abstract: From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.

13,269 citations

Journal ArticleDOI
TL;DR: A least squares version for support vector machine (SVM) classifiers that follows from solving a set of linear equations, instead of quadratic programming for classical SVM's.
Abstract: In this letter we discuss a least squares version for support vector machine (SVM) classifiers. Due to equality type constraints in the formulation, the solution follows from solving a set of linear equations, instead of quadratic programming for classical SVM‘s. The approach is illustrated on a two-spiral benchmark classification problem.

7,819 citations

Journal ArticleDOI
TL;DR: A technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments is presented.
Abstract: Demand has never been greater for revolutionary technologies that deliver fast, inexpensive and accurate genome information. This challenge has catalysed the development of next-generation sequencing (NGS) technologies. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Here, I present a technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments. I also outline the broad range of applications for NGS technologies, in addition to providing guidelines for platform selection to address biological questions of interest.

6,671 citations

Journal ArticleDOI
Vladimir Vapnik1
TL;DR: How the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms are demonstrated and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems are demonstrated.
Abstract: Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990's new types of learning algorithms (called support vector machines) based on the developed theory were proposed. This made statistical learning theory not only a tool for the theoretical analysis but also a tool for creating practical algorithms for estimating multidimensional functions. This article presents a very general overview of statistical learning theory including both theoretical and algorithmic aspects of the theory. The goal of this overview is to demonstrate how the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems.

4,587 citations