scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM

01 Oct 2015-International Journal of Computer Applications (Foundation of Computer Science (FCS), NY, USA)-Vol. 128, Iss: 3, pp 28-34
TL;DR: The research has proved that the complexity of SVM (LibSVM) is O(n3) and the time complexity shown that C++ faster than Java, both in training and testing, beside that the data growth will be affect and increase the time of computation.
Abstract: Support Vector Machines (SVM) is one of machine learning methods that can be used to perform classification task. Many researchers using SVM library to accelerate their research development. Using such a library will save their time and avoid to write codes from scratch. LibSVM is one of SVM library that has been widely used by researchers to solve their problems. The library also integrated to WEKA, one of popular Data Mining tools. This article contain results of our work related to complexity analysis of Support Vector Machines. Our work has focus on SVM algorithm and its implementation in LibSVM. We also using two popular programming languages i.e C++ and Java with three different dataset to test our analysis and experiment. The results of our research has proved that the complexity of SVM (LibSVM) is O(n3) and the time complexity shown that C++ faster than Java, both in training and testing, beside that the data growth will be affect and increase the time of computation.

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI
E.R. Davies1
01 Jan 1990
TL;DR: This chapter introduces the subject of statistical pattern recognition (SPR) by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier.
Abstract: This chapter introduces the subject of statistical pattern recognition (SPR). It starts by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier. The concepts of an optimal number of features, representativeness of the training data, and the need to avoid overfitting to the training data are stressed. The chapter shows that methods such as the support vector machine and artificial neural networks are subject to these same training limitations, although each has its advantages. For neural networks, the multilayer perceptron architecture and back-propagation algorithm are described. The chapter distinguishes between supervised and unsupervised learning, demonstrating the advantages of the latter and showing how methods such as clustering and principal components analysis fit into the SPR framework. The chapter also defines the receiver operating characteristic, which allows an optimum balance between false positives and false negatives to be achieved.

1,189 citations

Journal ArticleDOI
TL;DR: This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software that is capable of scaling computation effectively and efficiently in the era of Big Data.
Abstract: The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

443 citations


Cites background from "Time Complexity Analysis of Support..."

  • ...It takes O(n3) time (Abdiansah and Wardoyo 2015) in the worst case and around O(n2) on typical cases, where n is the number of data points....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive survey on the use of ML in MEC systems is provided, offering an insight into the current progress of this research area and helpful guidance is supplied by pointing out which MEC challenges can be solved by ML solutions, what are the current trending algorithms in frontier ML research and how they could be used in M EC.
Abstract: Mobile Edge Computing (MEC) is considered an essential future service for the implementation of 5G networks and the Internet of Things, as it is the best method of delivering computation and communication resources to mobile devices. It is based on the connection of the users to servers located on the edge of the network, which is especially relevant for real-time applications that demand minimal latency. In order to guarantee a resource-efficient MEC (which, for example, could mean improved Quality of Service for users or lower costs for service providers), it is important to consider certain aspects of the service model, such as where to offload the tasks generated by the devices, how many resources to allocate to each user (specially in the wired or wireless device-server communication) and how to handle inter-server communication. However, in the MEC scenarios with many and varied users, servers and applications, these problems are characterized by parameters with exceedingly high levels of dimensionality, resulting in too much data to be processed and complicating the task of finding efficient configurations. This will be particularly troublesome when 5G networks and Internet of Things roll out, with their massive amounts of devices. To address this concern, the best solution is to utilize Machine Learning (ML) algorithms, which enable the computer to draw conclusions and make predictions based on existing data without human supervision, leading to quick near-optimal solutions even in problems with high dimensionality. Indeed, in scenarios with too much data and too many parameters, ML algorithms are often the only feasible alternative. In this paper, a comprehensive survey on the use of ML in MEC systems is provided, offering an insight into the current progress of this research area. Furthermore, helpful guidance is supplied by pointing out which MEC challenges can be solved by ML solutions, what are the current trending algorithms in frontier ML research and how they could be used in MEC. These pieces of information should prove fundamental in encouraging future research that combines ML and MEC.

186 citations


Cites background from "Time Complexity Analysis of Support..."

  • ...with problem and implementation, so these are rough estimates [186]....

    [...]

Journal ArticleDOI
15 May 2020-Sensors
TL;DR: The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs and employs random forest (RF) as a classifier.
Abstract: Globally, cervical cancer remains as the foremost prevailing cancer in females. Hence, it is necessary to distinguish the importance of risk factors of cervical cancer to classify potential patients. The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs. The CCPM first removes outliers by using outlier detection methods such as density-based spatial clustering of applications with noise (DBSCAN) and isolation forest (iForest) and by increasing the number of cases in the dataset in a balanced way, for example, through synthetic minority over-sampling technique (SMOTE) and SMOTE with Tomek link (SMOTETomek). Finally, it employs random forest (RF) as a classifier. Thus, CCPM lies on four scenarios: (1) DBSCAN + SMOTETomek + RF, (2) DBSCAN + SMOTE+ RF, (3) iForest + SMOTETomek + RF, and (4) iForest + SMOTE + RF. A dataset of 858 potential patients was used to validate the performance of the proposed method. We found that combinations of iForest with SMOTE and iForest with SMOTETomek provided better performances than those of DBSCAN with SMOTE and DBSCAN with SMOTETomek. We also observed that RF performed the best among several popular machine learning classifiers. Furthermore, the proposed CCPM showed better accuracy than previously proposed methods for forecasting cervical cancer. In addition, a mobile application that can collect cervical cancer risk factors data and provides results from CCPM is developed for instant and proper action at the initial stage of cervical cancer.

155 citations


Cites background from "Time Complexity Analysis of Support..."

  • ...Time complexity and space complexity are two types of computational complexity [81,82]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposed a series of new centrality indices for links in line graph, and designed three supervised learning methods to realize link weight prediction both in the networks of single layer and multiple layers, which perform much better than several recently proposed baseline methods.
Abstract: Real-world networks feature weights of interactions, where link weights often represent some physical attributes. In many situations, to recover the missing data or predict the network evolution, we need to predict link weights in a network. In this paper, we first proposed a series of new centrality indices for links in line graph. Then, utilizing these line graph indices, as well as a number of original graph indices, we designed three supervised learning methods to realize link weight prediction both in the networks of single layer and multiple layers, which perform much better than several recently proposed baseline methods. We found that the resource allocation index (RA) plays a more important role in the weight prediction than other topological properties, and the line graph indices are at least as important as the original graph indices in link weight prediction. In particular, the success application of our methods on Yelp layered network suggests that we can indeed predict the offline co-foraging behaviors of users just based on their online social interactions, which may open a new direction for link weight prediction algorithms, and meanwhile provide insights to design better restaurant recommendation systems.

114 citations


Cites background from "Time Complexity Analysis of Support..."

  • ..., the time complexity for SVM is OðjEj(3)Þ [80], for RF and GBDT is OðTF jEjlog jEjÞ,...

    [...]

  • ...For supervised methods,the time complexity is mainly determined by the training part, e.g., the time complexity for SVM is OðjEj3Þ [80], for RF and GBDT is OðTF jEjlog jEjÞ, where T denotes the number of trees, F denotes the number of attributes [81]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

40,826 citations

Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations

01 Jan 2008
TL;DR: A simple procedure is proposed, which usually gives reasonable results and is suitable for beginners who are not familiar with SVM.
Abstract: Support vector machine (SVM) is a popular technique for classication. However, beginners who are not familiar with SVM often get unsatisfactory results since they miss some easy but signicant steps. In this guide, we propose a simple procedure, which usually gives reasonable results.

7,069 citations

Journal ArticleDOI
TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Abstract: Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.

6,562 citations

01 Jan 1998
TL;DR: The Structural Risk Minimization (SRM) as discussed by the authors principle has been shown to be superior to traditional empirical risk minimization (ERM) principle employed by conventional neural networks, as opposed to ERM which minimizes the error on the training data.
Abstract: The foundations of Support Vector Machines (SVM) have been developed by Vapnik and are gaining popularity due to many attractive features, and promising empirical performance. The formulation embodies the Structural Risk Minimisation (SRM) principle, which in our work has been shown to be superior to traditional Empirical Risk Minimisation (ERM) principle employed by conventional neural networks. SRM minimises an upper bound on the VC dimension (generalisation error), as opposed to ERM which minimises the error on the training data. It is this difference which equips SVMs with a greater ability to generalise, which is our goal in statistical learning. SVMs were developed to solve the classification problem, but recently they have been extended to the domain of regression problems.

2,295 citations

Trending Questions (1)
How does the complexity of SVM compare to other algorithms?

The provided paper does not compare the complexity of SVM to other algorithms.