Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM

doi:10.5120/IJCA2015906480

Home
/
Papers
/
Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM

Journal Article•DOI•

Time Complexity Analysis of Support Vector Machines (SVM) in LibSVM

01 Oct 2015-International Journal of Computer Applications (Foundation of Computer Science (FCS), NY, USA)-Vol. 128, Iss: 3, pp 28-34

TL;DR: The research has proved that the complexity of SVM (LibSVM) is O(n3) and the time complexity shown that C++ faster than Java, both in training and testing, beside that the data growth will be affect and increase the time of computation.

read less

Abstract: Support Vector Machines (SVM) is one of machine learning methods that can be used to perform classification task. Many researchers using SVM library to accelerate their research development. Using such a library will save their time and avoid to write codes from scratch. LibSVM is one of SVM library that has been widely used by researchers to solve their problems. The library also integrated to WEKA, one of popular Data Mining tools. This article contain results of our work related to complexity analysis of Support Vector Machines. Our work has focus on SVM algorithm and its implementation in LibSVM. We also using two popular programming languages i.e C++ and Java with three different dataset to test our analysis and experiment. The results of our research has proved that the complexity of SVM (LibSVM) is O(n3) and the time complexity shown that C++ faster than Java, both in training and testing, beside that the data growth will be affect and increase the time of computation.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Statistical Pattern Recognition

[...]

E.R. Davies¹•Institutions (1)

University of London¹

01 Jan 1990

TL;DR: This chapter introduces the subject of statistical pattern recognition (SPR) by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier.

...read moreread less

Abstract: This chapter introduces the subject of statistical pattern recognition (SPR). It starts by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier. The concepts of an optimal number of features, representativeness of the training data, and the need to avoid overfitting to the training data are stressed. The chapter shows that methods such as the support vector machine and artificial neural networks are subject to these same training limitations, although each has its advantages. For neural networks, the multilayer perceptron architecture and back-propagation algorithm are described. The chapter distinguishes between supervised and unsupervised learning, demonstrating the advantages of the latter and showing how methods such as clustering and principal components analysis fit into the SPR framework. The chapter also defines the receiver operating characteristic, which allows an optimum balance between false positives and false negatives to be achieved.

...read moreread less

1,189 citations

Journal Article•DOI•

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

[...]

Giang Nguyen¹, Stefan Dlugolinsky¹, Martin Bobák¹, Viet Tran¹, Álvaro López García², Ignacio Heredia², Peter Malik¹, Ladislav Hluchý¹ - Show less +4 more•Institutions (2)

Slovak Academy of Sciences¹, Spanish National Research Council²

19 Jan 2019-Artificial Intelligence Review

TL;DR: This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software that is capable of scaling computation effectively and efficiently in the era of Big Data.

...read moreread less

Abstract: The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

...read moreread less

443 citations

Cites background from "Time Complexity Analysis of Support..."

...It takes O(n3) time (Abdiansah and Wardoyo 2015) in the worst case and around O(n2) on typical cases, where n is the number of data points....
[...]

Journal Article•DOI•

Machine Learning Meets Computation and Communication Control in Evolving Edge and Cloud: Challenges and Future Perspective

[...]

Tiago Koketsu Rodrigues¹, Katsuya Suto², Hiroki Nishiyama¹, Jiajia Liu³, Nei Kato¹ - Show less +1 more•Institutions (3)

Tohoku University¹, University of Electro-Communications², Xidian University³

01 Jan 2020-IEEE Communications Surveys and Tutorials

TL;DR: A comprehensive survey on the use of ML in MEC systems is provided, offering an insight into the current progress of this research area and helpful guidance is supplied by pointing out which MEC challenges can be solved by ML solutions, what are the current trending algorithms in frontier ML research and how they could be used in M EC.

...read moreread less

Abstract: Mobile Edge Computing (MEC) is considered an essential future service for the implementation of 5G networks and the Internet of Things, as it is the best method of delivering computation and communication resources to mobile devices. It is based on the connection of the users to servers located on the edge of the network, which is especially relevant for real-time applications that demand minimal latency. In order to guarantee a resource-efficient MEC (which, for example, could mean improved Quality of Service for users or lower costs for service providers), it is important to consider certain aspects of the service model, such as where to offload the tasks generated by the devices, how many resources to allocate to each user (specially in the wired or wireless device-server communication) and how to handle inter-server communication. However, in the MEC scenarios with many and varied users, servers and applications, these problems are characterized by parameters with exceedingly high levels of dimensionality, resulting in too much data to be processed and complicating the task of finding efficient configurations. This will be particularly troublesome when 5G networks and Internet of Things roll out, with their massive amounts of devices. To address this concern, the best solution is to utilize Machine Learning (ML) algorithms, which enable the computer to draw conclusions and make predictions based on existing data without human supervision, leading to quick near-optimal solutions even in problems with high dimensionality. Indeed, in scenarios with too much data and too many parameters, ML algorithms are often the only feasible alternative. In this paper, a comprehensive survey on the use of ML in MEC systems is provided, offering an insight into the current progress of this research area. Furthermore, helpful guidance is supplied by pointing out which MEC challenges can be solved by ML solutions, what are the current trending algorithms in frontier ML research and how they could be used in MEC. These pieces of information should prove fundamental in encouraging future research that combines ML and MEC.

...read moreread less

186 citations

Cites background from "Time Complexity Analysis of Support..."

...with problem and implementation, so these are rough estimates [186]....
[...]

Journal Article•DOI•

Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods.

[...]

Muhammad Fazal Ijaz¹, Muhammad Attique², Youngdoo Son¹•Institutions (2)

Dongguk University¹, Sejong University²

15 May 2020-Sensors

TL;DR: The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs and employs random forest (RF) as a classifier.

...read moreread less

Abstract: Globally, cervical cancer remains as the foremost prevailing cancer in females. Hence, it is necessary to distinguish the importance of risk factors of cervical cancer to classify potential patients. The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs. The CCPM first removes outliers by using outlier detection methods such as density-based spatial clustering of applications with noise (DBSCAN) and isolation forest (iForest) and by increasing the number of cases in the dataset in a balanced way, for example, through synthetic minority over-sampling technique (SMOTE) and SMOTE with Tomek link (SMOTETomek). Finally, it employs random forest (RF) as a classifier. Thus, CCPM lies on four scenarios: (1) DBSCAN + SMOTETomek + RF, (2) DBSCAN + SMOTE+ RF, (3) iForest + SMOTETomek + RF, and (4) iForest + SMOTE + RF. A dataset of 858 potential patients was used to validate the performance of the proposed method. We found that combinations of iForest with SMOTE and iForest with SMOTETomek provided better performances than those of DBSCAN with SMOTE and DBSCAN with SMOTETomek. We also observed that RF performed the best among several popular machine learning classifiers. Furthermore, the proposed CCPM showed better accuracy than previously proposed methods for forecasting cervical cancer. In addition, a mobile application that can collect cervical cancer risk factors data and provides results from CCPM is developed for instant and proper action at the initial stage of cervical cancer.

...read moreread less

155 citations

Cites background from "Time Complexity Analysis of Support..."

...Time complexity and space complexity are two types of computational complexity [81,82]....
[...]

Journal Article•DOI•

Link Weight Prediction Using Supervised Learning Methods and Its Application to Yelp Layered Network

[...]

Chenbo Fu¹, Minghao Zhao¹, Lu Fan¹, Xinyi Chen¹, Jinyin Chen¹, Zhefu Wu¹, Yongxiang Xia², Qi Xuan¹ - Show less +4 more•Institutions (2)

Zhejiang University of Technology¹, Zhejiang University²

01 Aug 2018-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposed a series of new centrality indices for links in line graph, and designed three supervised learning methods to realize link weight prediction both in the networks of single layer and multiple layers, which perform much better than several recently proposed baseline methods.

...read moreread less

Abstract: Real-world networks feature weights of interactions, where link weights often represent some physical attributes. In many situations, to recover the missing data or predict the network evolution, we need to predict link weights in a network. In this paper, we first proposed a series of new centrality indices for links in line graph. Then, utilizing these line graph indices, as well as a number of original graph indices, we designed three supervised learning methods to realize link weight prediction both in the networks of single layer and multiple layers, which perform much better than several recently proposed baseline methods. We found that the resource allocation index (RA) plays a more important role in the weight prediction than other topological properties, and the line graph indices are at least as important as the original graph indices in link weight prediction. In particular, the success application of our methods on Yelp layered network suggests that we can indeed predict the offline co-foraging behaviors of users just based on their online social interactions, which may open a new direction for link weight prediction algorithms, and meanwhile provide insights to design better restaurant recommendation systems.

...read moreread less

114 citations

Cites background from "Time Complexity Analysis of Support..."

..., the time complexity for SVM is OðjEj(3)Þ [80], for RF and GBDT is OðTF jEjlog jEjÞ,...
[...]
...For supervised methods,the time complexity is mainly determined by the training part, e.g., the time complexity for SVM is OðjEj3Þ [80], for RF and GBDT is OðTF jEjlog jEjÞ, where T denotes the number of trees, F denotes the number of attributes [81]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

LIBSVM: A library for support vector machines

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

06 May 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

40,826 citations

Book•

The Nature of Statistical Learning Theory

[...]

Vladimir Vapnik¹•Institutions (1)

Bell Labs¹

01 Jan 1995

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

...read moreread less

40,147 citations

A Practical Guide to Support Vector Classication

[...]

Hsu Chih-Wei, Chih-Chung Chang¹, Chih-Jen Lin•Institutions (1)

National Taiwan University¹

01 Jan 2008

TL;DR: A simple procedure is proposed, which usually gives reasonable results and is suitable for beginners who are not familiar with SVM.

...read moreread less

Abstract: Support vector machine (SVM) is a popular technique for classication. However, beginners who are not familiar with SVM often get unsatisfactory results since they miss some easy but signicant steps. In this guide, we propose a simple procedure, which usually gives reasonable results.

...read moreread less

7,069 citations

Journal Article•DOI•

A comparison of methods for multiclass support vector machines

[...]

Hsu Chih-Wei¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

01 Mar 2002-IEEE Transactions on Neural Networks

TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.

...read moreread less

Abstract: Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.

...read moreread less

6,562 citations

Support Vector Machines for Classification and Regression

[...]

Steve R. Gunn

01 Jan 1998

TL;DR: The Structural Risk Minimization (SRM) as discussed by the authors principle has been shown to be superior to traditional empirical risk minimization (ERM) principle employed by conventional neural networks, as opposed to ERM which minimizes the error on the training data.

...read moreread less

Abstract: The foundations of Support Vector Machines (SVM) have been developed by Vapnik and are gaining popularity due to many attractive features, and promising empirical performance. The formulation embodies the Structural Risk Minimisation (SRM) principle, which in our work has been shown to be superior to traditional Empirical Risk Minimisation (ERM) principle employed by conventional neural networks. SRM minimises an upper bound on the VC dimension (generalisation error), as opposed to ERM which minimises the error on the training data. It is this difference which equips SVMs with a greater ability to generalise, which is our goal in statistical learning. SVMs were developed to solve the classification problem, but recently they have been extended to the domain of regression problems.

...read moreread less

2,295 citations