scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Algorithm selection for classification problems

13 Jul 2016-pp 203-211
TL;DR: The number of attributes, the number of instances, thenumber of classes, maximum probability of class and class entropy are playing a major role in classifier accuracy and algorithm selection for thirty eight datasets used for experimentation.
Abstract: A number of algorithms are available in the areas of data mining, machine learning and pattern recognition for solving the same kind of problem. But there is a little guidance for suggesting algorithm to use which gives best results for the problem at hand. This paper shows an approach for solving this problem using meta-learning. The paper uses three types of data characteristics. Simple, information theoretic, and statistical data characteristics are used. Results are generated using nine different algorithms on thirty eight benchmark datasets from UCI repository. The proposed approach uses K-nearest neighbor algorithm for suggesting the suitable algorithm. Classifier accuracy is taken as a basis for recommending the algorithm. By using meta-learning, accurate method can be recommended as per the given data, and cognitive overload for applying each method, comparing with other methods and then selecting the suitable method for use can be reduced. Thus it helps in adaptive learning methods. The experimentation shows that predicted accuracies are matching with the actual accuracies for more than 90 % of the benchmark datasets used. Thus it is concluded that the number of attributes, the number of instances, the number of classes, maximum probability of class and class entropy are playing a major role in classifier accuracy and algorithm selection for thirty eight datasets used for experimentation.
Citations
More filters
Proceedings ArticleDOI
01 Aug 2018
TL;DR: A review of past and future application domains, sub-domains, and applications of machine learning and deep learning are illustrated in this paper.
Abstract: Machine learning is one of the fields in the modern computing world. A plenty of research has been undertaken to make machines intelligent. Learning is a natural human behavior which has been made an essential aspect of the machines as well. There are various techniques devised for the same. Traditional machine learning algorithms have been applied in many application areas. Researchers have put many efforts to improve the accuracy of that machinelearning algorithms. Another dimension was given thought which leads to deep learning concept. Deep learning is a subset of machine learning. So far few applications of deep learning have been explored. This is definitely going to cater to solving issues in several new application domains, sub-domains using deep learning. A review of these past and future application domains, sub-domains, and applications of machine learning and deep learning are illustrated in this paper.

216 citations


Cites methods from "Algorithm selection for classificat..."

  • ...Best Classifier algorithm from Naive Bayes, IBK, J48, Adaboost, LogitBoost, PART, Random Forest, Bagging, and SMOfor a given dataset is found with the help of parameters Number of attributes, Number of instances, Number of classes, Kurtosis, Skewness, Maximum Probability, and Entropy [9]....

    [...]

Proceedings ArticleDOI
12 Mar 2020
TL;DR: An automated DDoS detector using ML which can run on any commodity hardware and mostly can detect all types of DDoS such as ICMP flood, TCP flood, UDP flood etc.
Abstract: One of a high relentless attack is the crucial distributed DoS attacks. The types and tools for this attacks increases day-to-day as per the technology increases. So the methodology for detection of DDoS should be advanced. For this purpose we created an automated DDoS detector using ML which can run on any commodity hardware. The results are 98.5 % accurate. We use three classification algorithms KNN, RF and NB to classify DDoS packets from normal packets using two features, delta time and packet size. This detector mostly can detect all types of DDoS such as ICMP flood, TCP flood, UDP flood etc. In the older systems they detect only some types of DDoS attacks and some systems may require a large number of features to detect DDoS. Some systems may work only with certain protocols only. But our proposed model overcome these drawbacks by detecting the DDoS of any type without a need of specific protocol that uses less amount of features.

33 citations


Cites background or methods from "Algorithm selection for classificat..."

  • ...The required data can be collected by using pyshark, an interface for Tshark which allows the python program to communicate with the wireshark directly[11]....

    [...]

  • ...From[11] it is useful to discover the various types of algorithms and it variations....

    [...]

Proceedings ArticleDOI
10 Jan 2021
TL;DR: In this paper, the authors proposed a classification based machine learning approach for detection of DDoS attack in cloud computing, with the help of three classification machine learning algorithms K Nearest Neighbor, Random Forest and Naive Bayes.
Abstract: Distributed Denial of service attack(DDoS)is a network security attack and now the attackers intruded into almost every technology such as cloud computing, IoT, and edge computing to make themselves stronger. As per the behaviour of DDoS, all the available resources like memory, cpu or may be the entire network are consumed by the attacker in order to shutdown the victim‘s machine or server. Though, the plenty of defensive mechanism are proposed, but they are not efficient as the attackers get themselves trained by the newly available automated attacking tools. Therefore, we proposed a classification based machine learning approach for detection of DDoS attack in cloud computing. With the help of three classification machine learning algorithms K Nearest Neighbor, Random Forest and Naive Bayes, the mechanism can detect a DDoS attack with the accuracy of 99.76%.

20 citations

Journal ArticleDOI
TL;DR: Results of 5 performance measures indicate that the proposed link prediction-based recommendation method is more effective compared with the base line classification algorithm recommendation method and can be used in practice.
Abstract: Recommending appropriate classification algorithm(s) for a given classification problem is of great significance and also one of the challenging problems in the field of data mining, which is usually viewed as a meta-learning problem Multi-label learning has been adopted and validated to be an effective meta-learning method in classification algorithm recommendation However, the multi-label learning method used in previous classification algorithm recommendation relies only on relationship between data sets and their direct neighbours, ignoring the impact of other data sets In this paper, a new classification algorithm recommendation method based on link prediction between data sets and classification algorithms is proposed Taking advantage of link prediction in heterogeneous networks, this method considers the impact of all data sets and makes full use of the interactions between data sets as well as between data sets and algorithms Firstly, meta data of the training data sets is collected And then a heterogeneous network called DAR ( D ata and A lgorithm R elationship) Network is constructed with the meta data Finally, the link prediction technique is adopted to recommend appropriate algorithm(s) for a given data set on the basis of the DAR Network To evaluate the proposed link prediction-based recommendation method, extensive experiments with 131 data sets and 21 classification algorithms are conducted Results of 5 performance measures indicate that the proposed method is more effective compared with the base line classification algorithm recommendation method and can be used in practice

19 citations

Journal ArticleDOI
TL;DR: The proposed EML method can automatically recommend different numbers of appropriate algorithms for different dataset, rather than specifying a fixed number of appropriate algorithm(s) as done by the ML-KNN, SLP-based and OBOE methods.
Abstract: With the mountains of classification algorithms proposed in the literature, the study of how to select suitable classifier(s) for a given problem is important and practical. Existing methods rely on a single learner built on one type of meta-features or a simple combination of several types of meta-features to address this problem. In this paper, we propose a two-layer classification algorithm recommendation method called EML (Ensemble of ML-KNN for classification algorithm recommendation) to leverage the diversity of different sets of meta-features. The proposed method can automatically recommend different numbers of appropriate algorithms for different dataset, rather than specifying a fixed number of appropriate algorithm(s) as done by the ML-KNN, SLP-based and OBOE methods. Experimental results on 183 public datasets show the effectiveness of the EML method compared to the three baseline methods.

19 citations

References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Book
15 Oct 1992
TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Abstract: From the Publisher: Classifier systems play a major role in machine learning and knowledge-based systems, and Ross Quinlan's work on ID3 and C4.5 is widely acknowledged to have made some of the most significant contributions to their development. This book is a complete guide to the C4.5 system as implemented in C for the UNIX environment. It contains a comprehensive guide to the system's use , the source code (about 8,800 lines), and implementation notes. The source code and sample datasets are also available on a 3.5-inch floppy diskette for a Sun workstation. C4.5 starts with large sets of cases belonging to known classes. The cases, described by any mixture of nominal and numeric properties, are scrutinized for patterns that allow the classes to be reliably discriminated. These patterns are then expressed as models, in the form of decision trees or sets of if-then rules, that can be used to classify new cases, with emphasis on making the models understandable as well as accurate. The system has been applied successfully to tasks involving tens of thousands of cases described by hundreds of properties. The book starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting. Advantages and disadvantages of the C4.5 approach are discussed and illustrated with several case studies. This book and software should be of interest to developers of classification-based intelligent systems and to students in machine learning and expert systems courses.

21,674 citations

Journal ArticleDOI
TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Abstract: More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

19,603 citations

Journal ArticleDOI
TL;DR: The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points, so it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.
Abstract: The nearest neighbor decision rule assigns to an unclassified sample point the classification of the nearest of a set of previously classified points. This rule is independent of the underlying joint distribution on the sample points and their classifications, and hence the probability of error R of such a rule must be at least as great as the Bayes probability of error R^{\ast} --the minimum probability of error over all decision rules taking underlying probability structure into account. However, in a large sample analysis, we will show in the M -category case that R^{\ast} \leq R \leq R^{\ast}(2 --MR^{\ast}/(M-1)) , where these bounds are the tightest possible, for all suitably smooth underlying distributions. Thus for any number of categories, the probability of error of the nearest neighbor rule is bounded above by twice the Bayes probability of error. In this sense, it may be said that half the classification information in an infinite sample set is contained in the nearest neighbor.

12,243 citations

Journal ArticleDOI
TL;DR: This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
Abstract: Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data and then taking a weighted majority vote of the sequence of classifiers thus produced. For many classification algorithms, this simple strategy results in dramatic improvements in performance. We show that this seemingly mysterious phenomenon can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood. For the two-class problem, boosting can be viewed as an approximation to additive modeling on the logistic scale using maximum Bernoulli likelihood as a criterion. We develop more direct approximations and show that they exhibit nearly identical results to boosting. Direct multiclass generalizations based on multinomial likelihood are derived that exhibit performance comparable to other recently proposed multiclass generalizations of boosting in most situations, and far superior in some. We suggest a minor modification to boosting that can reduce computation, often by factors of 10 to 50. Finally, we apply these insights to produce an alternative formulation of boosting decision trees. This approach, based on best-first truncated tree induction, often leads to better performance, and can provide interpretable descriptions of the aggregate decision rule. It is also much faster computationally, making it more suitable to large-scale data mining applications.

6,598 citations