scispace - formally typeset
Open AccessProceedings Article

KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework

TLDR
The aim of this paper is to present three new aspects of KEEL: KEEL-dataset, a data set repository which includes the data set partitions in theKEELformat and some guidelines for including new algorithms in KEEL, helping the researcher to compare the results of many approaches already included within the KEEL software.
Abstract
(Knowledge Extraction based onEvolutionary Learning) tool, an open source software that supports datamanagement and a designer of experiments. KEEL pays special attentionto the implementation of evolutionary learning and soft computing basedtechniques for Data Mining problems including regression, classification,clustering, pattern mining and so on.The aim of this paper is to present three new aspects of KEEL: KEEL-dataset, a data set repository which includes the data set partitions in theKEELformatandshowssomeresultsofalgorithmsinthesedatasets; someguidelines for including new algorithms in KEEL, helping the researcherstomaketheirmethodseasilyaccessibletootherauthorsandtocomparetheresults of many approaches already included within the KEEL software;and a module of statistical procedures developed in order to provide to theresearcher a suitable tool to contrast the results obtained in any experimen-talstudy.Acaseofstudyisgiventoillustrateacompletecaseofapplicationwithin this experimental analysis framework.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms

TL;DR: The basics are discussed and a survey of a complete set of nonparametric procedures developed to perform both pairwise and multiple comparisons, for multi-problem analysis are given.
Journal ArticleDOI

A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches

TL;DR: A taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based is proposed and a thorough empirical comparison is developed by the consideration of the most significant published approaches to show whether any of them makes a difference.
Journal ArticleDOI

An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics

TL;DR: This work carries out a thorough discussion on the main issues related to using data intrinsic characteristics in this classification problem, and introduces several approaches and recommendations to address these problems in conjunction with imbalanced data.
Journal ArticleDOI

An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes

TL;DR: This work develops a double study, using different base classifiers in order to observe the suitability and potential of each combination within each classifier, and compares the performance of these ensemble techniques with the classifiers' themselves.
References
More filters
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Book

Statistical methods

Journal ArticleDOI

A Simple Sequentially Rejective Multiple Test Procedure

TL;DR: In this paper, a simple and widely accepted multiple test procedure of the sequentially rejective type is presented, i.e. hypotheses are rejected one at a time until no further rejections can be done.
Book

Data Mining: Practical Machine Learning Tools and Techniques

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Journal ArticleDOI

An Analysis of Variance Test for Normality (Complete Samples)

S. S. Shapiro, +1 more
- 01 Dec 1965 - 
TL;DR: In this article, a new statistical procedure for testing a complete sample for normality is introduced, which is obtained by dividing the square of an appropriate linear combination of the sample order statistics by the usual symmetric estimate of variance.