scispace - formally typeset
Open AccessJournal ArticleDOI

Top 10 algorithms in data mining

Reads0
Chats0
TLDR
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART.
Abstract
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.

read more

Content maybe subject to copyright    Report

I
MINEA DE DATOS PARA EL DESCUBRIMIENTO DE PATRONES EN
ENFERMEDADES RESPIRATORIAS EN BOGOTÁ, COLOMBIA
ERIKA ANDREA ROJAS GUTIÉRREZ
JUAN SEBASTIÁN AGUILAR
UNIVERSIDAD CATÓLICA DE COLOMBIA
FACULTAD DE INGENIERÍA
PROGRAMA DE INGENIERÍA DE SISTEMAS
MODALIDAD TRABAJO DE INVESTIGACIÓN
BOGOTÁ D.C., COLOMBIA
2017

2
MINEA DE DATOS PARA EL DESCUBRIMIENTO DE PATRONES EN
ENFERMEDADES RESPIRATORIAS EN BOGOTÁ, COLOMBIA
ERIKA ANDREA ROJAS GUTIÉRREZ
JUAN SEBASTIÁN AGUILAR
TRABAJO DE GRADO PARA OPTAR AL TÍTULO DE
INGENIERO DE SISTEMAS
Director (a):
JOHN ALEXANDER VELANDIA
Título (Prof. M.Sc. Eng.)
UNIVERSIDAD CATÓLICA DE COLOMBIA
FACULTAD DE INGENIERÍA
PROGRAMA DE INGENIERÍA DE SISTEMAS Y COMPUTACIÓN
TRABAJO DE INVESTIGACIÓN
BOGOTÁ D.C., COLOMBIA
2017

3
Nota de Aceptación
_______________________________________
_______________________________________
_______________________________________
_______________________________________
_______________________________________
_________________________________________
Jurado
______________________________________
John Alexander Velandia Vega
Director
______________________________________
Revisor Metodológico
Bogotá, 14, Noviembre, 2017

4

5
AGRADECIMIENTOS
A Dios por permitirnos ser guía en cada objetivo que nos proponemos.
A nuestros padres por su apoyo constante e incondicional durante toda la
carrera y en nuestras vidas.
A nuestro director de trabajo de grado el profesor John Alexander Velandia por
ser de guía durante la elaboración de este proyecto generador de
conocimiento.
A nuestros compañeros y profesores de la Universidad Católica de Colombia
los cuales dejan enseñanzas en nuestras vidas.
Al profesor Raúl Menéndez por la colaboración en el acceso a las fuentes de
datos.

Citations
More filters
Journal ArticleDOI

A Survey on Transfer Learning

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Journal ArticleDOI

Business intelligence and analytics: from big data to big impact

TL;DR: This introduction to the MIS Quarterly Special Issue on Business Intelligence Research first provides a framework that identifies the evolution, applications, and emerging research areas of BI&A, and introduces and characterized the six articles that comprise this special issue in terms of the proposed BI &A research framework.
Journal ArticleDOI

Big Data: A Survey

TL;DR: The background and state-of-the-art of big data are reviewed, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid, as well as related technologies.
Journal ArticleDOI

A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches

TL;DR: A taxonomy for ensemble-based methods to address the class imbalance where each proposal can be categorized depending on the inner ensemble methodology in which it is based is proposed and a thorough empirical comparison is developed by the consideration of the most significant published approaches to show whether any of them makes a difference.
Journal ArticleDOI

Modeling wine preferences by data mining from physicochemical properties

TL;DR: A data mining approach to predict human wine taste preferences that is based on easily available analytical tests at the certification step is proposed, which is useful to support the oenologist wine tasting evaluations and improve wine production.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Book

Matrix computations

Gene H. Golub

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.