scispace - formally typeset
Book ChapterDOI

Cluster Analysis on Different Data Sets Using K-Modes and K-Prototype Algorithms

Reads0
Chats0
TLDR
Algorithms which extend the k-means algorithm to categorical domains by using Modified k-modes algorithm and domains with mixed categorical and numerical values by using k-prototypes algorithm are implemented.
Abstract
The k-means algorithm is well-known for its efficiency in clustering large data sets and it is restricted to the numerical data types. But the real world is a mixture of various data typed objects. In this paper we implemented algorithms which extend the k-means algorithm to categorical domains by using Modified k-modes algorithm and domains with mixed categorical and numerical values by using k-prototypes algorithm. The Modified k-modes algorithm will replace the means with the modes of the clusters by following three measures like “using a simple matching dissimilarity measure for categorical data”, “replacing means of clusters by modes” and “using a frequency-based method to find the modes of a problem used by the k-means algorithm”. The other algorithm used in this paper is the k-prototypes algorithm which is implemented by integrating the Incremental k-means and the Modified k-modes partition clustering algorithms. All these algorithms reduce the cost function value.

read more

Citations
More filters
Proceedings ArticleDOI

A Survey of Clustering Techniques for Big Data Analysis

TL;DR: In this paper, some of the current big data mining clustering techniques are discussed and comprehensive analysis of these techniques is carried out and appropriate clustering algorithm is provided.
Journal ArticleDOI

Performance Analysis of Partition and Evolutionary Clustering Methods on Various Cluster Validation Criteria

TL;DR: Traditional clustering approaches, namely leader, K-means, ISODATA and evolutionary-based approaches like genetic algorithm, particle swarm optimization, social group optimization methods, are implemented on benchmark data set and performance analysis reveals evolutionary clustering methods convergence rate is better than partition clustering Methods.
Book ChapterDOI

Survey on Clustering Algorithms for Unstructured Data

TL;DR: Various existing clustering methods which are suitable for large, semi-structured, and unstructured data and how they can apply same algorithms in distributed environment/hadoop are studied.
Journal ArticleDOI

A mixed attributes oriented dynamic SOM fuzzy cluster algorithm for mobile user classification

TL;DR: The user mean membership threshold is defined which is an indicator to determine whether different groups need to be added in the mixed attribute variables of the D-SOMFCM-OMA clustering algorithm.
Book ChapterDOI

K-Means Clustering with Neural Networks for ATM Cash Repository Prediction

TL;DR: Main objective of this paper is to forecast cash demand forecasting of NN5 data with neural networks and root mean square error indicates applications of clustering before applying Neural Network increases precision in forecasting of ATM Cash Repository.
References
More filters
Book

Data Mining: Concepts and Techniques

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Book

Data Mining: Practical Machine Learning Tools and Techniques

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Book

Introduction to Data Mining

TL;DR: This book discusses data mining through the lens of cluster analysis, which examines the relationships between data, clusters, and algorithms, and some of the techniques used to solve these problems.
Journal ArticleDOI

Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

TL;DR: Two algorithms which extend the k-means algorithm to categorical domains and domains with mixed numeric and categorical values are presented and are shown to be efficient when clustering large data sets, which is critical to data mining applications.
OtherDOI

Introduction to Data Mining

TL;DR: This book discusses data mining through the lens of cluster analysis, which examines the relationships between data, clusters, and algorithms, and some of the techniques used to solve these problems.
Related Papers (5)