scispace - formally typeset
Open AccessJournal ArticleDOI

Unsupervised K-Means Clustering Algorithm

Kristina P. Sinaga, +1 more
- 20 Apr 2020 - 
- Vol. 8, pp 80716-80727
TLDR
An unsupervised learning schema is constructed for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters.
Abstract
The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsupervised learning to clustering in pattern recognition and machine learning, the k-means algorithm and its extensions are always influenced by initializations with a necessary number of clusters a priori. That is, the k-means algorithm is not exactly an unsupervised clustering method. In this paper, we construct an unsupervised learning schema for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters. That is, we propose a novel unsupervised k-means (U-k-means) clustering algorithm with automatically finding an optimal number of clusters without giving any initialization and parameter selection. The computational complexity of the proposed U-k-means clustering algorithm is also analyzed. Comparisons between the proposed U-k-means and other existing methods are made. Experimental results and comparisons actually demonstrate these good aspects of the proposed U-k-means clustering algorithm.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects

TL;DR: Clustering is an essential tool in data mining research and applications as discussed by the authors and it is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artificial intelligence, and machine learning.
Journal ArticleDOI

Assessment of global health risk of antibiotic resistance genes

TL;DR: In this article , an analysis at the metagenomic level from various habitats (6 types of habitats, 4572 samples) detects 2561 ARGs that collectively conferred resistance to 24 classes of antibiotics.
Journal ArticleDOI

From clustering to clustering ensemble selection: A review

TL;DR: Clustering Ensemble as mentioned in this paper is a knowledge reuse approach to solve the challenges inherent in clustering, it seeks to explore results of high stability and robustness by composing computed solutions achieved by base clustering algorithms without access to the features.
Journal ArticleDOI

Block Hunter: Federated Learning for Cyber Threat Hunting in Blockchain-Based IIoT Networks

TL;DR: This article uses federated learning to build a threat hunting framework called block hunter to automatically hunt for attacks in blockchain-based IIoT networks, and proves the efficiency of the block hunter in detecting anomalous activities with high accuracy and minimum required bandwidth.
References
More filters

Some methods for classification and analysis of multivariate observations

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Dissertation

Learning Multiple Layers of Features from Tiny Images

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Journal ArticleDOI

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.
Book

Finding Groups in Data: An Introduction to Cluster Analysis

TL;DR: An electrical signal transmission system, applicable to the transmission of signals from trackside hot box detector equipment for railroad locomotives and rolling stock, wherein a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count.