Unsupervised K-Means Clustering Algorithm
TLDR
An unsupervised learning schema is constructed for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters.Abstract:
The k-means algorithm is generally the most known and used clustering method. There are various extensions of k-means to be proposed in the literature. Although it is an unsupervised learning to clustering in pattern recognition and machine learning, the k-means algorithm and its extensions are always influenced by initializations with a necessary number of clusters a priori. That is, the k-means algorithm is not exactly an unsupervised clustering method. In this paper, we construct an unsupervised learning schema for the k-means algorithm so that it is free of initializations without parameter selection and can also simultaneously find an optimal number of clusters. That is, we propose a novel unsupervised k-means (U-k-means) clustering algorithm with automatically finding an optimal number of clusters without giving any initialization and parameter selection. The computational complexity of the proposed U-k-means clustering algorithm is also analyzed. Comparisons between the proposed U-k-means and other existing methods are made. Experimental results and comparisons actually demonstrate these good aspects of the proposed U-k-means clustering algorithm.read more
Citations
More filters
Journal ArticleDOI
A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects
Ezugwu E. Absalom,Abiodun Motunrayo Ikotun,Olaide Nathaniel Oyelade,Laith Abualigah,Jeffrey O. Agushaka,Christopher Ifeanyi Eke,Andronicus Ayobami Akinyelu +6 more
TL;DR: Clustering is an essential tool in data mining research and applications as discussed by the authors and it is the subject of active research in many fields of study, such as computer science, data science, statistics, pattern recognition, artificial intelligence, and machine learning.
Journal ArticleDOI
Assessment of global health risk of antibiotic resistance genes
Zhenyan Zhang,Qi Zhang,Ting-zhang Wang,Nuohan Xu,Tao Liu,Wenjie Hong,Josep Peñuelas,Michael R. Gillings,Mei-Xia Wang,Wenwen Gao,Haifeng Qian +10 more
TL;DR: In this article , an analysis at the metagenomic level from various habitats (6 types of habitats, 4572 samples) detects 2561 ARGs that collectively conferred resistance to 24 classes of antibiotics.
Journal ArticleDOI
From clustering to clustering ensemble selection: A review
TL;DR: Clustering Ensemble as mentioned in this paper is a knowledge reuse approach to solve the challenges inherent in clustering, it seeks to explore results of high stability and robustness by composing computed solutions achieved by base clustering algorithms without access to the features.
Journal ArticleDOI
Block Hunter: Federated Learning for Cyber Threat Hunting in Blockchain-Based IIoT Networks
Abbas Yazdinejad,Ali Dehghantanha,Reza M. Parizi,Mohammad Hammoudeh,Hadis Karimipour,Gautam Srivastava +5 more
TL;DR: This article uses federated learning to build a threat hunting framework called block hunter to automatically hunt for attacks in blockchain-based IIoT networks, and proves the efficiency of the block hunter in detecting anomalous activities with high accuracy and minimum required bandwidth.
References
More filters
Some methods for classification and analysis of multivariate observations
TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Journal ArticleDOI
Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
TL;DR: A new graphical display is proposed for partitioning techniques, where each cluster is represented by a so-called silhouette, which is based on the comparison of its tightness and separation, and provides an evaluation of clustering validity.
Book
Finding Groups in Data: An Introduction to Cluster Analysis
TL;DR: An electrical signal transmission system, applicable to the transmission of signals from trackside hot box detector equipment for railroad locomotives and rolling stock, wherein a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count.