Open Access
Review on determining number of Cluster in K-Means Clustering
TLDR
Six different approaches to determine the right number of clusters in a dataset are explored, including k-means method, a simple and fast clustering technique that addresses the problem of cluster number selection by using a k-Means approach.Abstract:
Clustering is widely used in different field such as biology, psychology, and economics. The result of clustering varies as number of cluster parameter changes hence main challenge of cluster analysis is that the number of clusters or the number of model parameters is seldom known, and it must be determined before clustering. The several clustering algorithm has been proposed. Among them k-means method is a simple and fast clustering technique. We address the problem of cluster number selection by using a k-means approach We can ask end users to provide a number of clusters in advance, but it is not feasible end user requires domain knowledge of each data set. There are many methods available to estimate the number of clusters such as statistical indices, variance based method, Information Theoretic, goodness of fit method etc...The paper explores six different approaches to determine the right number of clusters in a datasetread more
Citations
More filters
Journal ArticleDOI
Joint Communication, Computation, Caching, and Control in Big Data Multi-Access Edge Computing
Anselme Ndikumana,Nguyen H. Tran,Tai Manh Ho,Zhu Han,Walid Saad,Dusit Niyato,Choong Seon Hong +6 more
TL;DR: In this paper, the problem of joint computing, caching, communication, and control (4C) in big data MEC is formulated as an optimization problem whose goal is to jointly optimize a linear combination of the bandwidth consumption and network latency.
Journal ArticleDOI
Forecasting across time series databases using recurrent neural networks on groups of similar series: A clustering approach
TL;DR: In this article, a prediction model that can be used with different types of RNN models on subgroups of similar time series, which are identified by time series clustering techniques is presented.
Journal ArticleDOI
Estimation of reference evapotranspiration in Brazil with limited meteorological data using ANN and SVM - a new approach.
Lucas Borges Ferreira,Fernando França da Cunha,Rubens Alves de Oliveira,Elpídio Inácio Fernandes Filho +3 more
TL;DR: In this article, the authors evaluated the performance of artificial neural network (ANN) and support vector machine (SVM) models for the estimation of daily ETo across the entirety of Brazil using measured data on temperature and relative humidity or only temperature.
Journal ArticleDOI
Physics-guided convolutional neural network (PhyCNN) for data-driven seismic response modeling
TL;DR: In this paper, a physics-guided convolutional neural network (PhyCNN) is proposed to predict building seismic response in a data-driven fashion without the need of a physicsbased analytical/numerical model.
Journal ArticleDOI
Prediction of Blast-Induced Ground Vibration in an Open-Pit Mine by a Novel Hybrid Model Based on Clustering and Artificial Neural Network
TL;DR: The proposed HKM–ANN model was the most superior model in estimating PPV caused by blasting operations in this study and contributed a new computational model in predicting blast-induced PPV for the science community and practical engineering with high accuracy level.
References
More filters
Journal ArticleDOI
A new look at the statistical model identification
TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Journal ArticleDOI
Estimating the Dimension of a Model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Estimating the dimension of a model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Book
Finding Groups in Data: An Introduction to Cluster Analysis
TL;DR: An electrical signal transmission system, applicable to the transmission of signals from trackside hot box detector equipment for railroad locomotives and rolling stock, wherein a basic pulse train is transmitted whereof the pulses are of a selected first amplitude and represent a train axle count.