scispace - formally typeset
Search or ask a question

Showing papers on "Davies–Bouldin index published in 2020"


Proceedings ArticleDOI
11 Mar 2020
TL;DR: The research discoveries of applying K-Means clustering, on a cereal dataset is distributed and to differentiate the outcomes found on the number of bunches to identify whether the ideal or best number of groups to be 3 or 5 is found.
Abstract: Cereals grains have been used as a principle ingredient of human diet for hundreds of years. Indian cereal crops provide vital nutrients and energy to the human diet. The motivation behind this research paper is to distribute the research discoveries of applying K-Means clustering, on a cereal dataset and to differentiate the outcomes found on the number of bunches to identify whether the ideal or best number of groups to be 3 or 5. This speculation is achieved by applying distinctive clustering tests (likewise reordered in the paper), and visualizations. The aforementioned resolution by doing exploratory analysis, at that point modeled fitting followed by result testing, driving us to a definite end. The language utilized for our exploration is R.

25 citations


Journal ArticleDOI
01 Jan 2020
TL;DR: The X-means algorithm with the Davies-Bouldin Index evaluation to determine the number of Centroid clusters is done by modifying the X-Means method to do some centroid determination to get 11 iterations and produces cluster members that have a good level of similarity with other data.
Abstract: Clustering is a process to group data into several clusters or groups so the data in one cluster has a maximum level of similarity and data between clusters has a minimum similarity. X-means clustering is used to solving one of the main weaknesses of K-means clustering need for prior knowledge about the number of clusters (K). In this method, the actual value of K is estimated in a way that is not monitored and only based on the data set itself. The results of the study using the X-Means algorithm with the Davies-Bouldin Index evaluation to determine the number of Centroid clusters is done by modifying the X-Means method to do some centroid determination to get 11 iterations. The result is produces cluster members that have a good level of similarity with other data. In determining the number of centroids, use the Davies-Bouldin Index method where testing with 2 clusters has a minimum value with a DBI value close to 0.

16 citations


Journal ArticleDOI
TL;DR: This paper proposes to cluster and identify similar trajectories based on paths traversed by moving object based on graph model, which has two phases graph generation and clustering.

16 citations


Journal ArticleDOI
TL;DR: Improved performance of the clustering algorithm makes the algorithm more stable than the original K-Means algorithm because the labels of each dataset do not change and increases the values of precision, recall, and accuracy of the automatic learning style detection model proposed in this study.
Abstract: Learning Management System (LMS) is well de-signed and operated by an exceptional teaching team, but LMS does not consider the needs and characteristics of each student’s learning style. The LMS has not yet provided a feature to detect student diversity, but LMS has a track record of student learning activities known as log files. This study proposes a detection model of student’s learning styles by utilizing information on log file data consisting of four processes. The first process is pre-processing to get 29 features that are used as the input in the clustering process. The second process is clustering using a modified K-Means algorithm to get a label from each test data set before the classification process is carried out. The third process is detecting learning styles from each data set using the Naive Bayesian classification algorithm, and finally, the analysis of the performance of the proposed model. The test results using the validity value of the Davies-Bouldin Index (DBI) matrix indicate that the modified K-Means algorithm achieved 2.54 DBI, higher than that of original K-Means with 2.39 DBI. Besides having high validity, it also makes the algorithm more stable than the original K-Means algorithm because the labels of each dataset do not change. The improved performance of the clustering algorithm also increases the values of precision, recall, and accuracy of the automatic learning style detection model proposed in this study. The average precision value rises from 65.42% to 71.09%, the value of recall increases from 72.09% to 80.23%, and the value of accuracy increases from 67.06% to 71.60%.

6 citations


Journal ArticleDOI
17 Jul 2020
TL;DR: The results of this study indicate that k-medoid clustering produces good cluster data results with an evaluation value of the Bouldin index davies cluster index of 0.407029478 and is said to be a good cluster result.
Abstract: The need for data analysis in tertiary education every semester is needed, this is due to the increasingly large and uncontrolled data, on the other hand generally higher education does not yet have a data warehouse and big data analysis to maintain data quality at tertiary institutions is not easy, especially to estimate the results of university accreditation high, because the data continues to grow and is not controlled, the purpose of this study is to apply k-medoids clustering by applying the calculation of the weighting matrix of higher education accreditation with the data of the last 3 years namely length of study, average GPA, student and lecturer ratio and the number of lecturers according to the study program, so that it can predict accurate cluster results, the results of this study indicate that k-medoid clustering produces good cluster data results with an evaluation value of the Bouldin index davies cluster index of 0.407029478 and is said to be a good cluster result.

4 citations


Proceedings ArticleDOI
01 Feb 2020
TL;DR: Experiments are presented to compare the performance of K-medoids clustering algorithm using Euclidean, Manhattan and Chebyshev distance functions and it is shown that methods of Manhattan distance and Euclidesan distance with the Index Davies value of 0.050 are superior.
Abstract: The clustering task aims to assign a cluster for each observation data in such a way that observations data within each cluster are more homogeneous to one another than with those in the other groups. Its wide applications in many research fields have motivated many researchers to propose a plethora of clustering algorithms. K-medoids are a prominent clustering algorithm as an improvement of the predecessor, K-Means algorithm. Despite its widely used and less sensitive to noises and outliers, the performance of K-medoids clustering algorithm is affected by the distance function. This paper presents experimentation findings to compare the performance of K-medoids clustering algorithm using Euclidean, Manhattan and Chebyshev distance functions. In this study the K-medoids algorithm was tested using the village status dataset from Gorontalo Province, Indonesia. Execution time and Davies Bouldin Index were used as performance metrics of the clustering algorithm. Experiment results showed that methods of Manhattan distance and Euclidean distance with the Index Davies value of 0.050.

4 citations


Journal ArticleDOI
TL;DR: A novel adaptive graph convolution using a heat kernel model for attributed graph clustering (AGCHK) is proposed, which exploits the similarity among nodes under heat diffusion to flexibly restrict the neighborhood of the center node and enforce the graph smoothness.
Abstract: Attributed graphs contain a lot of node features and structural relationships, and how to utilize their inherent information sufficiently to improve graph clustering performance has attracted much attention. Although existing advanced methods exploit graph convolution to capture the global structure of an attributed graph and achieve obvious improvements for clustering results, they cannot determine the optimal neighborhood that reflects the relevant information of connected nodes in a graph. To address this limitation, we propose a novel adaptive graph convolution using a heat kernel model for attributed graph clustering (AGCHK), which exploits the similarity among nodes under heat diffusion to flexibly restrict the neighborhood of the center node and enforce the graph smoothness. Additionally, we take the Davies–Bouldin index (DBI) instead of the intra-cluster distance individually as the selection criterion to adaptively determine the order of graph convolution. The clustering results of AGCHK on three benchmark datasets—Cora, Citeseer, and Pubmed—are all more than 1% higher than the current advanced model AGC, and 12% on the Wiki dataset especially, which obtains a state-of-the-art result in the task of attributed graph clustering.

1 citations


DOI
31 Aug 2020
TL;DR: The cluster validity index is used to calculate the index of validity of each kmeans clustering result and the results from Fuzzy C-Means with c = 2, ....
Abstract: K-Means and Fuzzy C-Means Clustering is a method of analyzing data that performs the modeling process without supervision (without supervision) and is a method that groups data by partitioning the system. Clusters Clusters and Fuzzy C-Means will produce different clusters in the same dataset, cluster validity index is a method that can be used to improve the results of clustering generated by the clustering method. This study will use the cluster validity index on the kmeans clustering algorithm and Fuzzy C-Means by calculating the index of validity of each kmeans clustering result with k = 2, ..., kmax (k max determined at the beginning) and the results from Fuzzy C-Means with c = 2, ...., cmax (c max is specified at the beginning). By using the cluster validity index, the most optimal cluster is obtained in the second cluster with the Dbi value = 0.45 in the mean K and the second cluster with the Dbi value = 0.5 in the Fuzzy C Mean, and the results of the clustering are consistent.

1 citations