Komparasi Distance Measure Pada K-Medoids Clustering untuk Pengelompokkan Penyakit Ispa

Abstract

K-Medoids is an unsupervised algorithm that uses a distance measure to classify data. The distance measure is a method that can help an algorithm classify data based on the similarity of the variables. Several studies have shown that using the right distance measure can improve the performance of the algorithm in clustering. Euclidean and Chebyshev is two of some distance measures that can be used. In 2016, Karawang Health Office stated that 175.891 Karawang citizens were suffering from ISPA. This figure continued to increase in the following year until 2019. The total of Karawang citizens who suffering from ISPA reached 181.945 people. To assist the government in overcoming this problem, a clustering process will be carried out to group the areas where the ISPA is spreading in Karawang District. The area will be divided into three clusters, namely low, medium and high. Comparison of distance measures is carried out to find the best model based on the evaluation of the Davies Bouldin Index (DBI). The use of Euclidean-distance produces a DBI score of 0,088 meanwhile the use of Chebyshev distance resulted in a DBI score of 0,116. The performance of the K-Medoids algorithm with Euclidean-distance is considered to be better than Chebyshev distance because it produces a DBI score that is near to 0.