K-Linkage: A New Agglomerative Approach for Hierarchical Clustering


Yildirim P., Birant D.

ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, cilt.17, sa.4, ss.77-88, 2017 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 17 Sayı: 4
  • Basım Tarihi: 2017
  • Doi Numarası: 10.4316/aece.2017.04010
  • Dergi Adı: ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.77-88
  • Anahtar Kelimeler: clustering, data mining, data processing, knowledge discovery, unsupervised learning
  • Dokuz Eylül Üniversitesi Adresli: Evet

Özet

In agglomerative hierarchical clustering, the traditional approaches of computing cluster distances are single, complete, average and centroid linkages. However, single-link and complete-link approaches cannot always reflect the true underlying relationship between clusters, because they only consider just a single pair between two clusters. This situation may promote the formation of spurious clusters. To overcome the problem, this paper proposes a novel approach, named k-Linkage, which calculates the distance by considering k observations from two clusters separately. This article also introduces two novel concepts: k-min linkage (the average of k closest pairs) and k-max linkage (the average of k farthest pairs). In the experimental studies, the improved hierarchical clustering algorithm based on k-Linkage was executed on five well-known benchmark datasets with varying k values to demonstrate its efficiency. The results show that the proposed k-Linkage method can often produce clusters with better accuracy, compared to the single, complete, average and centroid linkages.