K-Linkage: A New Agglomerative Approach for Hierarchical Clustering

Yildirim, Pelin; Birant, DERYA

doi:10.4316/aece.2017.04010

K-Linkage: A New Agglomerative Approach for Hierarchical Clustering

Yildirim P., Birant D.

ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, cilt.17, sa.4, ss.77-88, 2017 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 17 Sayı: 4
Basım Tarihi: 2017
Doi Numarası: 10.4316/aece.2017.04010
Dergi Adı: ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.77-88
Anahtar Kelimeler: clustering, data mining, data processing, knowledge discovery, unsupervised learning
Dokuz Eylül Üniversitesi Adresli: Evet

Özet

In agglomerative hierarchical clustering, the traditional approaches of computing cluster distances are single, complete, average and centroid linkages. However, single-link and complete-link approaches cannot always reflect the true underlying relationship between clusters, because they only consider just a single pair between two clusters. This situation may promote the formation of spurious clusters. To overcome the problem, this paper proposes a novel approach, named k-Linkage, which calculates the distance by considering k observations from two clusters separately. This article also introduces two novel concepts: k-min linkage (the average of k closest pairs) and k-max linkage (the average of k farthest pairs). In the experimental studies, the improved hierarchical clustering algorithm based on k-Linkage was executed on five well-known benchmark datasets with varying k values to demonstrate its efficiency. The results show that the proposed k-Linkage method can often produce clusters with better accuracy, compared to the single, complete, average and centroid linkages.