K-Linkage: A New Agglomerative Approach for Hierarchical Clustering


Yildirim P., Birant D.

ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, vol.17, no.4, pp.77-88, 2017 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 17 Issue: 4
  • Publication Date: 2017
  • Doi Number: 10.4316/aece.2017.04010
  • Journal Name: ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.77-88
  • Keywords: clustering, data mining, data processing, knowledge discovery, unsupervised learning
  • Dokuz Eylül University Affiliated: Yes

Abstract

In agglomerative hierarchical clustering, the traditional approaches of computing cluster distances are single, complete, average and centroid linkages. However, single-link and complete-link approaches cannot always reflect the true underlying relationship between clusters, because they only consider just a single pair between two clusters. This situation may promote the formation of spurious clusters. To overcome the problem, this paper proposes a novel approach, named k-Linkage, which calculates the distance by considering k observations from two clusters separately. This article also introduces two novel concepts: k-min linkage (the average of k closest pairs) and k-max linkage (the average of k farthest pairs). In the experimental studies, the improved hierarchical clustering algorithm based on k-Linkage was executed on five well-known benchmark datasets with varying k values to demonstrate its efficiency. The results show that the proposed k-Linkage method can often produce clusters with better accuracy, compared to the single, complete, average and centroid linkages.