Beyond Predefined Clusters: A Comprehensive Review of Clustering Methods for Unknown Cluster Numbers


Aghayengejeh N. P., Balafar M., Tanha J., SELVER M. A.

IEEE Transactions on Knowledge and Data Engineering, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1109/tkde.2026.3680286
  • Dergi Adı: IEEE Transactions on Knowledge and Data Engineering
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC
  • Anahtar Kelimeler: Automatic clustering, deep clustering, Semi-supervised clustering, Unknown number of clusters
  • Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Clustering is an unsupervised learning task that groups data points by their inherent similarities. Nonautomatic clustering algorithms face significant challenges when the true number of clusters is unknown or changes dynamically, as they require this number to be predefined. This paper provides a comprehensive review of automatic clustering algorithms specifically designed to handle such uncertainty. In this paper, these algorithms are systematically classified based on three key perspectives: clustering framework (classical vs. deep), clustering strategy (e.g., density-based, model based, graph-theoretic, subspace methods), and the use of labeled data (unsupervised vs. semi-supervised). We analyze each algorithm based on its core principles, key contributions, strengths, and limitations. Furthermore, we address the current challenges in this area and propose future research directions to enhance the scalability, robustness, and effectiveness of automatic clustering algorithms.