Beyond Predefined Clusters: A Comprehensive Review of Clustering Methods for Unknown Cluster Numbers

Aghayengejeh, Nazila; Balafar, M.A.; Tanha, Jafar; SELVER, MUSTAFA

doi:10.1109/tkde.2026.3680286

Beyond Predefined Clusters: A Comprehensive Review of Clustering Methods for Unknown Cluster Numbers

Aghayengejeh N. P., Balafar M., Tanha J., SELVER M. A.

IEEE Transactions on Knowledge and Data Engineering, cilt.38, sa.7, ss.4121-4138, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 38 Sayı: 7
Basım Tarihi: 2026
Doi Numarası: 10.1109/tkde.2026.3680286
Dergi Adı: IEEE Transactions on Knowledge and Data Engineering
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC
Sayfa Sayıları: ss.4121-4138
Anahtar Kelimeler: Automatic clustering, deep clustering, semi-supervised clustering, unknown number of clusters
Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Clustering is an unsupervised learning task that groups data points by their inherent similarities. Nonautomatic clustering algorithms face significant challenges when the true number of clusters is unknown or changes dynamically, as they require this number to be predefined. This paper provides a comprehensive review of automatic clustering algorithms specifically designed to handle such uncertainty. In this paper, these algorithms are systematically classified based on three key perspectives: clustering framework (classical vs. deep), clustering strategy (e.g., density-based, model based, graph-theoretic, subspace methods), and the use of labeled data (unsupervised vs. semi-supervised). We analyze each algorithm based on its core principles, key contributions, strengths, and limitations. Furthermore, we address the current challenges in this area and propose future research directions to enhance the scalability, robustness, and effectiveness of automatic clustering algorithms.