Balanced Hoeffding Tree Forest (BHTF): A Novel Multi-Label Classification with Oversampling and Undersampling Techniques for Failure Mode Diagnosis in Predictive Maintenance


Ghasemkhani B., KUT R. A., BİRANT D., YILMAZ R.

MATHEMATICS, cilt.13, sa.18, 2025 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 13 Sayı: 18
  • Basım Tarihi: 2025
  • Doi Numarası: 10.3390/math13183019
  • Dergi Adı: MATHEMATICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Communication Abstracts, Metadex, zbMATH, Directory of Open Access Journals, Civil Engineering Abstracts
  • Anahtar Kelimeler: machine learning, predictive maintenance, multi-label classification, ensemble learning, incremental learning, data imbalance, fault detection
  • Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Predictive maintenance (PdM) is essential for reducing equipment downtime and enhancing operational efficiency. However, PdM datasets frequently suffer from significant class imbalance and are often limited to single-label classification, which fails to reflect the complexity of real-world industrial systems where multiple failure modes can occur simultaneously. As the main contribution, we propose the Balanced Hoeffding Tree Forest (BHTF)-a novel multi-label classification framework that combines oversampling and undersampling strategies to effectively mitigate data imbalance. BHTF leverages the binary relevance method to decompose the multi-label problem into multiple binary tasks and utilizes an ensemble of Hoeffding Trees to ensure scalability and adaptability to streaming data. In particular, BHTF unifies three learning paradigms-multi-label learning (MLL), ensemble learning (EL), and incremental learning (IL)-providing a comprehensive and scalable approach for predictive maintenance applications. The key contribution of the proposed method is that it incorporates a hybrid data preprocessing strategy, introducing a novel undersampling technique, named Proximity-Driven Undersampling (PDU), and combining it with the Synthetic Minority Oversampling Technique (SMOTE) to effectively deal with the class imbalance issue in highly skewed datasets. Experimental results on the benchmark AI4I 2020 dataset showed that BHTF achieved an average classification accuracy of 97.44%, outperformed by a margin of the state-of-the-art methods (88.94%) with an improvement of 11% on average. These findings highlight the potential of BHTF as a robust artificial intelligence-based solution for complex fault detection in manufacturing predictive maintenance applications.