An Innovative Hybrid Machine Learning Approach for Student Survey Analysis: Random Tree With Ordinal Noise Filtering and Feature Selection (RTONF)

Tüysüzoğlu, GÖKSU; Doğan, YUNUS; Dalkılıç, FERİŞTAH; Kiyak, Elife; Ghasemkhani, Bita; Birant, KÖKTEN; Birant, DERYA

doi:10.1109/access.2025.3614058

An Innovative Hybrid Machine Learning Approach for Student Survey Analysis: Random Tree With Ordinal Noise Filtering and Feature Selection (RTONF)

Tüysüzoğlu G., Doğan Y., Dalkılıç F., Kiyak E. O., Ghasemkhani B., Birant K. U., ...Daha Fazla

IEEE ACCESS, cilt.13, ss.167798-167822, 2025 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 13
Basım Tarihi: 2025
Doi Numarası: 10.1109/access.2025.3614058
Dergi Adı: IEEE ACCESS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.167798-167822
Anahtar Kelimeler: Surveys, Noise, Education, Classification algorithms, Feature extraction, Accuracy, Random forests, Prediction algorithms, Radio frequency, Predictive models, Artificial intelligence, classification, education, educational data mining, feature selection, machine learning, noise detection, student performance prediction, student satisfaction prediction, survey data
Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Analyzing student survey data using machine learning has become increasingly important for educational institutions aiming to understand the student experience, identify areas for improvement in teaching and curriculum, and make informed decisions to enhance learning outcomes. A major challenge in this domain is the presence of noisy data, which can substantially reduce the performance of classification algorithms. Existing studies often ignore the ordinal nature of class labels during noise detection, which may lead to suboptimal data cleaning. To address this problem, we propose a new approach entitled Random Tree with Ordinal Noise Filtering and Feature Selection (RTONF). This method explicitly incorporates the inherent order of class labels (e.g., poor < fair < good < very good < excellent) during the noise identification process, before predicting student performance or satisfaction. The Random Tree (RT) algorithm serves as the base classifier, while Pearson Correlation is employed for feature selection due to its outstanding performance. Experimental results demonstrated that the proposed hybrid method with an average accuracy of 84.61% achieved a 5.23% improvement compared to the traditional RT classifier. Furthermore, comparative analysis indicated that our method outperformed the state-of-the-art techniques in terms of prediction accuracy on the same dataset.