International Journal of Intelligent Engineering Informatics, cilt.12, sa.4, ss.513-541, 2024 (ESCI)
Glioma grading, a critical task in neuro-oncology, plays a pivotal role in treatment planning and prognosis determination. This study investigates the glioma grade prediction performance of machine learning models based on clinical and molecular data, and how it can be improved by data balancing and feature selection methods. Moreover, a probabilistic multi-view learning model (P-MWM) is introduced to predict glioma grading using clinical and molecular features. In order to improve the model interpretability, the Shapley additive explanations (SHAP) method is used for analysing and interpreting the contribution of each feature to the grading. The study's contributions lie in the development of the P-MWM model, leveraging feature selection methods using ANOVA's f-test, addressing imbalanced data issues, using SMOTE and SMOTE-Tomek Links, and improving model interpretability through SHAP. The proposed P-MWM model was noted to enhance the overall model performance, leading to improvement, particularly in the decision tree (DT) model culminating in an accuracy of 86.8918%. The individual logistic regression (LR) model combined with feature selection and data balancing techniques outperformed the other experimental settings by achieving 87.8442% accuracy.