Comparing Auto-Machine Learning and Expert-Designed Models in Diagnosing Vitreomacular Interface Disorders


Durmaz Engin C., Gokkan M. O., Koksaldi S., Kayabasi M., Besenk U., SELVER M. A., ...Daha Fazla

Journal of Clinical Medicine, cilt.14, sa.8, 2025 (SCI-Expanded) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 14 Sayı: 8
  • Basım Tarihi: 2025
  • Doi Numarası: 10.3390/jcm14082774
  • Dergi Adı: Journal of Clinical Medicine
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Directory of Open Access Journals
  • Anahtar Kelimeler: AutoML, deep learning, EfficientNet B0, optical coherence tomography, ResNet-50, vitreomacular interface disorders
  • Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Background: The vitreomacular interface (VMI) encompasses a group of retinal disorders that significantly impact vision, requiring accurate classification for effective management. This study aims to compare the effectiveness of an expert-designed custom deep learning (DL) model and a code free Auto Machine Learning (ML) model in classifying optical coherence tomography (OCT) images of VMI disorders. Materials and Methods: A balanced dataset of OCT images across five classes—normal, epiretinal membrane (ERM), idiopathic full-thickness macular hole (FTMH), lamellar macular hole (LMH), and vitreomacular traction (VMT)—was used. The expert-designed model combined ResNet-50 and EfficientNet-B0 architectures with Monte Carlo cross-validation. The AutoML model was created on Google Vertex AI, which handled data processing, model selection, and hyperparameter tuning automatically. Performance was evaluated using average precision, precision, and recall metrics. Results: The expert-designed model achieved an overall balanced accuracy of 95.97% and a Matthews Correlation Coefficient (MCC) of 94.65%. Both models attained 100% precision and recall for normal cases. For FTMH, the expert model reached perfect precision and recall, while the AutoML model scored 97.8% average precision, and 97.4% recall. In VMT detection, the AutoML model showed 99.5% average precision with a slightly lower recall of 94.7% compared to the expert model’s 95%. For ERM, the expert model achieved 95% recall, while the AutoML model had higher precision at 93.9% but a lower recall of 79.5%. In LMH classification, the expert model exhibited 95% precision, compared to 72.3% for the AutoML model, with similar recall for both (88% and 87.2%, respectively). Conclusions: While the AutoML model demonstrated strong performance, the expert-designed model achieved superior accuracy across certain classes. AutoML platforms, although accessible to healthcare professionals, may require further advancements to match the performance of expert-designed models in clinical applications.