A Novel Reduced Error Pruning Tree Forest with Time-Based Missing Data Imputation (REPTF-TMDI) for Traffic Flow Prediction


DOĞAN Y., TÜYSÜZOĞLU G., Kiyak E. O., Ghasemkhani B., BİRANT K. U., UTKU S., ...Daha Fazla

CMES - Computer Modeling in Engineering and Sciences, cilt.144, sa.2, ss.1677-1715, 2025 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 144 Sayı: 2
  • Basım Tarihi: 2025
  • Doi Numarası: 10.32604/cmes.2025.069255
  • Dergi Adı: CMES - Computer Modeling in Engineering and Sciences
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Communication Abstracts, Compendex, INSPEC, Metadex, zbMATH, Civil Engineering Abstracts
  • Sayfa Sayıları: ss.1677-1715
  • Anahtar Kelimeler: Machine learning, traffic flow prediction, missing data imputation, reduced error pruning tree (REPTree), sustainable transportation systems, traffic management, artificial intelligence
  • Dokuz Eylül Üniversitesi Adresli: Evet

Özet

Accurate traffic flow prediction (TFP) is vital for efficient and sustainable transportation management and the development of intelligent traffic systems. However, missing data in real-world traffic datasets poses a significant challenge to maintaining prediction precision. This study introduces REPTF-TMDI, a novel method that combines a Reduced Error Pruning Tree Forest (REPTree Forest) with a newly proposed Time-based Missing Data Imputation (TMDI) approach. The REPTree Forest, an ensemble learning approach, is tailored for time-related traffic data to enhance predictive accuracy and support the evolution of sustainable urban mobility solutions. Meanwhile, the TMDI approach exploits temporal patterns to estimate missing values reliably whenever empty fields are encountered. The proposed method was evaluated using hourly traffic flow data from a major U.S. roadway spanning 2012–2018, incorporating temporal features (e.g., hour, day, month, year, weekday), holiday indicator, and weather conditions (temperature, rain, snow, and cloud coverage). Experimental results demonstrated that the REPTF-TMDI method outperformed conventional imputation techniques across various missing data ratios by achieving an average 11.76% improvement in terms of correlation coefficient (R). Furthermore, REPTree Forest achieved improvements of 68.62% in RMSE and 70.52% in MAE compared to existing state-of-the-art models. These findings highlight the method’s ability to significantly boost traffic flow prediction accuracy, even in the presence of missing data, thereby contributing to the broader objectives of sustainable urban transportation systems.