STUDY OF THE EFFECTIVENESS OF CLASSICAL AND NEURAL NETWORK METHODS FOR REGRESSION FORECASTING

Authors

Keywords:

Big Data, regression, random forest, DNN, neural network, ML

Abstract

This paper presents a comparative analysis of classical and neural regression models for processing large datasets. Three approaches were implemented: Random Forest Regressor and two Dense Neural Networks — a basic PyTorch version and an improved TensorFlow model with Dropout and BatchNormalization layers. While the classical model achieved moderate accuracy (R² = 0.7404), the enhanced DNN yielded the best results (RMSE = 0.1570, R² = 0.9346). The study confirms the potential of neural networks for high-precision regression, provided that regularization, normalization, and hyperparameter tuning are applied.

References

Bjerre L.M., Peixoto C., Alkurd R., Talarico R., Abielmona R. Comparing AI/ML approaches and classical regression for predictive modeling using large population health databases: Applications to COVID-19 case prediction. Global Epidemiology. 2024. Vol. 8. 100168. https://doi.org/10.1016/j.gloepi.2024.100168

Breiman L. Random Forests. Machine Learning. 2001. Vol. 45(1). P. 5–32. https://doi.org/10.1023/A:1010933404324

Chen Y., Zhang Y. A Study of Optimization in Deep Neural Networks for Regression. Electronics. 2023. Vol. 12(14). Article 3071. https://doi.org/10.3390/electronics12143071

Mehta V., Batra N., Poonam, Goyal S., Kaur A., Dudekula K. V., Victor G. J. Machine Learning based Exploratory Data Analysis (EDA) and Diagnosis of Chronic Kidney Disease (CKD). EAI Endorsed Transactions on Pervasive Health and Technology. 2024. Vol. 10. https://doi.org/10.4108/eetpht.10.5512

Pargent F., Pfisterer F., Thomas J., Bischl B. Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features. Statistical Papers. 2022. Vol. 63. P. 1353–1375. https://doi.org/10.1007/s00180-022-01207-6

Chicco D., Warrens M. J., Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science. 2021. Vol. 7. Article e623. https://doi.org/10.7717/peerj-cs.623

Joseph A., Singh A. Optimal ratio for data splitting. Statistical Analysis and Data Mining: The ASA Data Science Journal. 2022. Vol. 15(1). P. 1–10. https://doi.org/10.1002/sam.11583

Published

2025-06-03