Olympic Medal Count Prediction and Analysis Based on Ridge Regression and Weighted Random Forest

Authors

  • Jiale Wang School of Information Management, Sun Yat-sen University, Guangzhou, China, 510006
  • Boyu Xu School of Information Management, Sun Yat-sen University, Guangzhou, China, 510006
  • Jialin Liao School of Software Engineering, Sun Yat-sen University, Zhuhai, China, 519000

DOI:

https://doi.org/10.54097/9gk4d352

Keywords:

Olympic medals, Ridge regression, Weighted random forest, Ensemble learning.

Abstract

This article comprehensively completed the prediction of medal distribution for the 2028 Summer Olympics by constructing statistical models including ridge regression model and weighted random forest model to predict the number of Olympic medals in various countries. At the same time, in-depth analysis was conducted on the specific situation of countries that won medals for the first time at the Olympics, revealing the factors behind their breakthrough achievements. In addition, this article explores the intrinsic relationship between different events and the number of medals awarded by each country, and identifies key events that may become the focus of medal competition. Furthermore, this study conducted an in-depth analysis of the "host effect" , exploring the additional advantages a host country may gain, as well as the "great coach effect" , examining the potential impact of excellent coaches on national medal tallies. It systematically analyzed the multidimensional factors influencing national Olympic performance.In the regression prediction stage, this article constructed a ridge regression model to solve the problem of multicollinearity in data, and deployed a multiple linear regression model to capture the linear relationship between variables. The introduced XGBoost and GDBT gradient enhancement algorithms perform well in handling large-scale data and high-dimensional features. In addition, the Transformer model provides strong support for complex data with its powerful sequence modeling capabilities and parallel computing advantages.

Downloads

Download data is not yet available.

References

[1] Liu H, Huang M, Zhang J. Predicting Olympic medal counts using machine learning techniques: An analysis of historical data and athlete performance[J]. International Journal of Sports Science & Coaching, 2021, 16(4): 739-752.

[2] Guo X, Ding Y, Wu G. Recent Advances in Ensemble Learning: Techniques, Applications, and Challenges[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(12): 5470-5489.

[3] Roth C, et al. Bayes beats Cross Validation: Efficient and Accurate Ridge Regression via Expectation Maximization[J]. arXiv preprint arXiv:2310.18860, 2023.

[4] Yamada T, et al. Refined Penalized Ridge Regression: Novel Methods for Optimizing Regularization Parameter in Genomic Prediction[J]. G3: Genes, Genomes, Genetics, 2024.

[5] Winham S J, Freimuth R R. A weighted random forests approach to improve predictive performance[J]. Statistical Analysis and Data Mining: The ASA Data Science Journal, 2013, 6(6): 421–429.

[6] Daho S, Boucheham B, Settouti N. Improving random forests using weighted voting for imbalanced data classification[C]//Proceedings of the 5th International Conference on Ambient Systems, Networks and Technologies (ANT-2014). Halifax: Elsevier, 2014: 64–71.

[7] Guo X, Ding Y, Wu G. Recent Advances in Ensemble Learning: Techniques, Applications, and Challenges[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(12): 5470-5489.

[8] Zhang C, Ma Y. Ensemble Machine Learning: A Review[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2020, 10(3): e1356.

[9] Liu J, Chen S, Tan Y. Stacking Ensemble Approach for Enhancing Medical Image Classification[J]. Applied Soft Computing, 2022, 114: 108132.

[10] Gajowniczek K, Ząbkowski T, Buda P. Adjusting random forest for imbalanced churn prediction by using weighted voting[J]. Expert Systems with Applications, 2020, 160: 113696.

[11] Balmer N J, Nevill A M, Williams A M. Host nation advantage in Olympic Games[J]. Journal of Sports Sciences, 2023, 21(6): 469-478.

[12] Magee J C, Sugden J. The Olympic Games as a global event: Major issues and controversies[J]. International Journal of the History of Sport, 2002, 19(5): 1-30.

Downloads

Published

23-12-2025

How to Cite

Wang, J., Xu, B., & Liao, J. (2025). Olympic Medal Count Prediction and Analysis Based on Ridge Regression and Weighted Random Forest. Highlights in Science, Engineering and Technology, 159, 256-264. https://doi.org/10.54097/9gk4d352