Machine learning approaches to injury risk prediction in sport: a scoping review with evidence synthesis
Objective This study reviewed the current state of machine learning (ML) research for the prediction of sports-related injuries. It aimed to chart the various approaches used and assess their efficacy, considering factors such as data heterogeneity, model specificity and contextual factors when developing predictive models. Design Scoping review. Data sources PubMed, EMBASE, SportDiscus and IEEEXplore. Results In total, 1241 studies were identified, 58 full texts were screened, and 38 relevant studies were reviewed and charted. Football (soccer) was the most commonly investigated sport. Area under the curve (AUC) was the most common means of model evaluation; it was reported in 71% of studies. In 60% of studies, tree-based solutions provided the highest statistical predictive performance. Random Forest and Extreme Gradient Boosting (XGBoost) were found to provide the highest performance for injury risk prediction. Logistic regression outperformed ML methods in 4 out of 12 studies. Thre