Efficacy and Applications of Artificial Intelligence and Machine Learning Analyses in Total Joint Arthroplasty: A Call for Improved Reporting.
Academic Article
Overview
abstract
BACKGROUND: There has been a considerable increase in total joint arthroplasty (TJA) research using machine learning (ML). Therefore, the purposes of this study were to synthesize the applications and efficacies of ML reported in the TJA literature, and to assess the methodological quality of these studies. METHODS: PubMed, OVID/MEDLINE, and Cochrane libraries were queried in January 2021 for articles regarding the use of ML in TJA. Study demographics, topic, primary and secondary outcomes, ML model development and testing, and model presentation and validation were recorded. The TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines were used to assess the methodological quality. RESULTS: Fifty-five studies were identified: 31 investigated clinical outcomes and resource utilization; 11, activity and motion surveillance; 10, imaging detection; and 3, natural language processing. For studies reporting the area under the receiver operating characteristic curve (AUC), the median AUC (and range) was 0.80 (0.60 to 0.97) among 26 clinical outcome studies, 0.99 (0.83 to 1.00) among 6 imaging-based studies, and 0.88 (0.76 to 0.98) among 3 activity and motion surveillance studies. Twelve studies compared ML to logistic regression, with 9 (75%) reporting that ML was superior. The average number of TRIPOD guidelines met was 11.5 (range: 5 to 18), with 38 (69%) meeting greater than half of the criteria. Presentation and explanation of the full model for individual predictions and assessments of model calibration were poorly reported (<30%). CONCLUSIONS: The performance of ML models was good to excellent when applied to a wide variety of clinically relevant outcomes in TJA. However, reporting of certain key methodological and model presentation criteria was inadequate. Despite the recent surge in TJA literature utilizing ML, the lack of consistent adherence to reporting guidelines needs to be addressed to bridge the gap between model development and clinical implementation.