Towards Proactive Surgical Infection Management: Development and External Validation of an AI-based Prediction Tool

Author(s):
Siri van der Meijden; Anna van Boekel; Mark G. J. de Boer; Rob G. H. H. Nelissen; Sebastian Bredie; Bart F. Geerts; Sesmu M.S. Arbous; Harry Van Goor

Background:

Late detection of postoperative infections results in poor outcomes for patients and added costs. To enhance early detection of surgical infections, we developed an Artificial Intelligence (AI) prediction model for clinical use.

Hypothesis:

Postoperative infections can be accurately predicted using AI prediction models.

Methods:

We retrospectively developed and validated an AI model (XGBoost) on surgical procedures in adults performed between 2011/1/1 – 2021/10/5 at the Leiden University Medical Center (LUMC). Model development and validation was performed on respectively 70% and 30% of the LUMC dataset using available electronic health record (EHR) data. We aimed to predict the risk of any treated bacterial infection within 30 days at the end of surgery. As not all infections can be found in the data due to under registration, we broadened the definition used for identifying infections to: non-prophylactic antibiotics, interventions related to infections and elevated CRP. External validation of the LUMC trained model was performed on a temporal LUMC dataset (2021/10/8 – 2021/11/8) and a general surgical dataset from The Radboud University Medical Center (Radboud UMC).

Results:

The model predicted postoperative infection with high performance using the LUMC EHR and temporal validation datasets, but did not show equal performance on the Radboud UMC dataset (Table 1). Both datasets differed in infection rates and infection related antibiotic treatments.

Uploaded Image 1

Conclusions:

We were able to predict postoperative infection with high internal and external temporal (prospective in time) predictive performance in one hospital. However, the same model did not perform as well in a comparable academic hospital. This finding emphasizes the need for AI models to be retrained on external data due to differences in patient populations, coding in EHR, local guidelines, human factors, procedure types and data collection. Next step is to investigate whether retraining of the AI model improves performance. The latter is needed before studying its impact in multi-center clinical trials.