Machine Learning Models Accurately Predict Surgical Site Infection after Emergent Trauma Laparotomy

Author(s):
Michael Cobler-Lichter; Jessica Delamater; Larisa Shagabayeva; Zoe Weiss; Matthew Fastiggi; Brianna L. Collie; Nicole Lyons; Luciana Tito Bustillos; Jonathan P. Meizoso; Nicholas Namias; Brandon Parker; Kenneth G Proctor

Background:

Trauma patients who require emergency laparotomy are at relatively high risk of surgical site infection (SSI), which is associated with increased morbidity, cost, and length of stay. Machine learning (ML) techniques have been developed to detect existing SSI and predict SSI in elective surgery, but have not been developed for prediction after emergent trauma laparotomy.

Hypothesis:

ML can be trained to identify risk of SSI using variables that are readily available at the time of surgery and that could be automatically extracted from the patient’s chart.

Methods:

Patients from the American College of Surgeons Trauma Quality Improvement Project database (TQIP) who received a laparotomy within 90 minutes of arrival were included.  Patients with missing data for SSI were excluded. Only variables that would be available in the early postoperative period were considered. ML models were created to predict SSI. A game theoretical approach was used to estimate the relative significance of each variable towards the final prediction.

Results:

Of 5,481,046 patients in TQIP from 2017 to 2021, 74,806 met inclusion criteria.  SSI incidence was 4.2%.  A gradient-boosted decision tree model performed the best with area under the receiver-operator curve of 0.736 [95% CI 0.717-0.754] (Figure 1).  The most impactful variables on the model’s predictive ability are displayed in Figure 2.

Conclusions:

This is the first description that ML can reliably identify emergency trauma laparotomy patients at increased risk for SSI.  Such an approach can be integrated directly into electronic medical records to automatically identify high risk patients on admission, allowing for personalized care plans tailored to each patient’s risk profile.  This algorithm is designed to improve its accuracy over time and can capture complex non-linear relationships between variables that may not be apparent to humans or standard statistical techniques.