Machine Learning Predicts Hip Fracture Death

A machine-learning model can accurately predict the risk of death after hip fracture using information gleaned from basic blood and lab tests and standard demographic data, research indicates.

LightGBM by Microsoft was most accurate of the models assessed, according to the findings published in the Journal of Orthopaedic Research.

The boosting algorithm involves training multiple models in sequence, with each showing improvements over its predecessor.

Age was most predictive in determining the risk of death at one year, followed in importance by nine blood tests.

“Our models show that certain biomarkers can be particularly useful in characterizing the risk of poor outcomes following hip fractures,” said corresponding author George Asrian, from the University of Pennsylvania.

Over 300,000 hip fractures occur each year in the U.S. alone, accounting for over 40% of fracture-related nursing home admissions. They contribute 70% to the direct costs in fracture care, equivalent to an eye-watering $12 billion.

The researchers trained and tested 10 different machine learning classification models using 3751 hip-fracture patient records from two in‐hospital database systems at the Beth Israel Deaconess Medical Center in Boston.

A total of 776 patients died within a year, giving a mortality rate of 20.7% overall and 29.3% for patients aged 80 plus. The five-year mortality rate was 29.6%, while the 10-year mortality rate was 32.0%.

The classification models were trained using the PyCaret module to aid comparison.

They included K Nearest Neighbors Classifier, Random Forest Classifier, Extra Trees Classifier, Logistic Regression Classifier, Gradient Boosting Classifier, Support Vector Machine, Ada Boost Classifier, Linear Discriminant Analysis, Decision Tree Classifier, Naïve Bayes Classifier, and Quadratic Discriminant Analysis from SciKit-Learn, as well as LightGBM from Microsoft.

Classification models were applied to 156 features, classifying each patient who died within a year of admission as a ‘one’ and those who survived as a ‘zero’.

The training size was set to 80% of the final data set, comprising 3000 records, with the remaining fifth reserved for testing.

Models were primarily evaluated on accuracy and the area under the receiver operating characteristic curve (AUC), which measures performance and is determined from a plot of the relationship between the true and false positive rates.

LightGBM performed best, with an accuracy of 81%, an AUC of 0.79, sensitivity of 0.34, and specificity of 0.98.

The 10 features of greatest predictive value for one-year mortality were age, glucose, red blood cell distribution width (RDW), mean corpuscular hemoglobin concentration (MCHC), white blood cells (WBCs), urea nitrogen, prothrombin time (PT), platelet count, calcium levels, and partial thromboplastin time (PTT).

LightGBM performed as well using just these 10 predictors compared with when all 156 features were used, only marginally losing accuracy.

It was then trained on datasets with the same patients but for five- and 10-year mortality, incorporating all 156 features run on the test set of 751 records.

The five-year model had an AUC of 0.78 and accuracy of 74% while the 10-year model had an AUC of 0.79 and accuracy of 72%.

The top 10 features for mortality at five years were the same as those at a year, except that calcium and PT dropped out and chloride and phosphate entered the list.

For 10-year mortality, the top 10 features were the same as those for one year, except PT, PTT, and calcium were replaced by hematocrit, mean corpuscular hemoglobin, and red blood cell counts

The five- and 10-year prediction models reflected minor losses in predictive ability, showing corresponding 6.7% and 8.9% decreases in accuracy with the test set compared to the one-year model, with negligible changes to the AUC.

“The attributes that were top 10 in the one-, five-, and 10-year models may be most important in general for predicting mortality in hip fracture patients at diagnosis given the consistency,” the authors noted.

“These variables include age, MCHC, platelet count, RDW, WBCs, urea nitrogen, and glucose.”

They added: “Given that the factors used in the LightGBM model are easily available from lab tests that are routinely done on patients, this tool can be put to use as a triage tool for surgeons and hospitalists to use.”

Also of Interest