Woman In The Network
Credit: John Lund/Getty Images

Researchers from Mass General Brigham, reporting in npj Digital Medicine, detail their development of a generative artificial intelligence (AI) that can identify social determinants of health (SDoH) from a doctor’s notes. The new model, developed using a large language model (LLM), the finely tuned tool identified 93.8% of patients with an adverse SDoH. This compares with only two percent of diagnostic codes including SDoH information and the new model also demonstrated it was less prone to bias when compared with generalist models such as GPT-4.

The researchers say the new tool can help meet the goal of seamlessly identifying more patients who can benefit from other health support resources and social work support, while also shining a spotlight on the under-documented impact of SDoH on so many patients.

“Algorithms that can pass major medical exams have received a lot of attention, but this is not what doctors need in the clinic to help take better care of patients each day. Algorithms that can notice things that doctors may miss in the ever-increasing volume of medical records will be more clinically relevant and therefore more powerful for improving health,” said corresponding author Danielle Bitterman, MD, a faculty member in the Artificial Intelligence in Medicine (AIM) Program at Mass General Brigham.

Social determinants of health including where a patient lives, employment, food insecurity, and distance from appropriate healthcare resources are now recognized as affecting a patient’s health and healthcare outcomes. While some clinicians may add SDoH information in a patient’s electronic health record (EHR), currently they are not organized in a systematic way within the record.

To create the new generative AI tool, the Mass General Brigham team manually reviewed 800 clinician notes from 770 cancer patients who had received radiotherapy in the Department of Radiation Oncology at Brigham and Women’s. Using a predetermined list six SDoH (employment status, housing, transportation, parental status, relationships, and presence or absence of social support) tagging any records that had reference to one or more these.

The investigators then used this dataset to train the language models to identify any references to SDoH found in clinician notes. To test the newly developed model, the team used 400 additional clinical notes from patients treated with immunotherapy at Dana-Farber Cancer Institute and from patients admitted to the critical care units at Beth Israel Deaconess Medical Center.

From this work the researchers found that fine-tuned LMs, notably Flan-T5 LMs, consistently identify the rare references to SDoH in the doctor’s notes, but also found the learning capacity of the models was limited by the rarity of the SDoH provided by the original training set—where the team found only three percent of sentences in the notes contained any reference to an identified SDoH. To improve this performance, the investigators then employed Chat GPT to produce an additional 900 synthetic examples of sentences indicating SDoH and use this as a second training set.

The research team was keenly aware that a current shortfall of some generative AI models developed for healthcare have been shown to perpetuate bias which can widen the gap of health disparities. To assess their new tool, the Mass General researchers compared its performance against OpenAI’s GPT-4 and found their new tool was less likely to change its identification of a SDoH based on a patient’s race/ethnicity or gender.

“If we don’t monitor algorithmic bias when we develop and implement large language models, we could make existing health disparities much worse than they currently are,” Bitterman said. “This study demonstrated that fine-tuning LMs may be a strategy to reduce algorithmic bias, but more research is needed in this area.”

Also of Interest