A retrospective study led by researchers at Washington University School of Medicine in St. Louis and Whiterabbit.ai showed that an artificial intelligence (AI)-based protocol theoretically reduced the number of false positive results from routine mammograms. The study is published in Radiology: Artificial Intelligence.
Acknowledging that many AI-based studies in breast cancer focus on improving breast cancer detection, the team created an algorithm designed to reduce the number of callbacks, biopsies, and, ultimately, false negative results from routine mammograms.
“One of the big concerns in breast imaging is to find more cancers, one also ends up with a lot more false positives, and those false positives can cause a lot of challenges,” explains co-author Richard Wahl, of Washington University. “These challenges include anxiety for women, biopsies that do not show cancer, costs, delays, and then the breasts end up with scars that can be hard to sort out on subsequent mammograms. Biopsies can leave structural alterations that are there forever, increasing the difficulty of reading the scans around them.”
The AI algorithm was trained on 123,248 2D digital mammograms containing 6,161 cancers from Washington University. The researchers then ran a simulation of three groups of patient data: two from the US (including one set of patients from Washington University who were not part of the training), and one group from the U.K. In total, the simulation study included 14,831 screening mammography examinations, including 1,026 cancers.
First, the researchers figured out what the doctors did: how many patients were called back for secondary screening and biopsies, the results of those tests, and the final determination in each case. Then, they applied AI to the datasets to see what would have been different if AI had been used to remove negative mammograms in the initial assessments and physicians had followed standard diagnostic procedures to evaluate the rest.
The simulation intended to see what would have happened if all of the very low-risk mammograms had been taken off radiologists’ plates, freeing the doctors to concentrate on the more questionable scans. The simulation revealed that fewer people would have been called back for additional testing but that the same number of cancer cases would have been detected.
The Washington University data set revealed there would have been a 31% reduction in false positive callbacks, the second US data set was reduced by 12%, while the false positive callback rate in the UK group would have been reduced by 17%. There was no decline in the cancer detection rate.
“Hypothetically, if this were used as the screening assessing all of them, 42% of the Wash University cases would have been negative, 20% in the second data set, and 37% in the U.K. group,” adds Wahl. “So it is a significant number of cases that would have been classified as extremely unlikely to have breast cancer.” The researchers found some differences according to demographics, even including the type of mammogram machine used, but no conclusions could be drawn from this information. “But it does appear from these first analyses that the algorithm appears to be generalizable to patients from outside of the U.S.”
Many methods of breast cancer detection focus on the goals of finding more cancers or assessing how that approach performs relative to the radiologist. “But our approach was that the AI algorithm would, if it were negative, mean that there is really a very, very, very low probability of cancer being present for at least till the next mammogram,” says Wahl.” The team suggests that removing the likelihood of a false positive by the AI tool frees radiologists to focus on interpreting more complicated mammograms.
Moving ahead, the team ultimately hopes to refine the algorithm further and evaluate it prospectively.