A CNS tumor diagnosis is one of the grimmest you can receive, can artificial intelligence turn this around? Can radiologists learn to use such tools to more rapidly and accurately diagnose these tumors? Do the tools offer the information they need? Do they trust them? Do the tools actually improve things?
According to recent research the answer to all these questions is “yes,” with important caveats. Radiologists can learn from high-performing machine learning (ML) systems to improve their performance. However, a lack of explainability of ML output can stymie that.
In other words, if the tools are good, the radiologists will use them and improve their work. Interestingly, they can even learn something from the less optimized tools.
Gliomas, tumors that arise from the glial cells of the brain or CNS, are complicated to treat, and can be very deadly. Glioblastoma is one of the most common types of brain cancer, and has a dismal 6.9% 5-year survival rate.
Enter digital pathology. One of the fastest growing markets in medicine, it is already estimated to be worth over $1B and growing. AI algorithms, including machine learning, are used for the rapid fire detection, segmentation, registration, processing, and classification of digitized pathological images. But they also introduce big issues about workflow, data, and the role of the radiologist.
An international team of researchers from TU Darmstadt, the University of Cambridge, Merck, and the Klinikum rechts der Isar of TU Munich, studied how software systems collect, process, and evaluate task-specific relevant information to support the work of radiologists.
Their work analyzes the influence of ML systems on human learning. It also shows how important it is for end users to know whether the results of ML methods are comprehensible and understandable. The team says these insights are not only relevant for medical diagnoses in radiology, but for everyone who becomes a reviewer of ML output through the daily use of AI tools, such as ChatGPT.
The research project was led by TU researchers Sara Ellenrieder and Peter Buxmann. It investigated the use of ML-based decision support systems in radiology, specifically in the manual segmentation of brain tumors in MRI images. The focus was on how radiologists can learn from these systems to improve their performance and decision-making confidence. The authors compared different performance levels of ML systems and analyzed how explaining the ML output improved the radiologists’ understanding of the results, aiming to find out how radiologists can benefit from these systems in the long term and use them safely.
In the experiment, 690 manual segmentations of brain tumors were performed by the radiologists. Physicians were asked to segment tumors in MRI images before and after receiving ML-based decision support. Different groups were provided with ML systems of varying performance or explainability. In addition to collecting quantitative performance data during the experiment, the researchers also gathered qualitative data through “think-aloud” protocols and subsequent interviews.
Radiologists, the results show, can learn from the information provided by high-performing ML systems. Through interaction, they improved their performance. However, the study also shows that a lack of explainability of ML output in low-performing systems can lead to a decline in performance among radiologists. Providing explanations of the ML output not only improved the learning outcomes of the radiologists but also prevented learning false information. In fact, some physicians were even able to learn from mistakes made by low-performing, but explainable systems.
“The future of human-AI collaboration lies in the development of explainable and transparent AI systems that enable end users in particular to learn from the systems and make better decisions in the long term,” said Buxmann.