Researchers from the Brigham and Women’s Hospital have developed a deep-learning based algorithm that can accurately predict the origin of primary tumors based on imaging data.
Although the origin of most primary tumors is known, in around 1-2% of cancer cases where multiple tumors are present at diagnosis it is difficult to know where the cancer originated. This can be problematic, as most treatments are aimed at primary tumors and these patients often have a poor prognosis and only survive for a few months.
Normally, the only option for such patients is to undergo multiple diagnostic biopsy and lab tests, which can delay crucial treatments. To try and simplify diagnosis in these cases, Faisal Mahmood, Ph.D., a research group leader at the Brigham and an assistant professor at Harvard Medical School, and his team developed an artificial-intelligence based algorithm to help locate the origin of primary tumors.
Even in low-resource settings, it is common practice for histology slides of tumor samples to be produced for diagnostic purposes. Mahmood and colleagues used these images to train their deep learning algorithm to accurately predict the location of a patients primary tumor. Deep learning is a form of AI that is designed to learn and improve its performance over time when exposed to data relevant to its purpose, in this case the slide images.
“Almost every patient that has a cancer diagnosis has a histology slide, which has been the diagnostic standard for over a hundred years,” says Mahmood, who is lead author on the paper describing the work, which is published in the journal Nature.
“Our work provides a way to leverage universally acquired data and the power of artificial intelligence to improve diagnosis for these complicated cases that typically require extensive diagnostic work-ups.”
The Tumor Origin Assessment via Deep Learning algorithm developed by Mahmood and team is designed to identify whether a tumor is primary or metastatic and predict where it originated in the body.
Before testing it on patient data, the researchers first trained the model using slide images from 22,000 patients with cancer, followed by images from 6,500 cases with a known location of their primary tumor. They also ran images from increasingly complex metastatic cases through the algorithm to ensure it was adequately trained to detect the location of primary tumors in patients where this is unknown.
The model was 83% accurate at identifying the correct primary cancer in the group where the primary tumor origin was known and the correct diagnosis was in the top three options selected by the algorithm 96% of the time.
In 317 cancer cases where the primary tumor location was unknown, the algorithm agreed with a differential diagnosis given by a pathologist 61% of the time and the diagnosis was in the top three computer-selected options 82% of the time.
The accuracy of the system was comparable to some recent research using genomic data to predict tumor origins, but as the researchers point out, genomic data is only an option in some clinical settings at the present time.
“The top predictions from the model can accelerate diagnosis and subsequent treatment by reducing the number of ancillary tests that need to be ordered, reducing additional tissue sampling, and the overall time required to diagnose patients, which can be long and stressful,” Mahmood said.
“Top-three predictions can be used to guide pathologists next steps, and in low-resource settings where pathology expertise may not be available the top prediction could potentially be used to assign a differential diagnosis. This is only the first step in using whole-slide images for AI-assisted cancer origin prediction, and it’s a very exciting area with the potential to standardize and improve the diagnostic process.”