Secondary tuberculosis infection, illustration
Credit: KATERYNA KON/SCIENCE PHOTO LIBRARY/Getty Images

Researchers have found that an artificial intelligence system is at least as good as human radiologists at identifying tuberculosis from chest X-rays, opening up its use for low-resource countries.

Indeed, the deep learning program was superior in sensitivity and noninferior in specificity in identifying active pulmonary TB in frontal chest radiographs when compared with nine radiologists from India.

The system could have particular value in low-income countries where large-scale screening programs are not always feasible due to cost and radiologist availability.

Simulations revealed that using the deep learning system to identify likely TB-positive chest radiographs for confirmation using nucleic acid amplification testing (NAAT) reduced costs by between 40 and 80 percent per positive patient detected.

“We hope this can be a tool used by non-expert physicians and healthcare workers to screen people en masse and get them to treatment where required without getting specialist doctors, who are in short supply,’ said researcher Rory Pilgrim, a product manager at Google Health AI in Mountain View, California.

“We believe we can do this with the people on the ground in a low-cost, high-volume way.”

The research is published in Radiology, a journal of the Radiological Society of North America.

The deep-learning system was trained using 165,754 images from 22,284 individuals, nearly all from South Africa, and then tested using data from five countries.

The total test set had 1236 images, of which 212 were identified as positive for TB based on microbiological tests or NAAT. These were binary scored by 10 radiologists from India and five from the USA, although one of the Indian radiologists was removed due to their much lower specificity than the others.

Among 1236 test individuals assessed, the deep learning system achieved superior sensitivity compared with a prespecified analysis involving the nine radiologists from India, at 88% versus 75%, with noninferior specificity at 79% versus 84%.

“What’s especially promising in this study is that we looked at a range of different datasets that reflected the breadth of TB presentation, different equipment and different clinical workflows,” said co-study author  Sahar Kazemzadeh, software engineer at Google Health.

The AI system achieved thresholds set by the World Health Organization in 2014 as a reasonable requirement for any TB screening test in most of the data sets, noted Bram van Ginneken, a professor of medical image analysis at Radboud University Medical Center in Nijmegen, The Netherlands, in an editorial accompanying the study.

Yet, he added: “It is shown that for difficult data sets, such as a mining population, whose radiographs may contain other signs of lung disease, and a subset of subjects who are HIV positive, where TB may occur without typical radiographic abnormalities, both the AI software and the human readers performed much lower.

“In such situations, radiography is simply a poorer test for TB. AI cannot magically change that.”

Also of Interest