Large Bone Marrow Data Set Helps Train AI to Identify Blood Disease

Large Bone Marrow Data Set Helps Train AI to Identify Blood Disease
Human bone marrow under the microscope [ Ivan Mattioli/Getty Images].

The method of diagnosing blood disorders has remained essentially unchanged for a century. Pathologists use microscopes to manually analyze and classify samples of bone marrow cells in search of a smattering of important cells that indicate the presence of disease. This method is both laborious and time-consuming.

Now, researchers from Helmholtz Munich with the LMU University Hospital Munich, the MLL Munich Leukemia Lab (one of the largest diagnostic providers in this field worldwide), and Fraunhofer Institute for Integrated Circuits have collaborated to use a large dataset of microscopy images to train a neural network that can now identify bone marrow cells with high accuracy.

Their work, published in the journal Blood, allowed the team to train high-quality classifiers of leukocyte cytomorphology that identify a wide range of diagnostically relevant cell species “with high precision and recall.” Their convolutional neural networks outcompete previous feature-based approaches, they added, and provide a proof-of-concept for the classification problem of single bone marrow cells.

The Helmholtz Munich researchers developed the largest open-access database on microscopic images of bone marrow cells to date. The database consists of more than 170,000 single-cell images from more 900 patients with various blood diseases.

“On top of our database, we have developed a neural network that outperforms previous machine learning algorithms for cell classification in terms of accuracy, but also in terms of generalizability,” said Christian Matek, Ph.D., a postdoctoral researcher at the Helmholtz Zentrum München. The deep neural network is a machine learning concept specifically designed to process images. “The analysis of bone marrow cells has not yet been performed with such advanced neural networks,” Matek explained, “which is also due to the fact that high-quality, public datasets have not been available until now.”

The researchers aim to further expand their bone marrow cell database to capture a broader range of findings and to prospectively validate their model. “The database and the model are freely available for research and training purposes—to educate professionals or as a reference for further AI-based approaches, e.g., in blood cancer diagnostics,” said Carsten Marr, PhD, deputy head of institute and research group leader at the Institute of Computational Biology at Helmholtz Munich.