Deep Learning Predicts Disease-Associated Mutations

Deep Learning Predicts Disease-Associated Mutations

A research team led by Hongzhe Sun from the Department of Chemistry at the University of Hong Kong (HKU), in collaboration with researchers including Junwen Wang from Mayo Clinic, Arizona, have used a deep learning approach to predict disease-associated mutations of the metal-binding sites in a protein. Understanding such mutations could facilitate discovery of new drugs.

This study “… stands for the first deep learning approach for the prediction of disease-associated metal-relevant site mutations in metalloproteins, providing a new platform to tackle human diseases,” the authors write in their paper, which was recently published in Nature Machine Intelligence.

Metal ions play pivotal roles either structurally or functionally in the (patho)physiology of human biological systems. Metals such as zinc, iron and copper are essential for all lives and their concentration in cells must be strictly regulated. Mutations at metal-binding sites may functionally disrupt metalloproteins, initiating severe diseases; however, there seemed to be no effective approach to predict such mutations until now. This study suggests that Sun and his collaborators’ approach accurately predicts disease-associated mutations at the metal-binding sites of metalloproteins.

The team first integrated omics data from different databases to build a comprehensive training dataset. By looking at the statistics from the collected data, the team found that different metals have different disease associations. A mutation in zinc-binding sites has a major role in breast, liver, kidney, immune system and prostate diseases. By contrast, mutations in calcium- and magnesium-binding sites are associated with muscular and immune system diseases, respectively. For iron-binding sites, mutations are more often associated with metabolic diseases. Furthermore, mutations of manganese- and copper-binding sites are associated with cardiovascular diseases with the latter being associated with nervous system disease as well.

Sun and his collaborators used a novel approach to extract spatial features from the metal binding sites using an energy-based affinity grid map. These spatial features where then merged with physicochemical sequential features to train the model. The final results show that using the spatial features enhanced the performance of the prediction with an area under the curve (AUC) of 0.90 and an accuracy of 0.82. Given the limited advanced techniques and platforms in the field of metallomics and metalloproteins, their deep learning approach offers a method to integrate the experimental data with bioinformatics analysis. This technique could help scientist predict DNA mutations associated with diseases such as cancer, cardiovascular diseases and genetic disorders.

“Machine learning and AI play important roles in the current biological and chemical science. In my group we worked on metals in biology and medicine using integrative omics approach including metallomics and metalloproteomics, and we already produced a large amount of valuable data using in vivo/vitro experiments,” says Sun, in the group’s press release.  He added that, “We now develop an artificial intelligence approach based on deep learning to turn these raw data to valuable knowledge, leading to uncover secrets behind the diseases and to fight with them. I believe this novel deep learning approach can be used in other projects, which is undergoing in our laboratory.”