A multiclass, machine learning-generated, model using microRNA (miRNA) expression profiles that can point to highly specific cancer types has been created by researchers from Florida Atlantic University (FAU)’s College of Engineering and Computer Science. Their study, published in the Institute of Electrical and Electronics Engineers’ journal IEEE Xplore, suggests that miRNAs may be highly unique to specific cancerous tissues and can be strong biomarkers for detection and classification. Since miRNA can be detected, with high accuracy, directly from blood, urine, and saliva, they could be a useful basis for diagnostics.
miRNAs are small non-coding ribonucleic acids (RNAs) whose main role is gene regulation. That role includes regulating the formation and development of cancer. Although miRNA have different expression profiles across different cancer types, cancers can also share common miRNA. Currently verified miRNA biomarkers, the authors note, are mostly generally applicable.
“Our study looks at a group of different cancers to determine if miRNA can be used to differentiate between them, as well as normal tissue samples. Our results show that indeed, these different cancers can be distinguished from each other, as well as from normal tissue,” Oneeb Rehman, corresponding author of the study, told Inside Precision Medicine. He is a Ph.D. candidate in FAU’s Department of Electrical Engineering and Computer Science.
Machine learning methodologies have been used to develop high performance pan-cancer classification models and to identify potentially novel miRNA biomarkers for clinical investigation.
For this study, researchers assessed how the top miRNA features selected by machine learning models relate to clinically and biologically verified miRNA biomarkers. They developed Support Vector Machine and Random Forest machine learning models for cancer classification, and iteratively added cancer classes to the multiclass models. They then looked at the relationship between the relevant miRNAs identified through feature selection and the performance metrics of the classification models across 20 iterations. Each iteration added another primary sample site to the multi-class models, increasing the number of cancer types involved.
Researchers examined the change in success metrics as more cancer types were introduced to the subset, how the 20-miRNA signature changed as more cancer types were introduced to the subset, and the characteristics of the full dataset via principal component analysis, a popular technique for analyzing large datasets containing a high number of dimensions or features.
The study provides insights into potential relationships between the overall clinical relevance of a given signature and the success metrics of the models. It also demonstrates the feasibility of using a single multi-tissue miRNA cancer signature for detection of a number of prominent cancers.
Findings showed that as the number of cancer classes increased, the performance metrics decreased, yet the percentage relevance of the miRNA feature selection signature slightly increased before stabilizing. In addition, after conducting principal component analysis, the non-cancer tissues from all samples had very similar expression visualizations, while all cancerous tissues had unique profiles.
“Further research can be done regarding tumor stage classification. Early detection in cancer is paramount for positive outcomes therefore it is important to see if and how miRNA expressions change across different tumor stages. Specific miRNAs could be narrowed down, which could be potent biomarkers for early cancer stages,” said Rehman.
He added that, “It has also been shown that cancer tissue with the same developmental origins have commonality in their miRNA expression profiles, which can also be a potential avenue of research.”