A big part of personalized medicine in cancer is knowing ahead of time if a drug is likely to be effective or not. That’s usually done by identifying actionable genetic mutations. But a team of researchers recently developed a potentially quicker and more consistent tool based on omics data: a machine learning algorithm that ranks drugs based on their anti-proliferative efficacy in cancer cells.
Known as Drug Ranking Using Machine Learning (DRUML), the method was developed at Queen Mary University in London and is based on machine learning analysis of protein omics data in cancer cells.
DRUML was created based on training responses of cancer cells to 412 cancer drugs to predict the most appropriate one to treat a particular cancer. To develop DRUML, the team used proteomics and phosphoproteomics datasets of 48 leukemia, esophagus and liver cancer cell lines generated from liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS). DRUML was developed from analysis of more than 22,000 phosphorylation sites and about 6,700 proteins.
By training the models using the responses of these cells to 412 cancer drugs listed in drug response repositories, DRUML was able to produce ordered lists based on the effectiveness of the drugs to reduce cancer cell growth.
“We observed a remarkably high correlation between the predicted and actual responses within cell models across drugs of diverse mode of action,” the authors write in their paper published today in Nature Communications.
The team then verified the predictive accuracy of DRUML models using data from six other cell lines obtained from other laboratories with similar results.
“We observed a significant correlation between the DRUML-derived drug response predictions and the actual responses for these six cell lines across drugs with diverse mode of action and developmental phase,” they write.
To test its clinical relevance, DRUML was used to predict responses to cytarabine in 36 primary acute myeloid leukemia (AML) samples. The team assumed that AML patients predicted to be sensitive to cytarabine by DRUML would show greater clinical overall survival than those predicted to be resistant. This study confirms that. Clinically, the median overall survival was 1.1 vs 3.4 years for patients that underwent a complete response, which included cytarabine treatment. In the DRUML sample cohort, the median overall survival was 1.0 and 1.6 years for samples predicted to be low and high cytarabine responders, respectively.
“These results therefore indicate that DRUML-predicted drug responses are clinically relevant in this setting,” the authors conclude.
The team also learned that DRUML’s markers of drug response were consistent with the drugs’ mode of action, suggesting that these markers are indicative of the biological mechanisms that determine responses to the profiled drugs.
Using machine learning to predict drug responses in cancer isn’t new. For example, two projects —the Cancer Target Discovery and Development and Genomics of Drug Sensitivity in Cancer — project rely on the approach by associating genomic features, gene expression patterns and copy number alterations to drug sensitivity.
DRUML represents an approach using large scale proteomics and phosphoproteomics data and it overcomes one of the previous limitations of this technique: low sample throughput of proteomics and phosphoproteomics by LC-MS/MS. Most proteomics methods also involve comparing proteins after chemical or metabolic labeling, thereby restricting the number of samples that can be directly compared and used as the input for machine learning model.
“Improvements in LC-MS/MS throughput and label-free analysis in tandem, together with the recent availability of systematic drug response profiles for a large number of cell lines and drugs now make feasible the use of proteomics and phosphoproteomics data as the input of predictive models of drug efficacy,” the authors write.
As new drugs become part of the cancer drug arsenal, DRUML can be retrained to theoretically test relevant cancer drugs.