Johns Hopkins researchers have devised their own bioinformatics software to evaluate how well the current strategies identify cancer-promoting mutations and distinguish them from benign mutations in cancer cells. The new study, which was published recently in PNAS through an article entitled “Evaluating the evaluation of cancer driver genes,” describes the importance of assessing how well these genetic research methods work, due to their potential value in developing treatments for keeping cancer in check.
“Identifying the genes that cause cancer when altered is often challenging, but is critical for directing research along the most fruitful course,” explained Howard Hughes Investigator Bert Vogelstein M.D., a professor at the Johns Hopkins Kimmel Cancer Center and co-author on the study. “This paper establishes novel ways to judge the techniques used to identify true cancer-causing genes and should considerably facilitate advances in this field in the future.”
To start tackling these bioinformatic issues, a team of Johns Hopkins computational scientists and cancer experts devised new software to assess current analytical identification strategies for cancer producing mutations. The Hopkins team addressed the lack of a widely-accepted consensus on what qualifies as a cancer driver gene.
“Because there is no generally accepted gold standard of driver genes, it has been difficult to quantitatively compare these methods,” the authors wrote.
“People have lists of what they consider to be cancer driver genes, but there's no official reference guide,” added lead study author Collin Tokheim a doctoral student in the Institute of Computational Medicine and the Department of Biomedical Engineering at Johns Hopkins.
Nevertheless, Tokheim and his colleagues were able to develop a machine-learning-based method for driver gene prediction and a framework for evaluating and comparing other prediction methods. For the study, this evaluation tool was applied to eight existing cancer driver gene prediction methods.
“We present a machine-learning–based method for driver gene prediction and a protocol to evaluate and compare prediction methods,” the authors penned. “We establish an evaluation framework that can be applied to driver gene prediction methods. We used this framework to compare the performance of eight such methods. One of these methods, described here, incorporated a machine-learning–based ratiometric approach. We show that the driver genes predicted by each of the eight methods vary widely. Moreover, the P values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them.”
“Our results suggest that most current methods do not adequately account for heterogeneity in the number of mutations expected by chance and consequently yield many false-positive calls, particularly in cancers with high mutation rate,” the authors continued.
While a study of this nature may lead to more precise ways to target cancer cells, the current state of affairs for bioinformatics is not entirely reassuring as Mr. Tokheim concluded that “these methods still need to get better. We're sharing our methodology publicly, and it should help others to improve their systems for identifying cancer driver genes.”