Scientists are applying a common understanding of cancer cells—the fact that most cancer cells are aneuploid—to large-scale RNA-sequencing to better distinguish them from normal cells within a tumor.
Known as CopyKAT (copy number karyotyping of aneuploid tumors), the tool serves as a computational method for detecting cancerous cells amidst large and complex data sets emerging from RNA-sequencing.
As a transcriptome-wide analysis tool, single-cell RNA sequencing provides gene expression information from thousands of cells. The result is a massive data set requiring specialized algorithmic methods to parse through the data. Scientists at the University of Texas MD Anderson Cancer Center developed CopyKAT to find a possible genetic fingerprint of cancer cells amidst all that information. This is especially important since most tumor samples include a mix of cell types.
“The underlying logic of calculating DNA copy numbers from RNAseq data is that gene expression levels of many adjacent genes can be dosed by genomic DNA copy numbers in that region,” according to Ruli Gao, PhD, CopyKAT’s developer, now assistant professor at Houston Methodist Research Institute. “The rationale for prediction tumor/normal cell states is that aneuploidy is common in human cancers (90%). Cells with extensive genome-wide copy number aberrations (aneuploidy) are considered as tumor cells, wherease stromal normal cells and immune cells often have 2N diploid or near-diploid copy number profiles.”
The target goal was to estimate genomic copy number profiles at an average genomic resolution of 5 Mb from read depth in high-throughput single-cell RNA sequencing data.
The team first compared copy numbers with CopyKAT against the actual DNA copy numbers obtained by whole-genome sequencing. They next analyzed 46,501 single cells from 21 tumors, including triple-negative breast cancer, pancreatic ductal adenocarcinoma, anaplastic thyroid cancer, invasive ductal carcinoma and glioblastoma, to distinguish cancer cells from normal cell types. In all experiments, CopyKAT could identify tumor cells versus immune or stromal cells with 99 percent accuracy in a mixed tumor sample.
“We could then go one step further to discover the subclones present and understand their genetic differences,” said senior author Nicholas Navin, PhD. As an example, the team used CopyKAT to reveal rare subpopulations within triple-negative breast cancers that were not previously well known. These included clonal subpopulations that differed in the expression of cancer genes, such as KRAS, as well as specific genetic signatures, including epithelial-to-mesenchymal transition, DNA repair, apoptosis and hypoxia.
Because CopyKAT provides a snapshot view of a tumor’s complete microenvironment, the tool provides a patient-specific profile, one that is not easily available with current RNA-sequencing methods. It is freely available.
One caveat: CopyKAT cannot be used for studying all cancer types. Examples include those cancers, such as in pediatric and hematologic cancers, where aneuploidy is rare.