Source: University of Washington
Source: University of Washington

The interface between the biological world and the silicon world has captured the imagination of many science-fic­tion filmmakers and storytellers. By turning DNA into a computer virus, an interdisciplinary group of molecular biol­ogists and computer scientists has recently broken that inter­face and brought science-fiction one step closer to reality.

 

The building blocks of humans and computers actually have a lot in common. For example, they’re both codes. Watson and Crick’s discovery illuminated the four-base code—A, T, C, G—that ultimately gives the instructions for building life, while a binary code made up of 1s and 0s gives the instructions needed to execute a command in computer software. Scientists have already found ways to use DNA molecules to store and retrieve data in the form of text, images, and even video. However, a research group at the University of Washington’s Paul G. Allen School of Com­puter Science & Engineering wanted to see if they could do more than just store data—they wanted to see if it was possible to use synthetic DNA to steal or corrupt data files containing genetic information.

 

While these so-called biohackers proved that the inter­action between biomolecular information and the systems that analyze it could pose a new, and frankly freaky, secu­rity risk for next-generation sequencing, they conceded that researchers have no reason to ring the alarm bells just yet.

 

“If we did not have to synthesize the DNA, and we were able to just create our exploit code in the 1’s and 0’s of com­puters—that would have been easy,” said author and Uni­versity of Washington, Computer Science & Engineering Professor, Tadayoshi Kohno, Ph.D. According to Kohno, the most natural way to exploit a vulnerable program is to send a long sequence with lots of repeating units. However, the majority of next-generation sequencing platforms randomly chop long sequences into pieces, read the sequences in par­allel, and then reconstruct the original. If the platform does not put the malware code back together properly, or reads it in the wrong direction, the code won’t function.

 

While repeating 1’s and 0’s works well for computers, this kind of repetition often causes DNA strands to form second­ary structures that make both synthesizing and sequencing difficult. Finally, the proportion of bases required to form a stable DNA molecule also limits the language would-be biohackers can use to send their malicious messages.

 

Still, Kohno and his colleagues proved that it’s possible. After multiple renditions, they constructed a DNA sequence that, when read, converted into a FASTQ file, and com­pressed using a modified fqzcomp utility, gave them arbitrary remote code execution—effectively allowing them to hijack the computer.

 

Click here to access the rest of this article.

Also of Interest