With it’s 150,000th patient recently enrolled, Geisinger Health System’s MyCode project is helping set the standard for big health data projects around the world. The program has a system to easily enroll patients, process both their genomic and clinical data, return relevant results to them, and spur research. It’s what many biobank developers are dreaming of.

Genomics databases now abound: there is the Million Veterans Program, the China Kadoorie Biobank Study, Vanderbilt’s BioVu, U.K.’s 100,000 Genomes Project, and many more. MyCode is unusual because of it’s rapid growth and the fact that it not only has a robust sequencing database but also more than 20 years worth of electronic health record (EHR) data. Most importantly, Geisinger is already using information gleaned from analysis of this data to guide care of its patients for almost 30 conditions, including breast and ovarian cancers, Lynch syndrome cancers, and hereditary high cholesterol.

“There’s an immense amount of information in clinical records,” said Jeremy Rotter, Ph.D., who is helping build a biobank based at UCLA. “And people are starting to learn how to efficiently mine that data, including billing records as well as doctors’ and nurses’ notes.” Rotter is director of LA BioMed’s Institute for Translational Genomics and Population Sciences, Harbor-UCLA Medical Center.

Another development that has helped MyCode and similar projects is the rise of cloud computing, which is a tool Geisinger and most biobanks use. “When you are dealing with a terabyte of data it costs a lot to store it,” explains Andrew Gruen, communications officer at Seven Bridges, which provides genomics-related data analysis and management services to multiple big data projects, including the Cancer Genomics Cloud, the Million Veteran Program, and the U.K.’s 100,000 Genomes program. Jeff Reid, Ph.D., agrees. He is executive director of Genome Informatics at the Regeneron Genetics Center (RGC), which outsources sequencing, analytics, and related services to dozens of organizations, ranging from small to large, including Geisinger. “The idea of being able to spin up as much storage as you need, as you want it is revolutionary,” he said. “It’s very different from figuring out how many computers you need.”

“We were one of the first health systems to install an EHR, we now have an average of 14 years of data on each patient in the system,” said Andrew Faucett, one of MyCode’s principal investigators, a professor, and director of policy & education at Geisinger. Normalizing and integrating that data with the genomic data, he explains, is one of the key challenges in this field.

Any Geisinger patient can enroll in MyCode, either online or during a visit to one of the systems’ facilities. After they agree to donate a blood sample and have their EHR data mined (with privacy protection for research purposes), that sample undergoes exome sequencing.

Click here to access the rest of this article.

Also of Interest