Binary Code Robot Face
[Source: Yuichiro Chino / Getty Images]

When the pandemic heated up early last year, the UK Biobank set out to urgently gather data on COVID-19 diagnostic tests, deaths, and hospitals episodes, such as critical care events. Within just over six months they amassed details on more than 400,000 patients. To date, 700 research groups have accessed this information with 65 publications already generated by scientists worldwide, UK Biobank reports.

How did they do it?  In March 2020 the Department for Health and Social Care (DHSC) sent out word to all general practitioners whose IT systems are supplied by The Phoenix Partnership (TPP) and Egton Medical Information Systems (EMIS)—the two main system suppliers for the agency—to “release electronic patient records to UK Biobank for purposes related to further understanding of COVID-19.” Each participant had expressly consented in advance to the use of their data through their health care records.

The rest was tedious. The data included millions of rows of records. And it was complex, since different medical systems use different codes for things such as diagnoses and medications. But all the data was cleaned up and de-identified (including stripping codes relating to specific occupations) before it was released to approved researchers.

It was exactly the kind of careful and impactful use of patient data that policymakers, scientists, and patients themselves have been talking about for years, well, decades.

Just as the speed and cost of sequencing have become almost afterthoughts, so data privacy and its management are now standard processes for most organizations hoping to parlay medical data into precision medicine.

The trick it seems, is not to worry what the whole regulatory landscape looks like, but to adhere to the highest possible standards, and let everyone else catch up with that. And, as the rules evolve, data managers have an even tougher challenge: they need to protect that information while making it fully and easily accessible to its real owners—the patients themselves.

Security, security, security, and access

Scott Kahn
Scott Kahn, chief information and privacy officer, LunaDNA

“By abiding to a higher level of privacy protection overall, we can operate in more than 180 countries already,” says Scott Kahn, chief information and privacy officer at Luna­DNA, which provides a platform for patients to share their data with researchers.

Kahn keeps close a vast spreadsheet of privacy laws across the U.S. and the world. But his job, he says, is not just knowing what those laws are, but staying ahead of the leaders so that patients feel comfortable using the LunaDNA platform. He also has to be nimble. “Security is always a journey,” he says. “No matter how good we get, we always have to get better.”

Ardy Arianpour, CEO and a co-founder of Seqster concurs. His company is a decision support system and research platform for consenting and onboarding patient data during clinical trials. “We are ahead of the rules, as we started out putting the patient at the center,” he says.

One rule Arianpour is referring to is CMS-9115-F, the Interoperability and Patient Access final rule that requires “increased patient electronic access to their health care information.” Part of that means easy access for patients, but it also means they get to know who else is authorized to view their data.

Ardy Arianpour
Ardy Arianpour, co-founder and CEO, Seqster

Seqster, Arianpour says, has access to 90% of all the EHRs in the U.S. that is more than 3,600 hospitals and 150,000 facilities that include medical groups, cancer and radiation centers, outpatient surgery centers, pain treatment centers, and more.

And the information is self-perpetuating. There is auto-synch (so that as data accumulates, it is added to the record), a clear chain of custody, and the patient can even visualize their own data. How did they do it? “We spent millions of dollars with a ninja team of software engineers,” Arianpour says.

That seems to be paying off. In October Seqster announced a deal with Takeda providing them access to electronic health records (EHRs), genomic profiles (DNA), and wearable/fitness data across health systems for individual patients.

Privacy control, it turns out, just takes a lot of muscle and attention to detail. For example, pioneering consumer genetics firm 23&Me has sold more 12 million of its testing kits in more than 50 countries since its launch in 2007. About 80% of its customers consent to participate in research, a spokesperson says. The company has a current exclusive four-year therapeutics collaboration with GSK, and have also done research with Pfizer and Genentech.

In July of 2020, the company released preliminary data supporting the idea that a person’s blood type could influence susceptibility to COVID-19—a finding that garnered a lot of headlines. More than one million people opted in to a longitudinal study 23&Me carried out about COVID-19. The company’s study of hospitalized patients with the disease quickly and easily amassed the 10,000 participants it aimed for.

The backstory, again, is about methodology.

23&Me has long required explicit, informed consent prior to use of de-identified customer data for research purposes, and those projects must be overseen by an independent institutional review board (IRB).

The company also applies “the highest industry standards for authentication, encryption, and access rights to its systems,” it says, which includes ISO/IEC certification. Further, it’s a founding member of the Coalition for Genetic Data Protection and of The Future Privacy Forum, which published its Privacy Best Practices for Consumer Genetic Testing Services in July 2018.

The process may have been grueling, but the bottom line is 23andMe has yet to experience any incident in which customer data was inappropriately used or accessed, the company reports.

And it’s not just COVID-19 research that is benefiting. The Deciphering Developmental Disorders (DDD) project recently delivered a diagnosis to one-third of the families of more than 13,500 children with severe undiagnosed developmental disorders. The DDD project was set up in 2010 by the Wellcome Sanger Institute, working with the U.K.’s National Health Service (NHS) clinicians in 24 Regional Genetic Centres across the U.K. and Ireland.

The Sanger Center is also sequencing 225,000 genomes for the UK Biobank, a project which it started in 2019 and is due to finish in early 2022, according to Sarion Bowers head of policy at the Center. Researchers from 50 countries have downloaded data from the Biobank, while most of the requests were from the U.S., U.K., China, or Germany, many other countries are also benefiting, including Egypt, Estonia, Finland and Iceland.

The good, the bad, and the ugly

In today’s world of wildfire hacking and social media, most people realize there is no such thing as complete security of anyone’s data. But experts also agree that having the highest standards possible helps both discourage misuse of data and encourage patient participation.

The GDPR (General Data Protection Regulation) is held up as that highest standard. It basically says that EU citizens “have the right to protection of their personal data.” But, as Bowers says, every nation has their own interpretation of that and Brexit has, of course, sent things into a tailspin. The best bet, as usual, is just to aim high.

Not surprisingly, the U.S. is still struggling with state-by-state privacy rules to catch up with the leaders. But California seems to be raising the bar with the California Consumer Privacy Act (CCPA), which gives patients the right to know about personal information that businesses collect about them.

And then there’s China. “Currently, the Chinese DTC genetic testing business is running in a regulatory vacuum, governed by self-regulation,” write Li Du and Meng Wang in their April 2020 article “Genetic Privacy and Data Protection.”  The upshot is there is no data privacy in China, nor is any expected. “The government should develop a comprehensive legal framework to regulate DTC genetic testing offerings,” the authors advise.

For the faint of heart, there are now firms such as Luna-DNA and Seqster to take the pain out of the process. But there is no doubt that patients have come to expect a reasonable amount of privacy while they are also increasingly interested in sharing their data.

What patients want

While it may be safer and easier to share your data, it’s still a big step, so what makes patients become participants in the big data wave?

Dave deBronkhart, also known as e-patient Dave, is the poster child for the patient-engagement movement ever since he pulled up his own medical records online in 2009 and ended up on the front page of the Boston Globe because those records were so wrong. Used to working with data in his day job, after a life-changing brush with cancer deBronkhart had turned his attention to his own medical data: what it contained, how it was being managed, and his own ability to access and share it.

He realized what most patients do, which is that there is no such thing as privacy anymore.   Yes, you can fight, as deBronkhardt has, for the for the right to see your data and hopefully correct it, as new laws should allow, but don’t ever expect perfection. “If someone wants to hack you they will,” he says.

Patients with terminal and rare diseases are at the forefront of the patient-engagement movement. Ian Terry who is chief experience officer at PXE International puts it bluntly: “Most people don’t realize that privacy is a principal and not a policy.” PXE recently moved its patient engagement platform from its own homegrown platform to LunaDNA. People just didn’t trust their “mom and pop” little shop, which was run off excel spreadsheets.

Dawn Barry
Dawn Barry, president and co-founder, LunaDNA

But what patients want most are results, and for that, they need even more data sharing, says Dawn Barry, president of LunaDNA. “We need natural histories, electronic health data, whole genome sequences, biomarkers, and so much more,” she says. Luna’s platform is already delivering results with about 100,000 participants. But its platform was built to accommodate millions of people, and that’s what it will take to deliver many of the results those people expect.

Also of Interest