Photo Credit: Mark Ostow
Photo Credit: Mark Ostow

Heidi Rehm, Ph.D., is a leader in translational medicine, having spearheaded the creation of the Laboratory for Molecular Medicine at Harvard-Partners Center for Genetics and Genomics in 2002 (now Partners Healthcare Personalized Medicine) and via her work at the Broad Institute involving both rare disease gene discovery, medical and population genetics, and the clinical research sequencing platform.

Her work with Partners and the Broad only begin to tell the story of her impact on the clinical use of genomic information, however. Among other projects, she also has leading roles with public resources such as the human variation-phenotype database ClinVar; the Clinical Genome Resource (ClinGen), an initiative to build genomic knowledgebases to support genomic medicine, score and curate claimed gene-disease associations; and Matchmaker Exchange, a global genotype and phenotype data sharing network aimed at improving the diagnosis of rare diseases.

With so much on her plate, Dr. Rehm is constantly on the move, but was able to share a few moments while in transit from the airport home to speak with Clinical OMICs Editor in Chief Chris Anderson.


As a molecular biologist, what are some of the biggest challenges you encounter running a clinical lab? 

Heidi Rehm: The biggest challenge we encounter is whether we can accurately identify and interpret the causes of a disease in an individual, and give an accurate answer that those patients can act on in a safe and appropriate way to improve their health. It is not straightforward to interpret genetic information in an informative way. The challenge is to do the best you can, knowing the best you can is not always perfect.


Is there a consensus in success rates in identifying a causal variant in an undiagnosed patient?
HR: There are two layers to that question. One is a question of technical validity. Can we physically detect a variant that may be causative for a patient based on the technology? The more challenging variants can be difficult to detect, so we can technically miss a variant even when it is in a region that we claim to be looking at: the exome, the entire coding region of the genome.

So we don’t know what we don’t know. That is, we don’t always know what the limitations of our technology are and we can’t assess what it is we are missing other than to simply say that there are limitations to the technology. We generally understand those limitations, but we don’t have a quantitative way to say we are missing 5% of these or 10% of those. To some extent, when we validate a test we put in known mutations and we can assess the things that are known that we are looking for that we have missed. But we can’t do that comprehensively enough, with enough variety, and with enough variations to assess the most challenging aspects.

What about the other layer?
HR: I published a review in Nature Genetics recently and I showed a table of typical detection rates for typical genetic disorders and indications. The gist of that was on average, we detect about 25% of the causes of disease in patients who undergo genetic testing. You can argue that we are missing 75% and the question then is: What is the basis for missing that 75%?

It spans four major areas:
One is the patient doesn’t have a genetic disease, and no matter how good we are we will never find it because their disease is not genetic. That is some unknown percentage.
The second is we are doing an exome and the answer to this question is in the noncoding regions, so we are going to miss it using this technique—it was the wrong test was run.
Third, even if we are looking in the right places, we technically miss the right variation for the reasons we just talked about, because we can’t detect structural variations or other things as well as we’d like.
And fourth, we got the variant but we weren’t able to understand or interpret it, such that we missed the interpretation of the variants. So it is there among the bucket of 5 million variants in the genome, or 20,000 to 50,000 variants in the exome, but we simply couldn’t identify which was the right one. Those are the reasons we have, on average, 75% of cases that are negative.


There is a lot of chatter about moving from exome sequencing to whole genome sequencing. What do you think about that?
HR: My general answer is that if run appropriately, whole-genome sequencing can increase the yield of diagnoses compared to exome sequences. But you have to run enough depth of coverage so that the overall average lower coverage doesn’t cause you to miss things on a genome that you would have caught on an exome because of its higher average coverage.

In general, for clinical diagnosis, I think the whole genome is the way to go. But it is more expensive right now and the cost of testing is a huge factor in the clinical market. It is also a large factor in the research market more because of the volume factor. If it was one case you were sequencing, you would probably do a genome. But if you were sequencing 10,000 cases for a research study, the added costs of a genome over an exome across 10,000 samples is enormous, whereas with one patient for a diagnosis you might be willing to pay that price.

Today, if you look across the market, I don’t think the market has reached a point of going to genome broadly, but I think the cost drops that are happening right now will make it such that in the next year there will be a fairly significant transition to [whole] genome sequencing.


You mentioned cost being a significant factor in the clinical setting. Do you think payers may balk at reimbursing for whole—genome sequencing?
HR: I don’t think they are going to treat whole genome over exome differently. The whole issue is a big difference in how they treat any genomic approach—exome or genome—compared to panel-based testing. What the payers are freaked out about are the downstream cascade of things from incidental findings of sequencing lots of genes that aren’t relative to the patient’s phenotype. From their standpoint, if you are just sequencing a panel, you are just looking at things relevant to a patient’s indication. You are not at risk for a bunch of other things you weren’t intending to look for that gets back to the patient and lead to a series of medical interventions for an at-risk patient that are unclear in their utility.We just published a paper from our MedSeq study in the Annals of Internal Medicine that showed only a very modest increase in cost when you give an enormous range of results back to healthy patients. I think that will help allay the fears of the incidental findings issue.


Can you give an update on ClinVar?
HR: ClinVar has been enormously successful. More so than I thought it would be when we started if five years ago.


Why do you think that is?
HR: I think there were a core set of laboratories that recognized about five years ago that we were headed for an untenable situation in our ability to interpret variation in our genomes, and that no single laboratory could manage the interpretations of millions of variants, all very rare in patients.

We started with the intent to crowdsource this problem. However, we recognized it is not just a question of crowdsourcing. As we began sharing, we recognized that we each had interpreted variants differently and so it became instead of a just a voluntary, crowdsourcing mission, it became a quality assurance issue in that we recognized that laboratories that were sharing their data—subjecting it to comparison and peer review of the community—would develop higher quality standards and provide more accurate results to patients.

I just wrote a commentary for Genetics in Medicine that is a list of clinical laboratories that meet what we consider a minimum standard for data sharing for quality assurance. The reason we are doing this is it will allow payers and providers to decide where to order tests from and decide which tests to reimburse because of the belief that labs that share data are higher quality and you should only reimburse or order from labs that are taking those quality-assurance steps.

Some of the major laboratories have engaged in data sharing and have come to these conclusions themselves, without us mandating it. I have been delightfully amazed how many laboratories were willing to do this on a voluntary basis. But now we are at a stage where we need all of them to do it, because there are a small number who aren’t. We sent a letter to all of the laboratories that had at least some submissions to ClinVar and said we would be launching the list in part to allow insurer and provider decisions on reimbursement and ordering sites and here is the list of minimum requirements you must meet to be on this list.

I wasn’t sure exactly what the response from the laboratories would be, but I got incredible support. From those that didn’t meet the criteria, I got several emails that said they are working diligently on getting it accomplished. They have gone to leadership and have gotten buy-in from their company executives and they are going to get this done.That was very gratifying and I’m very optimistic about the future.

Aside from ClinVar, what else are you working on?
HR: Another challenge we have, parallel to variant interpretation, is gene interpretation—what are the genes that when you disrupt them actually lead to disease phenotypes. This turns out not to be as straightforward as you might think.

There are genes that anyone, off the top of their head, know are disease genes, like BRCA1 and BRCA2 are breast cancer genes, and CFTR is the cystic fibrosis gene. Then you start to get down to a layer of one paper where somebody claimed a mutation in [one] gene is associated with [a specific] condition, but the data is not that strong. The claims about disease genes are what lead clinical labs to put these genes into their tests.

This is a major issue. And now we are embarking on a systematic curation of all claimed gene–disease associations through ClinGEN. At the recent ClinGEN meeting [there was] a commitment to share in this activity as a group and to get all of these 5,000 or 6,000 genes curated systematically to define the level of evidence. We have published a paper from ClinGEN that is the scoring framework to define strength of evidence of each gene that is the basis by which we will do this. 

We also have a project that relates to this.  A lot of these genes that we review end up in this ‘limited’ bucket where there is insufficient evidence to implicate them as a true disease gene, but it is still right for discovery. I’m also in charge of a project called Matchmaker Exchange. We have developed a federated network of rare disease databases used in gene-discovery efforts. When you find the candidate gene in a patient with a disease, but you have insufficient evidence to implicate it, through this federated network you can query all of the other databases to find patients whose phenotype matches and have a candidate mutations in the same gene to bring those cases together and implicate the gene. That is a major project that we have been working on for the last few years and is really growing quickly.


This article is published in the July/August issue of Clinical OMICs.

Also of Interest