Researchers studying the genomes of nearly 30,000 families have discovered genetic causes behind three rare diseases.
Their “Rareservoir” database revealed genetic explanations for primary lymphedema, thoracic aortic aneurysm disease, and congenital deafness.
The flexible and compact database, which contains rare variant genotypes and phenotypes collected by the U.K.’s 100,000 Genomes Project (100KGP), also identified other potential disease-causing genes.
Researchers led by Daniel Greene, PhD, an assistant professor in genetics and genomic sciences from Icahn School of Medicine at Mount Sinai in New York, publish their findings in the journal Nature Medicine.
“Etiological discovery is an important step in diagnosing, prognosticating and eventually developing treatments for rare diseases,” they note in a research briefing accompanying their paper.
“Patients with certain types of unexplained primary lymphoedema, thoracic aneurysm disease or congenital hearing impairment will now be able to receive a genetic diagnosis.”
Although rare diseases affect around one in 20 people, fewer than half of the approximately 10,000 catalogued have a resolved genetic etiology.
While genomic studies can offer an insight, large genetic datasets are notoriously cumbersome to work with as the data are typically stored in unmodifiable files many terabytes in size.
In an attempt to develop a more streamlined approach, the team examined genomic and phenotypic data collected as part of 100KGP, the largest genome study of patients with rare diseases to date.
The research included 34,523 sequenced patients with rare diseases and 43,016 unaffected relatives across 29,741 families.
From this, a relational database just 5.5 GB in size was developed and Bayesian statistics used to the identify genetic associations between coding genes and each of the 269 rare disease classes assigned to patients by the 100KGP clinicians.
Greene and team inferred 260 genetic associations with rare disease classes from the data, of which 241 were already known.
They then prioritized three of the 19 previously unidentified associations for further analysis and from this validated etiological roles for ERG, PMEPA1 and GPR156 through international collaboration involving the U.S.A, Europe, Middle East, and Japan.
The team showed that loss-of-function variants in the ETS-family transcription factor-encoding gene ERG resulted in primary lymphoedema.
Whereas the ERG protein usually only exists inside the nuclei of cells, a mutated form in patients was distributed in the cytoplasm where it was unable to bind to DNA.
In addition, the researchers found that truncating variants in the last exon of PMEPA1 resulted in a familial thoracic aneurysm disease reminiscent of Loeys-Dietz syndrome that likely exerted pathogenic effects by altering TGFβ signaling.
Lastly, the investigators found that loss-of-function variants in the G-protein-coupled receptor-encoding gene GPR156 gave rise to recessive congenital hearing impairment.
“Given the recently reported role of GPR156 as a critical regulator of stereocilia orientation, it is likely that reduced expression of GPR156 in patients disrupts stereocilia formation, which leads to the deafness phenotype,” they suggest.