Using machine learning algorithms on genomic data from more than 30,000 people, an international team co-led by researchers at the Garvan Institute of Medical Research has revealed thousands of new regulatory regions that control disease-linked genes. Their data is available to researchers worldwide at eQTLGen Consortium.
“Thanks to the statistical power of this large dataset, we were able to uncover new regulatory regions on the human genome,” says co-senior author Joseph Powell, Director of the Garvan-Weizmann Centre for Cellular Genomics and Deputy Director of the UNSW Cellular Genomics Futures Institute.
“Instead of just cataloguing the regulatory gene locations that were adjacent [cis-eQTLs], we were able to reveal genes that modulated the activity of more distant genes [trans-eQTLs],” he adds.
The findings, published in Nature Genetics, could help identify markers that reveal which patients will benefit most from specific treatments to enable more personalized medicine.
As the researchers write: “eQTL [expression quantitative trait loci] have become a common tool to interpret regulatory mechanism of variants identified by genome-wide association studies (GWAS).”
This study used specialized machine learning algorithms to analyze genomic data from the blood samples of 31,684 individuals from the eQTLGen Consortium dataset. The team then deposited their results there.
Out of the millions of genes they investigated, the researchers say they found not only that 88% had a cis-eQTL effect, but that 32% of genes also had a trans-eQTL effect further away in the genome, more than half of which they could assign to a biological impact, such as cardiovascular and immune diseases.
“While it’s clear that genetic variants are almost always a root cause of disease, the mechanism by which they influence disease is far less clear. For instance, while a specific condition may be linked to hundreds of genetic variants, the vast majority contribute to disease by regulating gene activity,” says Powell.
To study how human genetic variation affects risk of disease, researchers often carry out GWAS, but interpreting the results of these studies is not straight forward. Instead of directly driving disease, many genetic variants instead regulate the activity of genes, influencing how much of a protein is produced. By pinpointing these regulatory regions, eQTLs, researchers are able to better understand which genes directly contribute to disease risk and which could be targeted with precision treatments.
“In this study we have provided an entirely new view of genetic regulation by uncovering an in-depth picture of how genes and disease are linked. It is the most comprehensive analysis of how human genetic variation affects gene expression to date,” says Powell.
“Our discovery provides researchers an entirely new perspective on their genes of interest, and will help prioritize genes that may be more relevant for therapeutic intervention. It could also lead us to better markers for tracking disease progression and the efficacy of medicines,” says co-senior author Professor Lude Franke from the University Medical Centre Groningen, Netherlands.
“Understanding which genes this regulation ‘converges’ on will be invaluable to identify targets for new potential medicines. If a pharmaceutical company develops a therapy that targets a certain molecule, our resource can help identify how its expression is regulated and if the genetic background of different patients is likely to impact its efficacy,” Franke adds. “What we’ve discovered is an entirely new level of genomic information, providing a deeper understanding of biology and disease.