The mystery of how subtle differences in our DNA give rise to both our unique traits as well as a variety of diseases is one that thousands of scientists around the world are trying to solve. The Genotype-Tissue Expression (GTEx) Consortium, is one of a handful of ambitious endeavors aimed at answering this question. Launched in 2010, its goal is to identify and catalog genetic variants related to changes in gene expression—dubbed expression quantitative trait loci (eQTLs)—in tissues across the entire human body.
“Over the last decade or so, there have been many studies that have linked genetic variants to a wide range of traits, but many of these were found in regions of the genome that are far away from genes, so it was difficult to link them to disease mechanisms that would allow for better diagnoses and treatment,” said Michelle Ward, a postdoctoral researcher in Yoav Gilad’s lab at the University of Chicago. “What GTEx starts to provide is a way of linking genetic variants to the genes that they’re regulating.”
In 2015, the GTEx group, which includes researchers across multiple research centers, released a pilot dataset that included the results of gene expression analyses on the post-mortem tissues collected from 237 donors. The latest version of their atlas, which was released in October, contains more than 7000 samples from 449 individuals. These include 44 different tis- sues types throughout the entire body, including the brain, lungs, muscles, and whole blood.
“None of this would be have been possible without the tissue donors, their families, and the bio-specimen collection sites,” said GTEx member Casey Brown, a genetics professor at the University of Pennsylvania. “We’re all extremely grateful.”
Novel Insights
Along with releasing the dataset, the GTEx group also published new findings based on the data in a series of four Nature papers in October. “Being the largest study of this kind, they’ve been able to ask questions that have been difficult to answer in the past,” Ward said.
For example, in one study, the team observed that across the broad spectrum of analyzed tissues, genetic variation was influencing the expression of most protein-coding genes. Although most variants affecting expression were located nearby the impacted gene, some were as far away as on a different chromosome. The nearby variants, dubbed cis-QTLs, were much more common—researchers identified more than 150,000—and were found in a wider variety of tis- sues than the more distant trans-QTLs, which numbered in the hundreds. However, Brown noted, because the current dataset is still statistically underpowered when it comes to analyzing trans-QTLs, more tissue will need to be analyzed in order to fully characterize them.
In addition, the researchers discovered that around half of all trait-associated variants previously identified by genome-wide association studies (GWAS) overlapped with eQTLs. According to Stephen Montgomery, a Stanford University professor and GTEx investigator, this suggests there are expression mechanisms under- lying those trait associations. “That actually gives us quite a lot of information to move forward with about what molecular mechanisms underlie complex diseases,” he added.
In another analysis, researchers examined how rare variants, which may play an important role in deter- mining individual disease risk, impacted gene expression. By pin- pointing outliers—genes that were expressed at extremely high or low levels—in the GTEx dataset, the team discovered that these outliers were more likely to have a rare variant nearby than their counterparts.
A Global Resource
The consortium has made their data publically available for researchers to access. This catalogue of raw and unprocessed data is a “fantastic resource for the research community,” Ward said. The group is also maintaining a biobank of the analyzed tissue samples.
However, the atlas does have its limitations. For example, Brown pointed out, the group analyzed heterogeneous tissues, rather than single cells. “I think the next big leap will be true cellular-level resolution,” he said.
Another limitation, according to Ward, is that “while this dataset pro- vides insight into the associations between genetic variants and gene expression, I think more work is needed to determine the causality.” One way scientists could do this, she noted, is by using techniques like CRISPR/Cas9 gene editing.
The release of the next version of the GTEx atlas, which researchers expect to publish sometime next year, will more than double the sample size, Brown said. This will allow research- ers to address some questions that are difficult to probe with the currently available sample. Brown, for one, is interested in examining more of the trans-QTLs.
In addition to analyzing tissues from more patients, some of the members of the consortium are working on characterizing additional factors, such as DNA methylation, protein abundance, alternative splicing, and telomere length, by applying diverse molecular assays to the available samples. This effort, dubbed “Enhancing GTEx,” will publish analyses associated with their data over the next two years, according to Montgomery.
Ultimately, pooling all this information could help benefit personalized medicine, “It could be in the future that you go and you have your genome taken, but you might get your blood transcriptome, your metabolome, your proteome—all these other types of data,” he added. “So if we build the [analytical] infrastructure now, we should get better predictions for individuals about genetic risk factors.”
“All biomedical researchers should welcome the wealth of data that continues to be released by the GTEx project, and the insights it provides into the regulatory code of our genomes,” Nature’s editors wrote in a recent editorial. “It is an important step towards the ultimate and ambitious goal of being able to characterize genetic variation and gene regulation in all cells of the human body.”