A simplified example of RNA splicing mechanisms [NIH]
A simplified example of RNA splicing mechanisms [NIH]

Researchers at the University of Chicago and Stanford University have described in a new study how thousands of RNA splice mutations affect gene regulation in traits such as height  and diseases such as multiple sclerosis. The findings highlight the need for a better understanding of the role of RNA splicing on variation in complex traits and disease and enable more accurate functional interpretations of genome-wide association study (GWAS) results.

“We were able to comprehensively identify how mutations perturb gene expression all the way from transcription to translation, and how they affect different regulatory mechanisms,” explained co-senior study author Yoav Gilad, Ph.D., professor of human genetics at the University of Chicago. “We found that a significant proportion of the associations between mutation and variation in disease is explained by effects on RNA splicing. We can now work to better understand this relationship and add another tool to our kit to figure out the biological mechanisms that cause disease.”

The findings from this study will be published in the April 29th issue of Science in an article entitled “RNA Splicing Is a Primary Link Between Genetic Variation and Disease.”

In the past several years, GWAS have been successful at uncovering variations in the human genome that are associated with biological traits and complex diseases. These mutations, known as quantitative trait loci (QTL), are primarily found in regions outside of genes and were assumed to play a role in gene regulation. However, the functional significance of the vast majority of QTLs is still unclear.

The investigators wanted to take a comprehensive look at the underlying roles of genetic variants and did so by applying a suite of powerful statistical tools to the whole-genome and cell line data gathered from 70 individuals. In a series of experiments that spanned 8 years, they analyzed QTLs associated with seven regulatory phenotypes, including gene expression levels, RNA transcription, and protein translation.

In each regulatory phenotype, the team identified QTLs and quantified their effect on almost every step of gene regulation. They found that many of the QTLs overlapped in their effect on transcription, translation, and ultimately protein production.

“We have never considered so many data sets from a population sample of the exact same individuals before, so this type of analysis was never done,” Gilad said.

The scientists developed a new computational algorithm, called LeafCutter, which enables the efficient identification of QTLs that are specifically involved in RNA splicing. All genes undergo splicing, where a precursor form of mRNA is cut and restitched together into numerous combinations. This splicing significantly increases the number of proteins a single gene can code for and is thought to explain much of the complexity in higher-order organisms. At least 15% of all human diseases are thought to be due to splicing errors. Yet, until LeafCutter, no methodology existed to identify and analyze splicing QTLs effectively.

Using their newly developed program, the researchers discovered approximately 3000 splicing-specific QTLs, many of which appear to play a major contributing role in the biology of genetic traits and disease. The splicing QTLs were mainly enriched in multiple sclerosis; for other traits, they were roughly equal in influence with QTLs that affect global gene expression levels. Many of the splicing QTLs did not affect gene expression levels, suggesting that RNA splicing is a separate, but equally important, mechanism that underlies complex traits and disease. 

“We now have a new appreciation for how important splicing is for disease,” Dr. Gilad noted. “Intuitively, we had assumed it is important, but we didn't really have a lot of genome-wide evidence until this study.”

Notably, genetic variants identified through GWAS can now be assayed for potential roles in RNA splicing. If only overall gene expression is measured, the function of many of these sites would remain opaque.

“When we incorporate more information about more mechanisms in more diseases, we can better understand how genetic variation drives disease and someday perturb or fix that process,” Dr. Gilad stated. “We clearly have to consider RNA splicing now in addition to gene expression, histone accessibility, and other factors, as we try to learn these rules.”

Also of Interest