Angled DNA Double Helix
Credit: SERGII IAREMENKO / Getty Images

Thousands of previously unidentified rare genetic variants affecting plasma proteins have been identified in a major, new open-access resource from the U.K. Biobank.

Over three-quarters of these protein quantitative trait loci (pQTLs) have never previously been detected, according to a flagship paper published in Nature.

The analysis of nearly 3000 circulating proteins has identified approximately 20 times more associations than all previous antibody-based studies.

The findings represents the most detailed study to date on how genetics and proteins influence disease and offers new targets for drug discovery.

“This momentous study offers whole new avenues of research to the biomedical community, and is a leading example of how cross-sector collaboration can bring about results that are so much greater than the sum of their parts said Naomi Allen, PhD, chief scientist of U.K. Biobank.

She added: “I am excited for researchers to use these data to identify patterns that could transform our understanding of how diseases develop, and to identify potential new treatment pathways.”

To date, large-scale proteogenomic studies have identified upwards of 12,000 pQTLs, which represent independent associations between genetic variants and plasma protein concentrations.

But opportunities to expand this on a massive scale arise in population studies like the U.K. Biobank, which has been collecting data and tracking the health of 500,000 volunteers enrolled between 2006 and 2010.

The open-access framework, deep phenotypic characterization and long-term development of the biobank offers a unique opportunity to broaden research use of high-throughput proteomic data, build more extensive pQTL databases, and accelerate the discovery of biomarkers, diagnostics and medicines.

Recognition of this has led to the development of the Pharma Proteomics Project, a collaboration between 13 leading biopharmaceutical companies to create the world’s largest proteomic atlas.

Researchers in the project characterized the plasma proteomic profiles of 54,219 U.K. Biobank participants using an antibody-based technique.

Benjamin B Sun, PhD, from Biogen in Cambridge, Massachusetts, and colleagues processed and analysed 2,941 blood plasma analytes, capturing 2,923 unique proteins.

These were measured across eight protein panels that were relevant to the fields of cardiometabolism, neurology, inflammation and oncology.

Comprehensive pQTL mapping of the proteins identified 14,287 primary genetic associations, of which 81% are previously undescribed.

The findings offer insights into trans pQTLs across multiple biological domains, and highlight the influence of genes on ligand–receptor interactions and pathway disturbances across a variety cytokines and complement networks.

They are accompanied in the journal by a companion piece that focuses on rare and aggregate gene proteogenomic effects. Another sister piece in the same title compares proteogenomics across two big cohorts and proteomic platforms to give perspective.

“To date, the scientific community has invested substantially in genomics for the advancement of precision medicine,” said Chris Whelan, PhD, director of neuroscience data science for Johnson &  Johnson, who leads the Pharma Proteomics Project.

“However, to identify the right drug for the right patient at the right time, we must move beyond genomics alone.

“This dataset will help paint a much more nuanced and detailed picture of how the human genome and proteins circulating in the blood influence human health and disease – enabling biomedical researchers to identify new biological associations, find new drug targets and build blood-based diagnostics.”

Full summary statistics are available on the AstraZeneca open-access portal AZPheWAS.com.

Also of Interest