Almost a decade in the making, the largest human protein map is now available thanks to a joint effort involving over 80 researchers in the United States, Canada, Spain, Belgium, France and Israel. The map represents all of the known interactions between gene-encoded proteins. Known as the Human Reference Protein Interactome Mapping Project (HuRI), the map is intended to help understand how faulty genes involved in protein coding cause or influence disease.
The study team describes HuRI as a systematic proteome-wide reference that links genomic variation to phenotypic outcomes. HuRI charts 52,569 protein-protein interactions (PPIs) between 8,275 human proteins, or four times as many PPIs than previously curated from the group’s previous efforts and other protein mapping projects. HuRI data was first published in preprint article on BioRxiv in April 2019 with full publication in Nature this week.
The human genome includes about 20,000 protein-coding genes but much remains unknown about most of the proteins they encode and how these proteins interact in cell physiology in normal and disease states. To learn more about how these protein function, researchers looked to the proteins’ neighbors to see what biological processes they are involved in. PPIs depend on the presence of two or more proteins acting together in the same location at the same time.
For this reason, the researchers used yeast two-hybrid (Y2H) screening assays which study the activity of pairs of human proteins in yeast cells to detect PPIs and build the HuRI map. When the proteins bound to each other, it boosted yeast cell growth providing a clear indication that the proteins had interacted.
The team tested all possible pairwise combinations among 17,500 proteins in the ORFeome database for their ability to interact with each other. Their analysis involved three separate versions of a yeast-based assay, each done in triplicate, amounting to three billion separate tests. The results yielded ~53,000 high-confidence PPIs between 8,275 proteins. Adding in information from other PPI published data from the group, the HuRI includes data on 9,100 proteins and 64,000 PPIs. The majority of interactions had never been detected before.
“HuRI substantially expands the number of biomedically interesting genes for which high-quality direct PPI data are available and finds new interaction partners for these genes in previously uncharted regions of the protein interactome,” the authors write.
With this map, the HuRI has more than tripled the number of known interactions between human proteins and includes approximately 90 percent of the protein-coding genome. And, it revealed new protein functions involved in apoptosis, release of cellular cargo and other processes. By integrating protein interaction data with tissue-specific gene expression, the team identified protein networks behind the development and maintenance of different tissues, revealing new therapeutic targets for diverse genetic diseases including cancer and potentially for infectious diseases as well.
“Genome sequencing can identify the variants carried by an individual that make them susceptible to disease, but it doesn’t reveal how the disease is caused,” says HuRI co-leader Michael Calderwood PhD, of the Dana-Farber Cancer Institute “Changes in the interactions of a protein is one possible mechanism of disease, and this map provides a starting point to study the impact of disease associated variants on protein-protein interactions.”
HuRI, also known as HI-III-19, is the third protein interaction map produced by the study team. The group’s first mapping effort, known as HI-I-05 about 8,000 open reading frames (ORFs) corresponding to ~7,000 genes, and identified ~2,700 high-quality paired PPIs. This search space represents ~12% of the complete search space, assuming a total of ~20,000 protein-coding genes. The second phase of the human interactome mapping project, HI-II-14, generated a dataset of ~14,000 paired PPIs.
Although HuRI is the largest map of its kind to date, it still remains incomplete with the authors suggesting it captures between 2-11 per cent of all human protein interactions. Still, they note that the uniform proteome and interactome coverage of HuRI enable its use as a reference for the study of most aspects of human cellular function.