The Complete Computationalist

Damian Doherty interviews Javier Alfaro from The University of Gdansk

Javier Alfaro,
Javier Alfaro, data science group leader, immunology and immunotherapeutics, University of Gdansk and senior research fellow at the University of Edinburgh

Damian Doherty spoke to Javier Alfaro, data science group leader–computational immunology and immunotherapeutics at the International Center for Cancer Vaccine Science at the University of Gdansk and senior fellow at the University of Edinburgh about the evolving field of Proteogenomics and his focus on diseases spanning cancer, emerging diseases and auto-immune disorders. He articulates his vision for what it will take to harness the power of the proteome and discusses the transformative  work he is doing with T Cell Vaccines.

 

IPM: When did your interest in the immune system begin?

Javier: I was trained in cancer proteogenomics at the University of Toronto under the guidance of two amazing mentors: Prof. Paul Boutros and Prof. Thomas Kislinger. Therefore, my background and training are in cancer biology and informatics, but immediately after my PhD, I was invited to be a group leader of computational science at the International Center for Cancer Vaccine in Gdansk, Poland. My interest in immunology began there, and I have focused particularly on cell-mediated immunity and antigen presentation. I study the short peptides that all cells in our body present to the immune system and characterize their properties. This centre was a brand-new research institute at the time, and I jumped at the opportunity to learn how to start things from scratch. It has been a wonderful opportunity, and I thank the FNP for providing me with it.

Gdansk University
Gdansk University of Technology, Gdansk, Poland

 

IPM: Can you talk about the work you’re doing in clinical proteogenomics?

Javier: I have conducted several projects. One noteworthy project that has just reached pre-print is a proteogenomic landscape study of Oesophageal Cancer, where we used an integrative -omic approach to understand how RNA-to-protein abundances become dysregulated during the emergence of this disease. The trick in proteogenomics is understanding what can be gained by studying multiple layers of the central dogma in comparison to any single layer. So, you might ask about aberrations and their presence or absence across the central dogma. Alternatively, one could ask about the disruption in abundances across the central dogma. In this oesophageal study, performed in collaboration with a consortium in the U.K. (OCCAMS), we investigated the dysregulation of RNA and protein abundances in aggressive disease. This project represents a colossal energy investment from co-first authors Dr. Rob-O’Neill, a surgeon in Cambridge, U.K. and Dr. Marcos Yebenes Mayordomo, who has just graduated last week from the group.

I also have a very exciting project in renal cancer working in a large consortium of 20 research institutes to leverage publicly available data and artificial intelligence towards personalized medicine in renal cancer. In this project, we will also collate multi-omic datasets of real cancer patients to predict the response to therapies, with my particular focus being immunotherapy. In this project, I have used the lab’s know-how in computational mass spectrometry to dive deeply into the immunopeptidome, which is the way that all our cells present themselves to the cell-mediated immune system. You can find out more about this in my pre-print here: www.biorxiv.org/content/10.1101/2022.01.13.475872v1. This project was led by the tremendous efforts of one of my students, Georges Bedran, and I am incredibly proud of what he was able to achieve.

 

IPM: Where are we, would you say, in understanding the complexity of the immune system?

Javier: Very far away. The immune system is complex. I think in the end, it will be computational models driving understanding as we accumulate a wealth of high-throughput data about different interfaces in the immune-disease synapse. However, I think that as a computational researcher.

 

IPM: Beyond the success of checkpoint inhibitors what are the other burgeoning areas of immunology research you feel might have real clinical promise?

Javier: Currently, I am focusing on universal cancer vaccines. I would love to have a set of vaccines and antibody therapeutics in play that could be used in precision medicine. When a patient enters the clinic, we would sequence their genome and determine if they have mutations for which we already have a vaccine. Otherwise, we would generate a vaccine on the spot for that patient.

 

IPM: Is single-cell protein sequencing still in the hype cycle stage? Do you believe that it will ultimately have an impact on meaningful clinical insights? What about mass spectrometry, is it in the hype stage?

Javier: Mass spectrometry has been the dominant technique used in proteomics for decades. However, proteomics is a challenging problem, and we have outlined this in our recent perspective paper at www.arxiv.org/abs/2208.00530. With all the efforts and investments in this field, we are still falling short of sequencing proteomes; right now we might get an average sequence coverage of 30-40 percent for a protein in a complex mixture by bottom-up proteomics. Admittedly, this is better than just five years ago, and you can see some improvement over my last survey on the topic www.doi.org/10.1038/s41592-021-01143-1. However, given the amount of time that has gone into the technology, I wonder if we will eventually reach the physical limits of what can be done by mass spectrometry. This will probably be determined in the next 10 years.

I attribute challenges in sequencing proteomes by MS to sequencing biases made by the various decisions that allow MS-proteomics to work. For example, some peptides do not fly well in the gas phase, and some peptide fragments don’t make it either. In addition, reliance on trypsin leads to short peptides that can be reliably sequenced by fragmentation, which limits the ability to infer proteoforms. Top-down methods are getting better and better, but are still far from the same protein identifications as bottom-up methods due to various issues dealing with intact proteins. So, right now if you want the full sequence of a protein reliably, you must focus on that protein, attack it with different enzymes, and piece it together to obtain the sequence.

The dream of complete and accurate sequencing of proteomes is far away. In addition, mass spectrometry is inaccessible to many labs, preventing widespread adoption of proteomics in the community. Compare MS to DNA sequencing, which can be accomplished by most laboratories and even taken straight into the field in the form of nanopore sequencing. Currently, there are limits to both capability and access.

So, to answer your question: Is single-cell protein sequencing still in the hype cycle stage? Do you believe that it will ultimately have an impact on meaningful clinical insights?

I think that complete, accurate, and cheap sequencing of proteomes will usher in a wave of transformations akin to what we saw for DNA sequencing. Currently, many people are unaware that we cannot already do this, but if new technology emerges that can do this, it will be incredibly transformative in the clinic. But how long will it take? Two years, 5 years, 10 years, 30 years– that’s what determines whether it’s hype right? If I say it is 30 years away, its hype today right?

My advice is to just be patient. Whenever humanity has faced a grand challenge like this, there were technical hurdles to overcome that progressed the field forward. There are technical challenges in MS and single-molecule proteomics on the path towards single-cell sequencing. How many? I don’t think we know. How long will they take to solve? I don’t think we know. But inspired technologists will take aim and overcome them one by one to progress.  I hope sooner rather than later given the potential medical impact down the line.

 

IPM: Do you feel the research community in the last twenty years has generally favoured the genome over the proteome in its pursuit of new targets?

Javier: Certainly the genome has been in the spotlight. Particularly for cancers, the argument has been that cancer is a disease of the genome, so cancer genomes are the most relevant layer to study. However, I think that this mindset is oversimplified. For example, this mindset progressed with the recognition that epigenetic marks add a layer of plasticity to the genome and can be inherited from one cellular generation to the next. The catch is that these chemical modifications are everywhere. For example, in the transcriptome, epi-transcriptomic changes and in the proteome as post-translational modifications. So the logic from the epigenome continues does it not?  When the cell divides, the state of the proteome is also inherited with all the proteoform diversity from the previous generation. It is not simple to say that the genome is a fundamental heritable unit in modern cancer biology. The problem is that proteomes are incredibly heterogeneous and difficult to study as they are compartmentalized in complex organellar structures. In the end, characterizing proteomes adds value to genomic studies, and this mindset will propagate through the community.

 

IPM: Were you focused on SARS CoV-2 during the pandemic given the work you do on the T cell vaccine side?

Javier: Certainly, I now have a whole arm of research dedicated to computational methods for vaccine development. I am convinced that the design of vaccines, particularly universal vaccines, can be improved. I’m surprised that artificial intelligence applications have not really been leveraged well in the most recent pandemic, despite the tremendous wealth of knowledge we have on this virus.  I’d like to develop open-source solutions to help us deal with emerging diseases, T-cell vaccines are an important part of this.

 

IPM: I know the application of ML/AI is critical in the work you are doing–are these tools being built in house or are you working with tech providers?

Javier: Both, I’m involved in a major EU project dedicated to leveraging publicly available data for personalized medicine in renal cancer www.katy-project.eu/. In this project, we work with 20 other partner institutes, some of which are tech providers. It’s my first time working with industry in this way, and certainly I think it’s the way forward. It can be very difficult in a university setting to attract the right talent, as these technology skills pay very well in the industry. I am even considering starting a consultancy to resolve the talent shortage faced in academia when it comes to recruiting bright AI and tech professionals. That said, at the University of Edinburgh, the Biomedical AI CDT, houses significant expertise in artificial intelligence applied to the biomedical sciences. So, I have taken several students through this program over the years, and they have been quite successful.

 

IPM: We talk about the genome, proteome, microbiome, exposome–is there a growing need to understand the immunome?

Javier: Whenever we become sick, the immune system is implicated in some way. So absolutely, understanding the immunome is key for every high-throughput approach. The only problem is that there are many interfaces to study, and the best way to study them in a high-throughput manner is often using single-cell approaches. Therefore, achieving this is still very expensive.

 

IPM: How important is keeping track of new innovations and enabling technologies in the work that you do in your lab?

Javier: As a computational researcher, new technologies represent the forefront of new computational problems. So absolutely, keeping up with new technologies is critical for innovation. I have developed pipelines that use existing tools on yesterday technologies, but am always looking out for new technologies that need new tools developed.

 

IPM: How do you attract creative scientific talent and indeed what do you prioritize when you’re looking to add a valuable member to the team?

Javier: Attracting creative scientific talent is difficult as a starting point for PI. Particularly for IT talent, I’ve had to wait long stretches of time for the right applicant in the past. Because my lab is remote-friendly, I prioritize self-driven independence in research. Generally, my approach is to shoot people off in a clearly defined and ambitious direction, but then let them drive in directions that inspire them. I try to help them see what research questions lead to high-impact research along the way and hope to help them avoid pitfalls. But, mostly I sit beside them and cheer them along. Sometimes, this works, sometimes it doesn’t, and I’m sure most starting PI’s would recognize that they are figuring things out and adapting as they go. Certainly, things are easier now than when I started.

 

IPM: Finally, any predictions for the World Cup?

Javier: Hah, I’m Canadian of course, so I will cheer for my home country! Surprisingly, my usual Latin American favorites (Chile and Colombia) did not make it this year.

 

Damian Doherty has been in media and publishing for nearly 30 years, beginning in the early nineties at News Corporation in Wapping. Damian has managed, edited, and launched life science titles in drug discovery and precision medicine. He was features editor of the Drug Discovery World for fourteen years and founded, established, and edited the Journal of Precision Medicine in 2014. In parallel, Damian founded and organized the Precision Medicine Leaders’ Summit, a global, immersive 3-day senior leadership conference that still runs today. He edited AIMed magazine in 2019 before launching Photo51Media, a platform for illuminating untold, compelling stories in precision healthcare. Damian joined Mary Ann Liebert in 2021 to help steer the new rebrand and relaunch of Clinical OMICS to Inside Precision Medicine.

Also of Interest