The history of science is replete with instances of multiple discovery—the more or less simultaneous announcement of essentially the same breakthrough by independent researchers. Still, it may still seem uncanny that two separate research groups not only produced a draft map of the human proteome, they also published their results the same day in the same journal.
Today, in the online edition of Nature, researchers from Johns Hopkins University and the Institute of Bioinformatics in Bangalore, India, published an article entitled “A draft map of the human proteome.” Similarly, researchers from Technische Universitaet Muenchen (TUM) published an article entitled “Mass-spectrometry-based draft of the human proteome.”
While the mapping of the human proteome may not prove to be as epochal as the formulation of calculus by Newton and Leibniz, or the development of evolutionary thought by Darwin and Wallace, it is still vitally important. By comprehensively cataloging human proteins, the Baltimore/Bangalore team and the Munich team have created a resource for other researchers that promises to advance personalized medicine.
As stated by the authors of the Baltimore/Bangalore paper, “With the availability of both genomic and proteomic landscapes, integrating the information from both resources is likely to accelerate basic as well as translational research in the years to come through a better understanding of gene-protein-pathway networks in health and disease.”
The dual papers seem a little less coincidental when one considers that both research groups faced similar challenges and exploited similar technologies. As a result, they were almost fated to enact similar strategies and uncover similar findings.
Studying proteins is far more technically challenging than studying genes because the structures and functions of proteins are complex and diverse. Moreover, a mere list of existing proteins would not be very helpful without accompanying information about where in the body those proteins are found. Most protein studies to date have focused on individual tissues, often in the context of specific diseases.
To address these challenges, both research teams took advantage of mass spectrometry, which has revolutionized proteomics studies in a manner analogous to the impact of next-generation sequencing on genomics and transcriptomics. In addition, both teams compiled information about the types, distribution, and abundance of proteins in various cells and tissues. For example, the Baltimore/Bangalore team conducted in-depth profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues, and 6 purified primary hematopoietic cells.
While working up their dataset, the Baltimore/Bangalore team identified proteins encoded by 17,294 genes, which is about 84% of all the genes in the human genome predicted to encode proteins. The Munich team reports that it cataloged over 18,000 proteins.
The Baltimore/Bangalore team indicated that it had identified 193 novel proteins that came from regions of the genome not predicted to code for proteins, suggesting that the human genome is more complex than previously thought. Similarly, the Munich team noted that it had discovered “hundreds of protein fragments that are encoded by DNA outside of currently known genes.” These new proteins may possess novel biological properties and functions.
Both teams cited the challenge of “missing proteins”—proteins that should exist, given what we know about the genome, but remain unobserved. “The depth of our analysis enabled us to identify protein products derived from two-thirds (2,555 out of 3,844) of proteins designated as missing proteins for lack of protein-based evidence,” wrote the Baltimore/Bangalore researchers. “Several hypothetical proteins that we identified have a broad tissue distribution, indicating the inadequate sampling of the human proteome thus far.” The Munich researchers speculated that some missing proteins may exist only during embryonic development. These scientists also suggested that many known genes have simply become nonfunctional, such as genes believed to code for olfactory receptors—an indication that modern humans no longer rely on a sophisticated sense of smell to survive.
Yet another parallel finding concerned housekeeping proteins, which are highly abundant; well represented among histones, ribosomal proteins, metabolic enzymes, and cytoskeletal proteins; and constitute about 75% of total protein mass. The Munich team reported finding around 10,000 such proteins “in many different places.” Similarly, in their article, the Baltimore/Bangalore team noted that it “detected proteins encoded by 2,350 genes across all human cells/tissues.”
“One of the caveats of tissue proteomics is the contribution of vasculature, blood, and hematopoietic cells,” it added. “Thus, proteins designated as housekeeping proteins based on analysis of tissue proteomes could be broadly grouped into two categories, those that are truly expressed in every single cell type and those that are found in every tissue (for example, endothelial cells).”
Both groups highlighted the importance of their work for speeding research and translational developments. For example, the Munich team examined 24 cancer drugs whose effectiveness against 35 cancer cell lines were found to correlate strongly with their protein profiles. According to Prof. Bernhard Küster, the TUM Chair of Proteomics and Bioanalytics, “This edges us a little bit closer to the individualized treatment of patients. If we knew the protein profile of a tumor in detail, we might be able to administer drugs in a more targeted way. The new insights also allow medical researchers to investigate combinations of drugs and, thereby, tailoring treatments even more closely to a patient's individual needs.”
The Baltimore/Bangalore team emphasized the importance of using direct protein sequencing technologies such as mass spectrometry to complement genome annotation efforts. In addition, it outlined several proteomic research strategies that could benefit from the sampling of individual cell types of human tissues and organs and the ultimate creation of a “human cell map.”
“You can think of the human body as a huge library where each protein is a book,” said Akhilesh Pandey, M.D., Ph.D., a professor at the McKusick-Nathans Institute of Genetic Medicine and of biological chemistry, pathology and oncology at the Johns Hopkins University and the founder and director of the Institute of Bioinformatics. “The difficulty is that we don’t have a comprehensive catalog that gives us the titles of the available books and where to find them. We think we now have a good first draft of that comprehensive catalog.”
Committed to helping other researchers identify the proteins in their experiments, the Baltimore/Bangalore team has made its human proteome catalog available as an interactive web-based resource at www.humanproteomemap.org. Similarly, the Munich team, together with software company SAP, has made its inventory freely available at www.proteomicsdb.org.