Mount Sinai Health System said today its researchers will be able to explore more complex scientific questions more quickly once the New York City health system builds its second “Big Omics Data Engine” (BODE 2) supercomputer, to be funded through a $2 million grant awarded by the U.S. Department of Health and Human Services.
Through BODE 2, Mount Sinai said, its researchers will be able to use machine learning approaches to mine deep databases of genomic and clinical information, with precision medicine in mind. BODE 2 is intended to let researchers carry out both translational bioinformatics research and data-driven medicine, through broad, user-friendly, integrated access to more diverse data sources with robust, secure, bidirectional data flow between research and point-of-care programs.
“With BODE 2, we are renewing our commitment to push the boundaries of scientific research, tackle questions that we did not previously have the computational power to take on, and achieve breakthroughs that transform clinical care worldwide,” Dennis S. Charney, MD, Anne and Joel Ehrenkranz Dean, Icahn School of Medicine at Mount Sinai, and President for Academic Affairs, Mount Sinai Health System, said in a statement.
Mount Sinai disclosed two research projects to be facilitated through BODE 2:
- Understanding the Mechanism of SPl1-Dependent Alzheimer Disease Risk: BODE 2 will be designed to provide both the necessary storage for whole-genome-sequencing data sets from more than 10,000 study subjects and the processing power (approximately 12 million compute hours) to analyze the data using machine learning techniques. The analyses will be used to enhance current treatments or explore new therapies, Mount Sinai said.
- The Trans-Omics for Precision Medicine (TOPMed) Program: BODE 2 is intended to provide the 1.75 petabytes of storage necessary for the whole-genome-sequencing data, other omics, and molecular, behavioral, imaging, environmental, and clinical data for this unprecedented exploration of the biological causes underlying heart, blood, lung, and sleep disorders.
BODE 2 is also expected to provide hundreds of terabytes required for intermediate results storage, and approximately 7 million compute hours necessary for the highest-powered analysis of the TOPMed data. These processes can be greatly accelerated on BODE 2 compared with standalone machines.
“This new supercomputer will enable us to mine deep databases of genomic and clinical information using machine-learning approaches to propel the personalized medicine of today into better medicine tomorrow,” said Eimear Kenny, PhD, Associate Professor of Medicine (General Internal Medicine), and Genetics and Genomic Sciences, at the Icahn School of Medicine at Mount Sinai, Director of the Center for Genomic Health, and a Principal Investigator of the TOPMed Program. “The technology will help fuel innovative research programs to further our understanding of disease progression and management.”
28M Core Compute Hours per Year
The new BODE 2 supercomputer is a Lenovo ThinkSystem SR360 consisting of 3,840 Intel Cascade Lake cores, with 15 terabytes of memory, 14 petabytes of raw storage, and 11 petabytes of usable storage. It will produce approximately 28 million core compute hours per year at a frequency of 2.6 GHz and it will have a peak speed of 220 teraflops per second—approximately double that of BODE.
“Computing capability of this size and speed is not available widely, and Mount Sinai’s investment in building this infrastructure will translate into more robust genetics and population analysis, gene expression, machine learning, and structural and chemical biology investigations, and result in new insights and advances in a wide range of diseases including Alzheimer’s, autism, influenza, prostate cancer, schizophrenia, and substance use disorders,” Patricia Kovatch, Senior Associate Dean for Scientific Computing and Data Science at the Icahn School of Medicine at Mount Sinai, said in a statement.
Kovatch is also a member of the Icahn Institute for Data Science and Genomic Technology, and Associate Professor of Genetics and Genomic Sciences, and Pharmacological Sciences. Before joining Mount Sinai, she directed the National Science Foundation’s supercomputer center at the University of Tennessee, Knoxville, located at the U.S. Department of Energy’s Oak Ridge National Laboratory—where she deployed the world’s third fastest supercomputer in 2009, a $75 million Cray XT3.
BODE 2 is set to launch at year’s end. When it does, it will supersede BODE, an earlier-generation Cray supercomputer that consists of 2,484 Intel Haswell cores and 5 petabytes of storage. BODE was built in 2015 using a $2 million NIH grant as a second expansion of Mount Sinai’s first supercomputer, named MINERVA and deployed in May 2012. BODE nodes were fully integrated into the Minerva computer complex and were limited only to users with NIH-funded Genomics-based research projects.
According to Mount Sinai, BODE was used by 61 basic and translational researchers at Mount Sinai representing more than $100 million in NIH funding, along with their collaborators at 75 external institutions. BODE enabled scientific findings that appeared in more than 167 publications, including Nature and Science, with a total of 2,427 citations in three years.
“Based on our experiences with BODE, BODE 2 is designed to provide our researchers and clinicians, and their external partners in Mount Sinai-led national research projects, with the necessary infrastructure to achieve faster results for greater scientific throughput, increased fidelity in their simulations and analysis, and seamless migration of research applications to the software environment for enhanced scientific productivity,” Kovatch added.