A $2.5 million grant from the National Institutes of Health (NIH) to researchers at Binghamton University, State University of New York will be used to develop machine learning models to assess and predict cardiometabolic risks in adolescents and young adults. The grant will help fill a gap in current research studies of cardiometabolic diseases such as heart attack, stroke, and diabetes, the bulk of which is focused on older patient populations.
One goal of the research is help younger people and their healthcare providers better understand some of the early warning signs and risk factors that lead to these health problems later in life, and develop methods and care regimens to reduce these risk factors far in advance.
The new research will be led by Bing Si, PhD, an assistant professor in Binghamton University’s Thomas J. Watson College of Engineering and Applied Science. Si will work in collaboration with clinicians at the Mayo Clinic and Harvard University to develop novel statistical models that will leverage machine learning to analyze the anonymized health data of young patient’ to predict cardiometabolic risks in this population.
“My research is on statistical modeling and machine learning with a focus on multimodal health data analysis, and these data can have very complex structures and challenging properties,” said Si, a faculty member in the Department of Systems Science and Industrial Engineering. “I am working to develop new data fusion and machine learning models that tackle these challenges in data analysis and generate new knowledge to facilitate medical decision-making. In this project, we have this large data set with thousands of individuals to identify those high-risk versus low-risk subgroups from the young population.”
The research will draw on diverse patient data that includes socio-demographic data, dietary information, blood test results, sleep studies, exercise habit, health questionnaires, and information from regular health checkup, among others.
Si noted that a study such as this that pulls in multimodal data from thousands of people is how to address the issues she called “missingness.”
“If you are collecting multimodal data from thousands of people, for sure somebody will miss something,” she said. “Some tests may be unreliable, and we cannot use them. We are trying to use a statistical modeling approach to address that as well.”
Among the risk factors Si and team will track are metabolic dysregulation, obesity, physical inactivity, poor nutrition, sleep disorders and other related conditions that can lead to a higher chance of severe cardiometabolic outcomes, such as cardiovascular morbidity and mortality.
By the end of the five-year grant period, Si’s goal is to develop ways to identify different cardiometabolic subgroups that can help guide not just treatments, but also provide a roadmap for early intervention for those identified as high-risk. There is also potential for Si’s approach to applied for the study of other complex diseases.
“This is not the job of one grant to do, but we hope that after we complete our R01 project, we can contribute some new knowledge to the field and continue to study this area,” she said. “Our overarching goal is to improve cardiometabolic healthcare in young people as they transition into adulthood, and eventually to reduce the health disparity in diverse populations and reduce healthcare costs in the U.S.”