The Trans-Omics for Precision Medicine program has published an analysis of genomic data from 53,000 individuals, mostly from diverse backgrounds, providing important information on genetic variants contributing to heart, lung, blood and sleep disorders.
The NIH backed research, which involves a large number of different researchers and organizations was published in the journal Nature. Over 400 million single-nucleotide polymorphisms and insertion or deletion variants were identified across the cohort.
The program, known as TOPMed, is ongoing and aims to sequence more than 150,000 participants altogether. The cohort has a diverse makeup—41% European, 31% African, 15% Hispanic/Latino, 9% Asian and 4% unspecified ethnicity. Historically most genome sequencing or genotyping studies have included small numbers of people from diverse backgrounds so this is a big step forward in this space.
Calculating disease risk from one or more disease associated variants can prove complicated statistically. These variants also vary quite considerably in frequency and type between populations with different ethnic backgrounds. This makes it essential to sample as large and diverse a sample set as possible to ensure accurate precision medicine can be available for everyone.
TOPMed began in 2014 and supports the National Heart, Lung, and Blood Institute precision medicine research and other related NIH programs. The Nature paper presents an analysis of the first 53,831 genomes sequenced, but more than 90,000 genomes have already been sequenced to date and a reference panel and database including 97,000 genotyped samples is available for researchers to access and contribute to.
The TOPMed researchers have also collected phenotype data and RNA, gene, and metabolite profiles for some of the participants to allow more in depth ‘omic’ analyses of disease risks and to help improve diagnosis, treatment and prevention of a number of different diseases. Data from TOPMed is also available through the NIH Database of Genotypes and Phenotypes (dbGaP).
“We have already identified some surprising new insights,” said study co-corresponding author Timothy O’Connor, Ph.D., a researcher and associate professor at the University of Maryland School of Medicine. “Most of the time, these variants mean nothing, but they can provide a new understanding of mutational processes and recent human evolutionary history.”
For example, the enzyme encoded by the CYP2D6 gene is known to metabolize approximately 25% of prescription drugs and is subject to a lot of genetic variation. In the 53,831 TOPMed individuals, the researchers identified 99 variant alleles for this gene that alter the function of the enzyme, 66 of which were known and 33 were not previously identified.
Of the more than 400 million genetic variations identified by the team, 97% of them are very rare, and are only seen in 1% of the population or less. Around 46% of the variants discovered were only present in only one individual. “These rare variants provide insights into mutational processes and recent human evolutionary history,” write the authors.
Overall, the African-American group had the most genetic diversity, followed by the Hispanic/Latino, European and Asian groups. “This is consistent with a gradual loss of heterozygosity tracking the recent African origin of modern humans and subsequent migration from Africa to the rest of the globe,” explain the authors.
This study is just the first step in the data analysis for this ambitious project, which will continue to collect genomic and other omic data to facilitate the wider roll-out of precision medicine.