World map, technical looking illustration to represent more diverse and representative genomic studies
Credit: KTSDESIGN/SCIENCE PHOTO LIBRARY/Getty Images

The Global Biobank Meta-Analysis Initiative (GBMI) is already helping to make genomic studies more diverse and representative, according to a range of new studies.

The GBMI is a network of 24 biobanks from five continents with a combined participant number of over 2.2 million. With nine North American biobanks, eight European (including the well-known UK Biobank), one West Asian, four East Asian, one Australian and one African biobank in the group the combined participants are very diverse.

A world map-based figure shows the 23 biobanks across four continents that have joined GBMI as of April 2022, bringing the total number of samples with matched health data and genotypes to more than 2.2 million. This study is being used to improve diversity and representation in population genomics studies
This figure shows the 23 biobanks across four continents that have joined GBMI as of April 2022, bringing the total number of samples with matched health data and genotypes to more than 2.2 million. Biobanks are colored based on the sample recruiting strategies [Source: Zhou et al./Cell Genomics]

Formed in 2019, the aims of the GBMI are to improve statistical power for genome-wide association studies (GWAS), improve the accuracy of polygenic risk scores, enable research on understudied diseases, allow better cross validation studies, and to improve fine mapping and subgroup analyses.

This week the first seven research studies demonstrating the value of the GBMI were published in Cell Genomics. Numbers are changing as more participants join the initiative (the Uganda Genome Resource joined after the papers were submitted). But estimates from early 2022 suggest the participant numbers are split as follows: 1.4M European, 42,000 African, 31,000 Central and South Asian, 415,000 East Asian, and 12,000 Middle Eastern origin.

A key study led by researchers at the Broad Institute combined data from most of the biobanks to produce a large combined GWAS. The researchers identified 317 known and 183 new genes from the results linked to 14 different diseases including asthma, gout and some cancers.

“Together, the pilot work conducted in GBMI shows that bio-banks can be meta-analyzed to provide reliable genetic discoveries despite the heterogeneous characteristics across biobanks in many aspects, such as locations, sample sizes, genotyping and phenotyping approaches, sample ancestries, and strategies to recruit participants, with standardized phenotype definitions and analysis pipelines,” write first author Wei Zhou, a researcher at the Broad Institute, and colleagues.

Two of the other studies published this week used data from the biobanks to look for new drug targets. One looked at samples from both European and African ancestry to search for new targets for eight complex diseases. Overall, 16 possible drug targets were identified for future investigation. The other focused on developing drug discovery techniques that can be used across different populations. Across 13 common diseases, they found new targets involved with certain types of blood coagulation and immune signalling in gout.

A couple of the studies focused on specific biobanks, namely, the nearly 40-year-old Trøndelag Health Study in Norway and the newer Taiwan Biobank. Both studies looked at findings to date and the benefits of such population-specific studies.

Another of the studies focused on rare diseases and combined data from 13 of the biobanks to study idiopathic pulmonary fibrosis, a rare disease characterized by lung tissue scarring, in a large GWAS. By including a more diverse participant group, the researchers identified six new variants linked to the condition that would not have been found if the analysis was restricted to European populations alone.

The seventh study had a focus on protein expression and how best to carry out transcriptome-wide association studies across different populations to collect more accurate data to help improve genomic medicine.

Also of Interest