An AI-powered study has revealed that rare, penetrant genetic mutations confer ten times the risk of everyday diseases compared with common variants in the same genes.
Among nearly half a million people studied, nearly all carried at least one rare, penetrant mutation for conditions that ranged from cardiovascular disease to Alzheimer’s, osteoporosis, dyslipidemia, and liver disorders.
The findings, published in Science, indicate the potential value in sequencing the genomes of apparently healthy people.
“What we find is that 97% of otherwise healthy people in the general population carried highly actionable variants for clinically relevant conditions,” said corresponding author Kyle Kai-How Farh, PhD, principal investigator at the Illumina AI lab, which is behind the algorithm that powered the study.
“Up to now we’ve learned that you need genome sequencing if you have a rare disease or cancer—but actually it looks like every healthy person in the population has highly impactful variants in our genomes that are clinically relevant and are important to be informed about.”
The study deployed PrimateAI-3D, an Illumina algorithm that is based on deep-learning architecture similar to the large language model ChatGPT but using genome sequences instead of linguistic constructs.
The three-dimensional, convolutional neural network was trained using common genetic variants identified from sequencing 809 animals from 233 species of non-human primates, representing all 16 families and over 86% of living genera.
This information was then compared with the human genome under the premise that, because the species are closely related, variants that are tolerated through natural selection in other primates are unlikely to cause disease in humans.
The researchers used PrimateAI-3D to estimate the pathogenicity of rare coding variants in 454,712 participants from the UK Biobank, which contains in-depth health and genetic information on half a million individuals.
Exomes, which reflect the protein-coding portion of a genome, were studied for genes associated with a wide spectrum of 90 complex traits and common diseases.
Applying PrimateAI-3D to missense variants greatly improved gene discovery, revealing 73% more significant gene-phenotype associations than without the algorithm.
The investigators found that common and rare variants had complementary roles in predicting the risk of human diseases. While common variants explained a higher proportion of total population variance, rare variants more readily identified outlier individuals at the greatest risk for severe, early-onset disease.
“Understanding the impact of rare variants in common diseases is of prime interest for both precision medicine and the discovery of drug targets,” the authors note.
“By leveraging advances in variant effect prediction, we have demonstrated major improvements in rare-variant burden testing and genetic risk prediction.
“Notably, we observed that nearly all individuals carried at least one rare penetrant variant for the phenotypes we examined, demonstrating the utility of personal genome sequencing for otherwise healthy individuals in the general population.”
Illumina has said PrimateAI-3D will be made broadly available to the genomics community in the next release of its Connected Software products.