GWAS Used to Find Highly Pathogenic SARS-CoV-2 Mutation

GWAS Used to Find Highly Pathogenic SARS-CoV-2 Mutation
Credit: matejmo/Getty Images

A genome-wide association study (GWAS) has been used to detect an emerging highly pathogenic variant of the SARS-CoV-2 virus. This approach could be useful for flagging dangerous variants, according to Harvard T.H. Chan School of Public Health and MIT researchers.

Using GWAS and mortality data, the researchers pinpointed a mutation in the variant known as P.1, or Gamma, and linked it to higher mortality and, potentially, greater transmissibility, higher infection rates, and increased pathogenicity. They found a mutation—at locus 25,088bp in the virus’s genome—that alters the spike protein and was linked to a significant increase in mortality in COVID-19 patients.

This approach, the researchers said, should have broader applications beyond the P.1 variant and SARS-CoV-2. The team’s methodology is described online today in Genetic Epidemiology.

“Based on our experience, GWAS methodology might provide suitable tools that could be used to analyze potential links between mutations at specific locations in viral genomes and disease outcome,” said Christoph Lange, professor of biostatistics at Harvard Chan School and senior author of the paper. “This could enable better real-time detection of novel, deleterious variants/new viral strains in pandemics.”

The first patients in Brazil with the P.1 variant were documented in January 2021 and within a few weeks the variant caused a spike in cases in Manaus, Brazil. The city had already been hard hit by the pandemic in May 2020, and researchers thought that the city’s residents had achieved population immunity because so many people in the area had developed antibodies for the virus during that initial wave. Instead, P.1, which has several mutations in the spike protein, caused a second wave of infections and seemed to have higher transmissibility and be more likely to cause death than the earlier variants seen in the area.

In September 2020, several months before the first P.1 patient was documented, the Harvard Chan School and MIT team repurposed methodology used in GWAS, which are widely used to link certain genetic variations with specific diseases, to analyze the relative pathogenicity of various SARS-CoV-2 mutations. The team looked for links between each mutation of the SARS-CoV-2 virus’s single-stranded RNA and mortality in 7,548 COVID-19 patients. Data for the study came from the global initiative on sharing avian influenza data (GISAID) database, which contains the genetic sequence and related clinical and epidemiological data associated with SARS-CoV-2 and influenza viruses.

In total, they evaluated 29,891 sequenced loci of the viral genome for their effect on patient mortality. They found two loci, at 12,053 and 25,088 bp, that achieved genome-wide significance (p values of 4.09e−09 and 4.41e−23, respectively), though only 25,088 bp remained significant in follow-up analyses.

They write that: “The locus at 25,088 bp is located in the P.1 strain, which later (April 2021) became one of the distinguishing loci of the Brazilian strain, as defined by the Centers for Disease Control. Specifically, the mutations at 25,088 bp occur in the S2 subunit of the SARS-CoV-2 spike protein, which plays a key role in viral entry of target host cells.”   Since the mutations alter amino acid coding sequences, they may cause structural changes that could enhance viral infectivity and symptom severity.

“We expect that this approach would work in similar scenarios involving other diseases, provided the quality of the data collected in public databases is sufficiently high,” said Georg Hahn, research associate and instructor of biostatistics at Harvard Chan School and co-first author of the paper.