DNA from stool samples can reveal a person’s sex, ancestry, and whether they carry particular genetic variants, according to research findings that could have implications for privacy and confidentiality.
The study, reported in the journal Nature Microbiology, also demonstrated that it was possible to accurately reidentify people from human DNA in fecal samples by comparing it with their genotype data.
The discovery is important given that research into microorganisms inhabiting the human body, collectively known as the microbiome, often involves fecal samples.
It suggests that fecal metagenomes—involving the complete sequencing of genetic material from a sample—should be managed sensitively.
“Given that human germline genotype is highly confidential information and care should be taken when shared with the community, human reads in metagenome data should be removed before deposition, especially when personal identifying information should not be accessed,” the Japanese investigators maintain.
“In addition, both the researchers and study participants should recognize that metagenome data include germline genome information.”
One of the driving forces for the expansion in microbiome research has been the rapid progress of high-throughput metagenome sequencing technologies, including metagenome shotgun sequencing.
This technique directly sequences bulk DNA extracted from microbiome samples without targeted amplification of the bacteria-specific marker genes.
In this way, the non-bacterial component of samples including host DNA is also sequenced.
Fecal samples typically contain a relatively small amount of host DNA, at less than 10% compared with more than 90% in saliva, nasal cavity, skin and vaginal samples.
Yet the question as to whether personal information can be reconstructed from gut metagenome data remains unanswered.
To investigate further, Yukinori Okada, PhD, a professor in statistical genetics at Osaka University, and colleagues developed a series of methods to reconstruct personal information from a small number of human reads.
They then applied this to gut metagenome data from 343 Japanese individuals with available genotype data.
The team found they could predict genetic sex based on the sequencing depth of the sex chromosomes for 97.3% of the fecal samples.
Participants could also be reidentified from the matched genotype data with 93.3% sensitivity using a likelihood score-based method.
This method also enabled the researchers to predict the ancestry of 98.3% of the samples. Ultra-deep shotgun metagenomic sequencing of five fecal samples as well as whole-genome sequencing of blood samples demonstrated that genotypes of both common and rare variants could be reconstructed from fecal samples.
A subset of the rare variants recovered by direct genotype calling was associated with diseases including orotic aciduria and deficiency of butyryl-CoA dehydrogenase.
This, say the researchers, suggests that the rare-variant information reconstructed from the gut metagenome data could reveal the disease risk of an individual.
“Given the volume of fecal samples used for ultra-deep sequencing, approximately 40 mg of feces could be sufficient to call genome-wide common variants and some highly confidential rare variants,” they say.