A team that included scientists from the University of North Carolina (UNC) School of Medicine reports the development of a research model that relates variations in DNA and gene activity to the risk of brain disorders. The model, described in a paper (“Comprehensive functional genomic resource and integrative model for the human brain”) published in Science, draws from prior studies of thousands of healthy people and people with brain disorders.
Scientists can now use it as a tool to explore the biological mechanisms of disorders such as schizophrenia and autism, which have largely eluded a deep understanding and have no cure, according to the researchers.
“Strong genetic associations have been found for a number of psychiatric disorders. However, understanding the underlying molecular mechanisms remains challenging.
To address this challenge, the PsychENCODE Consortium has developed a comprehensive online resource and integrative models for the functional genomics of the human brain,” wrote the investigators.
The base of the pyramidal resource is the datasets generated by PsychENCODE, including bulk transcriptome, chromatin, genotype, and Hi-C datasets and single-cell transcriptomic data from ~32,000 cells for major brain regions. We have merged these with data from Genotype-Tissue Expression (GTEx), ENCODE, Roadmap Epigenomics, and single-cell analyses. Via uniform processing, we created a harmonized resource, allowing us to survey functional genomics data on the brain over a sample size of 1866 individuals.
“From this uniformly processed dataset, we created derived data products. These include lists of brain-expressed genes, coexpression modules, and single-cell expression profiles for many brain cell types; ~79,000 brain-active enhancers with associated Hi-C loops and topologically associating domains; and ~2.5 million expression quantitative-trait loci (QTLs) comprising ~238,000 linkage-disequilibrium–independent single-nucleotide polymorphisms and of other types of QTLs associated with splice isoforms, cell fractions, and chromatin activity.
“By using these, we found that >88% of the cross-population variation in brain gene expression can be accounted for by cell fraction changes. Furthermore, a number of disorders and aging are associated with changes in cell-type proportions. The derived data also enable comparison between the brain and other tissues. In particular, by using spectral analyses, we found that the brain has distinct expression and epigenetic patterns, including a greater extent of noncoding transcription than other tissues.
“The top level of the resource consists of integrative networks for regulation and machine-learning models for disease prediction. The networks include a full gene regulatory network (GRN) for the brain, linking transcription factors, enhancers, and target genes from merging of the QTLs, generalized element-activity correlations, and Hi-C data. By using this network, we link disease genes to genome-wide association study (GWAS) variants for psychiatric disorders. For schizophrenia, we linked 321 genes to the 142 reported GWAS loci. We then embedded the regulatory network into a deep-learning model to predict psychiatric phenotypes from genotype and expression. Our model gives a ~6-fold improvement in prediction over additive polygenic risk scores. Moreover, it achieves a ~3-fold improvement over additive models, even when the gene expression data are imputed, highlighting the value of having just a small amount of transcriptome data for disease prediction. Lastly, it highlights key genes and pathways associated with disorder prediction, including immunological, synaptic, and metabolic pathways, recapitulating de novo results from more targeted analyses.
“Our resource and integrative analyses have uncovered genomic elements and networks in the brain, which in turn have provided insight into the molecular mechanisms underlying psychiatric disorders. Our deep-learning model improves disease risk prediction over traditional approaches and can be extended with additional data types (e.g., microRNA and neuroimaging).”
“It’s the most comprehensive functional genomic resource ever developed for understanding the brain, and it establishes a framework for integrating different kinds of genomics data to get deep insights into the biology of brain disorders,” said co-first author Hyejung Won, Ph.D., assistant professor of genetics at the UNC School of Medicine and member of the UNC Neuroscience Center.
Scientists in the last few decades have performed hundreds of studies that gather DNA-sequence and related data on large groups of people to identify DNA variations and other genome-related factors associated with diseases. These genomics studies have generated important clues to the biological causes of many illnesses.
But for psychiatric disorders and many other common brain disorders, traditional genomics studies have been less useful. Schizophrenia, for example, has been linked to specific variations at more than 100 locations on the genome (called “risk loci”) but most of these loci do not contain genes, so it is unclear how they relate to the disease.
Moreover, the many gene variants that have been linked to schizophrenia typically have only weak impacts on schizophrenia risk. This has suggested to scientists that schizophrenia, and probably many other brain disorders, are too complex to understand with traditional, one-dimensional genomics approaches.
In pursuit of a more sophisticated approach, a group of genomics researchers several years ago formed the PsychENCODE consortium. They began to pool data from their genomics studies and other publicly available studies to develop tools to find relationships between different kinds of data.
The new resource includes different kinds of genomics data on individuals who had schizophrenia, bipolar disorder, and autism spectrum disorder. The types of genomics data include DNA-sequences, data on gene expression from specific kinds of brain cells, maps of DNA regions called enhancers that promote gene expression, and other features of the genome known to affect gene activity.
Dr. Won contributed data from her own studies on “chromosome conformation.” This refers to the three-dimensional organization of looped DNA in the nuclei of cells, and in particular, the points where different loops come close enough to influence each other’s gene expression. She also developed a complex model of how gene expression in brain cells is regulated by chromosome conformation and other genomic factors.
The team used the gene regulation network model to evaluate 142 schizophrenia risk loci uncovered by prior genomics studies. These risk loci do not contain genes, but are suspected of contributing to schizophrenia risk by somehow influencing the expression of other genes. The model identified 321 genes, including some that are known schizophrenia risk genes, as the likely regulatory targets of these risk loci.
Dr. Won and colleagues showed that these genes affect the functions of synapses, acetylcholine receptors, ion channels, and other pathways implicated in prior schizophrenia studies. The scientists also determined that schizophrenia is primarily a disorder of neurons, not of other brain cells.
The resource developed for the study includes an AI-powered deep-learning model that estimates the risk of psychiatric symptoms based on gene variant and gene expression data. The scientists compared the new model to a standard, much simpler model that predicts psychiatric illness based on an individual’s genome.
“The deep-learning model was much more accurate, and we think it will have a big impact in terms of risk assessment and diagnosis for patients,” said Dr. Won, adding that she and her PsychENCODE colleagues now are continuing to develop their model by integrating more types of genomics data and extending their analyses beyond schizophrenia to other brain disorders.