Analyses of GWAS signal using GRIN identify additional genes contributing to suicidal behavior

Multiplex biological network generationThe function of GRIN is contingent on the user supplying networks that represent known gene-gene biological relationships. In order to capture these relationships from diverse types of biological evidence, a multiplex network was assembled from weighted network connections (edges) from a combination of publicly available and newly generated monoplex (single layer) networks. A multiplex network has an advantage over aggregate multilayer networks in that the unique topology of each layer is maintained, resulting in generally higher functional predictive ability34. Multiple component networks from HumanNet v247 were used (co-functional links by co-citation, co-essentiality48, co-expression, molecular pathway databases, gene neighborhood, phylogenetic profile associations, and orthologous protein-protein interactions transferred from model organisms [CC, CE, CX, DB, GN, PG, IL]), and a protein-protein interaction (PPI) network was generated by merging the following networks into a single monoplex layer: HumanNet v2 component PPI networks (HT, LC), and high-confidence physical protein-protein interactions from STRING version 11.049 (taxa = 9606, protein.actions.v11.0, mode = binding, min score = 700).As the dorsolateral prefrontal cortex (dlPFC) is a key brain region involved in processes disrupted in individuals with a history of suicide attempt (e.g., deficits in executive function and impulsivity50,51), we included multiple dlPFC-specific networks to gain tissue type-specific perspective on gene-gene relationships from this brain region. This included a dlPFC-specific transcription factor-gene network layer from a previously published transcription factor binding site network52. A newly generated dlPFC (Brodmann area 9) Predictive Expression Network (PEN) was obtained using the Iterative Random Forest – Leave One Out Prediction (iRF-LOOP) method32,53 using individual-level RNA-seq expression data from the Genotype-Tissue Expression (GTEx) project54. The resulting multiplex network was built using RWRtoolkit55 (https://github.com/dkainer/RWRtoolkit), which incorporates command-line scripts and an R library for generating multiplex networks and running the network exploration algorithm random walk with restart (RWR) by building upon the RandomWalkRestartMH R package34. The multiplex network used for all analyses comprises 10 layers, 51,183 unique genes, and 3,419,975 edges using δ = 0.5, where δ is the probability of the random walker remaining in the current network layer or moving to a different layer. The multiplex network used for all analyses is publicly available at https://github.com/sullivanka/GRIN/tree/main/test/suicide_weighted_Multiplex_0.5Delta.RData.GRIN processGRIN leverages the hypothesis that false positive genes in a user’s gene set, such as from SNP-to-gene assignment from GWAS, are likely to be functionally random with respect to the rest of the gene set, while true positive genes are likely to share function with other members of the gene set. GRIN is a classifier that uses information captured in biological networks from diverse lines of evidence to determine which genes in a gene set are functionally related to each other (and therefore belong together) and which ones appear to be randomly included and are likely false positives. GRIN achieves this by first scoring every gene in the network (including those in the user’s gene set) according to how topologically accessible it is from each gene in the user’s gene set as determined by the network propagation algorithm Random Walk with Restart (RWR). GRIN then classifies the genes in the user’s gene set as true or false positives based on their RWR rankings. GRIN runs its RWR process for 100 random gene sets of the same dimension as the user’s gene set to build a null ranking distribution, so that GRIN can learn what false positive gene set ranks should look like under the assumption they are random. The ordered gene rankings for the user’s gene set is compared to the ordered null rankings to find the position where the distribution of rankings in the user’s gene set no longer diverges significantly from the null distribution. Using this theory, GRIN partitions the user’s gene set, such as from SNP-to-gene assignment from GWAS, into a Retained gene set and Removed gene set in a two-stage process.In Stage 1, every gene in the network is ranked according to how connected it is to the genes in the user-specified gene set (e.g., GWAS-derived genes). This includes ranking the user-specified genes themselves by using leave-one-out cross-validation (LOOCV). RWR provides each gene with a rank that is a proxy for how easily each gene in the network can be reached from the starting set of GWAS genes, including a rank for the GWAS genes themselves. Genes with many paths and interactions to one or more of the GWAS genes rank strongly, while genes that are isolated or distant from the GWAS genes rank poorly. In the current implementation, this RWR-based ranking occurs based on network propagation of probabilities of visiting a given gene in the multiplex network, which is based on a matrix representation of the edge weights between genes in the multiplex network (i.e., the supra-adjacency matrix composed of all intra- and inter-layer connections). Random walks are then simulated many times by propagating the probability of the random walker exploring a given gene beginning from the seed genes, and this process continues until the combined network probabilities no longer change between simulated random walks by a given threshold (1e−10), thereby achieving convergence based upon an asymptotic number of simulated random walks. The advantage of using a propagation algorithm like RWR is that genes that are not direct neighbors of GWAS genes may still rank highly due to indirect paths. Additional parameters can be used to tune RWR to favor certain network layers (τ) or adjust the probability of restart (r) at seed genes. In all analyses in the present study, we used r = 0.7, equivalent τ values for all network layers, and a multiplex network with δ = 0.5 based on previous work that achieved good performance using these parameters34.To obtain accurate rankings for each gene in a gene set of size n, we chose to implement random walk with restart leave-one-out cross validation (RWR-LOOCV) n times, where in each run one gene is left out and the other n-1 genes are used as seed genes (starting points) for the random walker in the multiplex network. Each run of RWR-LOOCV generates a ranking of every non-seed gene in the multiplex, including the left-out gene from the original seed gene set, so that each gene in the user’s obtains n-1 rank values after n runs. Stage 1 then orders the genes in the set from best to worst according to their median rank values. GRIN also needs a representation of what Stage 1 results should look like for purely random gene sets of size n. This empirical null distribution is generated by running RWR-LOO for 100 gene sets, each containing n randomly sampled genes from the multiplex. The median rank at each position in the order from 1 to n thus represents the empirical null distribution of ranks for this specific multiplex and gene set size.In Stage 2, a cutoff C between 1 and n is determined below which all gene set members are considered the equivalent of random and can be discarded. A two-sided Mann–Whitney U test from the R stats base package (“wilcox.test”) is performed over a sliding window of size \({winsize}=0.15\times n\) to see if the RWR-LOOCV ranks for the gene set members come from the same distribution as the null distribution RWR-LOOCV ranks. The expectation is that a gene set window containing functional groups of genes will have a very different ranking distribution to the random genes in the equivalent null window, resulting in very small (significant) p-values. On the other hand, if the window contains genes with little functional relatedness, the ranking distribution will appear to be drawn from the null distribution and the p-value tends towards 1. This test is run for each window sliding by 1, producing a p-value vector of length n-winsize. The cutoff C is chosen by finding an elbow in the p-values using the open source R package “Knee Arrower” with the method = “first” parameter set (https://github.com/agentlans/KneeArrower). The output is a Retained gene set and a Removed gene set.Validation of GRIN using well-characterized gene setsTo determine the ability of GRIN to effectively remove noise genes from a gene set, we obtained a variety of well-characterized biologically interrelated gene sets (“gold” sets) and spiked them with random genes drawn from the full multiplex network. Given our application of this method to suicide GWAS summary statistics, we chose 20 gene sets related to diverse brain functions. We included an additional 10 gene sets related to other organ systems (lung and kidney) in order to demonstrate that GRIN can be used in other biological contexts. These thirty “gold standard” gene sets of functionally interrelated genes (see Supplementary Table 1), ranging in size from 10 to 225 genes, were derived from the following sources: Gene Ontology (GO16); Online Mendelian Inheritance in Man (OMIM56); and DisGENET57. Random genes were added to the full list of genes in each gold set to create gene sets with a 1:1 signal-to-noise ratio (i.e., Ngold : Nrandom). For each of the 30 gold sets we generated 100 test gene sets using varying samples of random genes. GRIN was then used to filter out random genes from each test gene set and the effectiveness of the filter was evaluated using receiver operator characteristics (ROC) and precision/recall (PR) measured at every possible cutoff point, C, in each rank-ordered gene set.For evaluation purposes, “true positive” genes were labeled as genes belonging to a gold gene set that were correctly retained by GRIN; “true negative” genes were randomly added genes that were correctly removed by GRIN; “false positive” genes were randomly added genes that were incorrectly retained by GRIN; and “false negative” genes were gold genes that were incorrectly removed by GRIN. ROC (false positive rate vs true positive rate), and PR curves (precision vs recall) were generated and area under ROC (AUROC) and area under PRC (AUPRC) values were calculated for each test gene set. Median AUROC and AUPRC were calculated for each of the 30 gold standard gene sets to indicate whether Stage 1 of GRIN ranked gold genes more highly than random genes in general. After estimating the optimal cutoff C at Stage 2, precision, recall, and specificity (true negatives / true negatives + false positives) were calculated for the genes removed and the genes retained by Stage 2. Median precision, recall, and specificity values were calculated across the 100 test gene sets for each of the 30 gold standard gene sets. Values are presented as median +/- interquartile range (IQR).GRIN was also tested on unequal ratios of gold standard genes and random genes using the dopaminergic synaptic signaling gene set from GO (GO:0001963) and Acute Kidney Failure gene set from DisGeNET (C0022660) – 2:1 gold genes to random genes, 1:2 gold genes to random genes, 1:4 gold genes to random genes, and 1:10 gold genes to random genes. For each ratio of gold standard genes to noise, 100 test sets were generated. Finally, to test whether GRIN could remove random genes from gene sets containing multiple groups belonging to biological processes that were functionally distinct, multiple gold gene sets were combined and random noise also added. Dopaminergic synaptic transmission (GO:0001963; 23 genes), central nervous system myelination (GO:0022010; 20 genes), Alzheimer’s disease (OMIM #: 607822, 104300, 606889, 608907, 602192, 615590; 12 genes), and major depressive disorder (DisGeNET gene-disease association score ≥ 0.5; 14 genes) were mixed with five ratios of random to gold standard genes – 2:1 gold genes:noise, 1:1 gold genes:noise, 1:2 gold genes:noise, 1:4 gold genes:noise, and 1:10 gold genes:noise. This process was repeated to generate 100 gene sets of gold standard and random genes for each ratio examined.Evaluating GRIN with low- and high-powered GWAS resultsWe evaluated GRIN’s ability to control for the loss of precision when using a lower powered GWAS with relaxed significance thresholds to detect genes that were genome-wide significant in a higher powered GWAS. To do this, we obtained published GWAS summary statistics from multiple studies of the same trait and defined the highest powered study (the one with largest sample size) as the gold GWAS for that trait. To fairly compare GRIN to GWAB and NAGA, SNP-to-gene mapping was performed in the same way for each GWAS using the method used by GWAB and NAGA. SNPs were therefore assigned to protein-coding genes within a +/− 10 kbp window, with the SNP with the lowest p-value assigned to the nearest gene (or multiple genes in this window if the SNP was intergenic), and the genes identified in the gold GWAS at a stringent threshold of p < 5e−8 were labeled as true positives for the trait. We then ran GRIN on the gene sets from the lower powered studies at progressively more relaxed thresholds from p < 5e−8 by half orders of magnitude down to p < 1e−3 and measured the precision and recall of the GRIN-retained set of genes at each threshold. Using the higher powered GWAS as gold standard results, we calculated precision and recall using the total number of genes of the lower powered GWAS (No GWAS Boosting) and compared this to the values from GRIN-retained genes at each significance threshold. We did this for 10 human traits or diseases: coronary artery disease (CAD1-358,59,60); number of alcohol-containing drinks consumed per week (DrinksPerWeek1-261,62; HDL cholesterol (HDL1-363,64,65; height1-366,67,68; LDL cholesterol (LDL1-363,64,65); schizophrenia (SCZ1-369,70,71); smoking initiation1-261,62; type 2 diabetes (T2D1-272,73); total cholesterol (TC1-363,64,65); and total triglycerides (TG1-363,64,65). For CAD58,59, height66,67, SCZ69,70, and blood lipids traits (HDL, LDL, TC, TG63,64), two earlier, lower powered GWAS results were compared to the later, higher powered GWAS60,65,68,71 (e.g., CAD1 and CAD2 were used to measure precision and recall of CAD3). For traits where under 2000 genes were assigned at a given threshold, we did not proceed with lower thresholds with the exception of the second set of height results (Height2), as 2710 genes were assigned at p < 5e-8 (went down by half orders of magnitude to p < 1e−5).Comparing GRIN to other GWAS boosting methodsWe sought to compare the performance of GRIN to two other GWAS-boosting methods, NAGA and GWAB, using the same approach of considering higher powered GWAS results as a gold set from which to evaluate precision and recall of lower powered GWAS results.NAGA works by first assigning to each protein-coding gene the p-value of the best GWAS SNP in near proximity (i.e., +/− 10kbp), which produces a ranking of all protein-coding genes. We used NAGA’s SNP-gene assignment to assign SNPs to protein-coding genes within +/− 10kbp for the genome assembly that was used to run the GWAS: hg18/GRCh36 (CAD158, HDL163, Height166, LDL163, SCZ169, TC163, TG163, T2D172), hg19/GRCh37 (CAD259, CAD360, DrinksPerWeek161, HDL264, HDL365, Height267, LDL264, LDL365, SCZ270, SmokingInitiation161, TC264, TC365, TG264, TG365, SCZ371, T2D273), or hg38/GRCh38 (DrinksPerWeek262, Height368, SmokingInitiation262). NAGA then propagates the p-values over a gene-to-gene functional network using either the random walk with restart algorithm or heat diffusion, which is intended to boost the recall of relevant (true positive) genes for the trait by re-ranking the genes and increasing their rank position. Thus, while the traditional full NAGA output is a full list of ranked genes, it is up to the user to determine a cutoff in the ranked list for further investigation. In order to make a fair comparison between NAGA and GRIN, we used the top n ranked genes from NAGA to calculate precision and recall, where n is the total number of genes that were input to GRIN at a given statistical threshold. All NAGA rankings were performed using a Jupyter Notebook, random walk with restart network propagation, and the NAGA-supplied monoplex network “Original PCNet” (http://www.ndexbio.org/#/network/f93f402c-86d4-11e7-a10d-0ac135e8bacf).We ran GWAB (located at https://www.inetbio.org/gwab/gwab_query.php) using SNP-nearest gene assignment within +/− 10kbp similarly to NAGA. GWAB then uses a gene-to-gene functional network (HumanNet v247) to calculate a new score for each gene based on both its own p-value and the p-values of its network neighbors, modulated by the weights of the edges between the gene and those neighbors. This re-ranks the genes, which are evaluated against a user-provided list of true positive genes (e.g., a literature curated list of disease-relevant genes). The traditional output of GWAB is a list of genes assigned from SNPs at an optimized, less stringent genome-wide significance threshold based upon recall of an a priori list of literature-curated disease-relevant traits. Thus, we ran GWAB for the same traits as GRIN and NAGA where literature-curated genes were available from the DISEASES74 database, with the exception of Height1, Height2, and SmokingInitiation1. We ran GWAB at the same genome-wide significance thresholds as GRIN and compared precision and recall of GWAB boosted genes at each threshold.Million Veteran Program (MVP) suicide attempt genome wide association study (GWAS) summary statisticsSuicide attempts were identified from United States veterans as described previously15. Suicide attempts were characterized by using a combination of Veterans Healthcare Administration (VHA) databases from the VA: the Suicide Prevention Application Network (SPAN) database, electronic health record (EHR) information from the VA Corporate Data Warehouse (CDW), and the CDW Mental Health Domain survey. For the MVP diagnosis, suicide attempt was determined by the presence of one or more of the following International Statistical Classification of Diseases and Related Health Problems (ICD) − 9 and ICD-10 diagnostic codes in a subject’s EHR: ICD-9: E950-959; ICD-10: T14.91, X60-62, X64, X66-X83, Y87.0, Z91.5. Control patients were obtained from veterans enrolled in MVP without a history of suicide attempt or suicidal ideation as determined by a combination of SPAN survey, Mental Health Domain survey, and ICD diagnostic codes in the CDW database (suicidal ideation codes: ICD-9: V62.84; ICD-10: R45.851). A total of 410,464 controls from various ancestries (African, Asian, European, and Hispanic) were included for genome-wide association along with 14,535 cases of non-fatal suicide attempt and 294 fatal attempts. Genome-wide association analyses were conducted using DNA from whole blood samples from subjects enrolled in MVP using a custom Affymetrix Biobank Array. Quality control and imputation was performed as previously described14. All subjects provided informed consent and the activities used to generate the GWAS summary statistics were approved by the VA Central Institutional Review Board. All ethical regulations relevant to human research participants were followed.International Suicide Genetics Consortium (ISGC) suicide attempt GWAS summary statisticsSuicide attempt summary statistics were analyzed from two sets of suicide attempt summary statistics derived from civilian populations compiled by the ISGC13. SNPs were included from a general population of European ancestry (SA-EUR) as cases of suicide attempt or control subjects. Furthermore, additional summary statistics were derived from this general population conditioned on diagnosis status for major depressive disorder (SA-MDD) to generate an additional set of suicide attempt summary statistics. Thus, while the SA-MDD summary statistics are not independent of the SA-EUR summary statistics as they are comprised of the same set of controls and cases of suicide attempt, both SA-EUR and SA-MDD summary statistics were analyzed in order to determine the overlap between these results and results from the MVP cohort. All subjects involved in the ISGC provided informed consent and the activities used to generate the GWAS summary statistics were approved by their local institutional review boards as previously described13. All ethical regulations relevant to human research participants were followed.SNP to gene assignment for suicide attempt summary statisticsSNPs from MVP and ISGC suicide attempt summary statistics were assigned to genes using the union of two separate methods in order to identify multiple possible genes contributing to this phenotype as input to GRIN. H-MAGMA1 was used in combination with publicly available Hi-C data from adult dorsolateral prefrontal cortex (dlPFC)75 to improve intergenic SNP-to-gene assignment based on three-dimensional chromatin structure in this brain region. Adult prefrontal cortex Hi-C data was used as this brain region is known to be involved in executive function and impulsivity processes, which are disrupted in individuals with a history of suicide attempt50,51. Additionally, conventional MAGMA4 was used as an alternate method of SNP-to-gene assignment. Thus, H-MAGMA and conventional MAGMA were applied only as methods of assigning SNPs to genes only using SNPs at given significance thresholds, rather than using these tools as gene-based tests on the entire set of summary statistics.SNPs were assigned to genes from MVP, SA-EUR, or SA-MDD summary statistics at multiple thresholds (p < 5e−8, p < 1e−5, p < 1e−4, p < 1e−3, p < 1e−2, and p < 1e−1; Supplementary Table 2) to determine the number of suicide attempt GWAS genes that could be input to GRIN at each of these thresholds. The union of conventional MAGMA and H-MAGMA-assigned genes (i.e., all genes assigned from either method) from MVP, SA-EUR, or SA-MDD suicide attempt summary statistics were subsequently used as gene set inputs to GRIN at a threshold of p < 1e−5, as this resulted in gene set sizes that would result in high precision while still obtaining more recall compared to genes identified at a threshold of p < 5e−8. The union of genes identified at a threshold of p < 1e−5 were then filtered into retained and removed gene sets using GRIN (Supplementary Tables 5, 8, and 9).Gene set enrichment analysisGene sets from MVP and ISGC summary statistics were tested for multiple enrichments using the online ToppGene suite using ToppFun76. Gene set enrichments were analyzed using the following enrichment categories: GO: Molecular Function; GO: Biological Process; GO: Cellular Component; Human Phenotype; Pathway (all databases selected); Transcription Factor Binding Site (all databases selected); Drug (all databases selected); Disease (all databases selected). Enrichments were considered significant using a Benjamini-Hochberg false discovery rate (FDR)-adjusted p-value threshold < 0.05.Drug to gene target networks for putative drug repurposing and side effect evaluationGenes identified as contributing to suicide attempt pathophysiology from MVP and ISGC summary statistics were used to construct drug to gene target networks from information derived from DrugBank77. Drug to gene target networks were visualized in Cytoscape78 (version 3.8.2, Cytoscape Consortium) to identify drugs known to target genes of interest from MVP and ISGC summary statistics using GRIN-retained genes at p < 1e−5 (Supplementary Table 20). ISGC GWAS genes were compared to genes from the MVP cohort using Venn diagrams generated from the open source R package Vennerable (https://github.com/js229/Vennerable).Statistics and reproducibilityFull data points from Fig. 2, Supplementary Figs. 1–3, and Supplementary Figs. 6 and 7 are provided in Supplementary Data 2. All comparisons between the user’s gene set rank distributions and null distribution were performed using a two-sided Mann–Whitney U test, and the null distribution was generated by generating 100 random gene sets of equivalent size to the user’s gene set to ensure the reproducibility of results.Reporting summaryFurther information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Hot Topics

Related Articles