Integrating plasma proteomics with genome-wide association data to identify novel drug targets for inflammatory bowel disease

Associated plasma proteins identified in PWASWe conducted a PWAS by integrating GWAS data of IBD, UC and CD with the data of 1,348 plasma proteomes using the FUSION pipeline. The abundance of 62, 21 and 30 cis-regulated plasma proteins was significantly associated with IBD, UC and CD, respectively. Of these proteins, 4 were common among IBD, UC and CD. In addition, 15, 17 and 6 proteins were common between IBD and UC, between IBD and CD and between UC and CD, respectively. Each of these proteins had an FDR of < 0.05 with a higher confidence level. Detailed information on plasma proteins associated with IBD, UC and CD is presented in Fig. 1, Table 1 and Supplementary Table S2.Figure 1PWAS of IBD (A), UC (B) and CD (C) with the plasma proteomes (N = 1348) and GWAS were integrated in Manhattan plot using FUSION Each point in the plot indicates a single association test between a plasma protein and IBD, UC and CD as the -log10 (P) of a z-score test result which ordered by genomic position on the x axis and the association strength on the y axis. 62, 21 and 30 proteins were identified whose cis-regulated plasma protein abundance correlated with IBD, UC and CD, respectively, and the top 10 proteins with the highest correlation are illustrated in the figure. The red horizontal line represents the significant threshold for Bonferroni correction of the FDR P < 0.05 which was set at the highest unadjusted P value that is below that in IBD, UC and CD, seperately.Table 1 Candidate top 10 plasma proteins identified by PWAS analysis for IBD, UC and CD.Association of plasma proteins with IBD, UC and CD verified via MRMR was performed to verify the relationship between plasma proteins and the risk of IBD, UC and CD and to elucidate the specific causal relationships. A total of 32, 8 and 9 proteins with strong causal effects were identified as biomarkers for IBD, UC and CD, respectively, (P < 0.05). A partial overlap was observed among proteins associated with the risk of IBD, UC and CD. The top five plasma proteins associated with the risk of IBD were MST1 (P = 6.14 × 10−8, OR = 0.82, 95% CI = 0.77–0.88), PARK7 (P = 1.76 × 10−6, OR = 0.81, 95% CI = 0.75–0.89), NADK (P = 3.25 × 10−5, OR = 0.84, 95% CI = 0.78–0.91), RIPK2 (P = 6.22 × 10−5, OR = 0.62, 95% CI = 0.49–0.78) and TALDO1 (P = 1.14 × 10−4, OR = 0.60, 95% CI = 0.46–0.78). The top five plasma proteins associated with the risk of UC were MST1 (P = 6.22 × 10−8, OR = 0.84, 95% CI = 0.79–0.90), CADM2 (P = 1.46 × 10−4, OR = 0.64, 95% CI = 0.51–0.80), VSIR (P = 2.77 × 10−4, OR = 0.89, 95% CI = 0.83–0.95), PRKCB (P = 6.20 × 10−4, OR = 1.19, 95% CI = 1.08–1.31) and PIGR (P = 7.14 × 10−4, OR = 0.78, 95% CI = 0.67–0.90). The top five plasma proteins associated with the risk of CD were FLRT3 (P = 4.99 × 10−8, OR = 0.90, 95% CI = 0.87–0.93), MST1 (P = 6.08 × 10−6, OR = 0.83, 95% CI = 0.77–0.90), ABO (P = 3.96 × 10−5, OR = 1.11, 95% CI = 1.06–1.16), TNFRSF1A (P = 5.32 × 10−4, OR = 1.35, 95% CI = 1.14–1.60) and C7 (P = 8.41 × 10−4, OR = 1.14, 95% CI = 1.06–1.23). Detailed information is provided in Figs 2, 3 and Supplementary Table S1–S5.Figure 2Association of protein expression in the blood with IBD (A), UC (B) and CD (C) risk The forest map for estimates of the relationship between genetically predicted protein levels and IBD, UC and CD.Figure 3Scatter plots for the MR analysis Scatter plots for IVW highlighting the effect of protein level on IBD, UC and CD. (A) ERAP2 (B) RIPK2 (C) TALDO1 (D) CADM2 (E) RHOC (F) HGFAC (G) VSIR (H) CADM2 (I) MST1 (J) FLRT3.In addition to the aforementioned causal proteins, there are potential risk proteins to consider. First, during the process of screening instrumental variables, the thresholds for IL23R, HINT1, and MFNG may have been too stringent, preventing the SNPs of these proteins from being used as instrumental variables to explore the causal relationship between IBD and proteins. The same situation occurs with IL23R, HINT1, RIPK2, and HSPA1A in CD. Fortunately, the PWAS-significant proteins for UC were all utilized as robust instrumental variables for MR analysis. Secondly, under the multiple correction thresholds, some proteins passed the threshold of P < 0.05 but did not pass the FDR < 0.05 correction. These include TNFSF15, HGFAC, HYAL1, KLB, HDGF, FCN1, C10orf54, HEBP1, ABO, and MAN2B2 in IBD, IL1R2, and PARK7 in UC, and PPIH, GKN, and SERPINF2 in CD. Additionally, some proteins do not show a significant causal relationship, but their effects on the disease are consistent with the direction of PWAS. These include IL12B, STAT3, FCGR3B, IL1RL1, LRRC32, C2, CD274, CRK, NOG, NCF1, CHRDL2, LY75 and ITLN1 in IBD; STAT3, MICB, HLA-DQA2, IL23R, FCGR3B, FCGR3A, AGER and PCOLCE2 in UC; and C2, TNFSF15, APOM, ADK, IL1RL1, TNFSF8, LRRC32, CFB and C9 in CD.Colocalisation of plasma proteins associated with disease riskTo verify genetic colocalisation, PP was evaluated to identify shared causal variants between pQTL and IBD GWAS data for genes that met the FDR-corrected P-value threshold in the MR analysis. The results revealed a probability that the GWAS and pQTL data shared a causal variant (PPH4 ≥ 0.75). Based on the PPH4 value of ≥ 0.75, 5, 3 and 2 proteins were found to play an important role in the progression of IBD, UC and CD (Figs 2 and 4). Among the proteins identified via co-localisation analysis, CADM2 is an important shared protein between IBD and UC. Proteins with PPH3 > 0.7 are also of significant interest. This includes 14 proteins in IBD, 3 in UC, and 3 in CD. Specifically, the proteins in IBD are FCGR2A, PARK7, AIF1, MXRA8, IL1R2, NADK, LY9, PIGR, PLAU, PLCG2, ICAM5, FCGR2B, EPHB4, and AGER. For UC, the proteins are AIF1, PRKCB, and PIGR. In CD, the proteins are C7, IRF3, and TNFRSF1A.Figure 4Genetic colocalization of IBD (A-E), UC (F–H) and CD (I-J) (A) ERAP2 (B) RIPK2 (C) TALDO1 (D) CADM2 (E) RHOC (F) HGFAC (G) VSIR (H) CADM2 (I) MST1 (J) FLRT3. In this view, each dot is a genetic variant. The SNP with the most notable P value with IBD, UC and CD is marked, and the colors of other SNPs depends on the digit size ordering of linkage disequilibrium (r2). SNPs with missing linkage disequilibrium information are also coded dark blue. In the LocusZoom plots, -log10 (P.gwas) for links with IBD, UC and CD risk are on the x-axes, and -log10 (P.pqtl) for relationship with the protein levels on the y-axes.Drug prediction analysisAs most drugs exert their therapeutic effects through targeting proteins, we finally explored whether the 10 proteins identified through the comprehensive analysis can serve as potential therapeutic targets. Prioritized 4 potential targets for drug therapy intervention, including ERAP2 and RIPK2, CADM2 and VSIR were obtained from the DGIdb, through drug-gene interactions. Through druggability explorations, we identified the inhibitor of ERAP2 Tosedostat as a effective drug of IBD. These findings are expected to promote and facilitate the development of specific drugs for IBD, UC and CD.

Hot Topics

Related Articles