Machine learning based intratumor heterogeneity signature for predicting prognosis and immunotherapy benefit in stomach adenocarcinoma

The ITH score of STAD casesSupplementary Table 1 displayed the STAD cases’ ITH scores. Higher ITH score was correlated with lower tumor grade, higher clinical stage, higher pT stage and distant metastasis (Fig. 1A). STAD cases were categorized into low and high ITH score. A higher ITH score was associated with a poorer overall survival (OS) rate (Fig. 1B, p = 0.041). Next, we looked at the DEGs in STAD between the groups with low and high ITH scores, with which we could identify genes mediating the ITH of STAD. As a result, we obtained 925 DEGs (Fig. 1C, p < 0.05). Twenty-one of these DEGs had a strong correlation with the clinical prognosis of STAD patients (Fig. 1D).Fig. 1The intratumor heterogeneity score of STAD cases. A The correlation between intratumor heterogeneity score and the clinical characters of STAD patients. B Low intratumor heterogeneity score indicated a lower overall survival rate in STAD. C The different expressed genes between in high and low intratumor heterogeneity score. D Univariate cox analysis identified potential genes significantly correlated with the prognosis of STAD patients.A prognostic IRS was created by machine learningThen, in order to create an IRS, we fed these 21 genes into our machine learning-based integrative process. Using the LOOCV framework, we fitted 101 different types of prediction models in the TCGA cohort. We then computed the C-index for each model over all GEO cohorts (Fig. 2A). The C-index for each prediction model across all cohorts was displayed in Fig. 2A. With the greatest average C-index of 0.63, the IRS created using the RSF + Enet (alpha = 0.1) method was recommended as the ideal IRS (Fig. 2A). The IRS was created utilizing nine genes based on the RSF + Enet (alpha = 0.1) technique, and the IRS score (risk score) of STAD patients was computed using the formula below: risk score = (-0.2838) × DGKQexp + (0.1515) × SERPINE1exp + 0.1034 × PRTGexp + 0.0515 × CPNE8exp + 0.1614× NT5Eexp + 0.1919 × AKR1B1exp + 0.0493 × FGF1exp + (-0.0243) × SLITRK2exp + (-0.022) × ASPAexp. STAD cases were divided into groups with high and low IRS scores using the optimal cut-off. The study findings revealed that patients with STAD who had a high IRS score had a lower OS rate in the TCGA, GSE15459, GSE26253, GSE62254 and GSE84437 datasets(Fig. 2B and F), with 1-, 3-, and 5-year AUCs of 0.689, 0.683, and 0.669 in TCGA cohort (Fig. 2G); 0.670, 0.634, and 0.634 in GSE15459 cohort (Fig. 2H); 0.622, 0.601, and 0.693 in GSE26253 cohort (Fig. 2I); 0.669, 0.631, and 0.603 in GSE62254 cohort (Fig. 2J), 0.652, 0.686, and 0.636 in GSE84437 cohort (Fig. 2K), respectively (Fig. 2B and F).Fig. 2Development of IRS by integrative machine learning algorithms. A IRS was evaluated using 101 machine learning combinations. The concordance index was calculated for each model of TCGA and GEO datasets. The survival curve of STAD patients with different IRS score in TCGA (B), GSE15459 (C), GSE26253 (D), GSE62254 (E) and GSE84437 (F) cohort. Time-dependent ROC curves for IRS in evaluating the 1-year (Red line), 3-year (Blue line), and 5-year (Green line) overall survival for the TCGA (G), GSE15459 (H), GSE26253 (I), GSE62254 (J) and GSE84437 (K) cohort.An assessment of IRS’s performanceWe also computed the C-index of IRS and these clinical characteristics in order to assess how well they performed in predicting the clinical outcome of STAD cases. The C-index of IRS was greater in all datasets than that of clinical characteristics, such as age, gender, tumor grade, and clinical stage, as Fig. 3A illustrates. In the TCGA, GSE15459, GSE26253, GSE62254, and GSE84437 datasets, additional univariate and multivariate cox regression analysis revealed IRS as an independent risk factor for the clinical outcome of STAD cases (Fig. 3B and C, all p < 0.05). We also created a nomogram to predict the clinical fate of STAD patients based on risk score and stage (Fig. 3D). Regarding the 1, 3, and 5-year survival rates in the TCGA cohort, the calibration plots revealed good agreement between the nomogram prediction and actual observation (Fig. 3E). These data revealed that the IRS is capable of accurately and consistently predicting the clinical result of STAD cases.Fig. 3Evaluation the performance of IRS in predicting clinical outcome of STAD patients. A The C-index of IRS, age, gender and clinical stage for the performance in predicting the clinical outcome of STAD patients in TCGA and GEO datasets. B, C Univariate and multivariate cox regression analysis identified risk factors for the clinical outcome of STAD patients. D, E Predictive nomogram and calibration evaluating the overall survival rate of STAD patients.Analysis of the relationship between IRS and the tumor immunological milieuFigure 4A displayed the relationship between the IRS score and immune cell abundance. The number of CD8 + T cells, B cells, and dendritic cells was negatively correlated with the IRS score (Fig. 4B and D, p < 0.05). Additionally, a reduced amount of B cells, CD8 + T cells, macrophages, neutrophils, and TIL was suggested by a higher IRS score (Fig. 4E). A lower score was connected with cytolytic activity, T cell co-stimulation, and APC co-stimulation when the IRS score was higher (Fig. 4F). Additionally, we discovered that stromal, immune, and ESTIMAE scores were significantly lower in STAD patients with high IRS scores (Fig. 4G and I, all p < 0.001).Fig. 4The association between IRS and immune infiltration in STAD. A Correlation atlas between IRS and immune infiltration in STAD based on seven state-of-the-art algorithms. B–D IRS score was negatively correlated with the abundance of CD8 + T cell, B cell and Dendritic cell. E, F ssGSEA analysis revealing the level of immune cells and immune related functions in different IRS score group. G–I The immune score, stroma score and ESTIMAE score in different IRS score group. *p < 0.05, **p < 0.01, ***p < 0.001.IRS as a predictor of treatment outcomes in STADA greater immunophenotype and TMB score indicated a higher likelihood of benefiting from immunotherapy20. As shown in Fig. 5A and B, STAD patients with low IRS score had a higher TMB score and PD1&CTLA4 immunophenoscore. A low TIDE score indicated a better response to immunotherapy and a decreased risk of immune escape21,22. The data found a higher score of TIDE, T cell exclusion and dysfunction in higher IRS score group (Fig. 5C, all p < 0.05). A greater variety of antigen presentation was indicated by high expression of immunological checkpoints and HLA-related genes, which increased the possibility that immunotherapy might be beneficial23. The findings indicated that STAD patients with low IRS scores expressed more immunological checkpoints and genes associated to HLA (Fig. 5D and E, all p < 0.05). Consequently, patients with STAD and low IRS scores may benefit more from immunotherapy. We then computed the IRS score in immunotherapy patients to confirm the findings even more. In the IMvigor210 cohorts, non-responders had a higher IRS score than responders (p < 0.01), as illustrated in Fig. 5F. A higher IPS score (p = 0.02) was associated with a worse clinical outcome. Additionally, a lower IRS score was associated with a higher immunotherapy response rate (p < 0.01). It’s interesting to note that we saw comparable outcomes in the GSE78220 and GSE91061 dataset (Fig. 5G and H). We next investigate the IC50 value of STAD cases, taking into account the critical role that targeted therapy and standard chemotherapy play in the treatment of STAD. The information showed that STAD patients with high IRS scores had lower IC50 values for 5-Fluorouracil, Docetaxel, Oxaliplatin, Paclitaxel, Cytarabine, Gefitinib, Crizotinib, Erlotinib, and Osimertinib (Fig. 6A and B, all p < 0.05). This suggests that STAD patients with high IRS scores are more sensitive to chemotherapy and targeted therapy.Fig. 5IRS acted as an indicator for predicting the immunotherapy response in STAD. The TMB score (A), PD1&CTLA4 immunophenoscore (B), TIDE score (C) in STAD patients with different IRS score. The level of HLA-related genes (D) and immune checkpoints (E) in different IRS score group. The immunotherapy response and overall rate in patients with high and low IRS score in IMvigor210 (F), GSE91061 (G) and GSE78220 (H) datasets. *p < 0.05, **p < 0.01, ***p < 0.001.Fig. 6The IC50 value of common drugs in different IRS score group. Low risk score indicated a high IC50 value of common drugs in chemotherapy (A) and targeted therapy (B).Analysis of the variations in functional enrichment across IRS score groupsThe gene sets scoring for angiogenesis, DNA repair, EMT signaling, glycolysis, hypoxia, IL2-STAT5 signaling, mTORC1 signaling, NOTCH signaling, P53 pathway, and PIK-AKT-MTOR signaling were higher in STAD patients with high IRS scores, as Supplementary Fig. 1 illustrates (all p < 0.05).Biological functions of the selected geneTo further verify the performance of IRS, we selected AKR1B1 that contributed the most to the IRS for further analysis. We explored the expression of AKR1B1 in STAD cell lines, demonstrating that the expression of AKR1B1 was higher in most of STAD cell lines (Supplementary Fig. 2A). In the follow-up study, the results of the CCK-8 assay proved that knockdown of AKR1B1 obviously inhibited the proliferation of BGC-823 and HGC-27 cells (Supplementary Fig. 2B-2 C, all p < 0.05). Moreover, knockdown of AKR1B1 obviously inhibited the migration of BGC-823 and HGC-27 cells (Supplementary Fig. 2D-2G, all p < 0.05).

Hot Topics

Related Articles