Generation of novel lipid metabolism-based signatures to predict prognosis and immunotherapy response for colorectal adenocarcinoma

Assessment of tumor signatures using ssGSEA scoresTo investigate the tumor signatures, ssGSEA was used to score the EMT-related genes, stem cell-related gene and IC genes in COAD and normal samples. As shown in Fig. 1A–C, primary tumor samples possessed significant increased EMT score (P = 0.010) and stemness score (P < 0.001) than that of normal samples, while the ssGSEA score for ICs was remarkably decreased in primary tumor samples compared with normal samples (P < 0.001). Those data indicated that primary tumor samples maybe presented EMT, acquired characteristics of CSCs and escaped immune killing.Figure 1Assessment of tumor signatures in TCGA dataset. (A–C) ssGSEA scores for EMT, stemness and immune checkpoints in primary COAD samples and normal samples.Identification of co-expression modules and hub genes using WGCNABased on 742 lipid metabolism-related genes, WGCNA was performed to identify co-expression modules. Sample cluster analysis was conducted and the height cutoff value was set at 30 to detect outliers. A total of 65 outliers were excluded from this study (Supplementary Fig. 1A). Next, the soft threshold power at 4 with an R^2 > 0.8 is selected and 12 gene co-expression modules were identified (Supplementary Fig. 1B,C). Supplementary Fig. 1D displayed the eigengene adjacency heatmap that indicated that the green module, the yellow module and some other modules were adjacent.Subsequently, the module-signature relationship was evaluated (Fig. 2A). The green model was significantly correlated with EMT (cor = 0.74, P < 0.001), ICs (cor = 0.79, P < 0.001) and stemness (cor = 0.47, P < 0.001); also, the yellow module was correlated with EMT (cor = 0.74, P < 0.001), ICs (cor = 0.39, P < 0.001) and stemness (cor = 0.58, P < 0.001) of COAD. Then, the green module and yellow module were selected for module membership analysis (Fig. 2B,C). In the green module, the module membership was highly correlated with gene significance (cor = 0.87, P < 0.001); in the yellow module, the module membership had a significant correlation with gene significance (cor = 0.64, P < 0.001), indicating that the green module and yellow module were suitable for identifying hub genes. Based on intramodular connectivity |kME|> 0.8, a total of 12 hub genes (PIK3CG, ALOX5AP, PIK3R5, TNFAIP8L2, DPEP2, PIK3CD, PIK3R6, GGT5, ELOVL4, PTGIS, CYP7B1 and PRKD1) were found within the green module and yellow module.Figure 2Selection of modules associated with tumor signatures. (A) The relationships between12 modules and tumor signatures (EMT, immune checkpoints and stemness). (B,C) Module membership analysis for green module and yellow module.Generation of prognostic signatures and survival analysisLASSO Cox regression was utilized to determine the risk genes for prognostic model construction. With the gradual increase of lambda, the number of independent variable coefficients tending to zero increased gradually (Fig. 3A). Ten-fold cross-validation was utilized to calculate partial likelihood deviances (Fig. 3B). The optimal lambda = 0.01463613. Three genes (PIK3CG, GGT5 and PTGIS) were determined for further analysis (Fig. 3C). The risk score was calculated according to the formula: Risk score = −0.19820*PIK3CG + 0.08726*GGT5 + 0.13305*PTGIS. Figure 3D revealed that the number of death was major in high-risk group. The expression of PIK3CG was decreased in high-risk group while the expression of GGT5 and PTGIS were increased in high-risk group (Fig. 3E). Further survival analysis in TCGA dataset deciphered that patients in high-risk group had dismal prognosis than that of patients in low-risk group (P = 0.0128) (Fig. 3F) with 1 year AUC of 0.66, 3-year AUC of 0.63 and 5-year AUC of 0.59 (Fig. 3G). Furthermore, In the KMPLOT database (https://kmplot.com), COAD samples were selected for survival analysis. The result showed that COAD patients with high expression of PIK3CG (P = 0.033), or PTGIS (P = 0.00032), and low expression of GGT5 (P = 0.047) have a low overall survival rate (Supplementary Fig. 2).Figure 3Generation of prognostic model and survival analysis. (A) Determination of risk genes using LASSO. Independent variable coefficients changed with lambda increase. (B) The partial likelihood deviance of ten-fold cross validation. (C) Three genes (PIK3CG, GGT5 and PTGIS) were determined as risk genes. (D) Distribution of risk score and the status of patients. (E) The expression patterns of PIK3CG, GGT5 and PTGIS in patients with high- and low-risk. (F) Survival analysis using Kaplan–Meier curves in TCGA dataset. G, ROC curves with 1 year AUC of 0.66, 3-year AUC of 0.63 and 5-year AUC of 0.59.Analysis of mutation characteristics between risk groupsFurthermore, the distributions of clinicopathologic characteristics were analyzed between high- and low-risk groups and found that stage, T stage, N stage and M stage were significantly different between the two risk groups (Fig. 4A). Additionally, 100% patients in high-risk group and 99.59% patients in low-risk group exhibited mutation; And compared to low-risk group, patients in high-risk group had a higher genes mutation frequency (APC: 80 vs 73%, TP53: 64 vs 56%, TTN: 52 vs 51%, KRAS: 42 vs 41%) (Fig. 4B,C), indicated the malignancy was higher in the high-risk group.Figure 4Analysis of mutation characteristics between risk groups in TCGA dataset. (A) Distributions of clinicopathologic characteristics between high- and low-risk groups. (B,C) Mutation characteristics in high- and low-risk groups. ***P < 0.001.Metabolism pathways differences between risk groupsThrough GSEA, some metabolism-related pathways such as REACTOME_CHONDROITIN_SULFATE_DERMATAN_SULFATE_METABOLISM (P = 9.330e − 05), REACTOME_DISEASES_ASSOCIATED_WITH_GLYCOSAMINOGLYCAN_METABOLISM (P = 3.031e − 06), REACTOME_DISEASES_OF_METABOLISM (P = 5.284e − 08), REACTOME_GLYCOSAMINOGLYCAN_METABOLISM (P = 1.283e − 04) and REACTOME_METABOLISM_OF_CARBOHYDRATES (P = 1.558e − 03) were enriched in high-risk group (Fig. 5A). From KEGG analysis, ECM-receptor interaction, focal adhesion, calcium signaling pathway and PI3K-Akt signaling pathway were activated in high-risk group (Fig. 5B); Based on GO analysis, extracellular matrix structural constituent and extracellular matrix organization were activated while DNA replication-dependent chromatin assembly was suppressed in high-risk group (Fig. 5C).Figure 5Association of risk score with metabolism pathways in TCGA dataset. (A) The results from GSEA showing significant enriched metabolism-related pathways in high risk group. (B,C) KEGG and GO enrichment analyses using GSEA.Analysis of immune characteristics between risk groupsBased on MCPcounter analysis, infiltration of immune cells was significantly varied between the two risk groups. The abundance of endothelial cells and monocytic lineage was increased in patients in high-risk group, while other immune cells including B lineage, CD8 T cells, cytotoxic lymphocytes, neutrophils and NK cells were enriched in patients in low-risk group (Fig. 6A). The results from ESTIMATE analysis showed that patient in high-risk group possessed higher ESTIMATEScore (P < 0.001 and StromalScore (P < 0.001) than that of patient in low-risk group (Fig. 6B). Meanwhile, the infiltration of central memory CD8 T cell, central memory CD4 T cell, T follicular helper cell, regulatory T cell, natural killer cell, plasmacytoid dendritic cell and mast cell was increased in high-risk patients; activated CD8 T cell, activated CD4 T cell, Type 17 T helper cell, Type 2 T helper cell, activated B cell and neutrophil were abundant in patients in low-risk group (Fig. 6C).Figure 6Association of risk score with immune characteristics in TCGA dataset. (A) Infiltration of immune cells evaluated by MCPcounter analysis. (B) ESTIMATE analysis for evaluating ESTIMATEScore, ImmuneScore and StromalScore. (C) SsGSEA scores for 28 immune cells from TISIDB database. Ns represents P > 0.05; *P < 0.05, **P < 0.01, and ***P < 0.001.Prediction of responses to immunotherapy and chemotherapyTo predict the response to immunotherapy, the expression of ICs was evaluated in the two risk groups. Patients in low-risk group exhibited higher CD274 level while high-risk group had higher expression of CD276 (Fig. 7A). Figure 7B revealed high-risk group had higher TIDE score than that of low-risk group, indicating a low response rate to ICI therapy of high-risk patients. Additionally, the IPS values for ctla4_neg_pd1_neg (P < 0.001), ctla4_neg_pd1_pos (P < 0.001), ctla4_pos_pd1_neg (P < 0.001) and ctla4_pos_pd1_pos (P < 0.001) were higher in low-risk group, which indicated that low-risk group had more immunogenicity on ICIs (Fig. 7C).Figure 7Prediction of responses to immunotherapy and chemotherapy in TCGA dataset. (A) Expression of immune checkpoints in high- and low-risk group. (B) Changes of TIDE score between high- and low-risk group. (C) Estimated IC50 values for traditional chemotherapy drugs. Ns represents P > 0.05; *P < 0.05, **P < 0.01, and ***P < 0.001.Analysis of risk genes at single cell levelSingle cell data was acquired for the comprehensively understanding of the profile of risk genes in COAD patients. Figure 8A displayed the overlapped 23 CRC samples and 10 healthy samples in TSNE diagram. 6 cell subpopulations with some classic markers of immune cells such as B cells, ephithelial cells, mast cells, myeloids, stromal cells and T cells were determined (Fig. 8B). Next, the distribution of risk genes in cell subpopulations were evaluated, and PIK3CG was expressed in B cells; GGT5 and PTGIS were both expressed in stromal cells (Fig. 8C,D). Thereafter, the expression levels of PIK3CG, GGT5 and PTGIS were validated in single cell data. The expression of PIK3CG (P < 0.001) was downregulated in CRC samples whereas the levels of GGT5 (P = 0.0044) and PTGIS (P < 0.001) were increased in CRC samples compared with normal samples (Fig. 8E).Figure 8Analysis of risk genes at single cell level. (A) TSNE diagrams of 23 CRC samples and 10 healthy samples. (B) Screening for 6 cell subpopulations. (C,D) The distribution of risk genes in 6 cell subpopulations. (E) Validation of the expression levels of PIK3CG, GGT5 and PTGIS in single cell data.

Hot Topics

Related Articles