Development of a disulfidptosis-related lncRNA prognostic signature for enhanced prognostic assessment and therapeutic strategies in lung squamous cell carcinoma

The research flow of this paper is demonstrated in Fig. 1A.Figure 1(A) The study flow chart of the article. (B) Sankey diagram showing the lncRNAs that co-expressed with disulfidptosis-related genes (P < 0.05).Identification of kEY-lncRNAs.Using |R|> 0.3 and P < 0.001 as screening criteria, we identified 1,432 lncRNAs that were co-expressed with DRGs. The co-expression relationships between DRGs and lncRNAs were visualized using a Sankey diagram (Fig. 1B). These genes were then integrated with the TCGA-LUSC-lncRNA expression data, ultimately identifying 209 disulfidptosis-related lncRNAs (Supplementary Table S1).The 209 genes were further analyzed by univariate Cox regression in the TRAIN cohort, and a total of 21 lncRNAs related to LUSC prognosis were obtained (P < 0.05) and plotted as a forest plot (Fig. 2A).Figure 2Identification of key-lncRNAs. (A) Forest plot showing lncRNAs associated with LUSC prognosis. The red color represents the poor factors of LUSC prognosis. Green represents favorable factors for LUSC prognosis. (B-C) LASSO regression with tenfold cross-validation and trajectories for each independent variable. (D) Results and coefficients of multivariate Cox regression analysis (P < 0.05). (E) Heatmap showing the relationship between KEY-lncRNA and DRGs. (F) Differential expression of KEY-lncRNA in TCGA-LUSC dataset.LASSO analysis could control overfitting through L1 regularization, select biologically significant and explanatory key genes, and improve model interpretability27,28. We uses LASSO analysis to ascertain the presence of 16 lncRNAs (Fig. 2B, C). Finally, multivariate Cox regression analysis was performed to screen out nine lncRNAs with the highest correlation with LUSC (LINC01311, AC010285.3, C10orf55, AL021368.3, AC009509.4, AL606489.1, AL133410.1, AC130462.2. AC005618.4) are called Key-lncRNAs. The name of Key-lncRNAs and the coefficients of the corresponding multivariate Cox regression analysis are shown in Fig. 2D.Expression of key-lncRNAs in tumor tissues and their correlation with DRGsAccording to the correlation heatmap of key-lncRNAs and DRGs (Fig. 2E), we can find that NDUFA11 presents a strong correlation with Key-lncRNAs, which is also consistent with the Sankey plot results (Fig. 1B). In comparison, AC009509.4 and C10orf55 presented a strong correlation with multiple DRGs. Figure 2F demonstrates the difference in expression of Key-lncRNAs in TCGA-LUSC. The results show that AL021368.3, AC009509.4, AL606489.1, and AL133410.1 were expressed higher in normal samples than in the tumor samples (P < 0.001). The expression of LINC01311, AC010285.3, C10orf55, AC130462.2, and AC005618.4 was higher in the tumor samples than in the normal sample (P < 0.001).Construction and validation of the prognostic signature.We determined the riskscore for each sample by analyzing the expression of the nine Key-lncRNAs. The calculation formula is expLINC01311*(−0.548372490461769) + expAC010285.3*(−1.18133243514436) + expC10orf55*0.351310706169649 + expAL021368.3*(−0.473280791062113) + expAC009509.4*0.511059938402485 + expAL606489.1*0.284340017375496 + expAL133410.1*0.776155666213932 + expAC130462.2*1.16718718964378 + expAC005618.4*(−0.618559418880687). The TRAIN cohort was categorized into HRG and LRG based on the median-RS. The other two cohorts (TEST and ALL) were divided into two risk groups using the same formula and grouping method. Figure 3A–C shows the risk stratification and survival status of HRG and LRG across three cohorts (TRAIN, TEST, and ALL), along with heatmaps depicting the expression of key lncRNAs in the two risk groups. In Fig. 3A, part (a) represents the risk stratification, where red dots indicate patients in the HRG group and blue dots indicate patients in the LRG group. Part (b) features a dashed line representing the median risk score, with patients to the left of the line in the low-risk group and those to the right in the high-risk group. Here, red dots indicate deceased samples and blue dots indicate non-deceased samples, with the vertical axis representing survival time. We observe more deceased samples and shorter survival times in the high-risk group. Part (c) indicates that LINC01311, AC010285.3, AL021368.3, and AC005618.4 are more highly expressed in the LRG, whereas the other five lncRNAs are more highly expressed in the HRG. This pattern is consistent across all three cohorts. Kaplan–Meier curves show that HRG has a poorer prognosis compared to LRG in all three cohorts (P < 0.05) (Fig. 4A–C).Figure 3Construction of prognostic signature. Riskscore distribution, survival status, and key-lncRNA expression in the (A)TRAIN, (B)TEST, and (C) ALL cohorts.Figure 4Validation of prognostic signature. (A) K-M survival curve analysis of two risk groups in the (A)TRAIN, (B)TEST, and (C) ALL cohorts. ROC curves for 1-, 3-, and 5-year survival for the (D) TRAIN, (E) TEST, and (F) ALL cohorts. ROC curves for risk score, age, gender, and staging in (G) TRAIN, (H) TEST, and (I) ALL cohorts.Additionally, the accuracy of the prognostic value was evaluated using time-dependent ROC curves (Fig. 4D). The areas under the Curves (AUCs) for 1 year, 3 year, and 5 year survival in the TRAIN cohort were 0.694, 0.788, and 0.834, respectively. These results underscore the potential predictive value of the prognostic model for patients with LUSC. Similar ROC curves were created using the TEST cohort (Fig. 4E) and the ALL cohorts (Fig. 4F), with both the TEST and ALL cohorts showing relatively high AUC values, consistent with the results from the TRAIN cohort. ROC curves were also used to compare the predictive ability of the RS with other clinical features. In the TRAIN (Fig. 4G), TEST (Fig. 4H), and the ALL cohorts (Fig. 4I), the AUC values for the RS were 0.788, 0.643, and 0.702, indicating a robust predictive capability.The K-M curve analysis consistently suggested that the HRG exhibited a more unfavorable prognosis when categorized by risk for various clinical subgroups (P < 0.05) (Figs. 5A–H). These results indicate that our prognostic model has a broad application scope, even in the presence of various clinical features.Figure 5K-M curves for two risk groups in multiple clinical subgroups. (A) The graph shows subgroups with age < 65. (B) The subgroup with age > 65. (C) and (D) are stage I and stage II-IV subgroups. (E) and (F) are subgroups of T1-T2a and T2b-T4. (G) and (H) for subgroups N0-1 and N2-3.Establishment and validation of the nomogramTo determine whether RS is an independent predictor of survival in LUSC, we conducted univariate and multivariate Cox regression analyses (Fig. 6A, B). The analyses considered factors such as Age, Gender, Stage, and RS. The results indicate that both RS and Stage can independently predict the prognosis of LUSC (P < 0.05). Subsequently, a nomogram was created based on the RS and the aforementioned clinical-pathological indicators to generate an accurate tool for assessing the survival of LUSC patients (Fig. 6C). Taking TCGA-21–1080 as an example, a 66 year-old male patient in Stage I with T2, N0, and a low-risk score, the patient’s total score was 194, resulting in 1-, 3-, and 5 year survival rates of 0.911, 0.734, and 0.626, respectively. Calibration curves were established to assess the predictive capability of the prognostic model. The gray line in the graph represents the ideal line (predicted survival rate), and the calibration curve shows that the actual survival rates at 1, 3, and 5 years closely align with the ideal line (Fig. 6D). The findings suggest that the prognostic signature exhibits predictive solid capabilities.Figure 6Creation and validation of Nomogram. (A-B) Results of univariate Cox regression analysis and multivariate Cox regression analysis of clinical characteristics (Age, Sex, Stage) and RS. (C) The nomogram is built based on RS. (D) Validation of the nomogram using calibration curves.Functional enrichment analysisTo explore the mechanisms by which risk models predict patient prognosis, we explored the differences in biological and immune-related functions between the two groups through gene enrichment analysis. Through differential analysis, we first identified 187 DEGs from the high and low-risk groups (Supplementary Table 2). Subsequently, we conducted GO and KEGG analyses on these 187 DEGs. We visualized the top six enriched terms from each of the three categories (BP, CC, MF) in the GO analysis results (P < 0.05). The results indicate that, in terms of BP, DEGs are primarily enriched in immune and inflammatory responses (cellular response to biotic stimulus, cellular response to lipopolysaccharide, cellular response to molecule of bacterial origin, antimicrobial humoral immune response mediated by antimicrobial peptide). In CC and MF, DEGs are mainly enriched in membrane composition and various activities related to cytokines and receptors (receptor ligand activity, cytokine activity, CXCR chemokine receptor binding, etc.) (Fig. 7A).Figure 7Differences in functional enrichment across two risk groups. (A) Results of GO enrichment analysis of differential genes. The innermost circle (first circle) is the Rich Factor. The second circle is the number of select genes. The third circle indicates the number of genes, and the outermost circle indicates the serial number of GO items. (B) KEGG analysis results of differential genes. (C-D) GSEA analysis results of HRG and LRG.The KEGG analysis indicates that DEGs are mainly concentrated in multiple immune response signaling pathways (Cytokine-cytokine receptor interaction, IL-17 signaling pathway, TNF signaling pathway), amino acid metabolism (Arginine and proline metabolism, Arginine biosynthesis), and transcriptional misregulation in cancer (Fig. 7B). The significance of these enrichment results will be detailed in the discussion section.Additionally, the GSEA analysis findings for the two risk groups suggest that the HRG is predominantly concentrated in pathways associated with cytokines and chemokines (KEGG-CYTOKINE-CYTOKINE-RECEPTOR-INTERACTION, KEGG-CHEMOKINE-SIGNALING-PATHWAY) (Fig. 7C). In contrast, the LRG is primarily enriched in the PPAR signaling and cytochrome P450 metabolism pathways (Fig. 7D). Some studies suggest that the PPAR signaling pathway can have anti-cancer effects. This pathway is typically associated with inhibiting cancer cell growth and promoting cell differentiation. It can suppress angiogenesis in non-small cell lung cancer by inhibiting the production of ELR + CXC chemokines29,30. This may provide evidence for a better prognosis in the LRG. To better explain this difference, further exploration was conducted to investigate the disparities in the immune microenvironment between the two risk groups.Relationship between RS and tumor immune microenvironment (TIME)The ESTIMATE algorithm is a computational method leveraging gene expression data to assess the scores of stromal and immune cells in tumor tissues31. The outcomes illustrate that HRG exhibits significantly elevated StromalScore (indicating the extent of tumor stromal components), ImmuneScore (reflecting the infiltration level of immune cells), and ESTIMATEScore (the composite score) compared to LRG(P < 0.001) (Fig. 8A). This finding implies that the immune microenvironment of HRG patients is more complex, characterized by a greater infiltration of both tumor stromal cells and a heightened level of immune cell infiltration.Figure 8Differences in the immune microenvironment of different risk subgroups. (A-C) Results of ESTIMATE analysis, CIBERSORT analysis, and immune function difference analysis for HRG and LRG.CIBERSORT is a computational method based on gene expression data to estimate the relative abundance of various immune cell subtypes in tissue samples32. The results indicate that CD8 + T cells, resting dendritic cells, and eosinophils have higher enrichment scores in LRG while resting CD4 memory T cells and neutrophils are more enriched in HRG (Fig. 8B). This also suggests that different risk groupings lead to differences in tumor cell infiltration.ssGSEA has been widely used in the analysis of gene expression data, especially in cancer research, to reveal biological differences and clinical features between different tumor subtypes33. We delved deeper into the distinctions in immune function between the HRG and LRG using ssGSEA. This exploration aimed to elucidate the relationship between RS and the immune system. The results caught our attention. On 29 immune-related functions, 27 immune functions (Th1_cells, CD8 + _T_cells, NK_cells, Treg, T_helper_cells, T_fh, Th2_cells, TIL, Cytolytic_activity, iDCs, pDCs, DCs, Macrophages, Neutrophils, Mast_cells, B_cells, APC_co_stimulation, T_cell_co-stimulation, T_cell_co-inhibition, Check-point, HLA, MHC_class_I, Type_IFN_Response, Type_II_IFN_Response, Inflammation-promoting, Parainflammation, CCR) all scored higher in the HRG ( Fig. 8C). The significance of this difference will be explained in detail in the discussion section.Correlation of RS and TMB.TMB refers to the number of mutations in tumor cells, typically expressed as mutations per megabase (must/Mb). TMB is a significant biomarker that can be employed to evaluate the genetic variant load of a tumor, thereby aiding in the prediction of the treatment response and prognosis of patients. Numerous studies have demonstrated that TMB can be a marker for immunotherapy in various cancers34,35,36,37. Consequently, we decided to investigate the inherent connection between TMB and RS. Figure 9A–B demonstrates the top 15 genes with the highest mutation rates in the species of HRG and LRG, with no significant differences. Additionally, the LRG exhibited a considerably greater total TMB value than the HRG’s (Fig. 9C). We subdivided the samples into two groups, H-TMB and L-TMB, based on the median TMB values of all samples to compare the prognosis. K-M curves showed that the OS of the L-TMB group was higher than that of the H-TMB (P = 0.002; Fig. 9D). Moreover, the sample was further divided into “H-TMB + H-RS,” “H-TMB + L-RS,” “L-TMB + H-RS,” and “L-TMB + L-RS” groups to compare OS. The results showed that the L-TMB + H-RS had the worst prognosis, while the H-TMB + H-RS was also worse. This outcome confirms RS’s superior predictive power for patient prognosis and implies that our model can be used with TMB to forecast patient prognosis (Fig. 9E).Figure 9Correlation analysis of RS and TMB. (A-B)Tumor mutation waterfall plots of HRG and LRG. (C) Violin plot of the difference in TMB values between two risk groups. (D) Kaplan–Meier curves showing survival differences between high and low tumor mutation burden groups. (E) Patient survival curve for LUSC based on two variables (TMB and RS).Prognostic modeling for prediction of clinical treatmentWe calculated the TIDE scores for the high and low-risk groups based on the TIDE database. The results showed that the LRG had lower TIDE scores, indicating that the LRG may have better sensitivity to immunotherapy, particularly anti-PD-L1 and anti-CTLA-4 treatments (Fig. 10A). Additionally, using drug sensitivity data from the GDSC database, we calculated the IC50 (half-maximal inhibitory concentration) values for various drugs in both risk groups. And the lower the IC50, the higher the sensitivity. We found that 49 drugs were more sensitive to LRG, and 25 were more sensitive to HRG (Supplementary Table S3). We selected the drugs commonly used in clinical oncology treatment (Selumetinib, Paclitaxel, Trametinib, Afatinib, Erlotinib, Gefitinib, Osimertinib, Cisplatin, AZD4547, BI-2536, MK-2206) to be visualized (Fig. 10B–L). Among them, Selumetinib, Paclitaxel, and Trametinib were highly sensitive to HRG, while the other eight drugs were more sensitive to patients in the LRG. These two results can effectively help clinicians treat lung cancer for drug selection.Figure 10Immunotherapy and drug sensitivity differences between HRG and LRG. (A) Comparison of immunotherapy predictions between HRG and LRG according to the TIDE algorithm. (B-L) Differences in sensitivity to multiple drugs (Selumetinib, Paclitaxel, Trametinib, Afatinib, Erlotinib, Gefitinib, Osimertinib, Cisplatin, AZD4547, BI-2536, MK-2206) between the high and low-risk groups, respectively.Results of the experimentFigure 11A shows that the expression of C10orf55 was higher in three lung squamous carcinoma cell lines (H520, H596, and SW900) than in normal cell lines (BEAS-2B). Figure 11B showed the highest efficiency of SW900-687 interference, so we chose this interference sequence for subsequent analysis. The results of the Transwell assay showed that the invasive migration ability of SW900 cells in the siRNA group was decreased (Fig. 11C–D). According to Fig. 11E–F, we learned that after 48 h of 6-well plate culture, the siRNA group SW900 cells had the weakest ability to move.Figure 11(A) PCR results of C10orf55 in BEAS-2B, H520, H596, and SW900. (B) Expression levels in SW900 cells + NC plasmid, GAPDH, and three interfering plasmids after interference in SW900 cells. (C-D) Invasion results in three groups of SW900, SW900 + NC, and siRNA. (E–F) Scratch results of the three groups of SW900, SW900 + NC, and siRNA.

Hot Topics

Related Articles