Development and validation of a novel endoplasmic reticulum stress-related lncRNAs signature in osteosarcoma

Identification of differentially expressed ERS genesBy conducting differential analysis on normal muscle tissue and OS tissue, we obtained 5179 differentially expressed genes. Among them, there are 4274 upregulated genes and 905 downregulated genes (Fig. 2A, B). After intersecting with 252 ERSs downloaded from MSigDB, we obtained 78 differentially expressed ERGs (Fig. 2C). The results of GO and KEGG enrichment analysis on these 78 intersecting genes included “response to endoplasmic reticulum stress”, “response to topologically incorrect protein”, “response to unfolded protein”, “endoplasmic reticulum lumen”,“endoplasmic reticulum protein-containing complex”, “endoplasmic reticulum quality control compartment” and “Protein processing in endoplasmic reticulum” (Fig. 2D–F). The functions of these intersecting genes are mainly associated with ERS. This further proves the reliability of our screened ERGs.Fig. 2Identification of differentially expressed ERGs. (A,B) Identification of differentially expressed genes between OS and muscle (A) volcano plot, (B) heatmap (C) Intersecting differentially expressed genes with ERGs (D–F) Functional enrichment analysis of differentially expressed ERGs (D) bubble plot, (E) EMAP plot, (F) chord plot. Identification of ERLs and construction of a predictive model for ERLs in the training groupBased on the TARGET database, lncRNAs were extracted using Perl-based methods. Then, we conducted correlation analysis on the 78 differentially expressed ERGs (Cor > 0.5, p< 0.001), and we obtained 420 ERLs (Fig. 3A). Overall, 88 OS samples were classified into two groups: testing and training groups. 64 of these lncRNAs were selected using univariate Cox regression analysis for their potential as prognostic lncRNAs associated with ERS in OS patients (Fig. 3B). The Least Absolute Shrinkage and Selection Operator(LASSO) analysis and multi Cox regression analysis of the training cohort was conducted to build the model and 5 lncRNAs were finally identified when constructing the risk signature (Fig. 3C–E). The ERLs- signature score for each patient was subsequently determined through the formula as given below: ERLs score (Risk Score) = (– 1.954919033×expr AC023157.3) + (0.840838939×exprAL031673.1) + (0.193350763×expr LINC02298) + (– 1.160820595×expr LINC02328) + (0.512805281×expr SNHG26). The patient cohort was categorized into low and high-risk groups based on the median risk score. The survival status map revealed that an increase in the risk score corresponded to a progressive decline in the patients’ survival time and an increase in their mortality rate (Fig. 3F, G). The relative expression levels of each gene were graphically represented (Fig. 3H). Moreover, the KM survival curves indicated that the prognosis for the high-risk group was inferior to that of the low-risk group (Fig. 3I). At 1, 3, and 5 years, the AUC values for the ROC curves were 0.893, 0.936, and 0.927 (Fig. 3J).Fig. 3Identification of lncRNAs associated with ERS and construction of the prognostic signature. (A) The correlation analysis of the 78 ERGs and their related lncRNAs. (B) A graphical representation of a forest plot displays the results of the univariate Cox regression analysis (C) A cross-validation curve is generated to assess the paired likelihood of deviance (D) Elucidation of the LASSO coefficient profiles of predictive lncRNAs. (E) The results of the multivariate Cox regression analysis (F) Patients are classified into high-risk and low-risk groups based on riskscore. (G) Survival status map (H) Risk heatmap of gene expression (I) Kaplan–Meier survival curves (J) ROC curves were generated to assess the overall survival rates at 1, 3, and 5 years.Evaluation of the predictive model and PCA The accuracy of the ERLs signature was checked with both validation and total sets. The results were similar to what was seen in the training set: participants with a higher risk score showed a lower chance of survival and a higher death rate (Fig. 4A, B, F and G). Moreover, heatmaps were created to illustrate the distribution of expression for lncRNAs (Fig. 4C and H). Furthermore, the KM survival curves indicated a substandard prognosis for high-risk group patients (Fig. 4D and I). The model was excellent at predicting prognosis, as shown by the ROC curves of the validation set and the total set. The AUCs at 1, 3, and 5 years for the validation set were 0.716, 0.764, and 0.774, and they were 0.800, 0.867, and 0.873 for the total set (Fig. 4E and J). It was determined, through a PCA of all genes, ERGs, ERLs, and ERLs predictive model, the ERLs prognostic model could distinguish the patients between high and low risk groups more effectively (Fig. 5A–D).Fig. 4The assessment of the prognostic signature of lncRNAs associated with the ERS in the validation and whole cohorts. (A,F) The distribution of riskscore (B,G) Plots indicating the state of survival (C,H) Risk gene expression heatmaps (D,I) The Kaplan-Meier survival curves (E,J) ROC curves were generated to evaluate the overall survival rate at 1, 3, and 5 years.Fig. 5Principal component analysis (A–D) Principal component analysis of all genes, ERS genes, ERS lncRNAs and risk lncRNAs.Nomogram construction and KM survival curve in different subgroupsA nomogram was generated by calculating the prognostic indicators related to clinical characteristics and risk scores (Fig. 6A). These results indicate a strong correlation between the observed and expected findings for the overall survival rates at 1, 3, and 5 years. The calibration curves of the nomogram suggested that the nomogram had a sufficient ability to predict outcomes (Fig. 6B). Survival analysis was further conducted to compare groups based on baseline features such as age, gender and metastasis. The findings demonstrated that OS patients divided as high-risk demonstrated significantly shorter survival durations compared to low-risk patients across many categories, including gender (male, female), age (age  >12, age ≤ 12), and metastasis (metastasis, non-metastasis). These results provide strong evidence supporting the efficacy of the risk model in accurately predicting outcomes for OS patients in multiple subgroups (Fig. 6C–H).Fig. 6The development of a predicted nomogram and subgroup survival analysis. (A) This nomogram displays risk scores and clinicopathological variables that can be used to predict overall survival in OS patients at 1, 3, and 5 years. (B) Calibration curves (C,D) Survival analysis in different age subgroups. (E,F) Survival analysis in different sex subgroups. (G,H) Survival analysis in different metastasis subgroups.GSVAThen, we conducted GSVA and found that the high-risk group had lower scores than the low-risk group in KRAS signaling, Allograft rejection, IL2/STAT5 pathway, P53 pathway, Inflammatory response, PI3K/AKT/mTOR pathway, Complement, Apoptosis, IL6/JAK/STAT3 pathway, and TNFα via NFkβ signaling pathways. In contrast, the high-risk group had higher scores than the low-risk group in E2F targets, G2M checkpoint, Wnt/beta/catenin, and Cholesterol homeostasis (Fig. 7A). The heatmap of gene expression and hallmark pathway scores further demonstrates the relationship between gene expression and pathways (Fig. 7B). Patients in different risk groups may affect their respective prognosis through the above-mentioned pathways.Fig. 7Gene set variation analysis (GSVA) (A) GSVA analysis between high-risk and low-risk groups (B) The relationship between GSVA score and gene expression.Immune infiltration analysisDue to its dynamic and heterogeneous characteristics, the TME significantly influences cancer development, progression, and metastasis while also altering the behavior of tumor cells. Compared to patients in the low-risk group, the APC_co_inhibition, APC_co_stimulation, CCR, Check point, Cytolytic_activity, Inflammation-promoting, and T_cell_co-inhibition in patients in the high-risk group are significantly suppressed (Fig. 8A). ESTIMATE algorithm indicated that the low-risk group exhibited higher stromal, immune, and estimate scores than the high-risk group (Fig. 8B–D). Infiltration of stromal and immune cells is decreased in the high-risk group. Using the ssGSEA algorithm, we found that CD8 T cells, iDCs, Macrophages, Neutrophils, Tgd, and Th1 cells were significantly reduced in the high-risk group (Fig. 8E). Additionally, we also observed a significant correlation between Neutrophils, Macrophages, and iDCs (Fig. 8F). Furthermore, examining the relationship between the risk score and immune cells revealed that iDC, Macrophages, Neutrophils negatively correlated with the risk score and Tgd positively correlated with the risk score (Fig. 8G–K). The expression of common immune checkpoint genes between two groups was analysed and found the patients in high risk group had lower expression levels of LAG3,CTLA4,TNFRSF9,CD80,CD70,LAIR1,TNFSF15,CD274,HAVCR2,TMIGD2,CD28, CD48, CD200R1 (Fig. 8L).These findings suggest that the immune function of patients in the high-risk group is impaired, leading to an increased possibility of immune escape.Fig. 8Immune infiltration analysis and immune checkpoints analysis. (A) Analysis of immune function between two groups (B) ESTIMATE score between two groups (C) Stromal score between two groups (D) Immune score between two groups (E) Infiltration of 24 types of immune cells between high-risk and low-risk groups (F) Heatmap of immune cell infiltration correlation (G) Heatmap of correlation between model genes and immune cell infiltration (H) High infiltration of Tgd is associated with high risk score (I–K) Low infiltration of cytotoxic cells, macrophages, and iDCs is associated with high risk score (L) The expression of immune checkpoints in high-risk and low-risk groups.Drugs with potential efficacy in OSThis study utilized the “pRRophetic” R package to determine the IC50 values of various drugs. These values served as markers to estimate the effectiveness of a drug in inhibiting tumor growth. The results revealed 11 small molecule drugs with higher sensitivity towards the high-risk group: CDK9_5576, CDK9_5038, Acetalax, TAF1_5496, NVP-ADW742, Nilotinib, Linsitinib, KRAS(G12C)inhibitor-12, I-BRD9, GSK1904529A, Dihydrorotenone (Fig. 9).Fig. 9Drug sensitivity analyses (A) CDK9_5576 (B) CDK9_5038 (C) Acetalax (D)TAF1_5496 (E) NVP-ADW742 (F) nilotinib (G) linsitinib (H) KRAS(G12C)inhibitor-12 (I) I-BRD9 (J) GSK1904529A (K) dihydrorotenone.Confirmation of risk ERLs expression in OS cell linesThe risk prognosis predictive model was further validated by RT-qPCR analysis performed in three human OS cell lines (HOS, MG63 and U2OS). Normal osteoblast cells (hFOB1.19) were used as the control group. It was found that AL031673.1, SNHG26 and LINC02298 had increased expression in all three OS cell lines. AC023157.3 and LINC02328 had reduced expression in the OS cell lines (Fig. 10A-E). This indicates that the constructed risk prognosis model is persuasive and reliable.Fig. 10Validation of RT-qPCR in OS cell lines (A) LINC02298 (B) AC031673.1 (C) SNHG26 (D) AC023157.3 (E) LINC02328.The most significant differential expression was LINC02298, which promoted OS cell proliferation, migration, and invasionAfter analyzing the differential expression of the model ERLs in OS cell lines, it was observed that LINC02298 exhibited the most significant differential expression as a risk factor in OS. Therefore, further experimental validation was performed on LINC02298. To further investigate the function of LINC02298 in OS, lentiviral transduction was used to overexpress it in the HOS, MG63 and U2OS cell lines. RT-qPCR confirmed the overexpression of LINC02298 in OS cell lines (Fig. 11A, C and E). The CCK-8 assays demonstrated that LINC02298 overexpression promoted cell proliferation (Fig. 11B, D and F). Moreover, the migration of OS cells was promoted by LINC02298 overexpression, as determined by the wound healing assay (Fig. 11G–I). Moreover, the transwell assay demonstrated that LINC02298 overexpression significantly induced the migration and invasion of OS cells (Fig. 11J–L). In summary, the overexpression of LINC02298 promoted the growth, migration, and invasion of OS cells.Fig. 11The influence of LINC02298 overexpression on cell proliferation, migration, and invasion in HOS, MG63 and U2OS cell lines. (A,C,E) The overexpression of LINC02298 via RT-qPCR (B, D,F) CCK-8 assays (G–I) Wound healing assays (J–L) Transwell assays. Knockdown of LINC02298 inhibited the proliferation, migration, and invasion of OS cellsAfter determining the biological functions of overexpressing LINC02298, LINC02298 was knocked down to further validate changes in its function. It was knocked down in the HOS, MG63 and U2OS cell lines with siRNA. RT-qPCR confirmed the LINC02298 knockdown in OS cell lines (Fig. 12A, C and E). CCK-8 assay indicated that the knockdown of LINC02298 inhibited cell proliferation (Fig. 12B, D and F).Moreover, the migration of OS cells was inhibited by LINC02298 suppression, as determined by the wound healing assay (Fig. 12G–I). Conversely, the transwell assays demonstrated that LINC02298 inhibition substantially inhibited the migration and invasion of OS cells (Fig. 12J–L). In summary, the inhibition of LINC02298 inhibited the proliferation, invasion and migration of OS cells.Fig. 12The effect of LINC02298 knockdown on cell proliferation, migration, and invasion in HOS, MG63 and U2OS cell lines. (A,C,E) RT-qPCR measured the knockdown of LINC02298. (B,D,F) CCK-8 assays (G–I) Wound healing assays (J–L) Transwell assays.Pan-cancer analysis of LINC02298When comparing the expression levels in normal tissue and tumor tissue, we have observed that in the cases of BLCA, CHOL, COAD, ESCA, HNSC, KIRC, KIRP, LIHC, LUAD, LUSC, READ, STAD, THCA, the expression level of LINC02298 is higher in tumor tissue compared to normal tissue. While in many cancers, such as GBM, KICH, PRAD, the expression of LINC02298 is reduced compared to the normal tissue (Fig. 13A). The paired figure shows that the expression of BLCA, BRCA, CHOL, COAD, HNSC, KIRC, KIRP, LIHC, LUSC, STAD, and THCA in tumor tissue is more than that in normal tissue, while the expression in KICH and PRAD is lower than that in normal tissue (Fig. 13B). Subsequently, the Overall Survival (OS) analysis revealed that LINC02298 was significantly associated with poorer OS in ACC, DLBC, THCA, and UVM, while a LINC02298 in LGG was significantly associated with better OS (Fig. 13C). The Disease-Specific Survival (DSS) analysis showed that LINC02298 was significantly associated with poorer DSS in DLBC, READ, THCA, and UVM, while LINC02298 in LGG was significantly associated with better DSS (Fig. 13D). The Progression-Free Interval (PFI) analysis indicated that LINC02298 was significantly associated with better PFI in ACC, COAD, DLBC, and UVM, whereas a LINC02298 in LGG and UCEC was significantly associated with poorer PFI (Fig. E).The Cibersort ssGSEA and ESTIMATE algorithm suggests that LINC02298 is closely related to various immune cells in multiple cancers, indicating that LINC02298 may affect tumor prognosis by mediating multiple immune cells (Fig. 13F–H).Fig. 13Pan-cancer Analysis of LINC02298. (A,B) The expression of LINC02298 in pan-cancer analysis (C) Overall survival analysis. (D) Disease specific survival analysis. (E) Progress free interval analysis. (F–H) The correlation between LINC02298 and immune cells using Cibersort (F), ssGSEA (G), ESTIMATE (H) algorithm.

Hot Topics

Related Articles