Development and validation of a prognostic signature of breast cancer based on drug absorption, distribution, metabolism and excretion (ADME)-related genes

Extraction of ADME-related genes with noteworthy prognostic valueThe TCGA dataset was utilized to determine the expression levels of 298 genes related to ADME in both normal and breast cancer populations. The differential expression of ADME-related genes across various breast cancer subtypes was observed, as depicted in Supplementary Fig. S1. Within the HER2-enriched subtype, heightened NAT1 expression was significantly correlated with an unfavorable prognosis (P = 0.047) (Supplementary Fig. S2A). Within the Luminal B subtype, elevated ADH1B expression was associated with a poorer prognosis (P = 0.042) (Supplementary Fig. S2B). Furthermore, in the Luminal A subtype, patients exhibiting increased levels of CYP1A1, ABCC9, and CYP46A1 had a worse prognosis, whereas those with heightened CYP21A2 expression demonstrated a more favorable prognosis (Supplementary Fig. S2C). Between normal and breast cancer tissues, 81 DEGs were discovered, containing 26 upregulated and 55 downregulated genes (Fig. 2A,B). The crucial points were subsequently exhibited through the analysis of protein-protein interaction (PPI) of 81 DEGs, as shown in Fig. 2C. We observed that the critical genes were CYP3A4, ABCG2, and CYB2B6 (Fig. 2D).Fig. 2Identification of candidate ADME-related genes. (A) A heatmap of the DEGs between tumor tissues and normal tissues. (B) A volcano plot of DEGs. (C) The PPI network diagram of DEGs. (D) The number of nodes associated with hub genes in the PPI network was determined. (E) A forest plot was generated to assess the prognostic significance of ADME-related genes.Creation of a prognostic signature related to ADMEThe univariate Cox analysis identified 38 DEGs that were associated with prognosis (Fig. 2E). Following the elimination of overfitting genes through LASSO analysis, an ADME signature consisting of four genes (ABCB5, ATP7B, KCNJ11, and TAP1) was identified (Fig. 3A,B). The risk score is calculated using the following equation: $$\begin{aligned} {\text{risk score}} &= ({\text{ABCB}}5 \times (1.00448764021319) + ({\text{ATP}}7{\text{B}} \times (-0.246590735857609) \\ &\quad + ({\text{KCNJ}}11 \times (-0.186706924591259) + ({\text{TAP}}1 \times (-0.274209942422424)).\end{aligned}$$Fig. 3Construction of the prognostic signature in TCGA dataset. (A,B) The LASSO analysis of the ADME-related genes. (C) Distribution of risk scores and survival time and status for each case. (D) Heatmap showed the expression of risk genes in different risk groups. (E) The PCA analysis. (F) Survival differences between patients in different risk groups. (G) The time-dependent ROC curves of BRCA patients.Based on the median risk score, breast cancer patients were classified into two different risk subgroups (Fig. 3C). Figure 3D illustrates the diverse manifestation of the four genes among two risk categories. According to results of the PCA, individuals could be effectively categorized into two clusters (Fig. 3E). Figure 3F demonstrates that individuals classified as low-risk exhibit a more favorable prognosis in contrast to those classified as high-risk. Additionally, the ADME signature was assessed through the utilization of the ROC curve, resulting in AUC values of 0.764, 0.708, and 0.710 for the respective 1-, 3-, and 5-year intervals (Fig. 3G).Verification of the prognostic signature related to ADME and the link between risk scores and clinical features of individuals with BRCAMoreover, our findings were validated in the test datasets, which stratified individuals diagnosed with BRCA into two distinct risk subgroups. The KM analysis conducted on the datasets TCGA-test, TCGA-all, and GSE20685 demonstrated that individuals classified as low-risk exhibited a more favorable prognosis in comparison to those classified as high-risk (Fig. 4A,C). Furthermore, within the TCGA cohort, high-risk patients exhibited lower rates of recurrence-free survival (RFS) and disease-specific survival (DSS) compared to low-risk patients (Supplementary Fig. S3). Similarly, in the GSE7390, GSE20711, GSE25066, and GSE58812 cohorts, high-risk patients demonstrated a poorer prognosis (Supplementary Fig. S4). In TCGA-test (Fig. 4D), the AUC values for the ROC curve were 0.718, 0.681, and 0.676 for the 1-, 3-, and 5-year periods, respectively. In TCGA-all (Fig. 4E), the corresponding values were 0.737, 0.706, and 0.701. In GSE20685 (Fig. 4F), the AUC values were 0.932, 0.669, and 0.680. In patients diagnosed with Basal-like, HER2-enriched, Luminal A, and Luminal B breast cancer subtypes, those classified as high-risk exhibited a poorer prognosis compared to low-risk patients, with no statistically significant disparity observed in Normal-like subtypes (Supplementary Fig. S5). Furthermore, a survival analysis was performed using clinical characteristics, which demonstrated that individuals with low-risk scores exhibited superior outcomes across multiple categories, encompassing age groups of 65 and below, over 65, absence of metastasis (M0), presence of metastasis (M1), absence of lymph node involvement (N0), presence of lymph node involvement (N1–3), early-stage disease (Stage I + II), advanced-stage disease (Stage III + IV), and smaller tumor size (T1 + 2) (Supplementary Fig. S6). This further demonstrated the reliability of the risk model we developed. Furthermore, we conducted a comparison with the prognostic model of different individuals, and our signature exhibited a higher C-index compared to theirs (Supplementary Fig. S7). The expression of risk genes was validated through the use of RT-qPCR. Supplementary Fig. S8 demonstrated that tumor cells displayed elevated expression of ABCB5 and reduced expression of ATP7B, KCNJ11, and TAP1 in comparison to normal cells.Fig. 4Validation of the prognostic model. The K-M curves shows the different prognosis in different risk group in TCGA-test (A), TCGA-all (B) and GSE20685 (C). The ROC curves in TCGA-test (D), TCGA-all (E) and GSE20685 (F).Development of a nomogram for breast carcinoma and assessment of its prognostic predictive capacityCox regression analyses yielded compelling evidence supporting the robustness and independence of the risk score as a prognostic indicator (Fig. 5A,B). To further explore the prognostic potential of the ADME signature (Fig. 5C), a novel nomogram was created by incorporating the ADME signature and clinical variables obtained from TCGA. Figure 5D displayed calibration curves that demonstrated a satisfactory capability. Moreover, the nomogram’s ability to predict prognosis was assessed using ROC analysis, comparing it with other factors such as Stage, T, N, M, and age. The AUC value for 1-year survival was 0.954 (nomogram) and 0.789 (risk score) (Fig. 5E). The AUC for 3-year survival was 0.790 (according to the nomogram) and 0.675 (risk score) (Fig. 5F). The AUC value for 5-year survival was 0.718 (nomogram) and 0.700 (risk score) (Fig. 5G). The DCA for 1-, 3-, and 5-year periods demonstrated that this nomogram exhibited a greater overall advantage, as depicted in Fig. 5H,J. These results demonstrated the potential of this innovative nomogram to serve as an excellent prognosis prediction model.Fig. 5Construction and evaluation of the nomogram. The forest plot for univariate Cox (A) and multivariate Cox regression (B) analysis. (C) Nomogram plot based on risk score and clinicopathological factors. (D) Calibration plot for the validation of the nomogram. The multifactor AUC for 1- (E), 3- (F), and 5-years (G) survival. The DCA curves for 1- (H), 3- (I), and 5-years (J).Investigation of the TME and its correlation with the efficacy of immunotherapy among patients stratified into different risk score categoriesAccording to the ESTIMATE analysis, it was observed in Fig. 6A that the high-risk group had lower ImmuneScores compared to the low-risk group. Furthermore, the differentiation of immune cell quantities among the two risk categories was investigated using different algorithms. The low-risk group demonstrated increased levels in the majority of immune cells, as depicted in Fig. 6B. According to the ssGSEA analysis, the high-risk patients showed reduced infiltration of B cells, CD8 + T cells, Neutrophils, Mast cells, pDCs, T helper cells, Tfh, Th1 cells, Th2 cells, TIL, and Treg compared to the low-risk patients (Fig. 6C). In the cohort of patients classified as low risk, there was an observed enhancement in specific immune functions, including APC co-inhibition, checkpoint regulation, T cell co-stimulation/co-inhibition (Fig. 6D). This could potentially clarify the reason behind the superior outlook of the low-risk classification. Furthermore, Fig. 6E illustrated the spread of individuals with different risk among various immune subtypes.Fig. 6Analysis of tumor microenvironment of high- and low-risk groups. (A) Differences in Immunescore between the two groups. (B) The examination of disparities in immune cell infiltration between the two cohorts was conducted utilizing multiple algorithms. (C) The analysis of differences in immune cell infiltration between the two risk groups. (D) The examination of variations in immune functions between the two groups was conducted using ssGSEA. (E) The allocation of patients exhibiting high- and low-risk profiles across various immune subtypes.Next, we analyzed the expression patterns of immune-related genes in different risk patients. The high-risk category showed low expression levels of most immune-related genes (Fig. 7A,D). The TIDE scores played a pivotal role in assessing the efficacy of immunotherapy. Furthermore, an investigation was conducted to analyze the relationship between the TIDE score and risk score. The low-risk category demonstrated significantly lower TIDE scores when compared to the high-risk category (Fig. 7E). As shown in Fig. 7F,I, violin plots showed that a higher IPS in a low-risk category suggested a more robust reaction to PD-1 and CTLA-4 inhibitors, establishing the correlation between IPS and risk groups. Conversely, individuals with a low-risk profile demonstrated a more favorable response to immunotherapy compared to those with a high-risk profile (Fig. 7J).Fig. 7Evaluation of the immunotherapeutic efficacy in cohorts classified as high- and low-risk. (A–D) The immune-related gene expression levels in different groups. (E) The TIDE score exhibits variations between the two groups. (F–I) The violin plots effectively depicted the association between IPS interventions and different risk groups. (J) Prediction of immunotherapy response.The functional enrichment analysis of ADME signatures identified numerous pathways that exhibited significant alterations among patients stratified into different risk groupsTo enhance comprehension regarding the molecular mechanisms of distinct ADME signature subgroups, we conducted enrichment analyses employing GO, KEGG, and DO. The GO analysis revealed that the DEGs between different risk groups were primarily associated with the development of the epidermis, activity of serine-type endopeptidase, and keratin filament enrichment (Fig. 8A). According to the KEGG findings, these genes were mainly enhanced in the Estrogen, Nitrogen metabolism, and Wnt signaling pathway (Fig. 8B). According to the findings, these genes were mainly found to be abundant in various cancerous tumors, such as gastric carcinoma, cancer of the female reproductive system, and breast cancer (Fig. 8C). Additionally, the GSVA revealed that numerous pathways dramatically altered between breast cancer patients with different risk (Fig. 8D).Fig. 8Function analysis. (A) GO, (B) KEGG, and (C) DO analysis of DEGs between high and low-risk groups. (D) GSVA enrichment analysis in high- and low-risk groups.Identification of chemotherapy drug sensitivity on the basis of the prognostic modelIn order to obtain additional understanding regarding potential variations in drug responsiveness among the aforementioned risk categories, we conducted an analysis to assess the relationship between the risk scores of breast cancer patients and the IC50 values of chemotherapy and targeted treatment medications. In the high-risk group, the IC50 values of AZ628, CGP-60,474, FTI-277, and GSK429286A were significantly lower compared to the low-risk group. Conversely, the IC50 values of XL-184, KIN001-102, IPA-3, TAE684, and ZSTK474 were higher in high-risk group (Supplementary Fig. S9). The findings indicated that the group at a greater risk showed greater sensitivity to AZ628, CGP-60,474, FTI-277, and GSK429286A, while the group at a lower risk exhibited higher sensitivity to XL-184, KIN001-102, IPA-3, TAE684, and ZSTK474. Furthermore, our analysis of the correlation between risk scores and the sensitivity of frequently utilized breast cancer medications, utilizing data from the GSE130787 dataset, revealed that individuals in the high-risk group exhibited a more favorable response to docetaxel, carboplatin, and trastuzumab treatment (Supplementary Fig. S10).

Hot Topics

Related Articles