Proteogenomic characterization identifies clinical subgroups in EGFR and ALK wild-type never-smoker lung adenocarcinoma

Clinical characteristicsA total of 1597 fresh-frozen tumor samples were collected from patients with NSLA who underwent surgical resection between 2001 and 2018 at the Korean National Cancer Center (KNCC). Among these, 102 samples from tumors without drug-sensitive EGFR mutations or ALK rearrangements were collected for further analyses (Supplementary Fig. 1a). Genomic data was used for tumor purity estimation, resulting in the exclusion of three samples with low purity (purity < 0.4, Supplementary Fig. 1b). A total of 99 samples were selected as frozen sample sets, and these 99 samples were subsequently used for downstream analyses. The median age of the patients was 63 years (range, 40–85 years), and 87.9% (n = 87) of them were women; 39.4%, 33.3%, and 27.3% of the tumors were stage I, II, and III, respectively.Classification of subgroupsThe scarcity of proteogenomic research on lung adenocarcinoma in never smokers without EGFR or ALK mutations has resulted in an oversimplified view of these diseases, treating them collectively as a single, homogeneous, yet undefined entity. To explore the possibility of suitable stratification of molecular subgroups within NSLA without EGFR or ALK mutations, we employed a set of gene signatures that were shown to represent tumor-intrinsic properties as well as characteristics of the tumor microenvironment from a series of previous pan-cancer studies25,26. We employed the gene set signatures, encompassing proliferation and immune enrichment scores (see Supplementary Fig. 1a, “Methods”), to analyze the proteome and RNA data from the present study. These molecular signatures facilitated the stratification of the samples into four subgroups with distinct molecular properties: the proliferation (P), immune (I), angiogenesis (A), and metabolism (M) subgroups (refer to Fig. 1a, “Methods”). The molecular factors employed in the classification demonstrated orthogonal classification at both the transcriptome and proteome levels (Fig. 1a, Supplementary Fig. 2a) and were sufficiently independent of each other to guide the subtyping of NSLA patients with varying degrees of clinical outcomes (Supplementary Fig. 2b, c). Among the subgroups, the P and A subgroups exhibited worse clinical outcomes, while the I subgroup displayed more favorable outcomes (Fig. 1b). Multivariate analysis revealed that the hazard ratio for the P subgroup was 2.91 times greater than that for the I subgroup (P < 0.05, Fig. 1c).Fig. 1: Immune microenvironment and TME proteogenomic profiles of NSLA.a Heatmap representation of distinct molecular subgroups in the NSLA cohort. The subgroups are color-coded as follows: proliferation-high (P) subgroup, blue; immune-high (I) subgroup, red; angiogenesis-high (A) subgroup, purple; and metabolism (M) subgroup, green. Mutations are ordered by significance calculated by MutSigCV. Three mutational signatures were found in this cohort: age (pink), MSI (blue) and APOBEC (yellow). b Overall survival probability of the molecular subgroups of NSLA (Kaplan–Meier survival analysis with log-rank P value). c Cox proportional hazard model multivariate analysis of molecular classification adjusted for clinical factors, including sex, age, and tumor stage and subtype. d Boxplot of the TMB and (e) amount of weak or strong neoantigen according to molecular subgroup.We utilized transcriptomic data from The Cancer Genome Atlas (TCGA) for external validation. We first utilized TCGA never-smoker lung adenocarcinomas without EGFR or ALK driver mutations (n = 46). We observed a general trend toward a poor prognosis in the P and A subgroups and a better prognosis in the I subgroup (P = 0.0023, Supplementary Fig. 3a, b). The classification of entire TCGA lung adenocarcinoma data into the four molecular subgroups similarly stratified patients according to prognosis (Supplementary Fig. 3c–e).The genomic landscape of the NSLA cohort reveals subtype featuresIn the present cohort, TP53 exhibited the highest frequency of genetic alterations, at 26.3%, followed by KRAS (20.2%) and SETD2 (13.1%) (Table 1, Fig. 1a). EGFR oncogenic driver mutations and ALK rearrangements were not detected in the cohort. ERBB2 exon 20 insertions were frequently observed within the cohort (6 out of 99, Supplementary Fig. 4a). Although patients with ERBB2 exon 20 insertions have modest sensitivity to HER2 TKIs, they respond better to combinations of ICIs and HER2 TKIs32; thus, the ERBB2 insertion mutation is considered a viable combinatorial therapeutic target. Mutations in another driver gene, KRAS, were detected in 20 samples. KRAS G12C is the most common type of KRAS mutation in lung cancer and was previously reported to be positively associated with smoking status, while the KRAS G12D mutation is more frequently found in nonsmokers33. In our cohort, KRAS G12V and G12D mutations were the most frequent (10 and 6 out of 99, respectively), and KRAS G12C was the least frequent KRAS mutation type (3 out of 99), with one case of G13C, corroborating the nonsmoking nature of the cohort. The KRAS G12C mutation was distributed with one occurrence per cluster, excluding the I subgroup. In the A subgroup, KRAS mutations were detected in six samples, among which five had concurrent TP53 mutations (P = 0.0096). TP53 gene mutations were most frequent in the P subgroup (48%, P = 0.0015) and were almost exclusively found in the P and A subgroups (P = 1.8 × 10−5). Furthermore, two patients in the P subgroup had mutations in both TP53 and RB1 (Fig. 1a), which are associated with worse clinical outcomes34,35. Other frequent somatic alterations observed in this subgroup included ARID1A mutations (P = 0.003). Somatic mutations in SETD2 were inversely correlated with B-cell activity (Supplementary Fig. 3g, P = 0.0006) and did not co-occur with functional oncogenic rearrangements (Fisher’s exact test, P = 1). SETD2 mutations were most frequent in the M and A subgroups (P = 0.01, Fig. 1a). STK11 and PIK3CA mutations were almost exclusive to the M subgroup (P < 0.05).Table 1 Clinical characteristics of patients with EGFR and ALK wild-type never-smoker lung adenocarcinoma.We assessed mutation signatures in NSLA samples and found that the 3 signatures encompassing the age-related, microsatellite instability (MSI) and APOBEC mutational signatures (Fig. 1a, Supplementary Table 8) were highly related. As expected, there was no smoking-related mutational signature found in the cohort. The proportion of the APOBEC mutational signature was notably greater in the P and M subgroups than in the I and A subgroups (P < 0.00022). Interestingly, despite lacking TP53 mutations, the tumor mutational burden of the M subgroup was comparable to that of the P and A subgroups, while it was significantly lower in the I subgroup (Fig. 1d). Both strong and weak neoantigen loads were also lower in subgroup I (Fig. 1e). Combining a high immune microenvironment and decreased neoantigen loads may produce an immunologically neutral milieu, suggesting another molecular feature contributing to the limited efficacy of ICI monotherapy in patients with NSLA. Additionally, the I subgroup demonstrated the lowest occurrence of the APOBEC mutational signature (P < 0.05). A minority of patients with MSI features were mostly segregated within the A subgroup (P = 0.009, Fig. 1a), consistent with previous work highlighting the increased expression of vascular endothelial growth factor (VEGF) in MSI-high tumors36. MSI-high status was also significantly correlated with a lower percentage of patients with a high tumor copy number alteration (CNA) burden (Supplementary Table 6, P < 0.05).Our study examined the influence of somatic driver mutations on both the proteome and CNA (Supplementary Fig. 4b–d). Among the frequently mutated genes depicted in Fig. 1a, the presence of STK11 mutation was associated with the most significant upregulation of cancer development regulatory genes. These genes included CLUH, components of the oligomeric Golgi family (COG) 2, 4, and 7, as well as COPZ1, IRS2, KRAS, MVD, NEDD4L, PGAM5, PNPT1, RBM28, SEC16A, SLC12A7, SNX27, TFCP2, TRAF2, and TXNRD2 (Supplementary Fig. 4b). Additionally, the protein abundance of ERBB2 exhibited a strong positive correlation with the CNA of genes located on chromosome 17 (Supplementary Fig. 4c, d).Integrative analysis of copy number alterations in NSLAWe conducted an integrative analysis to explore the genome-wide impact of both cis- and trans-acting copy number alterations (CNAs) on the transcriptome and proteome of the NSLA cohort (Supplementary Fig. 5a, P < 0.05). Initially, our analysis focused on the cis-acting CNAs of 7364 genes (Supplementary Fig. 5b). Among these genes, we observed a significant correlation between CNA and the transcriptome for 2974 genes and between CNA and the proteome for 431 genes (P < 0.05). Gene set enrichment analysis of these 431 genes revealed several oncogenic signaling pathways, including the ERBB, neurotrophin, insulin, and MAPK pathways. Notably, among these genes, 208 showed a positive correlation between CNA and both the transcriptome and proteome (P < 0.01, Supplementary Fig. 5b).In the analysis of trans-acting CNAs, we identified several broadly affecting CNAs in genes that are potentially relevant to lung cancer, including PTK7, METTL1, MSTO1, PIGU, ITGA6, and LPCAT4. Following the exclusion of RNA and unannotated genes, we focused on investigating the effects of trans-acting ITGA6. We identified 262 upregulated and 142 downregulated genes based on their proteomic associations. Among these genes, 135 were found to be highly expressed in tumor samples, and 53 were highly expressed in NAT samples (Supplementary Fig. 5c). Furthermore, among the ITGA6 trans-affected proteomes, 98 genes, including ACTN1, TMOD2, MAP2K1, BRAT1, and SMAD3, were associated with a worse prognosis when upregulated.Proteogenomic characteristics of the proliferation-high (P) subgroupOverall, the four molecular subgroups of NSLA without driver mutations are characteristic of these clinico-molecular subgroups, and we further investigated the proteogenomic details of each subgroup. The P subgroup was characterized by elevated cellular division rates and had the poorest prognosis. Tumors in this subgroup contained intermixed immune components, generally decreased immune activity, and exhibited significantly depleted angiogenic activity (Fig. 1a, Supplementary Fig. 2a). Considering the contrasting clinical outcomes exhibited by the immune and proliferation groups (Fig. 1b, c), we sought to identify regulatory factors capable of blocking the inherent proliferation potential while enhancing the surrounding immune activity. An algorithm37 was used to search for regulatory factors that explain the differentially expressed genes (DEGs) specific to the subgroup of identified transcription factors, including E2F1 and TFDP1 (Supplementary Fig. 6a). These factors form heterodimeric complexes, which negatively impact immune activity37 and are directly involved in cell cycle progression; thus, E2F1 is a potential dual-action regulatory target in this subgroup.Proteome- and transcriptome-based enrichment in this subgroup revealed signaling networks centered on the cell cycle, chromosome modification, and DNA replication (Fig. 2a, b, Supplementary Figs. 8a, e, 9a, 10a). The weighted rank analysis of both the proteome and transcriptome in the P subgroup highlighted the upregulation of proliferation markers such as CDK1, MAD2L1, and the MCM family, while focal adhesion markers were downregulated (Fig. 2b, c). Analysis of kinases associated with the P subgroup identified numerous actionable targets, such as CDK2 and CDK5, polo-like kinases (PLKs), and ATR (Supplementary Fig. 10a). Inhibition of CDK2 can block hyperphosphorylation of the Rb protein, which leads to its binding to E2Fs and the TFDP1 complex and blocking their target gene activation38. In clinical practice, elevated protein levels of Ki67 and CDK1 can be used as biomarkers for the identification of this subgroup (Fig. 3a).Fig. 2: Proteome correlation clustering and characteristics associated with subgroups.a Protein correlation network of NSLA patients. Protein groups are defined and colored based on enrichment analysis of hallmark gene sets. Subtype-specific enrichment was also colored in red when it was higher and blue when it was lower than that in other subgroups. b Transcriptomic and proteomic expression of pathway markers depicting the characteristics of each subgroup. The size of the dot indicates significance. c Weighted rank density scatter plot indicating the magnitude of change multiplied by the significance of the protein expression on the y-axis and transcriptomic expression on the x-axis.Fig. 3: Candidate protein biomarkers highly expressed in each subgroup.Boxplots of candidate protein biomarkers: (a) CDK1 and MKI67 were overexpressed in the proliferation-high subgroup; (b) LCK and CCL5 were overexpressed in the immune-high subgroup; (c) the proangiogenic markers LGALS3, FGF2, and CXCL12 were overexpressed in the angiogenesis-high subgroup; (d) ACAT1 was overexpressed in the metabolism-high subgroup.Unraveling immune dynamics and therapeutic implications in the immune-high (I) subgroupIncreased levels of antitumor immune components, such as T and B lymphocytes, major histocompatibility complex II (MHC II) pathway components, NK cells, and effector T cells, were observed in the I subgroup, with increased levels of protumor immune elements, such as regulatory T cells (Tregs), cancer-associated fibroblasts (CAFs), and immune checkpoint pathway molecules (Fig. 1a). B cells constituted a significant portion of the immune components, leading to a heightened presence within this subgroup (Supplementary Fig. 3f, P = 6.8e-9). The regulatory network exhibited a preference for beneficial immune activities, with upregulated immune-stimulating transcription factors such as IRF4 and TBX21 and downregulated immune-inhibitory regulators (Supplementary Fig. 6b). Notably, several immune-inhibitory transcription factors, including E2F1, were suppressed in this subgroup, suggesting an immune profile opposite to that of the P subgroup (Supplementary Fig. 6b). This subgroup also demonstrated a highly elevated representation of immune activation (Fig. 2a) and correlated enrichment of diverse immune-related signaling networks (Supplementary Fig. 8b, 8f, 9b). Furthermore, upregulation of immune activation marker genes and downregulation of oxidative phosphorylation marker genes were observed at both the RNA and protein levels, accompanied by the formation of a robust subnetwork of T-cell receptor (TCR) signaling pathways and high expression of the canonical T-cell signaling kinase LCK (Figs. 2b, 3b)39. Elements in the cytokine signaling pathway were upregulated at both the RNA and protein levels (Supplementary Fig. 9b, 10b). Several kinases that were significantly increased in the I subgroup constituted a network centered on immune cytokine signaling (Supplementary Fig. 10b), including ZAP70 and SYK, which are known to regulate the maturation of T cells39.Immune checkpoint inhibitors are the first-line treatment for metastatic NSCLC40 but are generally less effective in never-smokers than in smokers. The immunological reason for this reduced effectiveness in NSLA patients remains unclear. However, one hypothesis attributes it to a lower TMB and PD-L1 expression. To analyze the immunomodulatory mechanisms of this cohort, we performed a systemic evaluation of inhibitory receptors and their corresponding ligands (Fig. 4a–c, Supplementary Fig. 11a). The stromal and immune scores exhibited concurrent increases in various immune components in NSLA patients (Fig. 4a). Many samples exhibited B-cell enrichment and corresponding upregulation of tertiary lymphoid structure markers (Fig. 4b). Most inhibitory receptors, including PD-1, TIGIT, BTLA, and CTLA4, were upregulated in the I subgroup and were strongly correlated with immune cell infiltration22 (Fig. 4c). In contrast, B7-H3, CEACAM1, and NECTIN2 were marginally upregulated in the nonimmune-high group and inversely correlated with T-cell infiltration (Supplementary Fig. 11a).Fig. 4: Immunological characteristics of the immune-high (I) subgroup.a Linear regression of immune and stromal scores depicting the concurrent increase in the stromal score and immune score (R2 = 0.61). b Heatmap representation of the expression patterns of tertiary lymphoid structure (TLS) marker genes. c Scatter plot for the correlation coefficient vectors between the immune checkpoint molecules and immune score calculated using the ESTIMATE algorithm as the x-axis and the RNA expression of the immune checkpoint molecules and T cell, B cell, and regulatory T cell scores as the y-axis. The immune checkpoint receptors are shown in red, the ligands are shown in green, and the expression of the molecules was determined using RNA-seq data. d Scatter plot of the correlation coefficients between cytokine and chemokine levels and the immune score. Type I cytokine receptors are shown in brown, G protein-coupled receptor (GPCR) CC chemokines are shown in green, C chemokines are shown in yellow, and tumor necrosis factor receptors (TNFRs) are shown in red.Effective immunotherapy requires adequate T-cell recruitment into tumors, a process in which cytokines and chemokines play crucial roles41. Both the proteome and transcriptome levels of immune activation markers, including TCR, were upregulated in the I subgroup, in addition to CD8A, CD79A, and CD247 (Fig. 2b, c). The expression levels of cytokines and chemokines were strongly correlated with tumor immune scores and T-cell infiltration (Supplementary Fig. 12a). Among them, CCL5 was differentially expressed at both the RNA and protein levels and showed the strongest positive correlation with the immune score (Figs. 3b, 4d, Supplementary Fig. 12a). In addition, CXCL13 and CD27, which are cytokines that attract B cells and follicular helper T cells, were specifically elevated in the I subgroup (Supplementary Fig. 12a).Proteogenomic characteristics of the angiogenesis-high (A) subgroupThe A subgroup displayed certain similarities to the immune group in terms of a downregulated immune suppressive milieu; however, it did not exhibit an explicitly favorable immune regulatory network (Supplementary Fig. 6c), indicating the potential for enhancing immune activity through the utilization of immune activators (Supplementary Fig. 8c, 8g). Patients with the highest angiogenesis scores tended to be mutually exclusive with the P subgroup: 22/24 patients with the highest angiogenesis scores belonged to the nonproliferation-high subgroup (Fisher test, P = 0.018). Furthermore, samples with co-mutations in KRAS and TP53 were found to be enriched in this subgroup. However, the characteristics of immune exclusion were apparent in the A subgroup, highlighted by the presence of CAFs and sporadically elevated levels of myeloid-derived suppressor cells (Fig. 1a), potentially impeding CD8 + T-cell infiltration42. The A subgroup demonstrated statistically significant enrichment of networks, including the TGF-β, cell migration, and angiogenic pathways (Supplementary Figs. 8c, 9c, 10c).Various proangiogenic factors, including FGF2 derived from CAFs43, CXCL12/SDF-1 secreted by human bone marrow stromal cells44, PDGFB, and pro-angiogenic LGALS3/galectin-3, were highly expressed at both the RNA and protein levels (Figs. 2b, c and 3c)45. When upregulated, LGALS3 binds to integrin or VEGFR2 on the endothelial cell surface, promoting the secretion of granulocyte colony-stimulating factor (G-CSF) and interleukin-6 (IL-6). Subsequently, IL-6 stimulates the NOTCH ligands JAG1/Jagged1 and DLL4, which play nuanced but distinct roles in angiogenesis46. Similarly, G-CSF derived from galectin-3 may also promote tumor growth by stimulating angiogenesis47.Proteogenomic enrichment analysis revealed lipid metabolism pathways in the metabolism (M) subgroupThe tumor samples from the M subgroup exhibited evident enrichment in oxidative phosphorylation, lipids, and carbon metabolism, along with other metabolic pathways (Fig. 2a, b, Supplementary Fig. 8d, 8h, 9d). Several upregulated kinases or related proteins in this subgroup, such as the ERBB3, ICK, ARAF, and CaMK2 families, were also associated with the MAPK cascade (Supplementary Fig. 10d), potentially affecting cholesterol homeostasis by promoting the expression of sterol regulatory element-binding proteins (SREBPs), which are transcription factors that regulate cholesterol synthesis48. This subgroup exhibited marginally immunosuppressive feature, and the expression of the acyl-CoA:cholesterol acyltransferase 1 protein (ACAT1), a suppressor of the proliferation of CD8 + T cells, was elevated (Fig. 3d, Supplementary Fig. 6d). The expression of acyl-CoA:cholesterol acyltransferase 1 protein (ACAT1), a suppressor of the proliferation of CD8 + T cells, was elevated in the M subgroup (Fig. 3d). The overexpression of ACAT1 may be involved in depleting the cholesterol required for TCR clustering in CD8 + T cells and decreasing T-cell avidity to antigens by esterifying free cholesterol. Moreover, the expression of CERS4, a pivotal enzyme in sphingolipid metabolism, was upregulated at both the RNA and proteome levels (Fig. 2b, c). Prior research has established a positive correlation between CERS4 and the efficacy of anti-PD-1 therapy in NSCLC49.Our proteogenomic data revealed coordinated enhancement of the citric acid cycle, the pentose phosphate pathway, and nucleotide biosynthesis, which, in turn, influenced amino acid and lipid metabolism (Fig. 5a, Supplementary Fig. 8h). Transcriptomic and proteomic levels of acetyl-CoA carboxylase alpha (ACACA), which catalyzes the conversion of acetyl-CoA to malonyl-CoA for mitochondrial fatty acid synthesis (mtFAS), were increased in the M subgroup50 (Fig. 5a).Fig. 5: Metabolomic subgroups with upregulated pathways and subtype-specific vulnerabilities.a Signaling pathway diagram of glycolysis, the citric acid cycle, and oxidative phosphorylation showing significantly upregulated expression in the M subgroup. b Boxplots showing a significantly greater dependency on the CDK9 gene observed exclusively within the P subgroup (P = 0.02). c The TRAF2 gene exhibited heightened dependency on the I subgroup (P = 0.04), (d) the GRB2 to A subgroup (P = 0.01), and (e) the ACACA to M subgroup (P = 0.01).Subtype-specific cancer vulnerabilitiesWe analyzed data from the CCLE51 and DepMap29 databases to identify vulnerabilities and potential treatment avenues in the NSLA subgroups. Our initial analysis specifically focused on 69 lung adenocarcinoma (LUAD) cell lines without oncogenic EGFR/ALK mutations. Pathway enrichment was conducted using marker gene lists, and cell lines were subsequently grouped into four distinct subgroups using the same algorithm applied to the NLSA cohort (Supplementary Fig. 13a)37. The similarity of the four clusters between the bulk KNCC cohort and cell line cohort was analyzed with Spearman correlation, which revealed a strong correlation between the same subtype in the two different cohorts (Supplementary Fig. 13b). To identify subgroup-specific targets, we delved into genes exhibiting a significantly substantial impact upon subgroup perturbation. Our findings revealed that the P subgroup exhibited a greater dependency on the CDK9 gene than did the other subgroups (P < 0.05, Fig. 5b).The cell lines in the I subgroup showed a strong dependency on TRAF2 (Fig. 5c). The presence of tumor cell-expressed TRAF2 has previously been recognized as a significant factor that restricts the ability of cytotoxic T cells to eliminate cancer cells even after immune checkpoint blockade52. The A subgroup demonstrated a pronounced dependency on GRB2, which is intricately linked to the EGFR pathway (Fig. 5d). The cell lines belonging to the M subgroup exhibited notable reliance on cancer metabolic genes such as ACACA, the pivotal player of mtFAS (Fig. 5e). Collectively, our results underscore the distinct gene dependencies of the P, I, A, and M subgroups on CDK9, TRAF2, GRB2, and ACACA, respectively.The PRISM Repurposing dataset was utilized to assess drug sensitivity in lung cancer cells across each subgroup, revealing several drug matches, including digitoxin and tacedinaline, as promising compounds for inducing cancer cell death in the P subgroup (Supplementary Fig. 13c). KI16425, a lysophosphatidic acid receptor (LPAR) antagonist, is suggested for treating subgroup I patients (Supplementary Fig. 13d). The I subgroup exhibited significantly greater LPAR6 expression than the other subgroups (Supplementary Table 10, P = 0.002), which was previously associated with negative regulation of CD8 + T-cell tumor infiltration53. For the A subgroup, ibutamoren (MK-677), which is a synthetic compound and a growth hormone secretagogue that mimics the action of ghrelin by binding to the ghrelin receptor and increasing the release of growth hormone, was selected (Supplementary Fig. 13e). Ghrelin was previously shown to protect against hypoxia-induced lung injury by preventing hypoxia-induced increases in angiogenesis and HIF1-alpha and VEGF expression54. Finally, clorsulon was suggested for the M subgroup (Supplementary Fig. 13f). Clorsulon is widely used as an anthelmintic in calves and sheep but is also a competitive inhibitor of both 3-phosphoglycerate and ATP, inhibiting glucose utilization and acetate and propionate formation55.Identification of cancer-specific antigensCancer germline antigens (CGAs), which are exclusively present in normal germ cells and some cancer cells, are considered promising targets for cancer immunotherapy due to their potential to enhance treatment efficacy while minimizing patient toxicity30. To identify patients who are most likely to benefit from targeting cancer-specific proteins, we selected CGA lists with at least one outlier whose expression was a minimum of 100-fold greater than the average expression (Fig. 6a). In our cohort, 11% of the samples exhibited atypically elevated CGA expression and were mostly clustered in nonimmune subgroups (Fig. 6b). The prevalence of CGA overexpression varied among the subtypes (Fig. 6c). The P subgroup had the greatest proportion of CGA-overexpressing samples (24%), while the I subgroup did not exhibit any instances of CGA overexpression, with the deficiency of CGA overexpression in the A and I subgroups (Fisher test, P = 0.02). To gain deeper insights into the molecular distinctions associated with the overexpression of CGAs, we conducted a differentially expressed gene analysis between patients with and without CGA expression (Supplementary Fig. 14c, Supplementary Table 12). The downregulated genes in the CGA-containing group were mainly related to immune-related genes and pathways, such as interleukin family signaling and the interferon response (Fig. 6d). The same analysis was conducted with the never-smoker subgroup of the TCGA-LUAD cohort (n = 46). Although statistical significance was not detected due to the limited sample size (P = 0.18), a trend toward underrepresentation of CGA-positive cancers in subgroups A and I was observed, and immune-related genes were downregulated in CGA-positive samples, similar to our current cohort (Supplementary Fig. 14a–d). To compare the increased expression of CGA markers within our patient cohort relative to that in normal tissue, we conducted a comparative analysis using normal tissue data sourced from the GTEx project31, revealing significantly elevated expression levels of specific CGAs exclusively within cancer and germline tissues in individual patients (Supplementary Fig. 15).Fig. 6: Cancer-specific antigens in NSLA.a Boxplots of cancer-specific antigen expression, log-scaled at the transcriptomic level on the y-axis, depicting outlier expression by different colors belonging to each molecular subtype of NSLA. b Heatmap illustrating the distribution of samples with cancer-specific antigen expression. c Pie graph showing the proportion of CGA-positive samples by subgroup. The subgroup with the highest percentage of samples exhibiting CGA overexpression was the P subgroup (28%), followed by the M subgroup (22%), the A subgroup (5%), and the I subgroup, which did not display marked overexpression of any CGAs. d Pathway enrichment analysis using hallmark gene sets from MSigDB associated with tumors expressing cancer-specific antigens.

Hot Topics

Related Articles