Prostate cancer subtyping and differential methylation analysis based on the ETS family of transcription factors fusion genes

In this study, we proposed considering the ETS family gene fusions in PCa as two distinct subtypes, positive and negative, and found substantial differences in the DNA methylation profiles between these subtypes. We analyzed the distribution of fusion genes in PCa and ETS family fusion positive genes on chromosomes, the fusion modes of reading frames, and the structural domain predictions of the fusion genes and their parental genes, in order to reveal the role of ETS family fusion genes in gene structure and functional regulation. Subsequently, we investigated the epigenomic DNA methylation patterns in different subgroups, including PCa recurrent fusion positive and negative, TMPRSS2-ERG fusion positive and negative, and the ETS family fusion positive and negative subtypes. This exploration aimed to identify differentially methylated CpG sites, shedding light on the relationships between various subgroups and overall survival rates. The findings revealed an increasing trend in mortality rates for PCa tumor with recurrent fusion genes, those without TMPRSS2-ERG fusion genes, and those with the ETS family fusion genes. Finally, we integrated the methylation results with gene expression data from the same patient samples to explore the potential impact of different DNA methylation patterns on mRNA expression levels in PCa.In order to validate the ETS family fusion positive and negative subtypes of PCa, we conducted survival analysis and clustering of DNA methylation profiles using a uniform analytical approach for both the recurrent fusion positive and negative groups, as well as the TMPRSS2-ERG fusion positive and negative groups. The study results showed that ETS family fusion positive and negative PCa tumor could be distinctly differentiated by their DNA methylation profiles. In PCa, the subdivision of subtypes based on the DNA methylation landscape reveals significant differences between TMPRSS2-ERG gene fusion positive and TMPRSS2-ERG gene fusion negative tumors, elucidating distinct potential carcinogenic pathways between these molecular subtypes11. Furthermore, for rhabdomyosarcoma, the study of methylation characteristics of the PAX3-FOXO1 and PAX7-FOXO1 fusion genes, which were formed by linking the DNA binding domain of PAX3 or PAX7 to the transactivation domain of FOXO1, provided a new method for distinguishing between fusion positive and fusion negative rhabdomyosarcoma17.We found that fusion events of PCa fusion genes were mainly concentrated on chromosomes 1, 4, and 21, with fusion events within chromosomes 3 and 4 being the most common. The frequency of the ETS family fusion events on chromosome 21 was the highest, and the frequency of interchromosomal fusion was higher than intrachromosomal fusion. These fusion events on chromosomes were closely related to cancer and diseases. For example, the FIP1L1-PDGFRA fusion gene associated with Hypereosinophilic Syndrome (HES) was located on chromosome 4. Cell molecular cytogenetic analysis techniques have shown that due to interstitial deletion of chromosome 4, the FIP1L1 gene fused and constitutively activated PDGFRA gene, leading to the production of a protein with tyrosine kinase activity, thereby stimulating sustained proliferation of eosinophils18. Additionally, the fusion gene AML1-ETO was associated with Acute Myeloid Leukemia (AML), leading to decreased survival rates and increased recurrence rates19. The fusion gene AML1-ETO, resulting from translocation between chromosomes 8 and 21, was one of the most common chromosomal abnormalities in AML20. For the ETS family of transcription factors, ETS Proto-Oncogene 2 (ETS2) was an important member of its transcription factor family, located on chromosome 21. The protein encoded by ETS2 was a Ca2+ dependent phosphorylated protein involved in regulating physiological and pathological processes such as cell proliferation, differentiation, and apoptosis21.The distribution of fusion gene reading frame fusion modes in PCa was roughly equal between in-frame, out-of-frame, and 3′-UTR and 5′-UTR. The reading frame fusion ratio of the ETS family fusion genes mainly occurred out-of-frame. In Acute Myeloid Leukemia, t (1;21) could lead to out-of-frame fusion of RUNX1-CLCA2, and these out-of-frame fusions could generate hypothetical truncated RUNX1 isoforms22. Out-of-frame fusions retained the DNA-binding Runt domain but lacked the transcriptional regulatory domains of RUNX1. Truncated RUNX1 gene could promote the development of leukemia in patients23. In Philadelphia chromosome (Ph) positive leukemia, the COOH terminal portion of the transcription product of the tumor specific antigen Bcr-ABL contained an out-of-frame coding amino acid sequence from the ABL gene. These variants were expressed in Ph positive Chronic Myeloid Leukemia (CML) and Acute Lymphoblastic Leukemia (ALL) patients24. If a frameshift fusion occurs at the gene fusion point, it can convert a functional in-frame fusion into a dysfunctional out-of-frame fusion, thereby affecting the structure and function of the fusion protein25. Future research should precisely identify fusion breakpoints and analyze the impact of frameshift fusions on the reading frame to understand disease mechanisms and guide therapeutic strategies.We conducted a statistical analysis of the domains of parental genes and their corresponding fusion genes. The domains of fusion genes could be categorized into several situations: some fusion gene domains were an overlap of the domains of both parental genes, while another portion of fusion gene domains retained only the domains of one parental gene. There was also a subset that retained only a portion of the domains of one parental gene. Only a very small number of fusion genes introduced new domains on the basis of the original parental genes. These preserved fusion gene domains played crucial roles in transcription, cell signaling, and the immune system. For example, the DDT (the DNA-binding homeobox containing proteins and the different transcription) domain included proteins and various transcription and chromatin remodeling factors. It collaborated with other protein domains to regulate biological processes such as transcription, replication, and repair26. The DBB domain referred to a domain shared by Dof (DNA binding with one finger), BANK1 (B-cell scaffold protein with ankyrin repeats 1), and BCAP (B-cell adapter for PI3K) proteins. This domain typically contained functional regions related to DNA binding or cell signaling27. The NLRP3 protein inflammasome was a crucial component of the innate immune system, and its aberrant activation could lead to inflammatory diseases. The LRR (Leucine-Rich Repeat) domain controlled inflammasome activation by mediating NLRP3 protein self association, oligomerization, and interaction with the essential regulator NEK728. Additionally, new domains like IG_like had emerged. The polycystic kidney disease gene PKD1 encoded polycystin-1, which included 16 IG_like domains (or PKD domains), indicating its significant role in cell–cell or cell–matrix interactions29.Survival analysis results revealed that after 9 years follow up, the survival rate of the recurrent fusion negative group declined, whereas due to data limitations, the recurrent fusion positive group could not show subsequent survival rates. A significant statistical trend indicated that tumors in the recurrent fusion positive group had a higher risk of death compared to the recurrent fusion negative group. Although the statistical trend for the TMPRSS2-ERG fusion positive and negative groups was not very significant, compared to the TMPRSS2-ERG fusion positive group, the risk of disease progression or death in the TMPRSS2-ERG fusion negative group increased over time. Similarly, for the ETS family fusion positive and negative groups, after 9 years, the survival rate of the ETS family fusion positive group significantly decreased, while the fusion negative group could not provide subsequent survival rates due to data limitations. Although the statistical trend was not pronounced, the trend estimate suggested that compared to the ETS family fusion negative group, the risk of disease progression or death in the ETS family fusion positive group increased. This was also a limitation of the study; we only used PCa samples from TCGA, which represented a small dataset for finding differences. By analyzing the methylation levels of differentially expressed CpG sites between the ETS family fusion positive and negative groups, we conducted a cluster analysis of the samples and identified two main clusters. One cluster contained 88% of fusion positive tumors, and the other cluster contained 89% of fusion negative tumors. This suggested that the differentially methylated CpG sites were closely related to the ETS family fusion positive and negative status.We identified hypermethylated CpG sites with significant differential expression between the ETS family fusion positive and fusion positive groups, including cg24345747 and cg17701886. We found a strong negative correlation between the methylation levels of these two CpG sites and the corresponding mRNA expression of CD8A and B3GNT5. These genes played crucial roles in the pathogenesis of cancer and diseases. CD8A (Cluster of Differentiation 8A) was a member of the T cell cytotoxic pathway related genes, encoding the CD8 antigen that collaborated with the T cell receptor on T cells to recognize and present antigens30. Additionally, in childhood asthma samples, higher methylation of CpG sites in the CD8A promoter region significantly downregulated CD8A expression, affecting the TCR (T-cell receptor) signaling pathway, thereby regulating the progression of childhood atopic asthma31. Radiogenomic features indicated that predicting the expression of CD8A in bladder cancer patients preoperatively contributed to predicting patient prognosis and sensitivity to immunotherapy32. Copy number amplification and hypomethylation of the promoter region of B3GNT5 (β-1,3-N-acetylglucosaminyltransferase 5) gene contributed to its overexpression in the most invasive subtype of breast cancer, basal-like breast cancer (BLBC). It served as a prognostic marker and therapeutic target for BLBC33. Dysregulation of sphingolipid metabolism was the major pathway in non-small cell lung cancer patients34, and B3GNT5 gene, along with GAL3ST1 (β-1,4-galactosyltransferase 1) gene, altered the levels of metabolites such as lactate, sphingolipids, and sulfides in the serum of non-small cell lung cancer patients35. This differential regulation affected the proliferation, migration, and invasion of tumor cells35.The correlation between the top ranked differentially expressed CpG sites with significant p-values for hypermethylation and the corresponding mRNA expression levels could be either positive or negative, depending on the location of aberrantly methylated CpG sites within the gene36. For instance, the downregulation of the DNMT3B gene could result in upregulation through DNA remethylation, depending on its local chromatin structure37.

Hot Topics

Related Articles