MOSBY enables multi-omic inference and spatial biomarker discovery from whole slide images

We developed the MOSBY model that learned a mapping from RetCCL (contrastive self-supervised learning)-based WSI features to bulk transcriptome, proteome, whole exome, SNP-array, and DNA methylome based profiles (Fig. 1a). We limited our study to 21 TCGA indications14 with at least 200 paired RNA-seq and H&E whole slide images, resulting in a total of 12,592, 10,192, and 12,090 images matching with transcriptome, proteome and DNA-based data respectively (breakdown by indication in Supplementary Table 1). Analyzed transcriptomic features consisted of 55 TME-related genes15,16 (Supplementary Table 2) and 175 gene sets that covered tumor-related processes17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32, metabolic pathways33, and TME cell type or process signatures34,35,36,37 (Supplementary Table 3). The proteome model involved 191 total and phosphoprotein antibodies from the TCGA reverse-phase protein array (RPPA) panel that focused on tumor-intrinsic and oncogenic processes 38. Tested DNA-based features were limited to tumor purity, cancer DNA fraction, subclonal genome fraction, and average DNA methylation (all continuous measures bound to the interval [0,1]).Figure 1(a) MOSBY flowchart: H&E-stained whole slide images were downloaded from TCGA and partitioned into tiles. Image features were extracted using RetCCL-pretrained weights and the ResNet-50 architecture. A 2-layer perceptron was trained to learn the mapping from image features to bulk omic profiles. A by-product of the process, tile level predictions were utilized to achieve virtual spatialization of omic profiles. Pairwise correlation of omic features resulted in a colocalization map for each patient. Colocalization maps were then flattened, and used as covariates in survival analysis with the goal of spatial biomarker discovery. (b,c,d,e) Test set Spearman correlation between omic feature predictions and ground truth, averaged over 5 cross-validation folds. Each data point is an omic feature, and the x-axis shows different single- and multi-indication runs. Red and turquoise boxes indicate results from RetCCL vs. ImageNet-pretrained ResNet-50 architecture respectively. Results are shown for (b) 55 TME-related genes, (c) 175 tumor cell and microenvironment-related signatures, (d) 191 reverse phase protein array antibodies, and (e) 4 DNA-based features. Mann–Whitney hypothesis tests for (b), (c), (d) were implemented with the ggpubr compare_means function to compare RetCCL and ImageNet based results (Hypothesis tests were not performed for (e) due to the small number of features, N = 4). Significance levels are provided in the figure, original P-values are provided in Supplementary Table 5. ****p < 0.0001, ***p < 0.001, **p < 0.01, *p < 0.05, ns: p > 0.05. Lower and upper hinges in box plots correspond to first and third quartiles, while whiskers extend to 1.5 times interquartile range. (f-g) Concordance between ST ground truth and MOSBY-predicted levels of select (f) signatures, (g) genes. Normalized expression levels in ground truth and MOSBY-predicted levels were both mapped to the [0–1] interval to enable comparison.In addition to single indication models, we also trained MOSBY in pan-tissue, pan-cancer, pan-squamous and pan-adenocarcinoma settings. Pan-tissue models consisted of LUNG (lung adenocarcinoma and squamous cell carcinoma), KIDNEY (clear cell and papillary renal cell carcinoma), BRAIN (glioblastoma and low grade glioma), and COADREAD (colon and rectal adenocarcinoma). The pan-adenocarcinoma (ADENO) model consisted of pancreatic, lung, stomach, colon, rectal, prostate, and ovarian adenocarcinomas; while the pan-SQUAMOUS model included squamous cell carcinomas of the lung, cervix, and head and neck. The ADENO and SQUAMOUS models allowed us to investigate whether histological similarities beyond tissue architecture enabled MOSBY to learn a better mapping from WSI features to multi-omic data.Similar to the HE2RNA12 model, a 2-layer perceptron was adopted as the multi-output regression network that mapped image features to omic variables, and a maximum of 8000 image tiles per slide were used. Following deviations from the HE2RNA model were implemented in TCGA: (1) Separate models were trained for gene, signature, protein, and DNA variables; (2) image tiles were randomly selected from each WSI to capture an unbiased representation of the entire slide, and were all used in training without clustering; (3) batch-normalized transcriptomic and proteomic data were used as ground truth to enable across-indication comparisons with resulting MOSBY predictions; (4) tile-level predictions of omic features were aggregated by averaging all available tiles to obtain slide-level predictions.Multiple model architectures (i.e. number of perceptron layers) and average pooling approaches (varying number of averaged tiles in {100,500,1000,all}) were explored in independent urothelial carcinoma datasets IMvigor21039 and IMvigor21140. Results showed that average pooling across all available tiles led to more accurate test set predictions in this setting (Supplementary Fig. 1a, Supplementary Table 4a), however, a 1-layer perceptron can in some cases be preferable to 2-layer perceptron in MOSBY (Supplementary Fig. 1b, Supplementary Table 4b).Contrastive self-supervised pretraining benefits prediction of omic data from H&E whole slide imagesThe RetCCL feature extractor utilized TCGA H&E images during contrastive training, however it was not supervised by gene expression information. Hence, there are no a priori guarantees for RetCCL-based image features to predict gene expression with higher accuracy than ImageNet-based features. Thus, we investigated the performance differences between MOSBY models trained with RetCCL- or ImageNet-based image features. Models were trained with fivefold cross-validation (80/20 percent training/test set split). A random one fifth of the training set was also allocated as validation set to determine an early stopping criterion using the Spearman correlation between slide-level model prediction and ground truth omic data. Spearman correlation was preferred to mitigate the effects of outlier samples in relatively smaller cohorts. Correlation coefficients from test sets were subsequently averaged across five folds to obtain the final performance score for a given feature.RetCCL-based image features consistently led to higher cross-validated test set averages in all four omic data types compared to features extracted with ImageNet-pretrained ResNet-50 architecture (Fig. 1b-e, Supplementary Table 5). The pan-cancer model (PANCAN) with RetCCL-based features achieved the highest performance for all tested data types with median cross-validated Spearman correlation of 0.722 in single genes (0.673 with ImageNet-features) (Fig. 1b), 0.693 in signatures (0.647 with ImageNet-features) (Fig. 1c), 0.549 in proteomic data (0.512 with ImageNet-features) (Fig. 1d), and 0.6 in DNA features (0.533 with ImageNet-features) (Fig. 1e). In terms of single indication models, the thyroid cancer model (THCA) achieved best performance for single gene and protein expression data sets with RetCCL-based features. Median cross-validated Spearman correlations in these models were 0.59 vs 0.524 for single gene, and 0.443 vs 0.385 for protein expression data with RetCCL and ImageNet-based features respectively (Fig. 1b,d). The single indication models achieving best performance for signature and DNA data were the liver and bladder cancer models respectively (LIHC and BLCA) (Fig. 1c,e). Across tested signatures, the LIHC model showed a median cross-validated Spearman correlation of 0.532 with RetCCL-based vs 0.45 with ImageNet-based features. For tested DNA features, the BLCA model achieved a median cross-validated Spearman correlation of 0.553 with RetCCL-based vs 0.417 with ImageNet-based features.To address the information leakage possibility in TCGA, the RetCCL vs ImageNet comparison was repeated in urothelial carcinoma datasets IMvigor21039 and IMvigor21140. Gene and signature models were trained in the same fashion as in TCGA, and cross-validated averages of test set correlations were compared between RetCCL- and ImageNet-based models. In both gene and signature models, MOSBY slide level predictions were highly significantly better with RetCCL-based features (Supplementary Fig. 1c, Supplementary Table 6a,b). Taken together, these results indicated that contrastive self-supervised pretraining in large-scale histology datasets benefits prediction of multiomic data from H&E-stained whole slide images.Further inspection of multi-indication models in TCGA showed that cohort differences between indications may lead to exaggerated performance estimates, and a pan-cancer model has limited utility in a clinical deployment scenario (Supplementary Notes, Supplementary Fig. 2).MOSBY tile level predictions are validated with spatially resolved ground truthAfter showing the benefit of contrastive self-supervised training for ‘slide level’ predictions, we validated MOSBY ‘tile level’ predictions with spatially resolved transcriptomic data in breast cancer, and CD8 immunohistochemistry (IHC) whole slide image data in urothelial cancer. For both tasks, inference models were trained with data from 80% of patients, with the remaining 20% used as the validation set to determine an early stopping criterion.The inference model for the former validation task was trained in TCGA (BRCA cohort, N = 1576 WSIs), and subsequently deployed on H&E image tiles from a publicly available spatially resolved transcriptomic breast cancer dataset41 (referred to as ST from here on, N = 68 WSIs). Image tiles were centered around ST spot coordinates to enable one-to-one comparison between spot level ground truth data and tile level model predictions (Methods). For individual slides, the MOSBY signature model predictions showed the highest concordance with ground truth for ‘poor differentiation’ (Stemness_Kim_Myc, Spearman r = 0.71) and stromal features (Stroma_Estimate, Spearman r = 0.63) (Fig. 1f, Supplementary Fig. 3a). Across 68 slides, concordance was highest for a monocyte signature with a median Spearman correlation of 0.238 (Supplementary Fig. 3a), and a maximum of 0.611 (Fig. 1f). This large variation across slides was observed for all tested signatures suggesting that the quality of spatially resolved data is critical in validating MOSBY predictions. Of note, CD8 T cell infiltration and proliferation-related model predictions also showed strong concordance with ground truth, demonstrating the variety in phenotypes captured successfully by the model (Supplementary Fig. 3b).Spot level concordance for single gene expression values was overall lower than that for signatures (Supplementary Fig. 3d), potentially driven by the large degree of zero reads (i.e. dropouts) in ground truth data (Supplementary Fig. 3e). However, model predictions for genes associated with stroma, plasma cells, and epithelial features showed good performance for individual slides (COL1A2, MZB1, EPCAM respectively) (Fig. 1g). CD68, a myeloid marker, showed the highest median concordance in gene models (Spearman r = 0.127), which was low in magnitude but consistent with the high concordance of myeloid cells in signature models (Supplementary Fig. 3d).The gene–gene and signature-signature correlation structure in spatial transcriptomics data also showed that MOSBY predictions had stronger concordance with signature level ground truth data, again highlighting that computing gene signature scores is an effective strategy to deal with the dropout issue in spatially resolved transcriptomic datasets (Supplementary Notes, Supplementary Fig. 3).MOSBY tile-level predictions were further validated with CD8 antibody-stained IHC image data (Methods). Inference models were trained with paired H&E and RNA-seq data from urothelial carcinoma trials IMvigor21039 and IMvigor21140 (Methods, N = 1460). Comparing CD8-associated gene and signature model predictions with IHC ground truth showed that MOSBY successfully learned a CD8 T cell-specific representation on H&E images (Figs. 2 and 3, Supplementary Notes).Figure 2Validation of MOSBY in silico spatialization with CD8 IHC whole slide images in IMvigor211. (a) Workflow for computational alignment of serially sectioned CD8 IHC and H&E whole slide images, and correlation of CD8 IHC-based ground truth values with MOSBY tile-level predictions. The derivation of tile level ground truth values by applying DAB mask on CD8 IHC images is described in Methods. (b) Pearson correlation between tile-level ground truth (CD8 IHC) and MOSBY-predicted values for the 42 slides that satisfied quality control criteria in (a). The x-axis shows gene (CD8A, CD8B) and signature (Cibersort CD8 T cell, T effector cell) features compared with CD8 IHC. For a given slide, the feature with the highest CD8 IHC correlation is denoted with a red dot, while the other three features are denoted with blue dots. The number of slides where each feature had the highest CD8 IHC correlation is shown in parenthesis under the x-axis labels. Lower and upper hinges in box plots correspond to first and third quartiles, while whiskers extend to 1.5 times interquartile range. (c–j) Description of results for the slide where MOSBY predictions are the most concordant with CD8 IHC-based ground truth levels: c) Density plot showing the correlation between tile-level ground truth data (DAB mask applied on CD8 IHC WSI) and MOSBY CD8B predictions (Pearson r = 0.79, N = 16,859 tiles, two-sided p ≈ 0 using the exact beta distribution of r). (d) Computationally aligned CD8 IHC and H&E whole slide images. (e) Heat map showing CD8 IHC tile level quantification. DAB mask is applied on CD8 IHC images to call ‘brown’ pixels. Brown pixels are counted within each 100 × 100 px window. Count values are log-transformed, and then clipped at 50th and 98th percentiles for contrast. Background regions are denoted with gray. (f) Heat map showing CD8B tile-level model predictions. Predicted values are clipped at 50th and 98th percentiles for contrast. Background regions are denoted with gray. (g) High magnification example of a CD8 T cell-rich region on the CD8 IHC image. (h) DAB mask classification of ‘brown’ pixels (denoted with green) overlaid on CD8 IHC image. (i) Corresponding region on the H&E image as identified by slide-level computational alignment. (j) CD8B model predictions inferred from the H&E image in (i) overlaid on the CD8 IHC image in (g). Orange and blue show high and low values respectively.Figure 3Representative slides showing the variability for CD8 IHC vs. MOSBY concordance in IMvigor211. (a–d) Slides from the 75th, 50th, 25th, and 0th percentiles of the Pearson correlation between CD8B model predictions and CD8 IHC ground truth quantification. In each figure, subpanels from left to right depict (1) computationally aligned CD8 IHC and H&E whole slide images, (2) heat map showing CD8 IHC tile level quantification (plotted values derived from 100 × 100 px windows, and clipped at 50th and 98th percentiles) with gray-colored background regions, (3) heat map showing CD8B tile level model predictions (clipped at 50th and 98th percentiles) with gray-colored background regions, (4) density plot showing the correlation between tile-level ground truth data (DAB mask applied on CD8 IHC WSI) and MOSBY CD8B predictions. (a) Slide from the 75th percentile with r = 0.48. (b) Slide from the 50th percentile with r = 0.35. (c) Slide from the 25th percentile with r = 0.25. (d) Slide from the 0th percentile with r = 0.011. (e) High magnification example of a CD8 T cell-poor region on the slide with r = 0.011 from (d). This region shows the presence of brown-staining artifacts. (f) DAB mask classification of ‘brown’ pixels (denoted with green) overlaid on CD8 IHC image. Brown-staining artifacts are also captured with DAB mask. (g) Corresponding region on the H&E image as identified by slide-level computational alignment. (h) CD8B model predictions inferred from the H&E image in (g) overlaid on the CD8 IHC image in (e). Orange and blue show high and low values respectively.MOSBY predicts stroma, immune, and proliferation features with highest accuracyIn TCGA indications, cross-validated test set performance in signature and protein models was used to determine biological features with the highest prediction accuracy, and those with 25 highest performances are highlighted in Fig. 4a,d respectively. The best-performing signatures ESTIMATE Stroma and ESTIMATE Immune had median correlations 0.627 and 0.616 across all tested indications (Fig. 4a). In randomly split test sets, the Spearman correlation of ESTIMATE Stroma and ESTIMATE Immune signatures reached as high as 0.787 and 0.781 in skin cutaneous melanoma and thyroid cancer respectively (Fig. 4b). Other best-performing signatures were again largely enriched in stroma and immune features as well as general mesenchymal characteristics (such as Hallmark EMT and Taube et al. mesenchymal signatures). (Fig. 4a). Of note, particular signatures known to play an important role in specific cancer indications also showed strong prediction accuracy in those pertinent settings. For instance, the Hallmark Angiogenesis signature had Spearman r = 0.746 in the liver cancer model (LIHC), and a fatty acid elongation signature had Spearman r = 0.741 in the low grade glioma setting (LGG) (Fig. 4b). After adjusting for multiple hypothesis testing with false discovery rate (adj. p < 0.05), 14 out of 21 tested indications showed significant performance (as measured by Spearman r) for more than 90% of tested signatures (total N = 175 signatures) (Fig. 4c).Figure 4Best performing features in MOSBY signature, protein, gene, and DNA panels. (a) Top 25 signatures with best prediction accuracy according to cross-validated test set performance in single-indication models. (b) Test set correlations of select signatures in a single split run (70% training, 15% validation, and 15% test sets). (c) Percentage of signatures predicted with significant ground truth correlation (adjusted p-value < 0.05). (d) Top 25 proteins/phosphoproteins with best prediction accuracy according to cross-validated test set performance in single-indication models. (e) Test set correlations of select proteins/phosphoproteins in a single split run (70% training, 15% validation, and 15% test sets). (f) Percentage of proteins/phosphoproteins predicted with significant ground truth correlation (adjusted p-value < 0.1). (g) Top 25 genes with best prediction accuracy according to cross-validated test set performance in single-indication models. (h) Cross-validated test set performance of all tested DNA-based features. Lower and upper hinges in box plots correspond to first and third quartiles, while whiskers extend to 1.5 times interquartile range.In contrast with our signature set that represented both tumor-intrinsic pathways and cell populations in the TME, the TCGA RPPA antibody set was heavily focused on tumor-intrinsic pathways. Therefore, the best-performing proteins had a diversity of representation from tumor-intrinsic characteristics, such as proliferation (Cyclin-B1, FOXM1), DNA repair (MSH6, PARP1), and apoptosis (cleaved Caspase7) (Fig. 4d). MSH6 and Cyclin B1 were the two best-performing proteins that were tested in at least 20 indications (Fig. 4d). These proteins had median Spearman r = 0.423 and 0.417 respectively across tested indications, but correlations in individual indications were as high 0.676 in PRAD for Cyclin B1, and 0.593 in LGG for MSH6. Overall, lower prediction accuracy in protein models was expected due to the lower signal-to-noise ratio in the RPPA technology compared to RNA-seq. However, in specific settings such as the PRAD model, both total and phosphoproteins showed strong prediction accuracy in test sets. Progesterone receptor (PR) and cMET_pY1235 reached Spearman r = 0.614 and 0.631 respectively in this indication (Fig. 4e). Also, the AKT/mTOR pathway showed evidence of strong prediction accuracy in the sarcoma setting where S6 and RICTOR_pT1135 antibodies showed Spearman r = 0.731 and 0.695 respectively (Fig. 4e). Despite paucity of representation in the RPPA panel, stromal and immune features were also found among the best-performing proteins such as ECM-associated Collagen-VI, Fibronectin and T cell-associated Lck (Fig. 4d). Overall, 11 out of 20 tested indications showed significant performance (adj. p < 0.1) for more than 90% of tested antibodies (total N = 191 antibodies) (Fig. 4f).In terms of single gene models, best performing features confirmed the strong prediction accuracy associated with stromal features observed above; the highest-ranking genes were marker genes of fibroblasts (e.g. LUM, COL5A1) (Fig. 4g). Top-ranking genes also included markers of macrophages (e.g. CSF1R), and T cells (e.g. CD3E). In DNA models, tumor cellularity measures (tumor purity, cancer DNA fraction) achieved better performance than subclonal genome fraction, and average DNA methylation features (Fig. 4h).Taken together, these results highlight the promise of MOSBY predictions for profiling cell populations in complex tissue architectures. Models trained in TCGA lung adenocarcinoma were further deployed in IMpower15042, an independent validation cohort with pathologist-annotated WSIs. Comparison of pathologist cancer epithelia annotations with model predictions indicated that MOSBY is effective in inferring intratumor heterogeneity (Supplementary Notes, Supplementary Fig. 4). Moreover, model predictions for commonly used epithelial markers revealed important biology, such as DNA-based tumor cellularity estimates being preferable to RNA-based epithelia signatures (e.g. Taube et al.20) in demarcating cancer regions via bulk measurements (Supplementary Notes, Supplementary Fig. 5).Spatial patterns inferred from tile-level MOSBY predictions increase survival predictive power of gene signature-based modelsMOSBY tile-level predictions enable in silico spatialization for a tested omic feature, as well as assess spatial correlation between two tested features (positive and negative correlations indicating colocalization and spatial exclusion respectively). We define a ‘colocalization feature’ as the Pearson correlation between tile-level predictions of two omic features. As correlation coefficients were computed across all tiles on a slide, colocalization features represent slide-level as opposed to ‘local’ spatial patterns. We focused on our signature panel (N = 175) as the omic features to derive colocalization features and investigate survival associations, as the signature panel covered both tumor and non-tumor TME components. Moreover, using signatures as opposed to single genes enable the discovery of ‘biologically interpretable’ spatial biomarkers as well-designed signatures capture pathway and cell type-related gene expression with higher fidelity. For a slide, correlations from all pairwise combinations of signatures (N = 15,225) (i.e. the collection of all colocalization features) are referred to as the ‘colocalization map’ from here on (Fig. 1a).A survival analysis was performed to ask whether slide-level patterns in colocalization maps harbored survival signals that could not be captured by the mere magnitude of signature expression. Three L1-regularized Cox proportional hazards regression models were fit to address this question, and the ability of the models to predict survival was computed with concordance indices (c-index) (Fig. 5a) (Methods). The first model only used flattened colocalization features (MOSBY predictions, N = 15,225). The second Cox model only used signature expression levels (N = 175) with the goal of assessing the survival predictive power of gene expression ‘magnitude’. The third model was a joint model combining all signatures but only lasso-selected colocalization features from the first model in order to prevent colocalization features from dominating the model. Each model was run across 10 cross-validation folds to optimize the shrinkage parameter and also obtain mean and standard error estimates for the relevant c-index (Methods).Figure 5Survival predictive power comparison between gene signatures and colocalization features. (a) Concordance indices of survival models involving only colocalization features (denoted with C), only signatures (denoted with S), or signatures and lasso-selected colocalization features (denoted with C + S) in 21 TCGA indications. Indications are ordered according to increasing concordance index in signature-only models. Error bars denote ± 1 standard error. (b) Colocalization features most consistently associated with poor overall survival and elevated tumor levels in multiple TCGA indications. (c,d) Survival and differential tumor colocalization evidence for the ER17 vs. Neurotransmitter signature pair in TCGA colon, liver, lung, and ovarian cancer cohorts. (c) Median-split Kaplan–Meier plots and logrank test p-values, (d) tumor vs normal box plots and Wilcoxon test p-values. (e,f) Median-split Kaplan–Meier plots and logrank test p-values for the colocalization between ER17 and Neurotransmitter signatures in IMpower110 non-squamous cohort: (e) Chemotherapy arm, (f) Atezolizumab arm. Lower and upper hinges in box plots correspond to first and third quartiles, while whiskers extend to 1.5 times interquartile range. Red and blue color denote high and low levels respectively.This process was implemented separately for all tested TCGA indications, and c-index mean and standard error estimates were plotted (Fig. 5a). We observed that the joint (i.e. third) model had a higher c-index than the signature-only model in most indications. This finding revealed that slide-level spatial patterns, discovered with an inference engine such as MOSBY, have survival predictive power that could not be captured by gene expression alone. As MOSBY colocalization features are biologically interpretable, this opens the door to discovering potentially ‘actionable’ colocalization or spatial exclusion patterns that predict clinical outcomes. Moreover, the c-indices achieved in the joint model were highly competitive with or higher than those reported in the literature from end-to-end survival-trained multimodal models utilizing genomic, transcriptomic, and image datasets43. Of note, comparing colocalization-only and signature-only models, we also found that the total survival predictive power of slide-level spatial patterns was not as high as that of signature levels in most TCGA indications. Ovarian and rectal cancer results were an exception to this general pattern (Fig. 5a), suggesting spatial biomarkers discovered in these indications may have the greatest potential to lead to novel insights.Colocalization maps enable discovery of biologically interpretable spatial biomarkersWe next investigated consistent spatial predictors of risk that were supported across multiple indications and also showed evidence of tumor specificity. In each indication, survival effects and tumor specificity of colocalization features were explored with two tests: (1) A univariate Cox regression model for survival, and (2) a Mann–Whitney test to compare tumor vs. normal levels of colocalization. A potential spatial biomarker of risk was defined as a colocalization feature that was significantly associated with poor overall survival (p < 0.05) and also had elevated levels of colocalization in the tumor (adj. p < 0.05). Given these criteria, four colocalization features had evidence in four different TCGA indications to be a spatial biomarker of risk (Fig. 5b). Of these four, the colocalization between an ER stress signature (XBP1s targets ER1722) and a neurotransmitter signature was associated with poor survival and malignant state in colon adenocarcinoma, lung adenocarcinoma, liver hepatocellular and ovarian cancers (Fig. 5c,d). In an independent non-squamous lung cancer study involving both immune checkpoint blockade (atezolizumab) and chemotherapy arms (IMpower110), this colocalization feature was also found to be associated with poor survival in the chemotherapy, but not immunotherapy arm, suggesting higher relevance as a resistance factor in chemotherapy (Fig. 5e,f). Of note, the ER17 and neurotransmitter signatures were not individually found to be associated with risk in any of the four mentioned indications (Supplementary Fig. 6a,b). Visual inspection of WSIs indicated that the expression of ER17 and neurotransmitter signatures primarily came from the microenvironment (as opposed to tumor region) in the case of high colocalization (Supplementary Fig. 6c). In low colocalization cases, the neurotransmitter signature expression primarily came from the tumor region whereas the ER17 signature expression was again predominantly in the microenvironment (Supplementary Fig. 6d).Focusing on colocalization features involving immune system signatures, the T effector cell vs cysteine colocalization was identified as the most consistent spatial biomarker of risk in TCGA. This colocalization feature was associated with poor survival and also showed significant tumor enrichment in breast, squamous lung, and ovarian cancers (Fig. 6a,b). T effector and cysteine signatures were not individually found to be associated with risk in any of these indications (Supplementary Fig. 7a,b). Visual inspection of WSIs indicated that the expression of T effector and cysteine signatures primarily came from the microenvironment (as opposed to tumor region) in the case of high colocalization (Supplementary Fig. 7c). As high colocalization is a risk factor, this expression pattern may be suggestive of a cysteine-associated immunosuppressive TME. In low colocalization cases, the T effector signature expression primarily came from the microenvironment whereas the cysteine signature expression was predominantly in the tumor region (Supplementary Fig. 7d).Figure 6Immune system-related spatial biomarkers consistently associated with risk. (a,b) Survival and differential tumor colocalization evidence for T effector cell and Sabatini Cysteine signature pair in TCGA breast, lung squamous, and ovarian cancer cohorts: (a) Median-split Kaplan–Meier plots and logrank test p-values, (b) tumor vs normal box plots and Wilcoxon test p-value. Lower and upper hinges in box plots correspond to first and third quartiles, while whiskers extend to 1.5 times interquartile range. (c,d,e,f) Median-split Kaplan–Meier plots and logrank test p-values for Sabatini Cysteine colocalization with various immune cell populations: (c,d) TCGA breast cancer cohort: c) Lymphocytes and NK cells, (d) myeloid cell populations. (e–f) Atezolizumab arm of IMpower110 non-squamous cohort (N = 137): (e) Lymphocytes and NK cells, (f) myeloid cell populations. Red and blue color denote high and low levels respectively.The strongest survival effect for T effector vs cysteine colocalization was observed in breast cancer, where we investigated other immune cell types and found that immune vs cysteine colocalization was a general negative prognosis biomarker in this indication. Significant survival associations were observed for both lymphocytes/NK cells (Fig. 6c), and myeloid populations (Fig. 6d). The same immune vs cysteine colocalization features were also found to be negative prognostics in the atezolizumab arm of Impower110 non-squamous cohort (Fig. 6e,f). Of note, most of these features were not prognostic in the chemotherapy arm (Supplementary Fig. 7e,f.), however did not qualify as predictive biomarkers since the survival association differences between atezolizumab and chemotherapy arms were not significant.

Hot Topics

Related Articles